From balay at mcs.anl.gov Mon Jun 1 16:11:28 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 1 Jun 2009 16:11:28 -0500 (CDT) Subject: Mismatch in explicit fortran interface for MatGetInfo In-Reply-To: <4A215AB2.2010900@imperial.ac.uk> References: <4A18016F.6030805@imperial.ac.uk> <0A67546F-4327-4265-B94D-B889B94644E5@mcs.anl.gov> <4A212A19.3090404@imperial.ac.uk> <4A215AB2.2010900@imperial.ac.uk> Message-ID: On Sat, 30 May 2009, Stephan Kramer wrote: > Satish Balay wrote: > > On Sat, 30 May 2009, Stephan Kramer wrote: > > > > > Thanks a lot for looking into this. The explicit fortran interfaces are in > > > general very useful. The problem occurred for me with petsc-3.0.0-p1. I'm > > > happy to try it out with a more recent patch-level or with petsc-dev. > > > > Did you configure with '--with-fortran-interfaces=1' or are you > > directly using '#include "finclude/ftn-auto/petscmat.h90"'? > > > > Configured with '--with-fortran-interfaces=1', yes, and then using them via > the fortran modules: "use petscksp", "use petscmat", etc. ok. --with-fortran-interfaces was broken in p0, worked in p1,p2,p3,p4 - broken in curent p5. The next patch update - p6 will have the fix for this issue [along with the fix for MatGetInfo() interface] Satish From Andreas.Grassl at student.uibk.ac.at Tue Jun 2 15:23:37 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Tue, 02 Jun 2009 22:23:37 +0200 Subject: VecView behaviour In-Reply-To: <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> Message-ID: <4A258A49.2020502@student.uibk.ac.at> Barry Smith schrieb: > > On May 29, 2009, at 4:34 AM, Andreas Grassl wrote: > >> Hello, >> >> I'm working with the PCNN preconditioner and hence with >> ISLocalToGlobalMapping. >> After solving I want to write the solution to an ASCII-file where only >> the >> values belonging to the "external" global numbering are given and not >> followed > ^^^^^^^^^^^^^^^^^^^^^^^ >> >> by the zeros. > > What do you mean? What parts of the vector do you want? I want the first actdof entries actdof is the number of DOF the system has. the values of indices is in the range of 0 to actdof-1. I create the mapping by ISLocalToGlobalMappingCreate(commw,ind_length,indices,&gridmapping); Due to the "existence" of interface DOF's the sum over all ind_length is greather than actdof, namely the size of the Vectors, but only actdof entries of this Vector are nonzero, if I view it. >> >> Currently I'm giving this commands: >> >> ierr = >> PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_SYMMODU);CHKERRQ(ierr); >> ierr = VecView(X,viewer);CHKERRQ(ierr); I hope you got an idea, what problem I have. Cheers, ando -- /"\ \ / ASCII Ribbon X against HTML email / \ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: From bsmith at mcs.anl.gov Tue Jun 2 16:07:22 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 2 Jun 2009 16:07:22 -0500 Subject: VecView behaviour In-Reply-To: <4A258A49.2020502@student.uibk.ac.at> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> Message-ID: <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> Hmm, it sounds like the difference between local "ghosted" vectors and the global parallel vectors. But I do not understand why any of the local vector entries would be zero. Doesn't the vector X that is passed into KSP (or SNES) have the global entries and uniquely define the solution? Why is viewing that not right? Barry On Jun 2, 2009, at 3:23 PM, Andreas Grassl wrote: > Barry Smith schrieb: >> >> On May 29, 2009, at 4:34 AM, Andreas Grassl wrote: >> >>> Hello, >>> >>> I'm working with the PCNN preconditioner and hence with >>> ISLocalToGlobalMapping. >>> After solving I want to write the solution to an ASCII-file where >>> only >>> the >>> values belonging to the "external" global numbering are given and >>> not >>> followed >> ^^^^^^^^^^^^^^^^^^^^^^^ >>> >>> by the zeros. >> >> What do you mean? What parts of the vector do you want? > > I want the first actdof entries > > actdof is the number of DOF the system has. > the values of indices is in the range of 0 to actdof-1. > I create the mapping by > ISLocalToGlobalMappingCreate(commw,ind_length,indices,&gridmapping); > > Due to the "existence" of interface DOF's the sum over all > ind_length is > greather than actdof, namely the size of the Vectors, but only > actdof entries of > this Vector are nonzero, if I view it. > >>> >>> Currently I'm giving this commands: >>> >>> ierr = >>> PetscViewerSetFormat >>> (viewer,PETSC_VIEWER_ASCII_SYMMODU);CHKERRQ(ierr); >>> ierr = VecView(X,viewer);CHKERRQ(ierr); > > I hope you got an idea, what problem I have. > > Cheers, > > ando > > -- > /"\ > \ / ASCII Ribbon > X against HTML email > / \ > > From Andreas.Grassl at student.uibk.ac.at Wed Jun 3 05:29:00 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Wed, 03 Jun 2009 12:29:00 +0200 Subject: VecView behaviour In-Reply-To: <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> Message-ID: <4A26506C.5050002@student.uibk.ac.at> Barry Smith schrieb: > Hmm, it sounds like the difference between local "ghosted" vectors > and the global parallel vectors. But I do not understand why any of the > local vector entries would be zero. > Doesn't the vector X that is passed into KSP (or SNES) have the global > entries and uniquely define the solution? Why is viewing that not right? > I still don't understand fully the underlying processes of the whole PCNN solution procedure, but trying around I substituted MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, gridmapping, &A); by MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, &A); and received the needed results. Furthermore it seems, that the load balance is now better, although I still don't reach the expected values, e.g. ilu-cg 320 iterations, condition 4601 cg only 1662 iterations, condition 84919 nn-cg on 2 nodes 229 iterations, condition 6285 nn-cg on 4 nodes 331 iterations, condition 13312 or is it not to expect, that nn-cg is faster than ilu-cg? cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From jed at 59A2.org Wed Jun 3 09:41:00 2009 From: jed at 59A2.org (Jed Brown) Date: Wed, 03 Jun 2009 16:41:00 +0200 Subject: VecView behaviour In-Reply-To: <4A26506C.5050002@student.uibk.ac.at> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> Message-ID: <4A268B7C.3010305@59A2.org> Andreas Grassl wrote: > Barry Smith schrieb: >> Hmm, it sounds like the difference between local "ghosted" vectors >> and the global parallel vectors. But I do not understand why any of the >> local vector entries would be zero. >> Doesn't the vector X that is passed into KSP (or SNES) have the global >> entries and uniquely define the solution? Why is viewing that not right? >> > > I still don't understand fully the underlying processes of the whole PCNN > solution procedure, but trying around I substituted > > MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, > gridmapping, &A); This creates a matrix that is bigger than you want, and gives you the dead values at the end (global dofs that are not in the range of the LocalToGlobalMapping. This from the note on MatCreateIS: | m and n are NOT related to the size of the map, they are the size of the part of the vector owned | by that process. m + nghosts (or n + nghosts) is the length of map since map maps all local points | plus the ghost points to global indices. > by > > MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, &A); This creates a matrix of the correct size, but it looks like it could easily end up with the "wrong" dofs owned locally. What you probably want to do is: 1. Resolve ownership just like with any other DD method. This partitions your dofs into n owned dofs and ngh ghosted dofs on each process. The global sum of n is N, the size of the global vectors that the solver will interact with. 2. Make an ISLocalToGlobalMapping where all the owned dofs come first, mapping (0..n-1) to (rstart..rstart+n-1), followed by the ghosted dofs (local index n..ngh-1) which map to remote processes. (rstart is the global index of the first owned dof) One way to do this is to use MPI_Scan to find rstart, then number all the owned dofs and scatter the result. The details will be dependent on how you store your mesh. (I'm assuming it's unstructured, this step is trivial if you use a DA.) 3. Call MatCreateIS(comm,n,n,PETSC_DECIDE,PETSC_DECIDE,mapping,&A); > Furthermore it seems, that the load balance is now better, although I still > don't reach the expected values, e.g. > ilu-cg 320 iterations, condition 4601 > cg only 1662 iterations, condition 84919 > > nn-cg on 2 nodes 229 iterations, condition 6285 > nn-cg on 4 nodes 331 iterations, condition 13312 > > or is it not to expect, that nn-cg is faster than ilu-cg? It depends a lot on the problem. As you probably know, for a second order elliptic problem with exact subdomain solves, the NN preconditioned operator (without a coarse component) has condition number that scales as (1/H^2)(1 + log(H/h))^2 where H is the subdomain diameter and h is the element size. In contrast, overlapping additive Schwarz is 1/H^2 and block Jacobi is 1/(Hh) (the original problem was 1/h^2) In particular, there is no reason to expect that NN is uniformly better than ASM, although it may be for certain problems. When a coarse solve is used, NN becomes (1 + log(H/h))^2 which is quasi-optimal (these methods are known as BDDC, which is essentially equivalent to FETI-DP). The key advantage over multigrid (or multilivel Schwarz) is improved robustness with variable coefficients. My understanding is that PCNN is BDDC, and uses direct subdomain solves by default, but I could have missed something. In particular, if the coarse solve is missing or inexact solves are used, you could easily see relatively poor scaling. AFAIK, it's not for vector problems at this time. Good luck. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: From bsmith at mcs.anl.gov Wed Jun 3 10:51:04 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 3 Jun 2009 10:51:04 -0500 Subject: VecView behaviour In-Reply-To: <4A26506C.5050002@student.uibk.ac.at> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> Message-ID: <21552795-DCF3-4DED-BDBB-34F2037C571B@mcs.anl.gov> When properly running nn-cg (are you sure everything is symmetric?) should require 10-30 iterations (certainly for model problems) > nn-cg on 2 nodes 229 iterations, condition 6285 > nn-cg on 4 nodes 331 iterations, condition 13312 Are you sure that your operator has the null space of only constants? Barry On Jun 3, 2009, at 5:29 AM, Andreas Grassl wrote: > Barry Smith schrieb: >> Hmm, it sounds like the difference between local "ghosted" vectors >> and the global parallel vectors. But I do not understand why any of >> the >> local vector entries would be zero. >> Doesn't the vector X that is passed into KSP (or SNES) have the >> global >> entries and uniquely define the solution? Why is viewing that not >> right? >> > > I still don't understand fully the underlying processes of the whole > PCNN > solution procedure, but trying around I substituted > > MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, > gridmapping, &A); > > by > > MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, > gridmapping, &A); > > and received the needed results. > > Furthermore it seems, that the load balance is now better, although > I still > don't reach the expected values, e.g. > ilu-cg 320 iterations, condition 4601 > cg only 1662 iterations, condition 84919 > > nn-cg on 2 nodes 229 iterations, condition 6285 > nn-cg on 4 nodes 331 iterations, condition 13312 > > or is it not to expect, that nn-cg is faster than ilu-cg? > > cheers, > > ando > > -- > /"\ Grassl Andreas > \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik > X against HTML email Technikerstr. 13 Zi 709 > / \ +43 (0)512 507 6091 From Andreas.Grassl at student.uibk.ac.at Wed Jun 3 17:29:22 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Thu, 04 Jun 2009 00:29:22 +0200 Subject: VecView behaviour In-Reply-To: <21552795-DCF3-4DED-BDBB-34F2037C571B@mcs.anl.gov> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <21552795-DCF3-4DED-BDBB-34F2037C571B@mcs.anl.gov> Message-ID: <4A26F942.7030704@student.uibk.ac.at> Barry Smith schrieb: > > When properly running nn-cg (are you sure everything is symmetric?) > should require 10-30 iterations (certainly for model problems) ok, this was the number I expected. > >> nn-cg on 2 nodes 229 iterations, condition 6285 >> nn-cg on 4 nodes 331 iterations, condition 13312 > > Are you sure that your operator has the null space of only constants? no, I didn't touch anything regarding the null space since I thought it would be done inside the NN-preconditioner. Does this mean I have to set up a null space of the size of the Schur complement system, i.e. the number of interface DOF's? cheers, ando > > Barry > > > On Jun 3, 2009, at 5:29 AM, Andreas Grassl wrote: > >> Barry Smith schrieb: >>> Hmm, it sounds like the difference between local "ghosted" vectors >>> and the global parallel vectors. But I do not understand why any of the >>> local vector entries would be zero. >>> Doesn't the vector X that is passed into KSP (or SNES) have the global >>> entries and uniquely define the solution? Why is viewing that not right? >>> >> >> I still don't understand fully the underlying processes of the whole PCNN >> solution procedure, but trying around I substituted >> >> MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, >> gridmapping, &A); >> >> by >> >> MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, >> gridmapping, &A); >> >> and received the needed results. >> >> Furthermore it seems, that the load balance is now better, although I >> still >> don't reach the expected values, e.g. >> ilu-cg 320 iterations, condition 4601 >> cg only 1662 iterations, condition 84919 >> >> nn-cg on 2 nodes 229 iterations, condition 6285 >> nn-cg on 4 nodes 331 iterations, condition 13312 >> >> or is it not to expect, that nn-cg is faster than ilu-cg? >> >> cheers, >> >> ando >> >> -- >> /"\ Grassl Andreas >> \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik >> X against HTML email Technikerstr. 13 Zi 709 >> / \ +43 (0)512 507 6091 > -- /"\ \ / ASCII Ribbon X against HTML email / \ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: From Andreas.Grassl at student.uibk.ac.at Wed Jun 3 17:35:23 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Thu, 04 Jun 2009 00:35:23 +0200 Subject: VecView behaviour In-Reply-To: <4A268B7C.3010305@59A2.org> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <4A268B7C.3010305@59A2.org> Message-ID: <4A26FAAB.1000603@student.uibk.ac.at> Thank you for the explanation, first I'll try the null space set up and then I come back to your hints. Jed Brown schrieb: > Andreas Grassl wrote: >> Barry Smith schrieb: >>> Hmm, it sounds like the difference between local "ghosted" vectors >>> and the global parallel vectors. But I do not understand why any of the >>> local vector entries would be zero. >>> Doesn't the vector X that is passed into KSP (or SNES) have the global >>> entries and uniquely define the solution? Why is viewing that not right? >>> >> I still don't understand fully the underlying processes of the whole PCNN >> solution procedure, but trying around I substituted >> >> MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, >> gridmapping, &A); > > This creates a matrix that is bigger than you want, and gives you the > dead values at the end (global dofs that are not in the range of the > LocalToGlobalMapping. > > This from the note on MatCreateIS: > > | m and n are NOT related to the size of the map, they are the size of the part of the vector owned > | by that process. m + nghosts (or n + nghosts) is the length of map since map maps all local points > | plus the ghost points to global indices. > >> by >> >> MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, &A); > > This creates a matrix of the correct size, but it looks like it could > easily end up with the "wrong" dofs owned locally. What you probably > want to do is: > > 1. Resolve ownership just like with any other DD method. This > partitions your dofs into n owned dofs and ngh ghosted dofs on each > process. The global sum of n is N, the size of the global vectors that > the solver will interact with. > > 2. Make an ISLocalToGlobalMapping where all the owned dofs come first, > mapping (0..n-1) to (rstart..rstart+n-1), followed by the ghosted dofs > (local index n..ngh-1) which map to remote processes. (rstart is the > global index of the first owned dof) > > One way to do this is to use MPI_Scan to find rstart, then number all > the owned dofs and scatter the result. The details will be dependent on > how you store your mesh. (I'm assuming it's unstructured, this step is > trivial if you use a DA.) > > 3. Call MatCreateIS(comm,n,n,PETSC_DECIDE,PETSC_DECIDE,mapping,&A); > >> Furthermore it seems, that the load balance is now better, although I still >> don't reach the expected values, e.g. >> ilu-cg 320 iterations, condition 4601 >> cg only 1662 iterations, condition 84919 >> >> nn-cg on 2 nodes 229 iterations, condition 6285 >> nn-cg on 4 nodes 331 iterations, condition 13312 >> >> or is it not to expect, that nn-cg is faster than ilu-cg? > > It depends a lot on the problem. As you probably know, for a second > order elliptic problem with exact subdomain solves, the NN > preconditioned operator (without a coarse component) has condition > number that scales as > > (1/H^2)(1 + log(H/h))^2 > > where H is the subdomain diameter and h is the element size. > In contrast, overlapping additive Schwarz is > > 1/H^2 > > and block Jacobi is > > 1/(Hh) > > (the original problem was 1/h^2) > > In particular, there is no reason to expect that NN is uniformly better > than ASM, although it may be for certain problems. When a coarse > solve is used, NN becomes > > (1 + log(H/h))^2 > > which is quasi-optimal (these methods are known as BDDC, which is > essentially equivalent to FETI-DP). The key advantage over multigrid > (or multilivel Schwarz) is improved robustness with variable > coefficients. My understanding is that PCNN is BDDC, and uses direct > subdomain solves by default, but I could have missed something. In > particular, if the coarse solve is missing or inexact solves are used, > you could easily see relatively poor scaling. AFAIK, it's not for > vector problems at this time. > > Good luck. > > > Jed > -- /"\ \ / ASCII Ribbon X against HTML email / \ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: From bsmith at mcs.anl.gov Wed Jun 3 17:38:32 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 3 Jun 2009 17:38:32 -0500 Subject: VecView behaviour In-Reply-To: <4A26F942.7030704@student.uibk.ac.at> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <21552795-DCF3-4DED-BDBB-34F2037C571B@mcs.anl.gov> <4A26F942.7030704@student.uibk.ac.at> Message-ID: <4ADA90D3-69D6-4DA5-A6D6-7EC8B0B13DCE@mcs.anl.gov> On Jun 3, 2009, at 5:29 PM, Andreas Grassl wrote: > Barry Smith schrieb: >> >> When properly running nn-cg (are you sure everything is symmetric?) >> should require 10-30 iterations (certainly for model problems) > > ok, this was the number I expected. > >> >>> nn-cg on 2 nodes 229 iterations, condition 6285 >>> nn-cg on 4 nodes 331 iterations, condition 13312 >> >> Are you sure that your operator has the null space of only >> constants? > > no, I didn't touch anything regarding the null space since I thought > it would be > done inside the NN-preconditioner. Does this mean I have to set up a > null space > of the size of the Schur complement system, i.e. the number of > interface DOF's? No, I don't think you need to do anything about the null space. The code in PETSc for NN is for (and only for) a null space of constants. BTW: with 2 or 4 subdomains they all touch the boundary and likely don't have a null space anyways. Run with -ksp_view and make sure the local solves are being done with LU Barry > > > cheers, > > ando > >> >> Barry >> >> >> On Jun 3, 2009, at 5:29 AM, Andreas Grassl wrote: >> >>> Barry Smith schrieb: >>>> Hmm, it sounds like the difference between local "ghosted" vectors >>>> and the global parallel vectors. But I do not understand why any >>>> of the >>>> local vector entries would be zero. >>>> Doesn't the vector X that is passed into KSP (or SNES) have the >>>> global >>>> entries and uniquely define the solution? Why is viewing that not >>>> right? >>>> >>> >>> I still don't understand fully the underlying processes of the >>> whole PCNN >>> solution procedure, but trying around I substituted >>> >>> MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, >>> PETSC_DECIDE, >>> gridmapping, &A); >>> >>> by >>> >>> MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, >>> gridmapping, &A); >>> >>> and received the needed results. >>> >>> Furthermore it seems, that the load balance is now better, >>> although I >>> still >>> don't reach the expected values, e.g. >>> ilu-cg 320 iterations, condition 4601 >>> cg only 1662 iterations, condition 84919 >>> >>> nn-cg on 2 nodes 229 iterations, condition 6285 >>> nn-cg on 4 nodes 331 iterations, condition 13312 >>> >>> or is it not to expect, that nn-cg is faster than ilu-cg? >>> >>> cheers, >>> >>> ando >>> >>> -- >>> /"\ Grassl Andreas >>> \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. >>> Mathematik >>> X against HTML email Technikerstr. 13 Zi 709 >>> / \ +43 (0)512 507 6091 >> > > -- > /"\ > \ / ASCII Ribbon > X against HTML email > / \ > > From Andreas.Grassl at student.uibk.ac.at Thu Jun 4 11:07:07 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Thu, 04 Jun 2009 18:07:07 +0200 Subject: VecView behaviour In-Reply-To: <4ADA90D3-69D6-4DA5-A6D6-7EC8B0B13DCE@mcs.anl.gov> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <21552795-DCF3-4DED-BDBB-34F2037C571B@mcs.anl.gov> <4A26F942.7030704@student.uibk.ac.at> <4ADA90D3-69D6-4DA5-A6D6-7EC8B0B13DCE@mcs.anl.gov> Message-ID: <4A27F12B.3070403@student.uibk.ac.at> Barry Smith schrieb: > > On Jun 3, 2009, at 5:29 PM, Andreas Grassl wrote: > >> Barry Smith schrieb: >>> >>> When properly running nn-cg (are you sure everything is symmetric?) >>> should require 10-30 iterations (certainly for model problems) >> >> ok, this was the number I expected. >> >>> >>>> nn-cg on 2 nodes 229 iterations, condition 6285 >>>> nn-cg on 4 nodes 331 iterations, condition 13312 >>> >>> Are you sure that your operator has the null space of only constants? >> >> no, I didn't touch anything regarding the null space since I thought >> it would be >> done inside the NN-preconditioner. Does this mean I have to set up a >> null space >> of the size of the Schur complement system, i.e. the number of >> interface DOF's? > > No, I don't think you need to do anything about the null space. The > code in PETSc for NN is for (and only for) a null space of constants. > BTW: with 2 or 4 subdomains they all touch the boundary and likely don't > have a null space anyways. > > Run with -ksp_view and make sure the local solves are being done with LU > I don't find the anomalies... setting local_ksp-rtol to 1e-8 doesn't change anything the options passed are: -is_localD_ksp_type preonly -is_localD_pc_factor_shift_positive_definite -is_localD_pc_type lu -is_localN_ksp_type preonly -is_localN_pc_factor_shift_positive_definite -is_localN_pc_type lu -ksp_rtol 1e-8 -ksp_view #-is_localD_ksp_view #-is_localN_ksp_view #-nn_coarse_ksp_view # -pc_is_remove_nullspace_fixed this option doesn't produce any effect -log_summary -options_left and produce: -ksp_view: KSP Object: type: cg maximum iterations=10000 tolerances: relative=1e-08, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: nn linear system matrix = precond matrix: Matrix Object: type=is, rows=28632, cols=28632 Matrix Object:(is) type=seqaij, rows=7537, cols=7537 total: nonzeros=359491, allocated nonzeros=602960 using I-node routines: found 4578 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7515, cols=7515 total: nonzeros=349347, allocated nonzeros=601200 using I-node routines: found 5159 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7533, cols=7533 total: nonzeros=357291, allocated nonzeros=602640 using I-node routines: found 4739 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7360, cols=7360 total: nonzeros=364390, allocated nonzeros=588800 using I-node routines: found 3602 nodes, limit used is 5 -is_local...: KSP Object:(is_localD_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(is_localD_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: using Manteuffel shift LU: factor fill ratio needed 4.73566 Factored matrix follows Matrix Object: type=seqaij, rows=6714, cols=6714 package used to perform factorization: petsc total: nonzeros=1479078, allocated nonzeros=1479078 using I-node routines: found 2790 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=6714, cols=6714 total: nonzeros=312328, allocated nonzeros=312328 using I-node routines: found 4664 nodes, limit used is 5 KSP Object:(is_localN_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(is_localN_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: using Manteuffel shift LU: factor fill ratio needed 5.07571 Factored matrix follows Matrix Object: type=seqaij, rows=7537, cols=7537 package used to perform factorization: petsc total: nonzeros=1824671, allocated nonzeros=1824671 using I-node routines: found 2939 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object:(is) type=seqaij, rows=7537, cols=7537 total: nonzeros=359491, allocated nonzeros=602960 using I-node routines: found 4578 nodes, limit used is 5 -nn_coarse_ksp_view: KSP Object:(nn_coarse_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(nn_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(redundant_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(redundant_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill ratio needed 1 Factored matrix follows Matrix Object: type=seqaij, rows=4, cols=4 package used to perform factorization: petsc total: nonzeros=4, allocated nonzeros=4 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=4, cols=4 total: nonzeros=4, allocated nonzeros=4 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=4, cols=4 total: nonzeros=4, allocated nonzeros=68 not using I-node (on process 0) routines cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From sapphire.jxy at gmail.com Thu Jun 4 12:15:22 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Thu, 4 Jun 2009 13:15:22 -0400 Subject: Need help with makefile for multiple source files Message-ID: <6985a8f00906041015j2ec33d87u5b847453afe92f80@mail.gmail.com> Hi, I've got a .F90 code using PETSc ksp solver working, and now I'm trying to separate the module (which contains parameter define and PETSc include files) and main program into two .F90 files. I have trouble with the makefile now because I cannot find the example makefile for such purpose, it seems all PETSc example makefiles are for single code only. Thank you very much! Best, Xiaoyin Ji ---------------------------------------------- Xiaoyin Ji Graduate Student Department of Materials Science and Engineering North Carolina State University From knepley at gmail.com Thu Jun 4 12:21:35 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 Jun 2009 12:21:35 -0500 Subject: Need help with makefile for multiple source files In-Reply-To: <6985a8f00906041015j2ec33d87u5b847453afe92f80@mail.gmail.com> References: <6985a8f00906041015j2ec33d87u5b847453afe92f80@mail.gmail.com> Message-ID: You jsut list all the *.o files in the build rule for the executable. Matt On Thu, Jun 4, 2009 at 12:15 PM, xiaoyin ji wrote: > Hi, > > I've got a .F90 code using PETSc ksp solver working, and now I'm > trying to separate the module (which contains parameter define and > PETSc include files) and main program into two .F90 files. I have > trouble with the makefile now because I cannot find the example > makefile for such purpose, it seems all PETSc example makefiles are > for single code only. > > Thank you very much! > > Best, > > Xiaoyin Ji > > ---------------------------------------------- > > Xiaoyin Ji > Graduate Student > Department of Materials Science and Engineering > North Carolina State University > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Thu Jun 4 12:24:16 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Thu, 04 Jun 2009 10:24:16 -0700 Subject: Need help with makefile for multiple source files In-Reply-To: <6985a8f00906041015j2ec33d87u5b847453afe92f80@mail.gmail.com> References: <6985a8f00906041015j2ec33d87u5b847453afe92f80@mail.gmail.com> Message-ID: <4A280340.6080701@gmail.com> Hi Xiaoyin, I'm not sure what I do is the most elegant, but it works well for me. First, I use a program called makedepf90, which you can find here: http://personal.inet.fi/private/erikedelmann/makedepf90/ This program was designed for f90 and modules and it creates the dependency list you need to compile f90 programs. It puts this in a .depend file. Then, my makefile is simple: ============================================================================== # Include the dependency-list created by makedepf90 below include .depend CFLAGS = FFLAGS = CPPFLAGS = FPPFLAGS = include ${PETSC_DIR}/conf/base csemfwd: ${FOBJ} chkopts -${FLINKER} -o csemfwd ${FOBJ} ${PETSC_FORTRAN_LIB} ${PETSC_KSP_LIB} depend .depend: makedepf90 -o DO_NOT_COMPILE *.f *.F > .depend ============================================================================== Of course, you can modify this to suit your own needs. Good luck, Randy M. xiaoyin ji wrote: > Hi, > > I've got a .F90 code using PETSc ksp solver working, and now I'm > trying to separate the module (which contains parameter define and > PETSc include files) and main program into two .F90 files. I have > trouble with the makefile now because I cannot find the example > makefile for such purpose, it seems all PETSc example makefiles are > for single code only. > > Thank you very much! > > Best, > > Xiaoyin Ji > > ---------------------------------------------- > > Xiaoyin Ji > Graduate Student > Department of Materials Science and Engineering > North Carolina State University From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Thu Jun 4 12:45:15 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Thu, 04 Jun 2009 19:45:15 +0200 Subject: Need help with makefile for multiple source files In-Reply-To: <4A280340.6080701@gmail.com> References: <6985a8f00906041015j2ec33d87u5b847453afe92f80@mail.gmail.com> <4A280340.6080701@gmail.com> Message-ID: <20090604194515.anah9mht5iko44gw@webmail.ec-nantes.fr> Hi, I did something like this. RM = /bin/rm MYOBJ = main.o module.o include ${PETSC_DIR}/conf/base test: $(MYOBJS) ${FLINKER} -o test $(MYOBJS) ${PETSC_KSP_LIB} ${RM} *.o .F90.o: $(FCOMPILE) -c -o $@ $< $(MYOBJS): incl.h include ${PETSC_DIR}/conf/test good luck Jarunan -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE Randall Mackie a ??crit??: > Hi Xiaoyin, > > I'm not sure what I do is the most elegant, but it works well for me. > First, I use a program called makedepf90, which you can find here: > > http://personal.inet.fi/private/erikedelmann/makedepf90/ > > This program was designed for f90 and modules and it creates the > dependency list you need to compile f90 programs. It puts this > in a .depend file. > > Then, my makefile is simple: > > > ============================================================================== > # Include the dependency-list created by makedepf90 below > > include .depend > > > CFLAGS = > FFLAGS = > CPPFLAGS = > FPPFLAGS = > > include ${PETSC_DIR}/conf/base > > csemfwd: ${FOBJ} chkopts > -${FLINKER} -o csemfwd ${FOBJ} ${PETSC_FORTRAN_LIB} ${PETSC_KSP_LIB} > > > depend .depend: > makedepf90 -o DO_NOT_COMPILE *.f *.F > .depend > > ============================================================================== > > > Of course, you can modify this to suit your own needs. > > > Good luck, > > Randy M. > > > xiaoyin ji wrote: >> Hi, >> >> I've got a .F90 code using PETSc ksp solver working, and now I'm >> trying to separate the module (which contains parameter define and >> PETSc include files) and main program into two .F90 files. I have >> trouble with the makefile now because I cannot find the example >> makefile for such purpose, it seems all PETSc example makefiles are >> for single code only. >> >> Thank you very much! >> >> Best, >> >> Xiaoyin Ji >> >> ---------------------------------------------- >> >> Xiaoyin Ji >> Graduate Student >> Department of Materials Science and Engineering >> North Carolina State University > From bsmith at mcs.anl.gov Thu Jun 4 13:44:09 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 Jun 2009 13:44:09 -0500 Subject: VecView behaviour In-Reply-To: <4A27F12B.3070403@student.uibk.ac.at> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <21552795-DCF3-4DED-BDBB-34F2037C571B@mcs.anl.gov> <4A26F942.7030704@student.uibk.ac.at> <4ADA90D3-69D6-4DA5-A6D6-7EC8B0B13DCE@mcs.anl.gov> <4A27F12B.3070403@student.uibk.ac.at> Message-ID: Run with GMRES, what happens? On Jun 4, 2009, at 11:07 AM, Andreas Grassl wrote: > Barry Smith schrieb: >> >> On Jun 3, 2009, at 5:29 PM, Andreas Grassl wrote: >> >>> Barry Smith schrieb: >>>> >>>> When properly running nn-cg (are you sure everything is symmetric?) >>>> should require 10-30 iterations (certainly for model problems) >>> >>> ok, this was the number I expected. >>> >>>> >>>>> nn-cg on 2 nodes 229 iterations, condition 6285 >>>>> nn-cg on 4 nodes 331 iterations, condition 13312 >>>> >>>> Are you sure that your operator has the null space of only >>>> constants? >>> >>> no, I didn't touch anything regarding the null space since I thought >>> it would be >>> done inside the NN-preconditioner. Does this mean I have to set up a >>> null space >>> of the size of the Schur complement system, i.e. the number of >>> interface DOF's? >> >> No, I don't think you need to do anything about the null space. The >> code in PETSc for NN is for (and only for) a null space of constants. >> BTW: with 2 or 4 subdomains they all touch the boundary and likely >> don't >> have a null space anyways. >> >> Run with -ksp_view and make sure the local solves are being done >> with LU >> > > I don't find the anomalies... setting local_ksp-rtol to 1e-8 doesn't > change anything > > the options passed are: > > -is_localD_ksp_type preonly > -is_localD_pc_factor_shift_positive_definite > -is_localD_pc_type lu > -is_localN_ksp_type preonly > -is_localN_pc_factor_shift_positive_definite > -is_localN_pc_type lu > -ksp_rtol 1e-8 > -ksp_view > #-is_localD_ksp_view > #-is_localN_ksp_view > #-nn_coarse_ksp_view > # -pc_is_remove_nullspace_fixed this option doesn't produce any effect > -log_summary > -options_left > > and produce: > > -ksp_view: > > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: nn > linear system matrix = precond matrix: > Matrix Object: > type=is, rows=28632, cols=28632 > Matrix Object:(is) > type=seqaij, rows=7537, cols=7537 > total: nonzeros=359491, allocated nonzeros=602960 > using I-node routines: found 4578 nodes, limit used is 5 > Matrix Object:(is) > type=seqaij, rows=7515, cols=7515 > total: nonzeros=349347, allocated nonzeros=601200 > using I-node routines: found 5159 nodes, limit used is 5 > Matrix Object:(is) > type=seqaij, rows=7533, cols=7533 > total: nonzeros=357291, allocated nonzeros=602640 > using I-node routines: found 4739 nodes, limit used is 5 > Matrix Object:(is) > type=seqaij, rows=7360, cols=7360 > total: nonzeros=364390, allocated nonzeros=588800 > using I-node routines: found 3602 nodes, limit used is 5 > > -is_local...: > > KSP Object:(is_localD_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(is_localD_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: using Manteuffel shift > LU: factor fill ratio needed 4.73566 > Factored matrix follows > Matrix Object: > type=seqaij, rows=6714, cols=6714 > package used to perform factorization: petsc > total: nonzeros=1479078, allocated nonzeros=1479078 > using I-node routines: found 2790 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=6714, cols=6714 > total: nonzeros=312328, allocated nonzeros=312328 > using I-node routines: found 4664 nodes, limit used is 5 > > KSP Object:(is_localN_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(is_localN_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: using Manteuffel shift > LU: factor fill ratio needed 5.07571 > Factored matrix follows > Matrix Object: > type=seqaij, rows=7537, cols=7537 > package used to perform factorization: petsc > total: nonzeros=1824671, allocated nonzeros=1824671 > using I-node routines: found 2939 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object:(is) > type=seqaij, rows=7537, cols=7537 > total: nonzeros=359491, allocated nonzeros=602960 > using I-node routines: found 4578 nodes, limit used is 5 > > > -nn_coarse_ksp_view: > > KSP Object:(nn_coarse_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(nn_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 1 > Factored matrix follows > Matrix Object: > type=seqaij, rows=4, cols=4 > package used to perform factorization: petsc > total: nonzeros=4, allocated nonzeros=4 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=4, cols=4 > total: nonzeros=4, allocated nonzeros=4 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=4, cols=4 > total: nonzeros=4, allocated nonzeros=68 > not using I-node (on process 0) routines > > > cheers, > > ando > > -- > /"\ Grassl Andreas > \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik > X against HTML email Technikerstr. 13 Zi 709 > / \ +43 (0)512 507 6091 From ondrej at certik.cz Thu Jun 4 17:00:52 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 4 Jun 2009 16:00:52 -0600 Subject: petsc4py fails to configure with my own lapack/blas Message-ID: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> Hi, I am trying to build petsc-3.0.0 inside Sage, for which I want it to use Sage's lapack and blas. Here is what I do: ./configure --prefix="$SAGE_LOCAL" --with-blas-lapack-dir="$SAGE_LOCAL/lib" --CFLAGS="-fPIC" --CXXFLAGS="-fPIC" (The -fPIC flags are necessary, otherwise petsc4py fails to build on 64bit when linking with petsc -- that might be another bug) When I execute the above line, I get: ********************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------- You set a value for --with-blas-lapack-dir=, but /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib cannot be used ********************************************************************************* and a configure.log is attached. I also tried: ./configure --prefix="$SAGE_LOCAL" --with-lapack-lib="$SAGE_LOCAL/lib/liblapack.a" --with-blas-lib="$SAGE_LOCAL/lib/libblas.a" --CFLAGS="-fPIC" --CXXFLAGS="-fPIC" but I got the same result. So as a workaround, I configure it with: ./configure --prefix="$SAGE_LOCAL" --with-blas-lapack-dir="/usr/lib" --CFLAGS="-fPIC" --CXXFLAGS="-fPIC" which works fine, petsc4py also installs fine, but unfortunately it doesn't import: http://code.google.com/p/femhub/issues/detail?id=30 which I suspect is due to the wrong lapack and blas during petsc build (but I may be wrong, maybe it's another bug). Does anyone knows what else I can try to get it to work? Thanks, Ondrej -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 1096662 bytes Desc: not available URL: From dalcinl at gmail.com Thu Jun 4 17:13:11 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 4 Jun 2009 19:13:11 -0300 Subject: petsc4py fails to configure with my own lapack/blas In-Reply-To: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> References: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> Message-ID: This smells to a 32/64 bit libs mismatch, or a g77/g95/gfortran mismatch. Addionally, could you try to run 'ldd' on core PETSc libs and on the PETSc.so extension module? On Thu, Jun 4, 2009 at 7:00 PM, Ondrej Certik wrote: > Hi, > > I am trying to build petsc-3.0.0 inside Sage, for which I want it to > use Sage's lapack and blas. > > Here is what I do: > > ./configure --prefix="$SAGE_LOCAL" > --with-blas-lapack-dir="$SAGE_LOCAL/lib" --CFLAGS="-fPIC" > --CXXFLAGS="-fPIC" > > (The -fPIC flags are necessary, otherwise petsc4py fails to build on > 64bit when linking with petsc -- that might be another bug) > > When I execute the above line, I get: > > ********************************************************************************* > ? ? ? ? UNABLE to CONFIGURE with GIVEN OPTIONS ? ?(see configure.log > for details): > --------------------------------------------------------------------------------------- > You set a value for --with-blas-lapack-dir=, but > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib cannot be used > ********************************************************************************* > > > and a configure.log is attached. I also tried: > > ./configure --prefix="$SAGE_LOCAL" > --with-lapack-lib="$SAGE_LOCAL/lib/liblapack.a" > --with-blas-lib="$SAGE_LOCAL/lib/libblas.a" --CFLAGS="-fPIC" > --CXXFLAGS="-fPIC" > > but I got the same result. So as a workaround, I configure it with: > > ./configure --prefix="$SAGE_LOCAL" --with-blas-lapack-dir="/usr/lib" > --CFLAGS="-fPIC" --CXXFLAGS="-fPIC" > > which works fine, petsc4py also installs fine, but unfortunately it > doesn't import: > > http://code.google.com/p/femhub/issues/detail?id=30 > > which I suspect is due to the wrong lapack and blas during petsc build > (but I may be wrong, maybe it's another bug). > > > Does anyone knows what else I can try to get it to work? > > Thanks, > Ondrej > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From Andreas.Grassl at student.uibk.ac.at Thu Jun 4 17:13:50 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Fri, 05 Jun 2009 00:13:50 +0200 Subject: VecView behaviour In-Reply-To: References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <21552795-DCF3-4DED-BDBB-34F2037C571B@mcs.anl.gov> <4A26F942.7030704@student.uibk.ac.at> <4ADA90D3-69D6-4DA5-A6D6-7EC8B0B13DCE@mcs.anl.gov> <4A27F12B.3070403@student.uibk.ac.at> Message-ID: <4A28471E.5050905@student.uibk.ac.at> Barry Smith schrieb: > > Run with GMRES, what happens? Same behaviour... > > On Jun 4, 2009, at 11:07 AM, Andreas Grassl wrote: > >> Barry Smith schrieb: >>> >>> On Jun 3, 2009, at 5:29 PM, Andreas Grassl wrote: >>> >>>> Barry Smith schrieb: >>>>> >>>>> When properly running nn-cg (are you sure everything is symmetric?) >>>>> should require 10-30 iterations (certainly for model problems) >>>> >>>> ok, this was the number I expected. >>>> >>>>> >>>>>> nn-cg on 2 nodes 229 iterations, condition 6285 >>>>>> nn-cg on 4 nodes 331 iterations, condition 13312 >>>>> >>>>> Are you sure that your operator has the null space of only constants? >>>> >>>> no, I didn't touch anything regarding the null space since I thought >>>> it would be >>>> done inside the NN-preconditioner. Does this mean I have to set up a >>>> null space >>>> of the size of the Schur complement system, i.e. the number of >>>> interface DOF's? >>> >>> No, I don't think you need to do anything about the null space. The >>> code in PETSc for NN is for (and only for) a null space of constants. >>> BTW: with 2 or 4 subdomains they all touch the boundary and likely don't >>> have a null space anyways. >>> >>> Run with -ksp_view and make sure the local solves are being done with LU >>> >> >> I don't find the anomalies... setting local_ksp-rtol to 1e-8 doesn't >> change anything >> >> the options passed are: >> >> -is_localD_ksp_type preonly >> -is_localD_pc_factor_shift_positive_definite >> -is_localD_pc_type lu >> -is_localN_ksp_type preonly >> -is_localN_pc_factor_shift_positive_definite >> -is_localN_pc_type lu >> -ksp_rtol 1e-8 >> -ksp_view >> #-is_localD_ksp_view >> #-is_localN_ksp_view >> #-nn_coarse_ksp_view >> # -pc_is_remove_nullspace_fixed this option doesn't produce any effect >> -log_summary >> -options_left >> >> and produce: >> >> -ksp_view: >> >> KSP Object: >> type: cg >> maximum iterations=10000 >> tolerances: relative=1e-08, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: nn >> linear system matrix = precond matrix: >> Matrix Object: >> type=is, rows=28632, cols=28632 >> Matrix Object:(is) >> type=seqaij, rows=7537, cols=7537 >> total: nonzeros=359491, allocated nonzeros=602960 >> using I-node routines: found 4578 nodes, limit used is 5 >> Matrix Object:(is) >> type=seqaij, rows=7515, cols=7515 >> total: nonzeros=349347, allocated nonzeros=601200 >> using I-node routines: found 5159 nodes, limit used is 5 >> Matrix Object:(is) >> type=seqaij, rows=7533, cols=7533 >> total: nonzeros=357291, allocated nonzeros=602640 >> using I-node routines: found 4739 nodes, limit used is 5 >> Matrix Object:(is) >> type=seqaij, rows=7360, cols=7360 >> total: nonzeros=364390, allocated nonzeros=588800 >> using I-node routines: found 3602 nodes, limit used is 5 >> >> -is_local...: >> >> KSP Object:(is_localD_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(is_localD_) >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: using Manteuffel shift >> LU: factor fill ratio needed 4.73566 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=6714, cols=6714 >> package used to perform factorization: petsc >> total: nonzeros=1479078, allocated nonzeros=1479078 >> using I-node routines: found 2790 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=6714, cols=6714 >> total: nonzeros=312328, allocated nonzeros=312328 >> using I-node routines: found 4664 nodes, limit used is 5 >> >> KSP Object:(is_localN_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(is_localN_) >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: using Manteuffel shift >> LU: factor fill ratio needed 5.07571 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=7537, cols=7537 >> package used to perform factorization: petsc >> total: nonzeros=1824671, allocated nonzeros=1824671 >> using I-node routines: found 2939 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Matrix Object:(is) >> type=seqaij, rows=7537, cols=7537 >> total: nonzeros=359491, allocated nonzeros=602960 >> using I-node routines: found 4578 nodes, limit used is 5 >> >> >> -nn_coarse_ksp_view: >> >> KSP Object:(nn_coarse_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(nn_coarse_) >> type: redundant >> Redundant preconditioner: First (color=0) of 4 PCs follows >> KSP Object:(redundant_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(redundant_) >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: factor fill ratio needed 1 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=4, cols=4 >> package used to perform factorization: petsc >> total: nonzeros=4, allocated nonzeros=4 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=4, cols=4 >> total: nonzeros=4, allocated nonzeros=4 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=4, cols=4 >> total: nonzeros=4, allocated nonzeros=68 >> not using I-node (on process 0) routines >> >> >> cheers, >> >> ando >> >> -- >> /"\ Grassl Andreas >> \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik >> X against HTML email Technikerstr. 13 Zi 709 >> / \ +43 (0)512 507 6091 > -- /"\ \ / ASCII Ribbon X against HTML email / \ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: From ondrej at certik.cz Thu Jun 4 17:59:32 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 4 Jun 2009 16:59:32 -0600 Subject: petsc4py fails to configure with my own lapack/blas In-Reply-To: References: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> Message-ID: <85b5c3130906041559n7da19ceax2b3943e7d09dbcc3@mail.gmail.com> On Thu, Jun 4, 2009 at 4:13 PM, Lisandro Dalcin wrote: > This smells to a 32/64 bit libs mismatch, or a g77/g95/gfortran > mismatch. Addionally, could you try to run 'ldd' on core PETSc libs > and on the PETSc.so extension module? So those core PETSc libs are just .a libraries (not dynamic executables). Could that be a problem? As to PETSc.so: $ ldd lib/linux-gnu-c-debug/PETSc.so linux-vdso.so.1 => (0x00007fff551ff000) libX11.so.6 => /usr/lib/libX11.so.6 (0x00007f1b4c0c1000) libcblas.so.3gf => /usr/lib/libcblas.so.3gf (0x00007f1b4bea2000) libf77blas.so.3gf => /usr/lib/libf77blas.so.3gf (0x00007f1b4bc83000) libatlas.so.3gf => /usr/lib/libatlas.so.3gf (0x00007f1b4b2fd000) libdl.so.2 => /lib/libdl.so.2 (0x00007f1b4b0d9000) libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00007f1b4ae36000) libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00007f1b4abee000) libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00007f1b4a982000) libnsl.so.1 => /lib/libnsl.so.1 (0x00007f1b4a768000) libutil.so.1 => /lib/libutil.so.1 (0x00007f1b4a565000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1b4a34c000) libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1b4a130000) libmpi_f90.so.0 => /usr/lib/libmpi_f90.so.0 (0x00007f1b49f2c000) libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00007f1b49cf3000) libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007f1b49a17000) libm.so.6 => /lib/libm.so.6 (0x00007f1b49792000) libc.so.6 => /lib/libc.so.6 (0x00007f1b4941f000) libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00007f1b49203000) /lib64/ld-linux-x86-64.so.2 (0x00007f1b4d0f7000) libXau.so.6 => /usr/lib/libXau.so.6 (0x00007f1b48fff000) libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00007f1b48dfa000) So I don't know... Ondrej From dalcinl at gmail.com Thu Jun 4 18:14:16 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 4 Jun 2009 20:14:16 -0300 Subject: petsc4py fails to configure with my own lapack/blas In-Reply-To: <85b5c3130906041559n7da19ceax2b3943e7d09dbcc3@mail.gmail.com> References: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> <85b5c3130906041559n7da19ceax2b3943e7d09dbcc3@mail.gmail.com> Message-ID: On Thu, Jun 4, 2009 at 7:59 PM, Ondrej Certik wrote: > On Thu, Jun 4, 2009 at 4:13 PM, Lisandro Dalcin wrote: >> This smells to a 32/64 bit libs mismatch, or a g77/g95/gfortran >> mismatch. Addionally, could you try to run 'ldd' on core PETSc libs >> and on the PETSc.so extension module? > > So those core PETSc libs are just .a libraries (not dynamic > executables). Could that be a problem? > That could be a BIG problem. petsc4py does not "officially" support PETSc builds with static libs, though it could work on some scenarios. Moreover, even if you get it working, you will not be able to use let say slepc4py, or any other C code depending on the PETSc libraries (think of a fast, Cython-implemented Function()/Jacobian() routine for a nonlinear problem solved with SNES). I really recommend you to pass '--with-shared' to PETSc's configure. > As to PETSc.so: > > $ ldd lib/linux-gnu-c-debug/PETSc.so > ? ? ? ?linux-vdso.so.1 => ?(0x00007fff551ff000) > ? ? ? ?libX11.so.6 => /usr/lib/libX11.so.6 (0x00007f1b4c0c1000) > ? ? ? ?libcblas.so.3gf => /usr/lib/libcblas.so.3gf (0x00007f1b4bea2000) > ? ? ? ?libf77blas.so.3gf => /usr/lib/libf77blas.so.3gf (0x00007f1b4bc83000) > ? ? ? ?libatlas.so.3gf => /usr/lib/libatlas.so.3gf (0x00007f1b4b2fd000) > ? ? ? ?libdl.so.2 => /lib/libdl.so.2 (0x00007f1b4b0d9000) > ? ? ? ?libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00007f1b4ae36000) > ? ? ? ?libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00007f1b4abee000) > ? ? ? ?libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00007f1b4a982000) > ? ? ? ?libnsl.so.1 => /lib/libnsl.so.1 (0x00007f1b4a768000) > ? ? ? ?libutil.so.1 => /lib/libutil.so.1 (0x00007f1b4a565000) > ? ? ? ?libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f1b4a34c000) > ? ? ? ?libpthread.so.0 => /lib/libpthread.so.0 (0x00007f1b4a130000) > ? ? ? ?libmpi_f90.so.0 => /usr/lib/libmpi_f90.so.0 (0x00007f1b49f2c000) > ? ? ? ?libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00007f1b49cf3000) > ? ? ? ?libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007f1b49a17000) > ? ? ? ?libm.so.6 => /lib/libm.so.6 (0x00007f1b49792000) > ? ? ? ?libc.so.6 => /lib/libc.so.6 (0x00007f1b4941f000) > ? ? ? ?libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00007f1b49203000) > ? ? ? ?/lib64/ld-linux-x86-64.so.2 (0x00007f1b4d0f7000) > ? ? ? ?libXau.so.6 => /usr/lib/libXau.so.6 (0x00007f1b48fff000) > ? ? ? ?libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00007f1b48dfa000) > > > So I don't know... > > Ondrej > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From ondrej at certik.cz Thu Jun 4 18:18:17 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 4 Jun 2009 17:18:17 -0600 Subject: petsc4py fails to configure with my own lapack/blas In-Reply-To: References: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> <85b5c3130906041559n7da19ceax2b3943e7d09dbcc3@mail.gmail.com> Message-ID: <85b5c3130906041618pc27c6e0s2840a608a7b1dd36@mail.gmail.com> On Thu, Jun 4, 2009 at 5:14 PM, Lisandro Dalcin wrote: > On Thu, Jun 4, 2009 at 7:59 PM, Ondrej Certik wrote: >> On Thu, Jun 4, 2009 at 4:13 PM, Lisandro Dalcin wrote: >>> This smells to a 32/64 bit libs mismatch, or a g77/g95/gfortran >>> mismatch. Addionally, could you try to run 'ldd' on core PETSc libs >>> and on the PETSc.so extension module? >> >> So those core PETSc libs are just .a libraries (not dynamic >> executables). Could that be a problem? >> > > That could be a BIG problem. petsc4py does not "officially" support > PETSc builds with static libs, though it could work on some scenarios. > Moreover, even if you get it working, you will not be able to use let > say slepc4py, or any other C code depending on the PETSc libraries > (think of a fast, Cython-implemented Function()/Jacobian() routine for > a nonlinear problem solved with SNES). > > I really recommend you to pass '--with-shared' to PETSc's configure. Ah, I didn't know I have to pass it. I thought petsc will do the right thing. Let me do it right now and I'll report back. Ondrej From ondrej at certik.cz Thu Jun 4 18:28:24 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Thu, 4 Jun 2009 17:28:24 -0600 Subject: petsc4py fails to configure with my own lapack/blas In-Reply-To: <85b5c3130906041618pc27c6e0s2840a608a7b1dd36@mail.gmail.com> References: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> <85b5c3130906041559n7da19ceax2b3943e7d09dbcc3@mail.gmail.com> <85b5c3130906041618pc27c6e0s2840a608a7b1dd36@mail.gmail.com> Message-ID: <85b5c3130906041628o589d93e0id244c1680fb5fd6c@mail.gmail.com> On Thu, Jun 4, 2009 at 5:18 PM, Ondrej Certik wrote: > On Thu, Jun 4, 2009 at 5:14 PM, Lisandro Dalcin wrote: >> On Thu, Jun 4, 2009 at 7:59 PM, Ondrej Certik wrote: >>> On Thu, Jun 4, 2009 at 4:13 PM, Lisandro Dalcin wrote: >>>> This smells to a 32/64 bit libs mismatch, or a g77/g95/gfortran >>>> mismatch. Addionally, could you try to run 'ldd' on core PETSc libs >>>> and on the PETSc.so extension module? >>> >>> So those core PETSc libs are just .a libraries (not dynamic >>> executables). Could that be a problem? >>> >> >> That could be a BIG problem. petsc4py does not "officially" support >> PETSc builds with static libs, though it could work on some scenarios. >> Moreover, even if you get it working, you will not be able to use let >> say slepc4py, or any other C code depending on the PETSc libraries >> (think of a fast, Cython-implemented Function()/Jacobian() routine for >> a nonlinear problem solved with SNES). >> >> I really recommend you to pass '--with-shared' to PETSc's configure. > > Ah, I didn't know I have to pass it. I thought petsc will do the right > thing. Let me do it right now and I'll report back. Ok, so now everything is built as an .so library, but it still fails in exactly the same way as here: http://code.google.com/p/femhub/issues/detail?id=30 Here is the ldd on petsc libraries: $ ldd lib/libpetsc.so linux-vdso.so.1 => (0x00007fff32bfe000) libX11.so.6 => /usr/lib/libX11.so.6 (0x00007fc62a372000) liblapack.so.3gf => /usr/lib/liblapack.so.3gf (0x00007fc6298ac000) libcblas.so.3gf => /usr/lib/libcblas.so.3gf (0x00007fc62968e000) libf77blas.so.3gf => /usr/lib/libf77blas.so.3gf (0x00007fc62946f000) libatlas.so.3gf => /usr/lib/libatlas.so.3gf (0x00007fc628ae8000) libdl.so.2 => /lib/libdl.so.2 (0x00007fc6288e4000) libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00007fc628641000) libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00007fc6283f8000) libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00007fc62818d000) libnsl.so.1 => /lib/libnsl.so.1 (0x00007fc627f73000) libutil.so.1 => /lib/libutil.so.1 (0x00007fc627d6f000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fc627b57000) libpthread.so.0 => /lib/libpthread.so.0 (0x00007fc62793b000) libmpi_f90.so.0 => /usr/lib/libmpi_f90.so.0 (0x00007fc627736000) libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00007fc6274fe000) libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007fc627222000) libm.so.6 => /lib/libm.so.6 (0x00007fc626f9c000) libc.so.6 => /lib/libc.so.6 (0x00007fc626c2a000) libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00007fc626a0e000) libblas.so.3gf => /usr/lib/libblas.so.3gf (0x00007fc62677d000) /lib64/ld-linux-x86-64.so.2 (0x00007fc62a9ec000) libXau.so.6 => /usr/lib/libXau.so.6 (0x00007fc62657a000) libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00007fc626374000) and here on the PETSc.so: $ ldd lib/python2.5/site-packages/petsc4py/lib/linux-gnu-c-debug/PETSc.so linux-vdso.so.1 => (0x00007fff899ff000) libpetscts.so => /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscts.so (0x00007f2881264000) libpetscsnes.so => /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscsnes.so (0x00007f2880ffb000) libpetscksp.so => /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscksp.so (0x00007f2880b68000) libpetscdm.so => /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscdm.so (0x00007f28808a4000) libpetscmat.so => /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscmat.so (0x00007f28802f9000) libpetscvec.so => /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscvec.so (0x00007f287ffff000) libpetsc.so => /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetsc.so (0x00007f287fc8f000) libX11.so.6 => /usr/lib/libX11.so.6 (0x00007f287f987000) libcblas.so.3gf => /usr/lib/libcblas.so.3gf (0x00007f287f769000) libf77blas.so.3gf => /usr/lib/libf77blas.so.3gf (0x00007f287f54a000) libatlas.so.3gf => /usr/lib/libatlas.so.3gf (0x00007f287ebc3000) libdl.so.2 => /lib/libdl.so.2 (0x00007f287e9bf000) libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00007f287e71c000) libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00007f287e4d3000) libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00007f287e268000) libnsl.so.1 => /lib/libnsl.so.1 (0x00007f287e04e000) libutil.so.1 => /lib/libutil.so.1 (0x00007f287de4a000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f287dc32000) libpthread.so.0 => /lib/libpthread.so.0 (0x00007f287da16000) libmpi_f90.so.0 => /usr/lib/libmpi_f90.so.0 (0x00007f287d811000) libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00007f287d5d9000) libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007f287d2fd000) libm.so.6 => /lib/libm.so.6 (0x00007f287d077000) libc.so.6 => /lib/libc.so.6 (0x00007f287cd05000) liblapack.so.3gf => /usr/lib/liblapack.so.3gf (0x00007f287c240000) libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00007f287c023000) /lib64/ld-linux-x86-64.so.2 (0x00007f28818b0000) libblas.so.3gf => /usr/lib/libblas.so.3gf (0x00007f287bd92000) libXau.so.6 => /usr/lib/libXau.so.6 (0x00007f287bb8f000) libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00007f287b989000) I also noticed that the -fPIC flags are now put there several times, but I guess this can't hurt, that's not a problem for now. Ondrej From dalcinl at gmail.com Thu Jun 4 18:51:54 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 4 Jun 2009 20:51:54 -0300 Subject: petsc4py fails to configure with my own lapack/blas In-Reply-To: <85b5c3130906041628o589d93e0id244c1680fb5fd6c@mail.gmail.com> References: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> <85b5c3130906041559n7da19ceax2b3943e7d09dbcc3@mail.gmail.com> <85b5c3130906041618pc27c6e0s2840a608a7b1dd36@mail.gmail.com> <85b5c3130906041628o589d93e0id244c1680fb5fd6c@mail.gmail.com> Message-ID: First of all, what Fortran compilers do you have installed? g77? g95? gfortran? all of them? two of them? just one? This is going to be a real mess for you if more than one is installed. If you want to use the system blas/lapack, then you should use a matching compiler, and this dependency will apply when you build numpy and scipy. At first look, it seems you are using blas/lapack libraries built with g77 ?? But PETSc seems to pick gfortran as Fortran compiler ?? Moreover, libpetsc.so is linked with some f77 and f90 MPI bits... I'm not sure if that's fine... On Thu, Jun 4, 2009 at 8:28 PM, Ondrej Certik wrote: > On Thu, Jun 4, 2009 at 5:18 PM, Ondrej Certik wrote: >> On Thu, Jun 4, 2009 at 5:14 PM, Lisandro Dalcin wrote: >>> On Thu, Jun 4, 2009 at 7:59 PM, Ondrej Certik wrote: >>>> On Thu, Jun 4, 2009 at 4:13 PM, Lisandro Dalcin wrote: >>>>> This smells to a 32/64 bit libs mismatch, or a g77/g95/gfortran >>>>> mismatch. Addionally, could you try to run 'ldd' on core PETSc libs >>>>> and on the PETSc.so extension module? >>>> >>>> So those core PETSc libs are just .a libraries (not dynamic >>>> executables). Could that be a problem? >>>> >>> >>> That could be a BIG problem. petsc4py does not "officially" support >>> PETSc builds with static libs, though it could work on some scenarios. >>> Moreover, even if you get it working, you will not be able to use let >>> say slepc4py, or any other C code depending on the PETSc libraries >>> (think of a fast, Cython-implemented Function()/Jacobian() routine for >>> a nonlinear problem solved with SNES). >>> >>> I really recommend you to pass '--with-shared' to PETSc's configure. >> >> Ah, I didn't know I have to pass it. I thought petsc will do the right >> thing. Let me do it right now and I'll report back. > > Ok, so now everything is built as an .so library, but it still fails > in exactly the same way as here: > > http://code.google.com/p/femhub/issues/detail?id=30 > > Here is the ldd on petsc libraries: > > $ ldd lib/libpetsc.so > ? ? ? ?linux-vdso.so.1 => ?(0x00007fff32bfe000) > ? ? ? ?libX11.so.6 => /usr/lib/libX11.so.6 (0x00007fc62a372000) > ? ? ? ?liblapack.so.3gf => /usr/lib/liblapack.so.3gf (0x00007fc6298ac000) > ? ? ? ?libcblas.so.3gf => /usr/lib/libcblas.so.3gf (0x00007fc62968e000) > ? ? ? ?libf77blas.so.3gf => /usr/lib/libf77blas.so.3gf (0x00007fc62946f000) > ? ? ? ?libatlas.so.3gf => /usr/lib/libatlas.so.3gf (0x00007fc628ae8000) > ? ? ? ?libdl.so.2 => /lib/libdl.so.2 (0x00007fc6288e4000) > ? ? ? ?libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00007fc628641000) > ? ? ? ?libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00007fc6283f8000) > ? ? ? ?libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00007fc62818d000) > ? ? ? ?libnsl.so.1 => /lib/libnsl.so.1 (0x00007fc627f73000) > ? ? ? ?libutil.so.1 => /lib/libutil.so.1 (0x00007fc627d6f000) > ? ? ? ?libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fc627b57000) > ? ? ? ?libpthread.so.0 => /lib/libpthread.so.0 (0x00007fc62793b000) > ? ? ? ?libmpi_f90.so.0 => /usr/lib/libmpi_f90.so.0 (0x00007fc627736000) > ? ? ? ?libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00007fc6274fe000) > ? ? ? ?libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007fc627222000) > ? ? ? ?libm.so.6 => /lib/libm.so.6 (0x00007fc626f9c000) > ? ? ? ?libc.so.6 => /lib/libc.so.6 (0x00007fc626c2a000) > ? ? ? ?libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00007fc626a0e000) > ? ? ? ?libblas.so.3gf => /usr/lib/libblas.so.3gf (0x00007fc62677d000) > ? ? ? ?/lib64/ld-linux-x86-64.so.2 (0x00007fc62a9ec000) > ? ? ? ?libXau.so.6 => /usr/lib/libXau.so.6 (0x00007fc62657a000) > ? ? ? ?libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00007fc626374000) > > > > and here on the PETSc.so: > > $ ldd lib/python2.5/site-packages/petsc4py/lib/linux-gnu-c-debug/PETSc.so > ? ? ? ?linux-vdso.so.1 => ?(0x00007fff899ff000) > ? ? ? ?libpetscts.so => > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscts.so > (0x00007f2881264000) > ? ? ? ?libpetscsnes.so => > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscsnes.so > (0x00007f2880ffb000) > ? ? ? ?libpetscksp.so => > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscksp.so > (0x00007f2880b68000) > ? ? ? ?libpetscdm.so => > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscdm.so > (0x00007f28808a4000) > ? ? ? ?libpetscmat.so => > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscmat.so > (0x00007f28802f9000) > ? ? ? ?libpetscvec.so => > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetscvec.so > (0x00007f287ffff000) > ? ? ? ?libpetsc.so => > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib/libpetsc.so > (0x00007f287fc8f000) > ? ? ? ?libX11.so.6 => /usr/lib/libX11.so.6 (0x00007f287f987000) > ? ? ? ?libcblas.so.3gf => /usr/lib/libcblas.so.3gf (0x00007f287f769000) > ? ? ? ?libf77blas.so.3gf => /usr/lib/libf77blas.so.3gf (0x00007f287f54a000) > ? ? ? ?libatlas.so.3gf => /usr/lib/libatlas.so.3gf (0x00007f287ebc3000) > ? ? ? ?libdl.so.2 => /lib/libdl.so.2 (0x00007f287e9bf000) > ? ? ? ?libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00007f287e71c000) > ? ? ? ?libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00007f287e4d3000) > ? ? ? ?libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00007f287e268000) > ? ? ? ?libnsl.so.1 => /lib/libnsl.so.1 (0x00007f287e04e000) > ? ? ? ?libutil.so.1 => /lib/libutil.so.1 (0x00007f287de4a000) > ? ? ? ?libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f287dc32000) > ? ? ? ?libpthread.so.0 => /lib/libpthread.so.0 (0x00007f287da16000) > ? ? ? ?libmpi_f90.so.0 => /usr/lib/libmpi_f90.so.0 (0x00007f287d811000) > ? ? ? ?libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00007f287d5d9000) > ? ? ? ?libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007f287d2fd000) > ? ? ? ?libm.so.6 => /lib/libm.so.6 (0x00007f287d077000) > ? ? ? ?libc.so.6 => /lib/libc.so.6 (0x00007f287cd05000) > ? ? ? ?liblapack.so.3gf => /usr/lib/liblapack.so.3gf (0x00007f287c240000) > ? ? ? ?libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00007f287c023000) > ? ? ? ?/lib64/ld-linux-x86-64.so.2 (0x00007f28818b0000) > ? ? ? ?libblas.so.3gf => /usr/lib/libblas.so.3gf (0x00007f287bd92000) > ? ? ? ?libXau.so.6 => /usr/lib/libXau.so.6 (0x00007f287bb8f000) > ? ? ? ?libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x00007f287b989000) > > > > I also noticed that the -fPIC flags are now put there several times, > but I guess this can't hurt, that's not a problem for now. > > Ondrej > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From bsmith at mcs.anl.gov Thu Jun 4 21:33:36 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 Jun 2009 21:33:36 -0500 Subject: petsc4py fails to configure with my own lapack/blas In-Reply-To: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> References: <85b5c3130906041500l2b857070ka36bcb333513b01d@mail.gmail.com> Message-ID: This is a petsc-maint at mcs.anl.gov request, I am sending it over there and removing from the petsc-users list. Please respond only to the petsc-maint at mcs.anl.gov list Back to the original problem using the BLAS/LAPACK you provided. It has nothing to do with petsc4py The FOTRAN compiler your mpif90 is using is gfortran (from the configure.log Driving: /usr/bin/gfortran ) but the Fortran compiler used to build your blas/lapack libraries is g95 (from the log file xerbla.f:(.text+0x1b): undefined reference to `_g95_get_ioparm') You cannot do this. You need to either 1) build your MPI to use g95 or 2) build your BLAS/LAPACK with gfortran then run PETSc's config/configure.py again. Barry Best would be to totally remove g95 from your machine then this kind of problem would not happen. On Jun 4, 2009, at 5:00 PM, Ondrej Certik wrote: > Hi, > > I am trying to build petsc-3.0.0 inside Sage, for which I want it to > use Sage's lapack and blas. > > Here is what I do: > > ./configure --prefix="$SAGE_LOCAL" > --with-blas-lapack-dir="$SAGE_LOCAL/lib" --CFLAGS="-fPIC" > --CXXFLAGS="-fPIC" > > (The -fPIC flags are necessary, otherwise petsc4py fails to build on > 64bit when linking with petsc -- that might be another bug) > > When I execute the above line, I get: > > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > --------------------------------------------------------------------------------------- > You set a value for --with-blas-lapack-dir=, but > /home/ondrej/ext/spd-3.4.2spd3-ubuntu-64bit/local/lib cannot be used > ********************************************************************************* > > > and a configure.log is attached. I also tried: > > ./configure --prefix="$SAGE_LOCAL" > --with-lapack-lib="$SAGE_LOCAL/lib/liblapack.a" > --with-blas-lib="$SAGE_LOCAL/lib/libblas.a" --CFLAGS="-fPIC" > --CXXFLAGS="-fPIC" > > but I got the same result. So as a workaround, I configure it with: > > ./configure --prefix="$SAGE_LOCAL" --with-blas-lapack-dir="/usr/lib" > --CFLAGS="-fPIC" --CXXFLAGS="-fPIC" > > which works fine, petsc4py also installs fine, but unfortunately it > doesn't import: > > http://code.google.com/p/femhub/issues/detail?id=30 > > which I suspect is due to the wrong lapack and blas during petsc build > (but I may be wrong, maybe it's another bug). > > > Does anyone knows what else I can try to get it to work? > > Thanks, > Ondrej > From Andreas.Grassl at student.uibk.ac.at Fri Jun 5 01:49:42 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Fri, 05 Jun 2009 08:49:42 +0200 Subject: VecView behaviour In-Reply-To: <4A28471E.5050905@student.uibk.ac.at> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <21552795-DCF3-4DED-BDBB-34F2037C571B@mcs.anl.gov> <4A26F942.7030704@student.uibk.ac.at> <4ADA90D3-69D6-4DA5-A6D6-7EC8B0B13DCE@mcs.anl.gov> <4A27F12B.3070403@student.uibk.ac.at> <4A28471E.5050905@student.uibk.ac.at> Message-ID: <4A28C006.4070906@student.uibk.ac.at> Andreas Grassl schrieb: > Barry Smith schrieb: >> Run with GMRES, what happens? > > Same behaviour... > i.e. 443 iterations. Although I noticed some differences between giving the ksp and pc type hardcoded or as runtime options. (443 vs. 354 its on gmres and 339 vs. 333 on cg) I don't have to reorder the matrix manually, right? output of -ksp_view KSP Object: type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=1e-08, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: nn linear system matrix = precond matrix: Matrix Object: type=is, rows=28632, cols=28632 Matrix Object:(is) type=seqaij, rows=7537, cols=7537 total: nonzeros=359491, allocated nonzeros=602960 using I-node routines: found 4578 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7515, cols=7515 total: nonzeros=349347, allocated nonzeros=601200 using I-node routines: found 5159 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7533, cols=7533 total: nonzeros=357291, allocated nonzeros=602640 using I-node routines: found 4739 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7360, cols=7360 total: nonzeros=364390, allocated nonzeros=588800 using I-node routines: found 3602 nodes, limit used is 5 cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From sapphire.jxy at gmail.com Tue Jun 9 16:27:22 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Tue, 9 Jun 2009 17:27:22 -0400 Subject: how to set values of petsc matrix into double precision? Message-ID: <6985a8f00906091427p5fc3876ncbe0517674240252@mail.gmail.com> Hi, Thanks for the help with Makefile for multiple files in PETSc, I've also found a way that works for me, simply add these two libs: LIBSCOMP = -I/gpfs_irving/irving/xji/petsc-3.0.0-p5/linux-gnu-c-debug/include -I/gpfs_irving/irving/xji/petsc-3.0.0-p5/include -I/usr/X11R6/include LIBSLINK = -Wl,-rpath,/gpfs_irving/irving/xji/petsc-3.0.0-p5/linux-gnu-c-debug/lib -L/gpfs_irving/irving/xji/petsc-3.0.0-p5/linux-gnu-c-debug/lib -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -L/usr/X11R6/lib -lX11 -llapack -lblas -L/usr/local/intel/mpich-1.2.7p1/linux86/9.1/lib -ldl -lmpich -lpthread -lrt -L/usr/local/intel/cc/9.1.045/lib -L/usr/lib/gcc/i386-redhat-linux/3.4.6 -limf -lipgo -lgcc_s -lirc -lirc_s -lmpichf90 -L/usr/local/intel/fc/9.1.040/lib -lifport -lifcore -lm -lm -ldl -lmpich -lpthread -lrt -limf -lipgo -lgcc_s -lirc -lirc_s -ldl LIBSCOMP is for compiling .F90 files into .o and LIBSLINK is for linking .o files into single executable. I did not use "include ${PETSC_DIR}/conf/base" Now I came up with this new problem, I could not find the way to let petsc mat store double precision values. I used double precision value in MatSetValue() but the MatView showed the stored data to be single precision. I've also tried to recompile petsc as: ./config/configure.py --with-precision=double which didnot work for me. Thanks very much. Xiaoyin Ji ---------------------------------------------- Xiaoyin Ji Graduate Student Department of Materials Science and Engineering North Carolina State University From knepley at gmail.com Tue Jun 9 16:40:10 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 Jun 2009 16:40:10 -0500 Subject: how to set values of petsc matrix into double precision? In-Reply-To: <6985a8f00906091427p5fc3876ncbe0517674240252@mail.gmail.com> References: <6985a8f00906091427p5fc3876ncbe0517674240252@mail.gmail.com> Message-ID: On Tue, Jun 9, 2009 at 4:27 PM, xiaoyin ji wrote: > Hi, > > Now I came up with this new problem, I could not find the way to let > petsc mat store double precision values. I used double precision value > in MatSetValue() but the MatView showed the stored data to be single > precision. I've also tried to recompile petsc as: I think you are misinterpreting MatView(). Matt > > ./config/configure.py --with-precision=double > > which didnot work for me. > > Thanks very much. > > Xiaoyin Ji > > ---------------------------------------------- > > Xiaoyin Ji > Graduate Student > Department of Materials Science and Engineering > North Carolina State University > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Andreas.Grassl at student.uibk.ac.at Wed Jun 10 08:56:35 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Wed, 10 Jun 2009 15:56:35 +0200 Subject: VecView behaviour In-Reply-To: <4A268B7C.3010305@59A2.org> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <4A268B7C.3010305@59A2.org> Message-ID: <4A2FBB93.8060108@student.uibk.ac.at> Hi Jed, the BNN-Algorithm in the literature distinguishes always between inner nodes and interface nodes. The short question arising from your explanation for me is, if owned DOF's is a synonym for the inner DOF's and ghosted DOF's for the interface DOF's? Below you find more extended thoughts and an example. Jed Brown schrieb: > Andreas Grassl wrote: >> Barry Smith schrieb: >>> Hmm, it sounds like the difference between local "ghosted" vectors >>> and the global parallel vectors. But I do not understand why any of the >>> local vector entries would be zero. >>> Doesn't the vector X that is passed into KSP (or SNES) have the global >>> entries and uniquely define the solution? Why is viewing that not right? >>> >> I still don't understand fully the underlying processes of the whole PCNN >> solution procedure, but trying around I substituted >> >> MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, >> gridmapping, &A); > > This creates a matrix that is bigger than you want, and gives you the > dead values at the end (global dofs that are not in the range of the > LocalToGlobalMapping. > > This from the note on MatCreateIS: > > | m and n are NOT related to the size of the map, they are the size of the part of the vector owned > | by that process. m + nghosts (or n + nghosts) is the length of map since map maps all local points > | plus the ghost points to global indices. > >> by >> >> MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, &A); > > This creates a matrix of the correct size, but it looks like it could > easily end up with the "wrong" dofs owned locally. What you probably > want to do is: > > 1. Resolve ownership just like with any other DD method. This > partitions your dofs into n owned dofs and ngh ghosted dofs on each > process. The global sum of n is N, the size of the global vectors that > the solver will interact with. do I understand right, that owned dofs are the inner nodes and the ghosted dofs are the interface dofs? > > 2. Make an ISLocalToGlobalMapping where all the owned dofs come first, > mapping (0..n-1) to (rstart..rstart+n-1), followed by the ghosted dofs > (local index n..ngh-1) which map to remote processes. (rstart is the > global index of the first owned dof) currently I set up my ISLocalToGlobalMapping by giving the processes all the dofs in arbitrary order having the effect, that the interface dofs appear more times. Attached I give you a small example with 2 subdomains and 270 DOF's. > > One way to do this is to use MPI_Scan to find rstart, then number all > the owned dofs and scatter the result. The details will be dependent on > how you store your mesh. (I'm assuming it's unstructured, this step is > trivial if you use a DA.) Yes, the mesh is unstructured, I read out from the FE-package the partitioning at element-basis, loop over all elements to find the belonging DOF's and assemble the index vector for the ISLocalToGlobalMapping this way, without regarding interface DOF's, thinking this would be done automatically by setting up the mapping because by this some global DOF's appear more times. > > 3. Call MatCreateIS(comm,n,n,PETSC_DECIDE,PETSC_DECIDE,mapping,&A); > Seeing this function call and interpreting the owned DOF's as the subdomain inner DOF's the Matrix A has not the full size?! Given a 4x6 grid with 1 DOF per node divided into 4 subdomains I get 9 interface DOF's. 0 o o O o 5 | 6 o o O o o | O--O--O--O--O--O | o o o O o 23 My first approach to create the Matrix would give a Matrix size of 35x35, with 11 dead entries at the end of the vector. My second approach would give the "correct" Matrix size of 24x24. By splitting up in n owned values and some ghosted values I would expect to receive a Matrix of size 15x15. Otherwise I don't see how I could partition the grid in a consistent way. I would really appreciate, if you could show me, how the partition and ownership of the DOF's in this little example work out. cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From Andreas.Grassl at student.uibk.ac.at Wed Jun 10 09:02:13 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Wed, 10 Jun 2009 16:02:13 +0200 Subject: VecView behaviour In-Reply-To: <4A268B7C.3010305@59A2.org> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <4A268B7C.3010305@59A2.org> Message-ID: <4A2FBCE5.6080102@student.uibk.ac.at> The promised mapping for the last post -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mapping.dat URL: From jed at 59A2.org Wed Jun 10 09:38:19 2009 From: jed at 59A2.org (Jed Brown) Date: Wed, 10 Jun 2009 16:38:19 +0200 Subject: VecView behaviour In-Reply-To: <4A2FBB93.8060108@student.uibk.ac.at> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <4A268B7C.3010305@59A2.org> <4A2FBB93.8060108@student.uibk.ac.at> Message-ID: <4A2FC55B.80904@59A2.org> Andreas Grassl wrote: > Hi Jed, > > the BNN-Algorithm in the literature distinguishes always between inner nodes and > interface nodes. The short question arising from your explanation for me is, if > owned DOF's is a synonym for the inner DOF's and ghosted DOF's for the interface > DOF's? No, every degree of freedom (interior and interface) must be owned by exactly one process. You want every process to own their interior degrees of freedom, but I don't think there is a way to guarantee this without using a process like I described. > Below you find more extended thoughts and an example. > > Jed Brown schrieb: >> Andreas Grassl wrote: >>> Barry Smith schrieb: >>>> Hmm, it sounds like the difference between local "ghosted" vectors >>>> and the global parallel vectors. But I do not understand why any of the >>>> local vector entries would be zero. >>>> Doesn't the vector X that is passed into KSP (or SNES) have the global >>>> entries and uniquely define the solution? Why is viewing that not right? >>>> >>> I still don't understand fully the underlying processes of the whole PCNN >>> solution procedure, but trying around I substituted >>> >>> MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, >>> gridmapping, &A); >> This creates a matrix that is bigger than you want, and gives you the >> dead values at the end (global dofs that are not in the range of the >> LocalToGlobalMapping. >> >> This from the note on MatCreateIS: >> >> | m and n are NOT related to the size of the map, they are the size of the part of the vector owned >> | by that process. m + nghosts (or n + nghosts) is the length of map since map maps all local points >> | plus the ghost points to global indices. >> >>> by >>> >>> MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, &A); >> This creates a matrix of the correct size, but it looks like it could >> easily end up with the "wrong" dofs owned locally. What you probably >> want to do is: >> >> 1. Resolve ownership just like with any other DD method. This >> partitions your dofs into n owned dofs and ngh ghosted dofs on each >> process. The global sum of n is N, the size of the global vectors that >> the solver will interact with. > > do I understand right, that owned dofs are the inner nodes and the ghosted dofs > are the interface dofs? No, a dof is ghosted on processes that reference it, but do not own it. >> 2. Make an ISLocalToGlobalMapping where all the owned dofs come first, >> mapping (0..n-1) to (rstart..rstart+n-1), followed by the ghosted dofs >> (local index n..ngh-1) which map to remote processes. (rstart is the >> global index of the first owned dof) > > currently I set up my ISLocalToGlobalMapping by giving the processes all the > dofs in arbitrary order having the effect, that the interface dofs appear more > times. Attached I give you a small example with 2 subdomains and 270 DOF's. I think you're ending up with a lot of interior dofs owned by remote processes (this is bad). I'll try to explain for the 24 dof example below. >> One way to do this is to use MPI_Scan to find rstart, then number all >> the owned dofs and scatter the result. The details will be dependent on >> how you store your mesh. (I'm assuming it's unstructured, this step is >> trivial if you use a DA.) > > Yes, the mesh is unstructured, I read out from the FE-package the partitioning > at element-basis, loop over all elements to find the belonging DOF's and > assemble the index vector for the ISLocalToGlobalMapping this way, without > regarding interface DOF's, thinking this would be done automatically by setting > up the mapping because by this some global DOF's appear more times. > >> 3. Call MatCreateIS(comm,n,n,PETSC_DECIDE,PETSC_DECIDE,mapping,&A); >> > > Seeing this function call and interpreting the owned DOF's as the subdomain > inner DOF's the Matrix A has not the full size?! > > Given a 4x6 grid with 1 DOF per node divided into 4 subdomains I get 9 interface > DOF's. > > 0 o o O o 5 > | > 6 o o O o o > | > O--O--O--O--O--O > | > o o o O o 23 > > My first approach to create the Matrix would give a Matrix size of 35x35, with > 11 dead entries at the end of the vector. > > My second approach would give the "correct" Matrix size of 24x24. > > By splitting up in n owned values and some ghosted values I would expect to > receive a Matrix of size 15x15. Otherwise I don't see how I could partition the > grid in a consistent way. > > I would really appreciate, if you could show me, how the partition and ownership > of the DOF's in this little example work out. I see 4 subdomains with interior dofs rank 0: 0 1 2 6 7 8 rank 1: 4 5 10 11 rank 2: 18 19 20 rank 3: 22 23 I'll continue to use this "natural ordering" to describe the dofs, but you don't normally want to use it because it is not compatible with the decomposition you are actually using. Suppose we resolve ownership by assigning it to the lowest rank touching it. Then the global vector (seen by the solver) is rank 0: 0 1 2 3 6 7 8 9 12 13 14 15 (global indices 0:12) rank 1: 4 5 10 11 16 17 (global indices 12:18) rank 2: 18 19 20 21 (global indices 18:22) rank 3: 22 23 (global indices 22:24) Your local-to-global map should be with respect to the global indices in the ordering compatible with the decomposition. With respect to the natural ordering, it is rank 0: 0 1 2 3 6 7 8 9 12 13 14 15 rank 1: 3 4 5 9 10 11 15 16 17 rank 2: 12 13 14 15 18 19 20 21 rank 3: 15 16 17 21 22 23 Converting this to the ordering compatible with your decomposition, we have rank 0: 0 1 2 3 4 5 6 7 8 9 10 11 rank 1: 3 12 13 7 14 15 11 16 17 rank 2: 8 9 10 11 18 19 20 21 rank 3: 11 16 17 21 22 23 When you create the matrix, 'n' is the number of owned dofs on each process [12,6,4,2] and you want to use this final form of the local to global mapping. If you just give the total size (24), the partition will balance the number of owned dofs, but interior dofs won't end up being owned on the correct process. If you use the natural ordering, it's hopeless to end up with correct interior ownership. (Note that assigning ownership to the highest touching rank would have been better balanced in this case.) Does this help? Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: From Andreas.Grassl at student.uibk.ac.at Wed Jun 10 09:48:14 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Wed, 10 Jun 2009 16:48:14 +0200 Subject: VecView behaviour In-Reply-To: <4A2FC55B.80904@59A2.org> References: <4A1FAC32.4010507@student.uibk.ac.at> <2B57ADF8-D4AB-4938-BCA5-291C96063C8E@mcs.anl.gov> <4A258A49.2020502@student.uibk.ac.at> <4F7307C4-3CB9-4323-BD23-01E577433488@mcs.anl.gov> <4A26506C.5050002@student.uibk.ac.at> <4A268B7C.3010305@59A2.org> <4A2FBB93.8060108@student.uibk.ac.at> <4A2FC55B.80904@59A2.org> Message-ID: <4A2FC7AE.2020704@student.uibk.ac.at> Jed Brown schrieb: > Does this help? I guess, this will help me a lot, but I think I'll give it a try tomorrow, since it is already higher afternoon here and I still have some unfinished work. Thank you so far! cu ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From sapphire.jxy at gmail.com Wed Jun 10 13:10:27 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Wed, 10 Jun 2009 14:10:27 -0400 Subject: something slowed down MatSetValues Message-ID: <6985a8f00906101110l29b5858ch6dcec66070f3d156@mail.gmail.com> Hi, I've just got a problem with MatSetValues. I have a loop of about 1 million and 6 MatSetValues call inside. The loop runs very slow( about several minutes or more), but if I comment one or two MatSetValues then it will finish in seconds. Any hints? Thank you! Xiaoyin Ji From knepley at gmail.com Wed Jun 10 13:16:50 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 10 Jun 2009 13:16:50 -0500 Subject: something slowed down MatSetValues In-Reply-To: <6985a8f00906101110l29b5858ch6dcec66070f3d156@mail.gmail.com> References: <6985a8f00906101110l29b5858ch6dcec66070f3d156@mail.gmail.com> Message-ID: You have not preallocated the matrix correctly. Matt On Wed, Jun 10, 2009 at 1:10 PM, xiaoyin ji wrote: > Hi, > > I've just got a problem with MatSetValues. I have a loop of about 1 > million and 6 MatSetValues call inside. The loop runs very slow( about > several minutes or more), but if I comment one or two MatSetValues > then it will finish in seconds. Any hints? Thank you! > > Xiaoyin Ji > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Jarunan.Panyasantisuk at eleves.ec-nantes.fr Thu Jun 11 03:28:09 2009 From: Jarunan.Panyasantisuk at eleves.ec-nantes.fr (Panyasantisuk Jarunan) Date: Thu, 11 Jun 2009 10:28:09 +0200 Subject: Initial values Message-ID: <20090611102809.ba6l8rhaleok88o8@webmail.ec-nantes.fr> Hello, What is actually the initial values in solving a linear system? And where can I set it? Best regards, Jarunan -- Jarunan PANYASANTISUK MSc. in Computational Mechanics Erasmus Mundus Master Program Ecole Centrale de Nantes 1, rue de la no?, 44321 NANTES, FRANCE From knepley at gmail.com Thu Jun 11 06:19:16 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Jun 2009 06:19:16 -0500 Subject: Initial values In-Reply-To: <20090611102809.ba6l8rhaleok88o8@webmail.ec-nantes.fr> References: <20090611102809.ba6l8rhaleok88o8@webmail.ec-nantes.fr> Message-ID: On Thu, Jun 11, 2009 at 3:28 AM, Panyasantisuk Jarunan < Jarunan.Panyasantisuk at eleves.ec-nantes.fr> wrote: > > Hello, > > What is actually the initial values in solving a linear system? And where > can I set it? The initial values are in x when you call KSPSolve(ksp, b, x) assuming that you call KSPSetInitialGuessNonzero(ksp, PETSC_TRUE) Matt > > Best regards, > Jarunan > > > -- > Jarunan PANYASANTISUK > MSc. in Computational Mechanics > Erasmus Mundus Master Program > Ecole Centrale de Nantes > 1, rue de la no?, 44321 NANTES, FRANCE > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.klettner at ucl.ac.uk Fri Jun 12 09:13:59 2009 From: christian.klettner at ucl.ac.uk (Christian Klettner) Date: Fri, 12 Jun 2009 15:13:59 +0100 (BST) Subject: What is the best solver a poisson type eqn. Message-ID: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> Sorry that I sent this twice. No subject in the first one. Dear PETSc Team, I am writing a CFD finite element code in C. From the discretization of the governing equations I have to solve a Poisson type equation which is really killing my performance. Which solver/preconditioner from PETSc or any external packages would you recommend? The size of my problem is from ~30000-100000 DOF per core. What kind of performance would I be able to expect with this solver/preconditioner? I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the domain with Parmetis. The mesh is unstructured. Also, I am writing a code which studies free surface phenomena so the mesh is continually changing. Does this matter when choosing a solver/preconditioner? My left hand side matrix (A in Ax=b) does not change in time. Best regards and thank you in advance, Christian Klettner From christian.klettner at ucl.ac.uk Fri Jun 12 09:09:36 2009 From: christian.klettner at ucl.ac.uk (Christian Klettner) Date: Fri, 12 Jun 2009 15:09:36 +0100 (BST) Subject: Welcome to the "petsc-users" mailing list In-Reply-To: References: Message-ID: <43306.128.40.55.186.1244815776.squirrel@www.squirrelmail.ucl.ac.uk> Dear PETSc Team, I am writing a CFD finite element code in C. From the discretization of the governing equations I have to solve a Poisson type equation which is really killing my performance. Which solver/preconditioner from PETSc or any external packages would you recommend? The size of my problem is from ~30000-100000 DOF per core. What kind of performance would I be able to expect with this solver/preconditioner? I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the domain with Parmetis. Also, I am writing a code which studies free surface phenomena so the mesh is continually changing. Does this matter when choosing a solver/preconditioner? My left hand side matrix (A in Ax=b) does not change in time. Best regards and thank you in advance, Christian Klettner From dalcinl at gmail.com Fri Jun 12 09:31:28 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 12 Jun 2009 11:31:28 -0300 Subject: What is the best solver a poisson type eqn. In-Reply-To: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: On Fri, Jun 12, 2009 at 11:13 AM, Christian Klettner wrote: > Sorry that I sent this twice. No subject in the first one. > > Dear PETSc Team, > I am writing a CFD finite element code in C. From the discretization of > the governing equations I have to solve a Poisson type equation which is > really killing my performance. Which solver/preconditioner from PETSc or > any external packages would you recommend? The size of my problem is from > ~30000-100000 DOF per core. What kind of performance would I be able to > expect with this solver/preconditioner? I would suggest KSPCG. As preconditioner I would use ML or HYPRE/BoomerAMG (both are external packages) > I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the domain > with Parmetis. The mesh is unstructured. > Also, I am writing a code which studies free surface phenomena so the mesh > is continually changing. Does this matter when choosing a > solver/preconditioner? My left hand side matrix (A in Ax=b) does not > change in time. ML has a faster setup that BoomerAMG, but the convergence is a bit slower. If your A matrix do not change, then likely BoomerAMG will be better for you. In any case, you can try both: just build PETSc with both packages, then you can change the preconditioner by just passing a command line option. > > Best regards and thank you in advance, > Christian Klettner > Disclaimer: the convergence of multigrid preconditioners depends a lot on your actual problem. What I've suggested is just my limited experience in a few problems I've run solving electric potentials. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From knepley at gmail.com Fri Jun 12 09:37:54 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Jun 2009 09:37:54 -0500 Subject: What is the best solver a poisson type eqn. In-Reply-To: References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: The problem is small enough that you might be able to use MUMPS. Matt On Fri, Jun 12, 2009 at 9:31 AM, Lisandro Dalcin wrote: > On Fri, Jun 12, 2009 at 11:13 AM, Christian > Klettner wrote: > > Sorry that I sent this twice. No subject in the first one. > > > > Dear PETSc Team, > > I am writing a CFD finite element code in C. From the discretization of > > the governing equations I have to solve a Poisson type equation which is > > really killing my performance. Which solver/preconditioner from PETSc or > > any external packages would you recommend? The size of my problem is from > > ~30000-100000 DOF per core. What kind of performance would I be able to > > expect with this solver/preconditioner? > > I would suggest KSPCG. As preconditioner I would use ML or > HYPRE/BoomerAMG (both are external packages) > > > I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the domain > > with Parmetis. The mesh is unstructured. > > Also, I am writing a code which studies free surface phenomena so the > mesh > > is continually changing. Does this matter when choosing a > > solver/preconditioner? My left hand side matrix (A in Ax=b) does not > > change in time. > > ML has a faster setup that BoomerAMG, but the convergence is a bit > slower. If your A matrix do not change, then likely BoomerAMG will be > better for you. In any case, you can try both: just build PETSc with > both packages, then you can change the preconditioner by just passing > a command line option. > > > > > Best regards and thank you in advance, > > Christian Klettner > > > > Disclaimer: the convergence of multigrid preconditioners depends a lot > on your actual problem. What I've suggested is just my limited > experience in a few problems I've run solving electric potentials. > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From naromero at alcf.anl.gov Fri Jun 12 10:20:15 2009 From: naromero at alcf.anl.gov (naromero at alcf.anl.gov) Date: Fri, 12 Jun 2009 10:20:15 -0500 (CDT) Subject: non-linear partial differential equations In-Reply-To: <12786127.93191244819358791.JavaMail.root@zimbra> Message-ID: <9062949.93531244820015614.JavaMail.root@zimbra> Hi, I would like to understand if the methods in PETSc are applicable to my problem. I work in the area of density functional theory. The KS equation in real-space (G) is [-(1/2) (nabla)^2 + V_local(G) + V_nlocal(G) + V_H[rho(G)] psi_nG = E_n*psi_nG rho(G) = \sum_n |psi_nG|^2 n is the index on eigenvalues which correspond to the electron energy levels. This KS equation is sparse in real-space and dense in fourier-space. I think strictly speaking it is a non-linear partial differential equation. V_nlocal(G) is an integral operator (short range though), so maybe it is technically a non-linear integro-partial differential equation. I understand that PETSc is a sparse solvers. Does the non-linearity in the partial differential equation make PETSc less applicable to this problem? On one more technical note, we do not store the matrix in sparse format. It is also matrix*vector based. Argonne Leadership Computing Facility Argonne National Laboratory Building 360 Room L-146 9700 South Cass Avenue Argonne, IL 60490 (630) 252-3441 From knepley at gmail.com Fri Jun 12 11:21:08 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Jun 2009 11:21:08 -0500 Subject: non-linear partial differential equations In-Reply-To: <9062949.93531244820015614.JavaMail.root@zimbra> References: <12786127.93191244819358791.JavaMail.root@zimbra> <9062949.93531244820015614.JavaMail.root@zimbra> Message-ID: You can solve matrix-free nonlinear equations with PETSc. If you are actually solving an eigenproblem, I would recommend using SLEPc which has PETSc underneath. Matt On Fri, Jun 12, 2009 at 10:20 AM, wrote: > Hi, > > I would like to understand if the methods in PETSc are applicable to my > problem. > > I work in the area of density functional theory. The KS equation in > real-space (G) is > > [-(1/2) (nabla)^2 + V_local(G) + V_nlocal(G) + V_H[rho(G)] psi_nG = > E_n*psi_nG > > rho(G) = \sum_n |psi_nG|^2 > > n is the index on eigenvalues which correspond to the electron energy > levels. > > This KS equation is sparse in real-space and dense in fourier-space. I > think > strictly speaking it is a non-linear partial differential equation. > V_nlocal(G) > is an integral operator (short range though), so maybe it is technically a > non-linear integro-partial differential equation. > > I understand that PETSc is a sparse solvers. Does the non-linearity in the > partial differential equation make PETSc less applicable to this problem? > > On one more technical note, we do not store the matrix in sparse format. It > is > also matrix*vector based. > > > > Argonne Leadership Computing Facility > Argonne National Laboratory > Building 360 Room L-146 > 9700 South Cass Avenue > Argonne, IL 60490 > (630) 252-3441 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From naromero at alcf.anl.gov Fri Jun 12 17:39:15 2009 From: naromero at alcf.anl.gov (naromero at alcf.anl.gov) Date: Fri, 12 Jun 2009 17:39:15 -0500 (CDT) Subject: non-linear partial differential equations In-Reply-To: <29419805.106691244846185120.JavaMail.root@zimbra> Message-ID: <17215404.106711244846355011.JavaMail.root@zimbra> Matt, Yes, it is a sparse eigenvalue problem. And yes, I have taken a look at SLEPc before. For some of our very large problems, we may get up to 10,000 (out of 10^7) eigenvalues and then SLEPc might need hooks into ScaLAPACK for the subspace diagonalization. Last time I checked ScaLAPACK interface in SLEPc was not available. Nichols A. Romero, Ph.D. Argonne Leadership Computing Facility Argonne National Laboratory Building 360 Room L-146 9700 South Cass Avenue Argonne, IL 60490 (630) 252-3441 ----- Original Message ----- From: "Matthew Knepley" To: "PETSc users list" Sent: Friday, June 12, 2009 11:21:08 AM GMT -06:00 US/Canada Central Subject: Re: non-linear partial differential equations You can solve matrix-free nonlinear equations with PETSc. If you are actually solving an eigenproblem, I would recommend using SLEPc which has PETSc underneath. Matt On Fri, Jun 12, 2009 at 10:20 AM, < naromero at alcf.anl.gov > wrote: Hi, I would like to understand if the methods in PETSc are applicable to my problem. I work in the area of density functional theory. The KS equation in real-space (G) is [-(1/2) (nabla)^2 + V_local(G) + V_nlocal(G) + V_H[rho(G)] psi_nG = E_n*psi_nG rho(G) = \sum_n |psi_nG|^2 n is the index on eigenvalues which correspond to the electron energy levels. This KS equation is sparse in real-space and dense in fourier-space. I think strictly speaking it is a non-linear partial differential equation. V_nlocal(G) is an integral operator (short range though), so maybe it is technically a non-linear integro-partial differential equation. I understand that PETSc is a sparse solvers. Does the non-linearity in the partial differential equation make PETSc less applicable to this problem? On one more technical note, we do not store the matrix in sparse format. It is also matrix*vector based. Argonne Leadership Computing Facility Argonne National Laboratory Building 360 Room L-146 9700 South Cass Avenue Argonne, IL 60490 (630) 252-3441 -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knepley at gmail.com Fri Jun 12 18:01:20 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Jun 2009 18:01:20 -0500 Subject: non-linear partial differential equations In-Reply-To: <17215404.106711244846355011.JavaMail.root@zimbra> References: <29419805.106691244846185120.JavaMail.root@zimbra> <17215404.106711244846355011.JavaMail.root@zimbra> Message-ID: On Fri, Jun 12, 2009 at 5:39 PM, wrote: > Matt, > > Yes, it is a sparse eigenvalue problem. And yes, I have taken a look at > SLEPc before. For some of our > very large problems, we may get up to 10,000 (out of 10^7) eigenvalues and > then SLEPc might need hooks > into ScaLAPACK for the subspace diagonalization. Last time I checked > ScaLAPACK interface in SLEPc > was not available. Not sure why you would need ScaLAPACK. Everything it has is either in PETSc or SLEPc, or PLAPACK, or you can use it through PETSc (which will download it automatically and build it). Matt > > Nichols A. Romero, Ph.D. > Argonne Leadership Computing Facility > Argonne National Laboratory > Building 360 Room L-146 > 9700 South Cass Avenue > Argonne, IL 60490 > (630) 252-3441 > > > ----- Original Message ----- > From: "Matthew Knepley" > To: "PETSc users list" > Sent: Friday, June 12, 2009 11:21:08 AM GMT -06:00 US/Canada Central > Subject: Re: non-linear partial differential equations > > You can solve matrix-free nonlinear equations with PETSc. If you are > actually > solving an eigenproblem, I would recommend using SLEPc which has PETSc > underneath. > > Matt > > > On Fri, Jun 12, 2009 at 10:20 AM, < naromero at alcf.anl.gov > wrote: > > > Hi, > > I would like to understand if the methods in PETSc are applicable to my > problem. > > I work in the area of density functional theory. The KS equation in > real-space (G) is > > [-(1/2) (nabla)^2 + V_local(G) + V_nlocal(G) + V_H[rho(G)] psi_nG = > E_n*psi_nG > > rho(G) = \sum_n |psi_nG|^2 > > n is the index on eigenvalues which correspond to the electron energy > levels. > > This KS equation is sparse in real-space and dense in fourier-space. I > think > strictly speaking it is a non-linear partial differential equation. > V_nlocal(G) > is an integral operator (short range though), so maybe it is technically a > non-linear integro-partial differential equation. > > I understand that PETSc is a sparse solvers. Does the non-linearity in the > partial differential equation make PETSc less applicable to this problem? > > On one more technical note, we do not store the matrix in sparse format. It > is > also matrix*vector based. > > > > Argonne Leadership Computing Facility > Argonne National Laboratory > Building 360 Room L-146 > 9700 South Cass Avenue > Argonne, IL 60490 > (630) 252-3441 > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From torres.pedrozpk at gmail.com Fri Jun 12 18:43:35 2009 From: torres.pedrozpk at gmail.com (Pedro Juan Torres Lopez) Date: Fri, 12 Jun 2009 20:43:35 -0300 Subject: MPE in PETSc Message-ID: Dear PETSc Team, I'm trying to enable mpe log in PETSc. But compiling a example, I get this errors, /home/ptorres/soft/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ex23.c:38: undefined reference to `PetscLogMPEBegin()' /home/ptorres/soft/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ex23.c:193: undefined reference to `PetscLogMPEDump(char const*)' Do I need to define PETSC_HAVE_MPE in the installation step to fix that?. If so, how can I do that?. I really appreciate any clue. I send attached my 'make info' output. Thanks in advance!. Regards Pedro Torres Pos-Gradua??o em Engenharia Mecanica UERJ-Universidade do Estado do Rio de Janeiro Rio de Janeiro - Brasil -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make_info_output Type: application/octet-stream Size: 6557 bytes Desc: not available URL: From knepley at gmail.com Fri Jun 12 19:06:18 2009 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Jun 2009 19:06:18 -0500 Subject: MPE in PETSc In-Reply-To: References: Message-ID: On Fri, Jun 12, 2009 at 6:43 PM, Pedro Juan Torres Lopez < torres.pedrozpk at gmail.com> wrote: > Dear PETSc Team, > > I'm trying to enable mpe log in PETSc. But compiling a example, I get this > errors, > /home/ptorres/soft/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ex23.c:38: > undefined reference to `PetscLogMPEBegin()' > /home/ptorres/soft/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ex23.c:193: > undefined reference to `PetscLogMPEDump(char const*)' > > Do I need to define PETSC_HAVE_MPE in the installation step to fix that?. > If so, how can I do that?. I really appreciate any clue. > I send attached my 'make info' output. > Thanks in advance!. I thought we have taken this support out, however it still seems to be there. You can try it by adding #define PETSC_HAVE_MPE 1 to $PETSC_ARCH/include/petscconf.h and rebuilding. Matt > > Regards > > Pedro Torres > Pos-Gradua??o em Engenharia Mecanica > UERJ-Universidade do Estado do Rio de Janeiro > Rio de Janeiro - Brasil > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Jun 12 23:52:38 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 12 Jun 2009 23:52:38 -0500 (CDT) Subject: MPE in PETSc In-Reply-To: References: Message-ID: On Fri, 12 Jun 2009, Matthew Knepley wrote: > I thought we have taken this support out, however it still seems to be > there. You can try it by adding Why was mpe.py deleted? I see it deleted at http://petsc.cs.iit.edu/petsc/petsc-dev/rev/097385087f06 - but I don't see the reason. Satish From jroman at dsic.upv.es Sat Jun 13 04:45:13 2009 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sat, 13 Jun 2009 11:45:13 +0200 Subject: non-linear partial differential equations In-Reply-To: <17215404.106711244846355011.JavaMail.root@zimbra> References: <17215404.106711244846355011.JavaMail.root@zimbra> Message-ID: <4B57BC1E-0292-439E-9C87-2F950BBE96BB@dsic.upv.es> Dear Nichols, You can use SLEPc to solve the linear eigenproblems that arise within the self-consistency iteration for the KS equation. ScaLAPACK cannot be used because your matrices are not stored explicitly. I know you are concerned about the diagonalization that is required within the iterative eigensolver, since you want to compute a large number of eigenpairs. In a previous communication, I said that we wanted to improve support in SLEPc for this case, and this is in fact done and included in version 3.0.0. Now you can control the growth of the subspace that is used internally by the eigensolver. See SLEPc users manual, section 2.6.4. As an illustration, we have recently tried a real symmetric matrix coming from a computational chemistry application. The dimension of the matrix is about 65,000 and we compute 2,000 eigenpairs in chunks so that the internal diagonalization is of order 300 at most. This computation scales very well (we tried up to 500 processors). My bet is that you can reach matrices of order 10^7 and 10,000 eigenvalues without problems, provided that you have enough memory and processors. We can provide support if necessary. Also, if you give us one of your matrices then we could do some tests. Contact us at the SLEPc maintainers email. Best regards, Jose E. Roman On 13/06/2009, naromero at alcf.anl.gov wrote: > Matt, > > Yes, it is a sparse eigenvalue problem. And yes, I have taken a look > at SLEPc before. For some of our > very large problems, we may get up to 10,000 (out of 10^7) > eigenvalues and then SLEPc might need hooks > into ScaLAPACK for the subspace diagonalization. Last time I checked > ScaLAPACK interface in SLEPc > was not available. > > > Nichols A. Romero, Ph.D. > Argonne Leadership Computing Facility > Argonne National Laboratory > Building 360 Room L-146 > 9700 South Cass Avenue > Argonne, IL 60490 > (630) 252-3441 > > > ----- Original Message ----- > From: "Matthew Knepley" > To: "PETSc users list" > Sent: Friday, June 12, 2009 11:21:08 AM GMT -06:00 US/Canada Central > Subject: Re: non-linear partial differential equations > > You can solve matrix-free nonlinear equations with PETSc. If you are > actually > solving an eigenproblem, I would recommend using SLEPc which has PETSc > underneath. > > Matt > > > On Fri, Jun 12, 2009 at 10:20 AM, < naromero at alcf.anl.gov > wrote: > > > Hi, > > I would like to understand if the methods in PETSc are applicable to > my > problem. > > I work in the area of density functional theory. The KS equation in > real-space (G) is > > [-(1/2) (nabla)^2 + V_local(G) + V_nlocal(G) + V_H[rho(G)] psi_nG = > E_n*psi_nG > > rho(G) = \sum_n |psi_nG|^2 > > n is the index on eigenvalues which correspond to the electron > energy levels. > > This KS equation is sparse in real-space and dense in fourier-space. > I think > strictly speaking it is a non-linear partial differential equation. > V_nlocal(G) > is an integral operator (short range though), so maybe it is > technically a > non-linear integro-partial differential equation. > > I understand that PETSc is a sparse solvers. Does the non-linearity > in the > partial differential equation make PETSc less applicable to this > problem? > > On one more technical note, we do not store the matrix in sparse > format. It is > also matrix*vector based. > > > > Argonne Leadership Computing Facility > Argonne National Laboratory > Building 360 Room L-146 > 9700 South Cass Avenue > Argonne, IL 60490 > (630) 252-3441 > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From ahuramazda10 at gmail.com Sat Jun 13 12:31:24 2009 From: ahuramazda10 at gmail.com (Santolo Felaco) Date: Sat, 13 Jun 2009 19:31:24 +0200 Subject: Matrix Dense for CG Message-ID: <5f76eef60906131031gabc9e5ft2e9357c2bac803de@mail.gmail.com> Hi, someone has some examples where the CG is used with a dense matrix? I need a system difficult to solve with the CG Thank you. S.F. -------------- next part -------------- An HTML attachment was scrubbed... URL: From torres.pedrozpk at gmail.com Sat Jun 13 12:44:12 2009 From: torres.pedrozpk at gmail.com (Pedro Juan Torres Lopez) Date: Sat, 13 Jun 2009 14:44:12 -0300 Subject: MPE in PETSc In-Reply-To: References: Message-ID: 2009/6/12 Matthew Knepley > > I thought we have taken this support out, however it still seems to be > there. You can try it by adding > > #define PETSC_HAVE_MPE 1 > > to $PETSC_ARCH/include/petscconf.h > > and rebuilding. > Thanks Matt, It's works. BTW, if you taken this support out, what another tool for use in PETSc, do you recommend for postmortem performance visualization?. Thanks again. Regards Pedro > > > Matt > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xy2102 at columbia.edu Sat Jun 13 14:50:34 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Sat, 13 Jun 2009 15:50:34 -0400 Subject: PetscExpScalar(x) Message-ID: <20090613155034.tppaasz688wcos8s@cubmail.cc.columbia.edu> Hi, I always has PetscExpScalar(x) returning 1 no matter what value x is. Is there anything wrong? Thanks, -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From knepley at gmail.com Sat Jun 13 14:58:23 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 13 Jun 2009 14:58:23 -0500 Subject: PetscExpScalar(x) In-Reply-To: <20090613155034.tppaasz688wcos8s@cubmail.cc.columbia.edu> References: <20090613155034.tppaasz688wcos8s@cubmail.cc.columbia.edu> Message-ID: On Sat, Jun 13, 2009 at 2:50 PM, (Rebecca) Xuefei YUAN wrote: > Hi, > > I always has PetscExpScalar(x) returning 1 no matter what value x is. Is > there anything wrong? This is unlikely to be the case. I suspect that for some reason you are actually passing 0. I would look at it in the debugger. Matt > > Thanks, > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From irfan.khan at gatech.edu Sat Jun 13 14:56:38 2009 From: irfan.khan at gatech.edu (irfan.khan at gatech.edu) Date: Sat, 13 Jun 2009 15:56:38 -0400 (EDT) Subject: parallel ghosted vectors In-Reply-To: <1699948477.4350561244922993453.JavaMail.root@mail8.gatech.edu> Message-ID: <1735829831.4350581244922998570.JavaMail.root@mail8.gatech.edu> Hi My code makes use of parallel ghosted vectors to transfer data between fluid and solid calculations. I am now trying to model large deformations in solid where the solid may cross several fluid subdomains. This essentially means that the entries for the parallel ghosted vectors would change their owning process. For instance, in case of 2 processes as shown below Process : Entries 0 : 0-n 1 : n+1-m some entries (i,i+1....i+l) can move from process 0 to process 1 at some time t. Is there a way to handled such parallel ghosted vectors without destroying the old and recreating a new parallel ghosted vector with updated entries? Thank you Irfan Graduate Research Assistant G.W. Woodruff School of Mechanical Engineering Georgia Institute of Technology Atlanta, GA. From knepley at gmail.com Sat Jun 13 15:50:39 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 13 Jun 2009 15:50:39 -0500 Subject: parallel ghosted vectors In-Reply-To: <1735829831.4350581244922998570.JavaMail.root@mail8.gatech.edu> References: <1699948477.4350561244922993453.JavaMail.root@mail8.gatech.edu> <1735829831.4350581244922998570.JavaMail.root@mail8.gatech.edu> Message-ID: On Sat, Jun 13, 2009 at 2:56 PM, wrote: > Hi > My code makes use of parallel ghosted vectors to transfer data between > fluid and solid calculations. I am now trying to model large deformations in > solid where the solid may cross several fluid subdomains. This essentially > means that the entries for the parallel ghosted vectors would change their > owning process. > > For instance, in case of 2 processes as shown below > Process : Entries > 0 : 0-n > 1 : n+1-m > > some entries (i,i+1....i+l) can move from process 0 to process 1 at some > time t. > > Is there a way to handled such parallel ghosted vectors without destroying > the old and recreating a new parallel ghosted vector with updated entries? No, the right way to do this is recreate the Vec with the new size, scatter from the old Vec, and destroy the old Vec. Matt > > Thank you > Irfan > Graduate Research Assistant > G.W. Woodruff School of Mechanical Engineering > Georgia Institute of Technology > Atlanta, GA. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.klettner at ucl.ac.uk Sun Jun 14 11:23:48 2009 From: christian.klettner at ucl.ac.uk (Christian Klettner) Date: Sun, 14 Jun 2009 17:23:48 +0100 (BST) Subject: Problems with multiplication scaling In-Reply-To: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: <46665.128.40.55.186.1244996628.squirrel@www.squirrelmail.ucl.ac.uk> Dear PETSc Team, I have used Hypres BoomerAMG to cut the iteration count in solving a Poisson type equation (i.e. Ax=b). The sparse matrix arises from a finite element discretization of the Navier-Stokes equations. However, the performance was very poor and so I checked the multiplication routine in my code. Below are the results for a 1000 250,000x250,000 matrix-vector operations. The time for the multiplications goes from 15.8 seconds to ~11 seconds when changing from 4 to 8 cores. The ratios indicate that there is good load balancing so I was wondering if this is to do with how I configure PETSc??? Or is it my machine-> I am using a 2x quad core 2.3GHz Opteron (Shanghai). Best regards, Christian Klettner ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex4 on a linux-gnu named christian-desktop with 4 processors, by christian Sun Jun 14 16:48:24 2009 Using Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 Max Max/Min Avg Total Time (sec): 1.974e+01 1.00119 1.973e+01 Objects: 1.080e+02 1.00000 1.080e+02 Flops: 8.078e+08 1.00163 8.070e+08 3.228e+09 Flops/sec: 4.095e+07 1.00232 4.090e+07 1.636e+08 Memory: 1.090e+08 1.00942 4.345e+08 MPI Messages: 2.071e+03 2.00000 1.553e+03 6.213e+03 MPI Message Lengths: 2.237e+06 2.00000 1.080e+03 6.712e+06 MPI Reductions: 7.250e+01 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.9730e+01 100.0% 3.2281e+09 100.0% 6.213e+03 100.0% 1.080e+03 100.0% 2.120e+02 73.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run config/configure.py # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 5 1.0 1.2703e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyBegin 3 1.0 2.9233e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 3 0 0 0 0 4 0 VecAssemblyEnd 3 1.0 2.2650e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 1003 1.0 1.8717e-01 4.1 0.00e+00 0.0 6.0e+03 1.1e+03 0.0e+00 1 0 97 95 0 1 0 97 95 0 0 VecScatterEnd 1003 1.0 5.3403e+00 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 20 0 0 0 0 20 0 0 0 0 0 MatMult 1000 1.0 1.5877e+01 1.0 8.08e+08 1.0 6.0e+03 1.1e+03 0.0e+00 80100 97 95 0 80100 97 95 0 203 MatAssemblyBegin 7 1.0 3.6728e-01 1.9 0.00e+00 0.0 6.3e+01 5.0e+03 1.4e+01 1 0 1 5 5 1 0 1 5 7 0 MatAssemblyEnd 7 1.0 8.6817e-01 1.2 0.00e+00 0.0 8.4e+01 2.7e+02 7.0e+01 4 0 1 0 24 4 0 1 0 33 0 MatZeroEntries 7 1.0 5.7693e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage Application Order 2 0 0 0 Index Set 30 30 18476 0 IS L to G Mapping 10 0 0 0 Vec 30 7 9128 0 Vec Scatter 15 0 0 0 Matrix 21 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 2.14577e-07 Average time for MPI_Barrier(): 5.89848e-05 Average time for zero size MPI_Send(): 6.80089e-05 #PETSc Option Table entries: -log_summary output1 #End o PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Fri Jun 12 16:59:30 2009 Configure options: --with-cc="gcc -fPIC" --download-mpich=1 --download-f-blas-lapack --download-triangle --download-parmetis --with-hypre=1 --download-hypre=1 --with-shared=0 ----------------------------------------- Libraries compiled on Fri Jun 12 17:11:54 BST 2009 on christian-desktop Machine characteristics: Linux christian-desktop 2.6.27-7-generic #1 SMP Fri Oct 24 06:40:41 UTC 2008 x86_64 GNU/Linux Using PETSc directory: /home/christian/Desktop/petsc-3.0.0-p4 Using PETSc arch: linux-gnu-c-debug ----------------------------------------- Using C compiler: /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 Using Fortran compiler: /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpif90 -Wall -Wno-unused-variable -g ----------------------------------------- Using include paths: -I/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/include -I/home/christian/Desktop/petsc-3.0.0-p4/include -I/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/include ------------------------------------------ Using C linker: /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 Using Fortran linker: /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpif90 -Wall -Wno-unused-variable -g Using libraries: -Wl,-rpath,/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -Wl,-rpath,/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -ltriangle -lparmetis -lmetis -lHYPRE -lmpichcxx -lstdc++ -lflapack -lfblas -lnsl -lrt -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -L/usr/lib/gcc/x86_64-linux-gnu/4.3.2 -L/lib -ldl -lmpich -lpthread -lrt -lgcc_s -lmpichf90 -lgfortranbegin -lgfortran -lm -L/usr/lib/gcc/x86_64-linux-gnu -lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl -lmpich -lpthread -lrt -lgcc_s -ldl ------------------------------------------ ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex4 on a linux-gnu named christian-desktop with 8 processors, by christian Sun Jun 14 17:13:40 2009 Using Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 Max Max/Min Avg Total Time (sec): 1.452e+01 1.01190 1.443e+01 Objects: 1.080e+02 1.00000 1.080e+02 Flops: 3.739e+08 1.00373 3.731e+08 2.985e+09 Flops/sec: 2.599e+07 1.01190 2.585e+07 2.068e+08 Memory: 5.157e+07 1.01231 4.117e+08 MPI Messages: 2.071e+03 2.00000 1.812e+03 1.450e+04 MPI Message Lengths: 2.388e+06 2.00000 1.153e+03 1.672e+07 MPI Reductions: 3.625e+01 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.4431e+01 100.0% 2.9847e+09 100.0% 1.450e+04 100.0% 1.153e+03 100.0% 2.120e+02 73.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run config/configure.py # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 5 1.0 6.1178e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyBegin 3 1.0 7.7400e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 9.0e+00 0 0 0 0 3 0 0 0 0 4 0 VecAssemblyEnd 3 1.0 4.1008e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 1003 1.0 1.0858e-01 2.9 0.00e+00 0.0 1.4e+04 1.1e+03 0.0e+00 1 0 97 95 0 1 0 97 95 0 0 VecScatterEnd 1003 1.0 5.3962e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 33 0 0 0 0 33 0 0 0 0 0 MatMult 1000 1.0 1.1430e+01 1.0 3.74e+08 1.0 1.4e+04 1.1e+03 0.0e+00 79100 97 95 0 79100 97 95 0 261 MatAssemblyBegin 7 1.0 4.6307e-01 1.8 0.00e+00 0.0 1.5e+02 5.3e+03 1.4e+01 3 0 1 5 5 3 0 1 5 7 0 MatAssemblyEnd 7 1.0 6.9013e-01 1.3 0.00e+00 0.0 2.0e+02 2.8e+02 7.0e+01 4 0 1 0 24 4 0 1 0 33 0 MatZeroEntries 7 1.0 2.7971e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage Application Order 2 0 0 0 Index Set 30 30 18476 0 IS L to G Mapping 10 0 0 0 Vec 30 7 9128 0 Vec Scatter 15 0 0 0 Matrix 21 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 0.000419807 Average time for zero size MPI_Send(): 0.000115991 #PETSc Option Table entries: -log_summary output18 #End o PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Fri Jun 12 16:59:30 2009 Configure options: --with-cc="gcc -fPIC" --download-mpich=1 --download-f-blas-lapack --download-triangle --download-parmetis --with-hypre=1 --download-hypre=1 --with-shared=0 ----------------------------------------- Libraries compiled on Fri Jun 12 17:11:54 BST 2009 on christian-desktop Machine characteristics: Linux christian-desktop 2.6.27-7-generic #1 SMP Fri Oct 24 06:40:41 UTC 2008 x86_64 GNU/Linux Using PETSc directory: /home/christian/Desktop/petsc-3.0.0-p4 Using PETSc arch: linux-gnu-c-debug ----------------------------------------- Using C compiler: /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 Using Fortran compiler: /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpif90 -Wall -Wno-unused-variable -g ----------------------------------------- Using include paths: -I/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/include -I/home/christian/Desktop/petsc-3.0.0-p4/include -I/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/include ------------------------------------------ Using C linker: /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 Using Fortran linker: /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpif90 -Wall -Wno-unused-variable -g Using libraries: -Wl,-rpath,/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -Wl,-rpath,/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -ltriangle -lparmetis -lmetis -lHYPRE -lmpichcxx -lstdc++ -lflapack -lfblas -lnsl -lrt -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -L/usr/lib/gcc/x86_64-linux-gnu/4.3.2 -L/lib -ldl -lmpich -lpthread -lrt -lgcc_s -lmpichf90 -lgfortranbegin -lgfortran -lm -L/usr/lib/gcc/x86_64-linux-gnu -lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl -lmpich -lpthread -lrt -lgcc_s -ldl ------------------------------------------ From knepley at gmail.com Sun Jun 14 15:01:28 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 14 Jun 2009 15:01:28 -0500 Subject: Problems with multiplication scaling In-Reply-To: <46665.128.40.55.186.1244996628.squirrel@www.squirrelmail.ucl.ac.uk> References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> <46665.128.40.55.186.1244996628.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: Matvec is a bandwidth limited operation, so adding more compute power will not usually make it go much faster. Hardware manufacturers don't tell you this stuff. Matt On Sun, Jun 14, 2009 at 11:23 AM, Christian Klettner < christian.klettner at ucl.ac.uk> wrote: > Dear PETSc Team, > I have used Hypres BoomerAMG to cut the iteration count in solving a > Poisson type equation (i.e. Ax=b). The sparse matrix arises from a finite > element discretization of the Navier-Stokes equations. However, the > performance was very poor and so I checked the multiplication routine in > my code. Below are the results for a 1000 250,000x250,000 matrix-vector > operations. The time for the multiplications goes from 15.8 seconds to ~11 > seconds when changing from 4 to 8 cores. The ratios indicate that there is > good load balancing so I was wondering if this is to do with how I > configure PETSc??? Or is it my machine-> > I am using a 2x quad core 2.3GHz Opteron (Shanghai). > Best regards, > Christian Klettner > > > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > ./ex4 on a linux-gnu named christian-desktop with 4 processors, by > christian Sun Jun 14 16:48:24 2009 > Using Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 > > Max Max/Min Avg Total > Time (sec): 1.974e+01 1.00119 1.973e+01 > Objects: 1.080e+02 1.00000 1.080e+02 > Flops: 8.078e+08 1.00163 8.070e+08 3.228e+09 > Flops/sec: 4.095e+07 1.00232 4.090e+07 1.636e+08 > Memory: 1.090e+08 1.00942 4.345e+08 > MPI Messages: 2.071e+03 2.00000 1.553e+03 6.213e+03 > MPI Message Lengths: 2.237e+06 2.00000 1.080e+03 6.712e+06 > MPI Reductions: 7.250e+01 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 1.9730e+01 100.0% 3.2281e+09 100.0% 6.213e+03 > 100.0% 1.080e+03 100.0% 2.120e+02 73.1% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was compiled with a debugging option, # > # To get timing results run config/configure.py # > # using --with-debugging=no, the performance will # > # be generally two or three times faster. # > # # > ########################################################## > > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > VecSet 5 1.0 1.2703e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAssemblyBegin 3 1.0 2.9233e-03 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 9.0e+00 0 0 0 0 3 0 0 0 0 4 0 > VecAssemblyEnd 3 1.0 2.2650e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 1003 1.0 1.8717e-01 4.1 0.00e+00 0.0 6.0e+03 1.1e+03 > 0.0e+00 1 0 97 95 0 1 0 97 95 0 0 > VecScatterEnd 1003 1.0 5.3403e+00 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 20 0 0 0 0 20 0 0 0 0 0 > MatMult 1000 1.0 1.5877e+01 1.0 8.08e+08 1.0 6.0e+03 1.1e+03 > 0.0e+00 80100 97 95 0 80100 97 95 0 203 > MatAssemblyBegin 7 1.0 3.6728e-01 1.9 0.00e+00 0.0 6.3e+01 5.0e+03 > 1.4e+01 1 0 1 5 5 1 0 1 5 7 0 > MatAssemblyEnd 7 1.0 8.6817e-01 1.2 0.00e+00 0.0 8.4e+01 2.7e+02 > 7.0e+01 4 0 1 0 24 4 0 1 0 33 0 > MatZeroEntries 7 1.0 5.7693e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > > --- Event Stage 0: Main Stage > > Application Order 2 0 0 0 > Index Set 30 30 18476 0 > IS L to G Mapping 10 0 0 0 > Vec 30 7 9128 0 > Vec Scatter 15 0 0 0 > Matrix 21 0 0 0 > > ======================================================================================================================== > Average time to get PetscTime(): 2.14577e-07 > Average time for MPI_Barrier(): 5.89848e-05 > Average time for zero size MPI_Send(): 6.80089e-05 > #PETSc Option Table entries: > -log_summary output1 > #End o PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > Configure run at: Fri Jun 12 16:59:30 2009 > Configure options: --with-cc="gcc -fPIC" --download-mpich=1 > --download-f-blas-lapack --download-triangle --download-parmetis > --with-hypre=1 --download-hypre=1 --with-shared=0 > ----------------------------------------- > Libraries compiled on Fri Jun 12 17:11:54 BST 2009 on christian-desktop > Machine characteristics: Linux christian-desktop 2.6.27-7-generic #1 SMP > Fri Oct 24 06:40:41 UTC 2008 x86_64 GNU/Linux > Using PETSc directory: /home/christian/Desktop/petsc-3.0.0-p4 > Using PETSc arch: linux-gnu-c-debug > ----------------------------------------- > Using C compiler: > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpicc -Wall > -Wwrite-strings -Wno-strict-aliasing -g3 > Using Fortran compiler: > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpif90 -Wall > -Wno-unused-variable -g > ----------------------------------------- > Using include paths: > -I/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/include > -I/home/christian/Desktop/petsc-3.0.0-p4/include > -I/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/include > ------------------------------------------ > Using C linker: > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpicc -Wall > -Wwrite-strings -Wno-strict-aliasing -g3 > Using Fortran linker: > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpif90 -Wall > -Wno-unused-variable -g > Using libraries: > -Wl,-rpath,/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -lpetscts > -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc > -Wl,-rpath,/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -ltriangle > -lparmetis -lmetis -lHYPRE -lmpichcxx -lstdc++ -lflapack -lfblas -lnsl > -lrt -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > -L/usr/lib/gcc/x86_64-linux-gnu/4.3.2 -L/lib -ldl -lmpich -lpthread -lrt > -lgcc_s -lmpichf90 -lgfortranbegin -lgfortran -lm > -L/usr/lib/gcc/x86_64-linux-gnu -lm -lmpichcxx -lstdc++ -lmpichcxx > -lstdc++ -ldl -lmpich -lpthread -lrt -lgcc_s -ldl > ------------------------------------------ > > > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > ./ex4 on a linux-gnu named christian-desktop with 8 processors, by > christian Sun Jun 14 17:13:40 2009 > Using Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 > > Max Max/Min Avg Total > Time (sec): 1.452e+01 1.01190 1.443e+01 > Objects: 1.080e+02 1.00000 1.080e+02 > Flops: 3.739e+08 1.00373 3.731e+08 2.985e+09 > Flops/sec: 2.599e+07 1.01190 2.585e+07 2.068e+08 > Memory: 5.157e+07 1.01231 4.117e+08 > MPI Messages: 2.071e+03 2.00000 1.812e+03 1.450e+04 > MPI Message Lengths: 2.388e+06 2.00000 1.153e+03 1.672e+07 > MPI Reductions: 3.625e+01 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 1.4431e+01 100.0% 2.9847e+09 100.0% 1.450e+04 > 100.0% 1.153e+03 100.0% 2.120e+02 73.1% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was compiled with a debugging option, # > # To get timing results run config/configure.py # > # using --with-debugging=no, the performance will # > # be generally two or three times faster. # > # # > ########################################################## > > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > VecSet 5 1.0 6.1178e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAssemblyBegin 3 1.0 7.7400e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 9.0e+00 0 0 0 0 3 0 0 0 0 4 0 > VecAssemblyEnd 3 1.0 4.1008e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 1003 1.0 1.0858e-01 2.9 0.00e+00 0.0 1.4e+04 1.1e+03 > 0.0e+00 1 0 97 95 0 1 0 97 95 0 0 > VecScatterEnd 1003 1.0 5.3962e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 33 0 0 0 0 33 0 0 0 0 0 > MatMult 1000 1.0 1.1430e+01 1.0 3.74e+08 1.0 1.4e+04 1.1e+03 > 0.0e+00 79100 97 95 0 79100 97 95 0 261 > MatAssemblyBegin 7 1.0 4.6307e-01 1.8 0.00e+00 0.0 1.5e+02 5.3e+03 > 1.4e+01 3 0 1 5 5 3 0 1 5 7 0 > MatAssemblyEnd 7 1.0 6.9013e-01 1.3 0.00e+00 0.0 2.0e+02 2.8e+02 > 7.0e+01 4 0 1 0 24 4 0 1 0 33 0 > MatZeroEntries 7 1.0 2.7971e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > > --- Event Stage 0: Main Stage > > Application Order 2 0 0 0 > Index Set 30 30 18476 0 > IS L to G Mapping 10 0 0 0 > Vec 30 7 9128 0 > Vec Scatter 15 0 0 0 > Matrix 21 0 0 0 > > ======================================================================================================================== > Average time to get PetscTime(): 9.53674e-08 > Average time for MPI_Barrier(): 0.000419807 > Average time for zero size MPI_Send(): 0.000115991 > #PETSc Option Table entries: > -log_summary output18 > #End o PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > Configure run at: Fri Jun 12 16:59:30 2009 > Configure options: --with-cc="gcc -fPIC" --download-mpich=1 > --download-f-blas-lapack --download-triangle --download-parmetis > --with-hypre=1 --download-hypre=1 --with-shared=0 > ----------------------------------------- > Libraries compiled on Fri Jun 12 17:11:54 BST 2009 on christian-desktop > Machine characteristics: Linux christian-desktop 2.6.27-7-generic #1 SMP > Fri Oct 24 06:40:41 UTC 2008 x86_64 GNU/Linux > Using PETSc directory: /home/christian/Desktop/petsc-3.0.0-p4 > Using PETSc arch: linux-gnu-c-debug > ----------------------------------------- > Using C compiler: > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpicc -Wall > -Wwrite-strings -Wno-strict-aliasing -g3 > Using Fortran compiler: > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpif90 -Wall > -Wno-unused-variable -g > ----------------------------------------- > Using include paths: > -I/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/include > -I/home/christian/Desktop/petsc-3.0.0-p4/include > -I/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/include > ------------------------------------------ > Using C linker: > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpicc -Wall > -Wwrite-strings -Wno-strict-aliasing -g3 > Using Fortran linker: > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/bin/mpif90 -Wall > -Wno-unused-variable -g > Using libraries: > -Wl,-rpath,/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -lpetscts > -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc > -Wl,-rpath,/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib -ltriangle > -lparmetis -lmetis -lHYPRE -lmpichcxx -lstdc++ -lflapack -lfblas -lnsl > -lrt -L/home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > -L/usr/lib/gcc/x86_64-linux-gnu/4.3.2 -L/lib -ldl -lmpich -lpthread -lrt > -lgcc_s -lmpichf90 -lgfortranbegin -lgfortran -lm > -L/usr/lib/gcc/x86_64-linux-gnu -lm -lmpichcxx -lstdc++ -lmpichcxx > -lstdc++ -ldl -lmpich -lpthread -lrt -lgcc_s -ldl > ------------------------------------------ > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From guillaume.alleon at gmail.com Tue Jun 16 05:38:10 2009 From: guillaume.alleon at gmail.com (tog) Date: Tue, 16 Jun 2009 18:38:10 +0800 Subject: Sorting a vector Message-ID: Hi there This is a kind of newbie question :) I have a vector which is the solution of my simulation, I want then to extract the p indices corresponding to the p biggest values in my vector. Is there a way to do this with PETSc ? What if I want to sort my vector to get the indices in ascending order of their values ? Thanks Guillaume -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.klettner at ucl.ac.uk Tue Jun 16 07:40:21 2009 From: christian.klettner at ucl.ac.uk (Christian Klettner) Date: Tue, 16 Jun 2009 13:40:21 +0100 (BST) Subject: What is the best solver a poisson type eqn. In-Reply-To: References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: <55020.128.40.55.186.1245156021.squirrel@www.squirrelmail.ucl.ac.uk> Hi Matt, I have tried to use MUMPS and came across the following problems. My application solves Ax=b about 10000-100000 times with A remaining constant. When I tried to use it through KSP I was not finding good performance. Could this be because it was refactoring etc. at each time step? With this in mind I have tried to implement the following: A is a parallel matrix created with MatCreateMPIAIJ(). rows is a list of the global row numbers on the process.cols is a list of the global columns that are on the process. I run the program with ./mpiexec -n 4 ./ex4 -pc_factor_mat_solver_package mumps (1) MatFactorInfo *info; (2) ierr=MatFactorInfoInitialize(info);CHKERRQ(ierr); (3) ierr=MatGetFactor(A,MAT_SOLVER_MUMPS,MAT_FACTOR_LU,&F);CHKERRQ(ierr); (4) ierr=MatLUFactorSymbolic(F, A, rows, cols,info);CHKERRQ(ierr); (5) MatLUFactorNumeric(F, A,info);CHKERRQ(ierr); for(){ ///TEMPORAL LOOP /*CODING*/ ierr=MatSolve(A,vecr,vecu);CHKERRQ(ierr); } I get the following error messages: Line (2) above gives the following error message: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: [0]PETSC ERROR: Null argument, when expecting valid pointer! [0]PETSC ERROR: Trying to zero at a null pointer! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by christian Tue Jun 16 13:33:33 2009 [0]PETSC ERROR: Libraries linked from /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" --download-mpich=1 --download-f-blas-lapack --download-scalapack --download-blacs --download-mumps --download-parmetis --download-hypre --download-triangle --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscMemzero() line 189 in src/sys/utils/memc.c [0]PETSC ERROR: MatFactorInfoInitialize() line 7123 in src/mat/interface/matrix.c [0]PETSC ERROR: main() line 1484 in src/dm/ao/examples/tutorials/ex4.c application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0[cli_0]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Null argument, when expecting valid pointer! [2]PETSC ERROR: Trying to zero at a null pointer! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html[0]0:Return code = 85 [0]1:Return code = 0, signaled with Interrupt [0]2:Return code = 0, signaled with Interrupt [0]3:Return code = 0, signaled with Interrupt ///////////////////////////////////////////////////////////////// Line (3) gives the following error messages: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Invalid argument! [0]PETSC ERROR: Wrong type of object: Parameter # 2! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Invalid argument! [2]PETSC ERROR: Wrong type of object: Parameter # 2! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by christian Tue Jun 16 13:37:36 2009 [2]PETSC ERROR: Libraries linked from /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib [2]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 [2]PETSC ERROR: Configure options --with-cc="gcc -fPIC" --download-mpich=1 --download-f-blas-lapack --download-scalapack --download-blacs --downlPetsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by christian Tue Jun 16 13:37:36 2009 [0]PETSC ERROR: Libraries linked from /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" --download-mpich=1 --download-f-blas-lapack --download-scalapack --download-blacs --download-mumps --download-parmetis --download-hypre --download-triangle --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: MatLUFactorSymbolic() line 2311 in src/mat/interface/matrix.c [0]PETSC ERROR: main() line 148oad-mumps --download-parmetis --download-hypre --download-triangle --with-shared=0 [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: MatLUFactorSymbolic() line 2311 in src/mat/interface/matrix.c [2]PETSC ERROR: main() line 1488 in src/dm/ao/examples/tutorials/ex4.c application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2[cli_2]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2 8 in src/dm/ao/examples/tutorials/ex4.c application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0[cli_0]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 [0]0:Return code = 62 [0]1:Return code = 0, signaled with Interrupt [0]2:Return code = 62 [0]3:Return code = 0, signaled with Interrupt Is the matrix (ie parameter 2) in teh wrong state because it's not a MUMPS matrix? Any help would be greatly appreciated, Best regards, Christian Klettner > The problem is small enough that you might be able to use MUMPS. > > Matt > > On Fri, Jun 12, 2009 at 9:31 AM, Lisandro Dalcin > wrote: > >> On Fri, Jun 12, 2009 at 11:13 AM, Christian >> Klettner wrote: >> > Sorry that I sent this twice. No subject in the first one. >> > >> > Dear PETSc Team, >> > I am writing a CFD finite element code in C. From the discretization >> of >> > the governing equations I have to solve a Poisson type equation which >> is >> > really killing my performance. Which solver/preconditioner from PETSc >> or >> > any external packages would you recommend? The size of my problem is >> from >> > ~30000-100000 DOF per core. What kind of performance would I be able >> to >> > expect with this solver/preconditioner? >> >> I would suggest KSPCG. As preconditioner I would use ML or >> HYPRE/BoomerAMG (both are external packages) >> >> > I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the domain >> > with Parmetis. The mesh is unstructured. >> > Also, I am writing a code which studies free surface phenomena so the >> mesh >> > is continually changing. Does this matter when choosing a >> > solver/preconditioner? My left hand side matrix (A in Ax=b) does not >> > change in time. >> >> ML has a faster setup that BoomerAMG, but the convergence is a bit >> slower. If your A matrix do not change, then likely BoomerAMG will be >> better for you. In any case, you can try both: just build PETSc with >> both packages, then you can change the preconditioner by just passing >> a command line option. >> >> > >> > Best regards and thank you in advance, >> > Christian Klettner >> > >> >> Disclaimer: the convergence of multigrid preconditioners depends a lot >> on your actual problem. What I've suggested is just my limited >> experience in a few problems I've run solving electric potentials. >> >> >> -- >> Lisandro Dalc?n >> --------------- >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina >> Tel/Fax: +54-(0)342-451.1594 >> > > > > -- > What most experimenters take for granted before they begin their > experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From knepley at gmail.com Tue Jun 16 07:52:04 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Jun 2009 07:52:04 -0500 Subject: What is the best solver a poisson type eqn. In-Reply-To: <55020.128.40.55.186.1245156021.squirrel@www.squirrelmail.ucl.ac.uk> References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> <55020.128.40.55.186.1245156021.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: On Tue, Jun 16, 2009 at 7:40 AM, Christian Klettner < christian.klettner at ucl.ac.uk> wrote: > Hi Matt, > I have tried to use MUMPS and came across the following problems. My > application solves Ax=b about 10000-100000 times with A remaining > constant. > When I tried to use it through KSP I was not finding good performance. > Could this be because it was refactoring etc. at each time step? > With this in mind I have tried to implement the following: 1) I have no idea what you would mean by good performance. Always, always, always send the -log_summary. 2) A matrix is never refactored during the KSP iteration > A is a parallel matrix created with MatCreateMPIAIJ(). > rows is a list of the global row numbers on the process.cols is a list of > the global columns that are on the process. > I run the program with ./mpiexec -n 4 ./ex4 -pc_factor_mat_solver_package > mumps > > (1) MatFactorInfo *info; > (2) ierr=MatFactorInfoInitialize(info);CHKERRQ(ierr); You have passed a meaningless pointer here. (1) MatFactorInfo info; (2) ierr=MatFactorInfoInitialize(&info);CHKERRQ(ierr); Matt > (3) > ierr=MatGetFactor(A,MAT_SOLVER_MUMPS,MAT_FACTOR_LU,&F);CHKERRQ(ierr); > (4) ierr=MatLUFactorSymbolic(F, A, rows, cols,info);CHKERRQ(ierr); > (5) MatLUFactorNumeric(F, A,info);CHKERRQ(ierr); > > > for(){ ///TEMPORAL LOOP > > /*CODING*/ > > ierr=MatSolve(A,vecr,vecu);CHKERRQ(ierr); > } > > I get the following error messages: > > Line (2) above gives the following error message: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: [0]PETSC ERROR: Null argument, when expecting valid > pointer! > [0]PETSC ERROR: Trying to zero at a null pointer! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 > CST 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by christian > Tue Jun 16 13:33:33 2009 > [0]PETSC ERROR: Libraries linked from > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 > [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" --download-mpich=1 > --download-f-blas-lapack --download-scalapack --download-blacs > --download-mumps --download-parmetis --download-hypre --download-triangle > --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMemzero() line 189 in src/sys/utils/memc.c > [0]PETSC ERROR: MatFactorInfoInitialize() line 7123 in > src/mat/interface/matrix.c > [0]PETSC ERROR: main() line 1484 in src/dm/ao/examples/tutorials/ex4.c > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0[cli_0]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 > --------------------- Error Message ------------------------------------ > [2]PETSC ERROR: Null argument, when expecting valid pointer! > [2]PETSC ERROR: Trying to zero at a null pointer! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 > CST 2009 > [2]PETSC ERROR: See docs/changes/index.html for recent updates. > [2]PETSC ERROR: See docs/faq.html[0]0:Return code = 85 > [0]1:Return code = 0, signaled with Interrupt > [0]2:Return code = 0, signaled with Interrupt > [0]3:Return code = 0, signaled with Interrupt > > ///////////////////////////////////////////////////////////////// > > Line (3) gives the following error messages: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 2! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: Invalid argument! > [2]PETSC ERROR: Wrong type of object: Parameter # 2! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 > CST 2009 > [2]PETSC ERROR: See docs/changes/index.html for recent updates. > [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [2]PETSC ERROR: See docs/index.html for manual pages. > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by christian > Tue Jun 16 13:37:36 2009 > [2]PETSC ERROR: Libraries linked from > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > [2]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 > [2]PETSC ERROR: Configure options --with-cc="gcc -fPIC" --download-mpich=1 > --download-f-blas-lapack --download-scalapack --download-blacs > --downlPetsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by christian > Tue Jun 16 13:37:36 2009 > [0]PETSC ERROR: Libraries linked from > /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 > [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" --download-mpich=1 > --download-f-blas-lapack --download-scalapack --download-blacs > --download-mumps --download-parmetis --download-hypre --download-triangle > --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: MatLUFactorSymbolic() line 2311 in > src/mat/interface/matrix.c > [0]PETSC ERROR: main() line 148oad-mumps --download-parmetis > --download-hypre --download-triangle --with-shared=0 > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: MatLUFactorSymbolic() line 2311 in > src/mat/interface/matrix.c > [2]PETSC ERROR: main() line 1488 in src/dm/ao/examples/tutorials/ex4.c > application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2[cli_2]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2 > 8 in src/dm/ao/examples/tutorials/ex4.c > application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0[cli_0]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 > [0]0:Return code = 62 > [0]1:Return code = 0, signaled with Interrupt > [0]2:Return code = 62 > [0]3:Return code = 0, signaled with Interrupt > > Is the matrix (ie parameter 2) in teh wrong state because it's not a MUMPS > matrix? > > Any help would be greatly appreciated, > Best regards, > Christian Klettner > > > > > The problem is small enough that you might be able to use MUMPS. > > > > Matt > > > > On Fri, Jun 12, 2009 at 9:31 AM, Lisandro Dalcin > > wrote: > > > >> On Fri, Jun 12, 2009 at 11:13 AM, Christian > >> Klettner wrote: > >> > Sorry that I sent this twice. No subject in the first one. > >> > > >> > Dear PETSc Team, > >> > I am writing a CFD finite element code in C. From the discretization > >> of > >> > the governing equations I have to solve a Poisson type equation which > >> is > >> > really killing my performance. Which solver/preconditioner from PETSc > >> or > >> > any external packages would you recommend? The size of my problem is > >> from > >> > ~30000-100000 DOF per core. What kind of performance would I be able > >> to > >> > expect with this solver/preconditioner? > >> > >> I would suggest KSPCG. As preconditioner I would use ML or > >> HYPRE/BoomerAMG (both are external packages) > >> > >> > I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the domain > >> > with Parmetis. The mesh is unstructured. > >> > Also, I am writing a code which studies free surface phenomena so the > >> mesh > >> > is continually changing. Does this matter when choosing a > >> > solver/preconditioner? My left hand side matrix (A in Ax=b) does not > >> > change in time. > >> > >> ML has a faster setup that BoomerAMG, but the convergence is a bit > >> slower. If your A matrix do not change, then likely BoomerAMG will be > >> better for you. In any case, you can try both: just build PETSc with > >> both packages, then you can change the preconditioner by just passing > >> a command line option. > >> > >> > > >> > Best regards and thank you in advance, > >> > Christian Klettner > >> > > >> > >> Disclaimer: the convergence of multigrid preconditioners depends a lot > >> on your actual problem. What I've suggested is just my limited > >> experience in a few problems I've run solving electric potentials. > >> > >> > >> -- > >> Lisandro Dalc?n > >> --------------- > >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina > >> Tel/Fax: +54-(0)342-451.1594 > >> > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jun 16 07:56:11 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Jun 2009 07:56:11 -0500 Subject: Sorting a vector In-Reply-To: References: Message-ID: On Tue, Jun 16, 2009 at 5:38 AM, tog wrote: > Hi there > > This is a kind of newbie question :) > I have a vector which is the solution of my simulation, I want then to > extract the p indices corresponding to the p biggest values in my vector. Is > there a way to do this with PETSc ? You can get p=1 with VecMax(). > > What if I want to sort my vector to get the indices in ascending order of > their values ? The p=2+ and permutation cases are sophisticated in parallel and not present. In serial, you can use VecGetArray() and PetscSortReal()/SortRealWithPermutation(). Matt > > Thanks > Guillaume > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From guillaume.alleon at gmail.com Tue Jun 16 08:30:52 2009 From: guillaume.alleon at gmail.com (tog) Date: Tue, 16 Jun 2009 21:30:52 +0800 Subject: Sorting a vector In-Reply-To: References: Message-ID: Thanks Matthew, Well I am interested by the parallel case unfortunately ;) Any reference paper that could help me to implement this ? Guillaume On Tue, Jun 16, 2009 at 8:56 PM, Matthew Knepley wrote: > On Tue, Jun 16, 2009 at 5:38 AM, tog wrote: > >> Hi there >> >> This is a kind of newbie question :) >> I have a vector which is the solution of my simulation, I want then to >> extract the p indices corresponding to the p biggest values in my vector. Is >> there a way to do this with PETSc ? > > > You can get p=1 with VecMax(). > > >> >> What if I want to sort my vector to get the indices in ascending order of >> their values ? > > > The p=2+ and permutation cases are sophisticated in parallel and not > present. In serial, you can use > VecGetArray() and PetscSortReal()/SortRealWithPermutation(). > > Matt > > >> >> Thanks >> Guillaume >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- PGP KeyID: 1024D/69B00854 subkeys.pgp.net http://cheztog.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jun 16 09:51:29 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Jun 2009 09:51:29 -0500 Subject: Sorting a vector In-Reply-To: References: Message-ID: On Tue, Jun 16, 2009 at 8:30 AM, tog wrote: > Thanks Matthew, > > Well I am interested by the parallel case unfortunately ;) > Any reference paper that could help me to implement this ? Maybe you can use STAPL (http://parasol.tamu.edu/stapl/) Matt > > Guillaume > > > On Tue, Jun 16, 2009 at 8:56 PM, Matthew Knepley wrote: > >> On Tue, Jun 16, 2009 at 5:38 AM, tog wrote: >> >>> Hi there >>> >>> This is a kind of newbie question :) >>> I have a vector which is the solution of my simulation, I want then to >>> extract the p indices corresponding to the p biggest values in my vector. Is >>> there a way to do this with PETSc ? >> >> >> You can get p=1 with VecMax(). >> >> >>> >>> What if I want to sort my vector to get the indices in ascending order of >>> their values ? >> >> >> The p=2+ and permutation cases are sophisticated in parallel and not >> present. In serial, you can use >> VecGetArray() and PetscSortReal()/SortRealWithPermutation(). >> >> Matt >> >> >>> >>> Thanks >>> Guillaume >>> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > PGP KeyID: 1024D/69B00854 subkeys.pgp.net > > http://cheztog.blogspot.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From sapphire.jxy at gmail.com Tue Jun 16 12:38:27 2009 From: sapphire.jxy at gmail.com (xiaoyin ji) Date: Tue, 16 Jun 2009 13:38:27 -0400 Subject: Can PETSc detect the number of CPUs on each computer node? Message-ID: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> Hi there, I'm using PETSc MATMPIAIJ and ksp solver. It seems that PETSc will run obviously faster if I set the number of CPUs close to the number of computer nodes in the job file. By default MPIAIJ matrix is stored in different processors and ksp solver will communicate for each step, however since on each node several CPUs share the same memory while ksp may still try to communicate through network card, this may mess up a bit. Is there any way to detect which CPUs are sharing the same memory? Thanks a lot. Best, Xiaoyin Ji From knepley at gmail.com Tue Jun 16 12:53:35 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Jun 2009 12:53:35 -0500 Subject: Can PETSc detect the number of CPUs on each computer node? In-Reply-To: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> References: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> Message-ID: On Tue, Jun 16, 2009 at 12:38 PM, xiaoyin ji wrote: > Hi there, > > I'm using PETSc MATMPIAIJ and ksp solver. It seems that PETSc will run > obviously faster if I set the number of CPUs close to the number of > computer nodes in the job file. By default MPIAIJ matrix is stored in > different processors and ksp solver will communicate for each step, > however since on each node several CPUs share the same memory while > ksp may still try to communicate through network card, this may mess > up a bit. Is there any way to detect which CPUs are sharing the same > memory? Thanks a lot. The interface for this is mpirun or the job submission mechanism. Matt > > Best, > Xiaoyin Ji > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Jun 16 13:13:16 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 16 Jun 2009 13:13:16 -0500 (CDT) Subject: Can PETSc detect the number of CPUs on each computer node? In-Reply-To: References: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> Message-ID: On Tue, 16 Jun 2009, Matthew Knepley wrote: > On Tue, Jun 16, 2009 at 12:38 PM, xiaoyin ji wrote: > > > Hi there, > > > > I'm using PETSc MATMPIAIJ and ksp solver. It seems that PETSc will run > > obviously faster if I set the number of CPUs close to the number of > > computer nodes in the job file. By default MPIAIJ matrix is stored in > > different processors and ksp solver will communicate for each step, > > however since on each node several CPUs share the same memory while > > ksp may still try to communicate through network card, this may mess > > up a bit. Is there any way to detect which CPUs are sharing the same > > memory? Thanks a lot. > > > The interface for this is mpirun or the job submission mechanism. One additional note: If you are scheduling multiple MPI jobs on the same machine [because its has multiple cores] - the reduced performance you notice could be due to 2 issues: * MPI not communicating optimally between the cores within the same node. For ex: mpich2-1 default install - i.e device=nemesis tries to be efficient for comunication between multiple cores within the node - as well as between nodes. [There could be similar configs for other MPI impls] * Within multi-core machines - the FPUs scale up with the number of cores, but the memory bandwidth does not scale up in the same linear way. Since achieved performance is a function of both - one should not expect linear speedup on multi-core machines. [What matters is the peak performance all the cores can collectively deliver] Satish From peyser.alex at gmail.com Tue Jun 16 13:13:01 2009 From: peyser.alex at gmail.com (Alex Peyser) Date: Tue, 16 Jun 2009 14:13:01 -0400 Subject: Can PETSc detect the number of CPUs on each computer node? In-Reply-To: References: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> Message-ID: <200906161413.09632.peyser.alex@gmail.com> On Tuesday 16 June 2009 01:53:35 pm Matthew Knepley wrote: > On Tue, Jun 16, 2009 at 12:38 PM, xiaoyin ji > > wrote: Hi there, > > I'm using PETSc MATMPIAIJ and ksp solver. It seems that PETSc will run > obviously faster if I set the number of CPUs close to the number of > computer nodes in the job file. By default MPIAIJ matrix is stored in > different processors and ksp solver will communicate for each step, > however since on each node several CPUs share the same memory while > ksp may still try to communicate through network card, this may mess > up a bit. Is there any way to detect which CPUs are sharing the same > memory? Thanks a lot. > > The interface for this is mpirun or the job submission mechanism. > > Matt > > > Best, > Xiaoyin Ji > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. -- Norbert Wiener I had a question on what is the best approach for this. Most of the time is spent inside of BLAS, correct? So wouldn't you maximize your operations by running one MPI/PETSC job per board (per shared memory), and use a multi-threaded BLAS that matches your board? You should cut down communications by some factor proportional to the number of threads per board, and the BLAS itself should better optimize most of your operations across the board, rather than relying on higher order parallelisms. Regards, Alex Peyser -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From knepley at gmail.com Tue Jun 16 13:29:14 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Jun 2009 13:29:14 -0500 Subject: Can PETSc detect the number of CPUs on each computer node? In-Reply-To: <200906161413.09632.peyser.alex@gmail.com> References: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> <200906161413.09632.peyser.alex@gmail.com> Message-ID: On Tue, Jun 16, 2009 at 1:13 PM, Alex Peyser wrote: > On Tuesday 16 June 2009 01:53:35 pm Matthew Knepley wrote: > > On Tue, Jun 16, 2009 at 12:38 PM, xiaoyin ji > > > wrote: Hi there, > > > > I'm using PETSc MATMPIAIJ and ksp solver. It seems that PETSc will run > > obviously faster if I set the number of CPUs close to the number of > > computer nodes in the job file. By default MPIAIJ matrix is stored in > > different processors and ksp solver will communicate for each step, > > however since on each node several CPUs share the same memory while > > ksp may still try to communicate through network card, this may mess > > up a bit. Is there any way to detect which CPUs are sharing the same > > memory? Thanks a lot. > > > > The interface for this is mpirun or the job submission mechanism. > > > > Matt > > > > > > Best, > > Xiaoyin Ji > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > their > > experiments lead. -- Norbert Wiener > > I had a question on what is the best approach for this. Most of the time is > spent inside of BLAS, correct? So wouldn't you maximize your operations by > running one MPI/PETSC job per board (per shared memory), and use a > multi-threaded BLAS that matches your board? You should cut down > communications by some factor proportional to the number of threads per > board, and the BLAS itself should better optimize most of your operations > across the board, rather than relying on higher order parallelisms. This is a common misconception. In fact, most time is spent in MatVec or BLAS1, neither of which benefit from MT BLAS. Matt > > Regards, > Alex Peyser > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Jun 16 13:30:42 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 16 Jun 2009 13:30:42 -0500 (CDT) Subject: Can PETSc detect the number of CPUs on each computer node? In-Reply-To: <200906161413.09632.peyser.alex@gmail.com> References: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> <200906161413.09632.peyser.alex@gmail.com> Message-ID: On Tue, 16 Jun 2009, Alex Peyser wrote: > I had a question on what is the best approach for this. Most of the time is > spent inside of BLAS, correct? Not really. PETSc uses a bit of blas1 operations - that should poerhaps account for arround 10-20% of runtime [depending upon application. Check for Vec operations in -log_summary. They are usually blas calls] > So wouldn't you maximize your operations by > running one MPI/PETSC job per board (per shared memory), and use a > multi-threaded BLAS that matches your board? You should cut down > communications by some factor proportional to the number of threads per > board, and the BLAS itself should better optimize most of your operations > across the board, rather than relying on higher order parallelisms. If the issue is memorybandwidth - then it affects threads or processes [MPI] equally. And if the algorithm needs some data sharing - there is cost associated with explicit communication [MPI] vs implicit data-sharing [shared memory] due to cache conflcits and other synchronization thats required.. There could be implementation inefficiencies between threads vs procs, mpi vs openmp that might tilt things in favor of one approach or the other - But I don't think it should be big margin.. Satish From peyser.alex at gmail.com Tue Jun 16 14:23:27 2009 From: peyser.alex at gmail.com (Alex Peyser) Date: Tue, 16 Jun 2009 15:23:27 -0400 Subject: Can PETSc detect the number of CPUs on each computer node? In-Reply-To: References: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> <200906161413.09632.peyser.alex@gmail.com> Message-ID: <200906161523.33176.peyser.alex@gmail.com> On Tuesday 16 June 2009 02:29:14 pm Matthew Knepley wrote: > On Tue, Jun 16, 2009 at 1:13 PM, Alex Peyser wrote: > > On Tuesday 16 June 2009 01:53:35 pm Matthew Knepley wrote: > > > On Tue, Jun 16, 2009 at 12:38 PM, xiaoyin ji > > > > wrote: Hi > > > there, > > > > > > I'm using PETSc MATMPIAIJ and ksp solver. It seems that PETSc will run > > > obviously faster if I set the number of CPUs close to the number of > > > computer nodes in the job file. By default MPIAIJ matrix is stored in > > > different processors and ksp solver will communicate for each step, > > > however since on each node several CPUs share the same memory while > > > ksp may still try to communicate through network card, this may mess > > > up a bit. Is there any way to detect which CPUs are sharing the same > > > memory? Thanks a lot. > > > > > > The interface for this is mpirun or the job submission mechanism. > > > > > > Matt > > > > > > > > > Best, > > > Xiaoyin Ji > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to which > > > > their > > > > > experiments lead. -- Norbert Wiener > > > > I had a question on what is the best approach for this. Most of the time > > is spent inside of BLAS, correct? So wouldn't you maximize your > > operations by running one MPI/PETSC job per board (per shared memory), > > and use a multi-threaded BLAS that matches your board? You should cut > > down communications by some factor proportional to the number of threads > > per board, and the BLAS itself should better optimize most of your > > operations across the board, rather than relying on higher order > > parallelisms. > > This is a common misconception. In fact, most time is spent in MatVec or > BLAS1, neither of which benefit from MT BLAS. > > Matt > > > Regards, > > Alex Peyser Interesting. At least my misconception is common. That makes things tricky with ATLAS, since the number of threads is a compile-time constant. I can't imagine it would be a good idea to have an 8x BLAS running 8xs simultaneously -- unless the mpi jobs were all unsynchronized. It may be only 10-20% of the time, but that's still a large overlap of conflicting threads degrading performance. I'll have to do some benchmarks. Is the 10-20% number still true for fairly dense matrices? Ah, another layer of administration-code may now be required to properly allocate jobs. Alex -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From balay at mcs.anl.gov Tue Jun 16 14:29:47 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 16 Jun 2009 14:29:47 -0500 (CDT) Subject: Can PETSc detect the number of CPUs on each computer node? In-Reply-To: <200906161523.33176.peyser.alex@gmail.com> References: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> <200906161413.09632.peyser.alex@gmail.com> <200906161523.33176.peyser.alex@gmail.com> Message-ID: On Tue, 16 Jun 2009, Alex Peyser wrote: > On Tuesday 16 June 2009 02:29:14 pm Matthew Knepley wrote: > > > > This is a common misconception. In fact, most time is spent in MatVec or > > BLAS1, neither of which benefit from MT BLAS. > Interesting. At least my misconception is common. > That makes things tricky with ATLAS, since the number of threads is a > compile-time constant. I can't imagine it would be a good idea to have an 8x > BLAS running 8xs simultaneously -- unless the mpi jobs were all > unsynchronized. It may be only 10-20% of the time, but that's still a large > overlap of conflicting threads degrading performance. > > I'll have to do some benchmarks. Is the 10-20% number still true for fairly > dense matrices? Its just a number I pulled out of a hat [for sparse matrix solves]. -log_summary would be the correct thing for a given application. If using MATDENSE - a much higher percentage of time will be in blas. Satish > > Ah, another layer of administration-code may now be required to properly > allocate jobs. > > Alex > From peyser.alex at gmail.com Tue Jun 16 14:51:17 2009 From: peyser.alex at gmail.com (Alex Peyser) Date: Tue, 16 Jun 2009 15:51:17 -0400 Subject: Can PETSc detect the number of CPUs on each computer node? In-Reply-To: References: <6985a8f00906161038m67abc966s61a552d4f68a0a1f@mail.gmail.com> <200906161523.33176.peyser.alex@gmail.com> Message-ID: <200906161551.21799.peyser.alex@gmail.com> On Tuesday 16 June 2009 03:29:47 pm Satish Balay wrote: > On Tue, 16 Jun 2009, Alex Peyser wrote: > > On Tuesday 16 June 2009 02:29:14 pm Matthew Knepley wrote: > > > This is a common misconception. In fact, most time is spent in MatVec > > > or BLAS1, neither of which benefit from MT BLAS. > > > > Interesting. At least my misconception is common. > > That makes things tricky with ATLAS, since the number of threads is a > > compile-time constant. I can't imagine it would be a good idea to have an > > 8x BLAS running 8xs simultaneously -- unless the mpi jobs were all > > unsynchronized. It may be only 10-20% of the time, but that's still a > > large overlap of conflicting threads degrading performance. > > > > I'll have to do some benchmarks. Is the 10-20% number still true for > > fairly dense matrices? > > Its just a number I pulled out of a hat [for sparse matrix > solves]. -log_summary would be the correct thing for a given > application. > > If using MATDENSE - a much higher percentage of time will be in blas. > > Satish > > > Ah, another layer of administration-code may now be required to properly > > allocate jobs. > > > > Alex Unfortunately, I'm writing a language. I don't know the application apriori. I have my current application -- which I can benchmark and is dense. But I would hate to predetermine the situation for future problems. Recompiling BLAS (and everything depending on it) on a per-use basis isn't a terribly attractive solution -- but that's a future problem. Alex -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From bsmith at mcs.anl.gov Tue Jun 16 19:11:14 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 16 Jun 2009 19:11:14 -0500 Subject: Matrix Dense for CG In-Reply-To: <5f76eef60906131031gabc9e5ft2e9357c2bac803de@mail.gmail.com> References: <5f76eef60906131031gabc9e5ft2e9357c2bac803de@mail.gmail.com> Message-ID: <44AD3560-8A87-4751-92DF-C4BF156FAA14@mcs.anl.gov> Are you asking for sample dense matrices that CG has difficulty with or are you asking for an example code that shows how to use CG on a dense matrix in PETSc. If the later, it is the same as the other examples, KSPCreate(), KSPSetType(ksp,KSPCG). KSPSetOperators() the only difference is using MatCreateSeqDense() or MatCreateMPIDense(). Barry On Jun 13, 2009, at 12:31 PM, Santolo Felaco wrote: > Hi, > someone has some examples where the CG is used with a dense matrix? > > I need a system difficult to solve with the CG > > Thank you. > > S.F. > From yannpaul at bu.edu Wed Jun 17 13:51:03 2009 From: yannpaul at bu.edu (Yann Tambouret) Date: Wed, 17 Jun 2009 14:51:03 -0400 Subject: how big can I go Message-ID: <4A393B17.8070502@bu.edu> Hi, I'm new to PETSc and I'm try to solve a linear system that has nine terms per equation, along the diagonal. I'd like to use a bluegene machine to solve for a large number of unknowns (10's of millions). Does anyone have advice for such problem? Which algorithms scale best on such a machine? I can of course provide more detail, so please don't hesitate to ask. Thanks, Yann From knepley at gmail.com Wed Jun 17 14:16:14 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Jun 2009 14:16:14 -0500 Subject: how big can I go In-Reply-To: <4A393B17.8070502@bu.edu> References: <4A393B17.8070502@bu.edu> Message-ID: On Wed, Jun 17, 2009 at 1:51 PM, Yann Tambouret wrote: > Hi, > > I'm new to PETSc and I'm try to solve a linear system that has nine terms > per equation, along the diagonal. I'd like to use a bluegene machine to > solve for a large number of unknowns (10's of millions). Does anyone have > advice for such problem? Which algorithms scale best on such a machine? We have run problems on BG/P with hundreds of millions of unknowns. However, the algorithm/solver is always highly system dependent. The right idea is to run lots of small examples, preferably on your laptop to understand the system you want to solve. Then scale up in stages. At each stage, something new usually becomes important, and you handle that aspect. The right thing to look at is the output of -log_summary and the iteration counts. Matt > > I can of course provide more detail, so please don't hesitate to ask. > > Thanks, > > Yann > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From yannpaul at bu.edu Wed Jun 17 14:53:41 2009 From: yannpaul at bu.edu (Yann Tambouret) Date: Wed, 17 Jun 2009 15:53:41 -0400 Subject: memory problem with superlu_dist Message-ID: <4A3949C5.4070109@bu.edu> Hi, While trying to use superlu_dist through the PETSc interface, I received the following error: Malloc fails for C[]. at line 614 in file pdgssvx.c This was for a system with ~16 million unknowns on a Bluegene machine run with 512 processors. Has anyone seen this error before? Thanks, Yann From knepley at gmail.com Wed Jun 17 15:09:13 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Jun 2009 15:09:13 -0500 Subject: memory problem with superlu_dist In-Reply-To: <4A3949C5.4070109@bu.edu> References: <4A3949C5.4070109@bu.edu> Message-ID: It seems clear that MUMPS ran out of memory for the factor. BG is a very memory-light machine. Matt On Wed, Jun 17, 2009 at 2:53 PM, Yann Tambouret wrote: > Hi, > > While trying to use superlu_dist through the PETSc interface, I received > the following error: > > Malloc fails for C[]. at line 614 in file pdgssvx.c > > This was for a system with ~16 million unknowns on a Bluegene machine run > with 512 processors. Has anyone seen this error before? > > Thanks, > > Yann > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Jun 17 15:35:09 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Wed, 17 Jun 2009 15:35:09 -0500 (CDT) Subject: memory problem with superlu_dist In-Reply-To: References: <4A3949C5.4070109@bu.edu> Message-ID: This error is anticipated ;-( Direct solvers require very large memory for numerical factorization. You may consult superlu or mumps developers for feasibility of your application. Hong On Wed, 17 Jun 2009, Matthew Knepley wrote: > It seems clear that MUMPS ran out of memory for the factor. BG is a very > memory-light > machine. > > Matt > > On Wed, Jun 17, 2009 at 2:53 PM, Yann Tambouret wrote: > >> Hi, >> >> While trying to use superlu_dist through the PETSc interface, I received >> the following error: >> >> Malloc fails for C[]. at line 614 in file pdgssvx.c >> >> This was for a system with ~16 million unknowns on a Bluegene machine run >> with 512 processors. Has anyone seen this error before? >> >> Thanks, >> >> Yann >> > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From s.kramer at imperial.ac.uk Thu Jun 18 07:22:06 2009 From: s.kramer at imperial.ac.uk (Stephan Kramer) Date: Thu, 18 Jun 2009 13:22:06 +0100 Subject: Mismatch in explicit fortran interface for MatGetInfo In-Reply-To: References: <4A18016F.6030805@imperial.ac.uk> <0A67546F-4327-4265-B94D-B889B94644E5@mcs.anl.gov> <4A212A19.3090404@imperial.ac.uk> <4A215AB2.2010900@imperial.ac.uk> Message-ID: <4A3A316E.6020206@imperial.ac.uk> Hi Satish, I tried with p6 and it indeed works fine now. Thanks a lot for looking into this. A related question: it seems still quite a lot of very common interfaces (PETScInitialize, KSPSettype, MatView, etc.) are missing. Are there any plans of adding those in the future? Cheers Stephan Satish Balay wrote: > On Sat, 30 May 2009, Stephan Kramer wrote: > >> Satish Balay wrote: >>> On Sat, 30 May 2009, Stephan Kramer wrote: >>> >>>> Thanks a lot for looking into this. The explicit fortran interfaces are in >>>> general very useful. The problem occurred for me with petsc-3.0.0-p1. I'm >>>> happy to try it out with a more recent patch-level or with petsc-dev. >>> Did you configure with '--with-fortran-interfaces=1' or are you >>> directly using '#include "finclude/ftn-auto/petscmat.h90"'? >>> >> Configured with '--with-fortran-interfaces=1', yes, and then using them via >> the fortran modules: "use petscksp", "use petscmat", etc. > > ok. --with-fortran-interfaces was broken in p0, worked in p1,p2,p3,p4 > - broken in curent p5. The next patch update - p6 will have the fix > for this issue [along with the fix for MatGetInfo() interface] > > Satish > From christian.klettner at ucl.ac.uk Thu Jun 18 10:48:32 2009 From: christian.klettner at ucl.ac.uk (Christian Klettner) Date: Thu, 18 Jun 2009 16:48:32 +0100 (BST) Subject: What is the best solver a poisson type eqn. In-Reply-To: References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> <55020.128.40.55.186.1245156021.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: <53469.128.40.55.186.1245340112.squirrel@www.squirrelmail.ucl.ac.uk> Hi Matt, Thank you for the reply. I have now gotten MUMPS to work with the following implementation: MatFactorInfoInitialize(&info); MatGetFactor(A,MAT_SOLVER_MUMPS,MAT_FACTOR_LU,&B); MatLUFactorSymbolic(B,A, rows_ck_is, cols_ck_is,&info); MatLUFactorNumeric(B,A,&info); for(){ ///temporal loop /*coding*/ ierr=MatSolve(A,vecr,vecu);CHKERRQ(ierr); } I explicitly use MUMPS to factor the matrix. However, from my implementation I do not see what package is being used to solve Ax=b. Could I be using PETSc to solve the system?? Is there any way of knowing if MUMPS is actually being used to solve the system? I am not passing in any arguments to use MUMPS explicitly from the command line, have not put in any MUMPS header files and I do not use any other MUMPS functions (e.g. MatCreate_AIJMUMPS()). Thank you for any advice, Best regards, Christian Klettner > On Tue, Jun 16, 2009 at 7:40 AM, Christian Klettner < > christian.klettner at ucl.ac.uk> wrote: > >> Hi Matt, >> I have tried to use MUMPS and came across the following problems. My >> application solves Ax=b about 10000-100000 times with A remaining >> constant. >> When I tried to use it through KSP I was not finding good performance. >> Could this be because it was refactoring etc. at each time step? >> With this in mind I have tried to implement the following: > > > 1) I have no idea what you would mean by good performance. Always, always, > always > send the -log_summary. > > 2) A matrix is never refactored during the KSP iteration > > >> A is a parallel matrix created with MatCreateMPIAIJ(). >> rows is a list of the global row numbers on the process.cols is a list >> of >> the global columns that are on the process. >> I run the program with ./mpiexec -n 4 ./ex4 >> -pc_factor_mat_solver_package >> mumps >> >> (1) MatFactorInfo *info; >> (2) ierr=MatFactorInfoInitialize(info);CHKERRQ(ierr); > > > You have passed a meaningless pointer here. > > (1) MatFactorInfo info; > (2) ierr=MatFactorInfoInitialize(&info);CHKERRQ(ierr); > > Matt > > >> (3) >> ierr=MatGetFactor(A,MAT_SOLVER_MUMPS,MAT_FACTOR_LU,&F);CHKERRQ(ierr); >> (4) ierr=MatLUFactorSymbolic(F, A, rows, cols,info);CHKERRQ(ierr); >> (5) MatLUFactorNumeric(F, A,info);CHKERRQ(ierr); >> >> >> for(){ ///TEMPORAL LOOP >> >> /*CODING*/ >> >> ierr=MatSolve(A,vecr,vecu);CHKERRQ(ierr); >> } >> >> I get the following error messages: >> >> Line (2) above gives the following error message: >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [2]PETSC ERROR: [0]PETSC ERROR: Null argument, when expecting valid >> pointer! >> [0]PETSC ERROR: Trying to zero at a null pointer! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 >> 14:46:08 >> CST 2009 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by >> christian >> Tue Jun 16 13:33:33 2009 >> [0]PETSC ERROR: Libraries linked from >> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib >> [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 >> [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" >> --download-mpich=1 >> --download-f-blas-lapack --download-scalapack --download-blacs >> --download-mumps --download-parmetis --download-hypre >> --download-triangle >> --with-shared=0 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: PetscMemzero() line 189 in src/sys/utils/memc.c >> [0]PETSC ERROR: MatFactorInfoInitialize() line 7123 in >> src/mat/interface/matrix.c >> [0]PETSC ERROR: main() line 1484 in src/dm/ao/examples/tutorials/ex4.c >> application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0[cli_0]: >> aborting job: >> application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 >> --------------------- Error Message ------------------------------------ >> [2]PETSC ERROR: Null argument, when expecting valid pointer! >> [2]PETSC ERROR: Trying to zero at a null pointer! >> [2]PETSC ERROR: >> ------------------------------------------------------------------------ >> [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 >> 14:46:08 >> CST 2009 >> [2]PETSC ERROR: See docs/changes/index.html for recent updates. >> [2]PETSC ERROR: See docs/faq.html[0]0:Return code = 85 >> [0]1:Return code = 0, signaled with Interrupt >> [0]2:Return code = 0, signaled with Interrupt >> [0]3:Return code = 0, signaled with Interrupt >> >> ///////////////////////////////////////////////////////////////// >> >> Line (3) gives the following error messages: >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Invalid argument! >> [0]PETSC ERROR: Wrong type of object: Parameter # 2! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [2]PETSC ERROR: Invalid argument! >> [2]PETSC ERROR: Wrong type of object: Parameter # 2! >> [2]PETSC ERROR: >> ------------------------------------------------------------------------ >> [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 >> 14:46:08 >> CST 2009 >> [2]PETSC ERROR: See docs/changes/index.html for recent updates. >> [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [2]PETSC ERROR: See docs/index.html for manual pages. >> [2]PETSC ERROR: >> ------------------------------------------------------------------------ >> [2]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by >> christian >> Tue Jun 16 13:37:36 2009 >> [2]PETSC ERROR: Libraries linked from >> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib >> [2]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 >> [2]PETSC ERROR: Configure options --with-cc="gcc -fPIC" >> --download-mpich=1 >> --download-f-blas-lapack --download-scalapack --download-blacs >> --downlPetsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST >> 2009 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by >> christian >> Tue Jun 16 13:37:36 2009 >> [0]PETSC ERROR: Libraries linked from >> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib >> [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 >> [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" >> --download-mpich=1 >> --download-f-blas-lapack --download-scalapack --download-blacs >> --download-mumps --download-parmetis --download-hypre >> --download-triangle >> --with-shared=0 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: MatLUFactorSymbolic() line 2311 in >> src/mat/interface/matrix.c >> [0]PETSC ERROR: main() line 148oad-mumps --download-parmetis >> --download-hypre --download-triangle --with-shared=0 >> [2]PETSC ERROR: >> ------------------------------------------------------------------------ >> [2]PETSC ERROR: MatLUFactorSymbolic() line 2311 in >> src/mat/interface/matrix.c >> [2]PETSC ERROR: main() line 1488 in src/dm/ao/examples/tutorials/ex4.c >> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2[cli_2]: >> aborting job: >> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2 >> 8 in src/dm/ao/examples/tutorials/ex4.c >> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0[cli_0]: >> aborting job: >> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 >> [0]0:Return code = 62 >> [0]1:Return code = 0, signaled with Interrupt >> [0]2:Return code = 62 >> [0]3:Return code = 0, signaled with Interrupt >> >> Is the matrix (ie parameter 2) in teh wrong state because it's not a >> MUMPS >> matrix? >> >> Any help would be greatly appreciated, >> Best regards, >> Christian Klettner >> >> >> >> > The problem is small enough that you might be able to use MUMPS. >> > >> > Matt >> > >> > On Fri, Jun 12, 2009 at 9:31 AM, Lisandro Dalcin >> > wrote: >> > >> >> On Fri, Jun 12, 2009 at 11:13 AM, Christian >> >> Klettner wrote: >> >> > Sorry that I sent this twice. No subject in the first one. >> >> > >> >> > Dear PETSc Team, >> >> > I am writing a CFD finite element code in C. From the >> discretization >> >> of >> >> > the governing equations I have to solve a Poisson type equation >> which >> >> is >> >> > really killing my performance. Which solver/preconditioner from >> PETSc >> >> or >> >> > any external packages would you recommend? The size of my problem >> is >> >> from >> >> > ~30000-100000 DOF per core. What kind of performance would I be >> able >> >> to >> >> > expect with this solver/preconditioner? >> >> >> >> I would suggest KSPCG. As preconditioner I would use ML or >> >> HYPRE/BoomerAMG (both are external packages) >> >> >> >> > I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the >> domain >> >> > with Parmetis. The mesh is unstructured. >> >> > Also, I am writing a code which studies free surface phenomena so >> the >> >> mesh >> >> > is continually changing. Does this matter when choosing a >> >> > solver/preconditioner? My left hand side matrix (A in Ax=b) does >> not >> >> > change in time. >> >> >> >> ML has a faster setup that BoomerAMG, but the convergence is a bit >> >> slower. If your A matrix do not change, then likely BoomerAMG will be >> >> better for you. In any case, you can try both: just build PETSc with >> >> both packages, then you can change the preconditioner by just passing >> >> a command line option. >> >> >> >> > >> >> > Best regards and thank you in advance, >> >> > Christian Klettner >> >> > >> >> >> >> Disclaimer: the convergence of multigrid preconditioners depends a >> lot >> >> on your actual problem. What I've suggested is just my limited >> >> experience in a few problems I've run solving electric potentials. >> >> >> >> >> >> -- >> >> Lisandro Dalc?n >> >> --------------- >> >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) >> >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) >> >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) >> >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina >> >> Tel/Fax: +54-(0)342-451.1594 >> >> >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments >> > is infinitely more interesting than any results to which their >> experiments >> > lead. >> > -- Norbert Wiener >> > >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > From bsmith at mcs.anl.gov Thu Jun 18 10:56:41 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 18 Jun 2009 10:56:41 -0500 Subject: What is the best solver a poisson type eqn. In-Reply-To: <53469.128.40.55.186.1245340112.squirrel@www.squirrelmail.ucl.ac.uk> References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> <55020.128.40.55.186.1245156021.squirrel@www.squirrelmail.ucl.ac.uk> <53469.128.40.55.186.1245340112.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: <8D7DDF0B-6044-48FE-8067-E45B65E83614@mcs.anl.gov> On Jun 18, 2009, at 10:48 AM, Christian Klettner wrote: > Hi Matt, > Thank you for the reply. I have now gotten MUMPS to work with the > following implementation: > > MatFactorInfoInitialize(&info); > MatGetFactor(A,MAT_SOLVER_MUMPS,MAT_FACTOR_LU,&B); > MatLUFactorSymbolic(B,A, rows_ck_is, cols_ck_is,&info); > MatLUFactorNumeric(B,A,&info); > > for(){ ///temporal loop > /*coding*/ > ierr=MatSolve(A,vecr,vecu);CHKERRQ(ierr); > } > > I explicitly use MUMPS to factor the matrix. However, from my > implementation I do not see what package is being used to solve Ax=b. > Could I be using PETSc to solve the system?? Is there any way of > knowing > if MUMPS is actually being used to solve the system? It is using MUMPS. Each external factorization package uses its own data structures and solvers. The PETSc triangular solvers only work for the PETSc data structures and cannot work with the MUMPS factorization. Barry > I am not passing in any arguments to use MUMPS explicitly from the > command > line, have > not put in any MUMPS header files and I do not use any other MUMPS > functions (e.g. MatCreate_AIJMUMPS()). ^^^^^^^^^^^^^^^^^^^^^ This stuff no longer exists since PETSc 3.0.0 > Thank you for any advice, > Best regards, > Christian Klettner > > >> On Tue, Jun 16, 2009 at 7:40 AM, Christian Klettner < >> christian.klettner at ucl.ac.uk> wrote: >> >>> Hi Matt, >>> I have tried to use MUMPS and came across the following problems. My >>> application solves Ax=b about 10000-100000 times with A remaining >>> constant. >>> When I tried to use it through KSP I was not finding good >>> performance. >>> Could this be because it was refactoring etc. at each time step? >>> With this in mind I have tried to implement the following: >> >> >> 1) I have no idea what you would mean by good performance. Always, >> always, >> always >> send the -log_summary. >> >> 2) A matrix is never refactored during the KSP iteration >> >> >>> A is a parallel matrix created with MatCreateMPIAIJ(). >>> rows is a list of the global row numbers on the process.cols is a >>> list >>> of >>> the global columns that are on the process. >>> I run the program with ./mpiexec -n 4 ./ex4 >>> -pc_factor_mat_solver_package >>> mumps >>> >>> (1) MatFactorInfo *info; >>> (2) ierr=MatFactorInfoInitialize(info);CHKERRQ(ierr); >> >> >> You have passed a meaningless pointer here. >> >> (1) MatFactorInfo info; >> (2) ierr=MatFactorInfoInitialize(&info);CHKERRQ(ierr); >> >> Matt >> >> >>> (3) >>> ierr >>> =MatGetFactor(A,MAT_SOLVER_MUMPS,MAT_FACTOR_LU,&F);CHKERRQ(ierr); >>> (4) ierr=MatLUFactorSymbolic(F, A, rows, >>> cols,info);CHKERRQ(ierr); >>> (5) MatLUFactorNumeric(F, A,info);CHKERRQ(ierr); >>> >>> >>> for(){ ///TEMPORAL LOOP >>> >>> /*CODING*/ >>> >>> ierr=MatSolve(A,vecr,vecu);CHKERRQ(ierr); >>> } >>> >>> I get the following error messages: >>> >>> Line (2) above gives the following error message: >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [2]PETSC ERROR: [0]PETSC ERROR: Null argument, when expecting valid >>> pointer! >>> [0]PETSC ERROR: Trying to zero at a null pointer! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 >>> 14:46:08 >>> CST 2009 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by >>> christian >>> Tue Jun 16 13:33:33 2009 >>> [0]PETSC ERROR: Libraries linked from >>> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib >>> [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 >>> [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" >>> --download-mpich=1 >>> --download-f-blas-lapack --download-scalapack --download-blacs >>> --download-mumps --download-parmetis --download-hypre >>> --download-triangle >>> --with-shared=0 >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: PetscMemzero() line 189 in src/sys/utils/memc.c >>> [0]PETSC ERROR: MatFactorInfoInitialize() line 7123 in >>> src/mat/interface/matrix.c >>> [0]PETSC ERROR: main() line 1484 in src/dm/ao/examples/tutorials/ >>> ex4.c >>> application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0[cli_0]: >>> aborting job: >>> application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 >>> --------------------- Error Message >>> ------------------------------------ >>> [2]PETSC ERROR: Null argument, when expecting valid pointer! >>> [2]PETSC ERROR: Trying to zero at a null pointer! >>> [2]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 >>> 14:46:08 >>> CST 2009 >>> [2]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [2]PETSC ERROR: See docs/faq.html[0]0:Return code = 85 >>> [0]1:Return code = 0, signaled with Interrupt >>> [0]2:Return code = 0, signaled with Interrupt >>> [0]3:Return code = 0, signaled with Interrupt >>> >>> ///////////////////////////////////////////////////////////////// >>> >>> Line (3) gives the following error messages: >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Invalid argument! >>> [0]PETSC ERROR: Wrong type of object: Parameter # 2! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [2]PETSC ERROR: Invalid argument! >>> [2]PETSC ERROR: Wrong type of object: Parameter # 2! >>> [2]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 >>> 14:46:08 >>> CST 2009 >>> [2]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [2]PETSC ERROR: See docs/index.html for manual pages. >>> [2]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [2]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by >>> christian >>> Tue Jun 16 13:37:36 2009 >>> [2]PETSC ERROR: Libraries linked from >>> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib >>> [2]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 >>> [2]PETSC ERROR: Configure options --with-cc="gcc -fPIC" >>> --download-mpich=1 >>> --download-f-blas-lapack --download-scalapack --download-blacs >>> --downlPetsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST >>> 2009 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by >>> christian >>> Tue Jun 16 13:37:36 2009 >>> [0]PETSC ERROR: Libraries linked from >>> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib >>> [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 >>> [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" >>> --download-mpich=1 >>> --download-f-blas-lapack --download-scalapack --download-blacs >>> --download-mumps --download-parmetis --download-hypre >>> --download-triangle >>> --with-shared=0 >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: MatLUFactorSymbolic() line 2311 in >>> src/mat/interface/matrix.c >>> [0]PETSC ERROR: main() line 148oad-mumps --download-parmetis >>> --download-hypre --download-triangle --with-shared=0 >>> [2]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [2]PETSC ERROR: MatLUFactorSymbolic() line 2311 in >>> src/mat/interface/matrix.c >>> [2]PETSC ERROR: main() line 1488 in src/dm/ao/examples/tutorials/ >>> ex4.c >>> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2[cli_2]: >>> aborting job: >>> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2 >>> 8 in src/dm/ao/examples/tutorials/ex4.c >>> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0[cli_0]: >>> aborting job: >>> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 >>> [0]0:Return code = 62 >>> [0]1:Return code = 0, signaled with Interrupt >>> [0]2:Return code = 62 >>> [0]3:Return code = 0, signaled with Interrupt >>> >>> Is the matrix (ie parameter 2) in teh wrong state because it's not a >>> MUMPS >>> matrix? >>> >>> Any help would be greatly appreciated, >>> Best regards, >>> Christian Klettner >>> >>> >>> >>>> The problem is small enough that you might be able to use MUMPS. >>>> >>>> Matt >>>> >>>> On Fri, Jun 12, 2009 at 9:31 AM, Lisandro Dalcin >>>> >>>> wrote: >>>> >>>>> On Fri, Jun 12, 2009 at 11:13 AM, Christian >>>>> Klettner wrote: >>>>>> Sorry that I sent this twice. No subject in the first one. >>>>>> >>>>>> Dear PETSc Team, >>>>>> I am writing a CFD finite element code in C. From the >>> discretization >>>>> of >>>>>> the governing equations I have to solve a Poisson type equation >>> which >>>>> is >>>>>> really killing my performance. Which solver/preconditioner from >>> PETSc >>>>> or >>>>>> any external packages would you recommend? The size of my problem >>> is >>>>> from >>>>>> ~30000-100000 DOF per core. What kind of performance would I be >>> able >>>>> to >>>>>> expect with this solver/preconditioner? >>>>> >>>>> I would suggest KSPCG. As preconditioner I would use ML or >>>>> HYPRE/BoomerAMG (both are external packages) >>>>> >>>>>> I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the >>> domain >>>>>> with Parmetis. The mesh is unstructured. >>>>>> Also, I am writing a code which studies free surface phenomena so >>> the >>>>> mesh >>>>>> is continually changing. Does this matter when choosing a >>>>>> solver/preconditioner? My left hand side matrix (A in Ax=b) does >>> not >>>>>> change in time. >>>>> >>>>> ML has a faster setup that BoomerAMG, but the convergence is a bit >>>>> slower. If your A matrix do not change, then likely BoomerAMG >>>>> will be >>>>> better for you. In any case, you can try both: just build PETSc >>>>> with >>>>> both packages, then you can change the preconditioner by just >>>>> passing >>>>> a command line option. >>>>> >>>>>> >>>>>> Best regards and thank you in advance, >>>>>> Christian Klettner >>>>>> >>>>> >>>>> Disclaimer: the convergence of multigrid preconditioners depends a >>> lot >>>>> on your actual problem. What I've suggested is just my limited >>>>> experience in a few problems I've run solving electric potentials. >>>>> >>>>> >>>>> -- >>>>> Lisandro Dalc?n >>>>> --------------- >>>>> Centro Internacional de M?todos Computacionales en Ingenier?a >>>>> (CIMEC) >>>>> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica >>>>> (INTEC) >>>>> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas >>>>> (CONICET) >>>>> PTLC - G?emes 3450, (3000) Santa Fe, Argentina >>>>> Tel/Fax: +54-(0)342-451.1594 >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments >>>> is infinitely more interesting than any results to which their >>> experiments >>>> lead. >>>> -- Norbert Wiener >>>> >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their >> experiments >> lead. >> -- Norbert Wiener >> > > From knepley at gmail.com Thu Jun 18 10:59:15 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Jun 2009 10:59:15 -0500 Subject: What is the best solver a poisson type eqn. In-Reply-To: <53469.128.40.55.186.1245340112.squirrel@www.squirrelmail.ucl.ac.uk> References: <43338.128.40.55.186.1244816039.squirrel@www.squirrelmail.ucl.ac.uk> <55020.128.40.55.186.1245156021.squirrel@www.squirrelmail.ucl.ac.uk> <53469.128.40.55.186.1245340112.squirrel@www.squirrelmail.ucl.ac.uk> Message-ID: On Thu, Jun 18, 2009 at 10:48 AM, Christian Klettner < christian.klettner at ucl.ac.uk> wrote: > Hi Matt, > Thank you for the reply. I have now gotten MUMPS to work with the > following implementation: > > MatFactorInfoInitialize(&info); > MatGetFactor(A,MAT_SOLVER_MUMPS,MAT_FACTOR_LU,&B); > MatLUFactorSymbolic(B,A, rows_ck_is, cols_ck_is,&info); > MatLUFactorNumeric(B,A,&info); > > for(){ ///temporal loop > /*coding*/ > ierr=MatSolve(A,vecr,vecu);CHKERRQ(ierr); > } > > I explicitly use MUMPS to factor the matrix. However, from my > implementation I do not see what package is being used to solve Ax=b. > Could I be using PETSc to solve the system?? Is there any way of knowing > if MUMPS is actually being used to solve the system? > I am not passing in any arguments to use MUMPS explicitly from the command > line, have > not put in any MUMPS header files and I do not use any other MUMPS > functions (e.g. MatCreate_AIJMUMPS()). Yes, it is determined from the type. However, I think this is completely the wrong way to program. Why hardcode yourself into a corner? It is hard to change packages, to wrap this in a Krylov solver, compare with other PCs. This is crazy. I recommend KSPCreate(comm, &ksp); KSPSetOperators(ksp, A, A, ...); KSPSetFromOptions(ksp); KSPSolve(ksp, b, x); Now with options you can control everything -ksp_type preonly -pc_type lu -mat *-pc_factor_mat_solver_package mumps* will reproduce what you have, but you can test it against -ksp_type gmres -pc_type asm -sub_pc_type ilu To see what is happening you can always use -ksp_view Matt > Thank you for any advice, > Best regards, > Christian Klettner > > > > On Tue, Jun 16, 2009 at 7:40 AM, Christian Klettner < > > christian.klettner at ucl.ac.uk> wrote: > > > >> Hi Matt, > >> I have tried to use MUMPS and came across the following problems. My > >> application solves Ax=b about 10000-100000 times with A remaining > >> constant. > >> When I tried to use it through KSP I was not finding good performance. > >> Could this be because it was refactoring etc. at each time step? > >> With this in mind I have tried to implement the following: > > > > > > 1) I have no idea what you would mean by good performance. Always, > always, > > always > > send the -log_summary. > > > > 2) A matrix is never refactored during the KSP iteration > > > > > >> A is a parallel matrix created with MatCreateMPIAIJ(). > >> rows is a list of the global row numbers on the process.cols is a list > >> of > >> the global columns that are on the process. > >> I run the program with ./mpiexec -n 4 ./ex4 > >> -pc_factor_mat_solver_package > >> mumps > >> > >> (1) MatFactorInfo *info; > >> (2) ierr=MatFactorInfoInitialize(info);CHKERRQ(ierr); > > > > > > You have passed a meaningless pointer here. > > > > (1) MatFactorInfo info; > > (2) ierr=MatFactorInfoInitialize(&info);CHKERRQ(ierr); > > > > Matt > > > > > >> (3) > >> ierr=MatGetFactor(A,MAT_SOLVER_MUMPS,MAT_FACTOR_LU,&F);CHKERRQ(ierr); > >> (4) ierr=MatLUFactorSymbolic(F, A, rows, cols,info);CHKERRQ(ierr); > >> (5) MatLUFactorNumeric(F, A,info);CHKERRQ(ierr); > >> > >> > >> for(){ ///TEMPORAL LOOP > >> > >> /*CODING*/ > >> > >> ierr=MatSolve(A,vecr,vecu);CHKERRQ(ierr); > >> } > >> > >> I get the following error messages: > >> > >> Line (2) above gives the following error message: > >> > >> [0]PETSC ERROR: --------------------- Error Message > >> ------------------------------------ > >> [2]PETSC ERROR: [0]PETSC ERROR: Null argument, when expecting valid > >> pointer! > >> [0]PETSC ERROR: Trying to zero at a null pointer! > >> [0]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 > >> 14:46:08 > >> CST 2009 > >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. > >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > >> [0]PETSC ERROR: See docs/index.html for manual pages. > >> [0]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by > >> christian > >> Tue Jun 16 13:33:33 2009 > >> [0]PETSC ERROR: Libraries linked from > >> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > >> [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 > >> [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" > >> --download-mpich=1 > >> --download-f-blas-lapack --download-scalapack --download-blacs > >> --download-mumps --download-parmetis --download-hypre > >> --download-triangle > >> --with-shared=0 > >> [0]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [0]PETSC ERROR: PetscMemzero() line 189 in src/sys/utils/memc.c > >> [0]PETSC ERROR: MatFactorInfoInitialize() line 7123 in > >> src/mat/interface/matrix.c > >> [0]PETSC ERROR: main() line 1484 in src/dm/ao/examples/tutorials/ex4.c > >> application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0[cli_0]: > >> aborting job: > >> application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 > >> --------------------- Error Message ------------------------------------ > >> [2]PETSC ERROR: Null argument, when expecting valid pointer! > >> [2]PETSC ERROR: Trying to zero at a null pointer! > >> [2]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 > >> 14:46:08 > >> CST 2009 > >> [2]PETSC ERROR: See docs/changes/index.html for recent updates. > >> [2]PETSC ERROR: See docs/faq.html[0]0:Return code = 85 > >> [0]1:Return code = 0, signaled with Interrupt > >> [0]2:Return code = 0, signaled with Interrupt > >> [0]3:Return code = 0, signaled with Interrupt > >> > >> ///////////////////////////////////////////////////////////////// > >> > >> Line (3) gives the following error messages: > >> > >> [0]PETSC ERROR: --------------------- Error Message > >> ------------------------------------ > >> [0]PETSC ERROR: Invalid argument! > >> [0]PETSC ERROR: Wrong type of object: Parameter # 2! > >> [0]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [0]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message > >> ------------------------------------ > >> [2]PETSC ERROR: Invalid argument! > >> [2]PETSC ERROR: Wrong type of object: Parameter # 2! > >> [2]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [2]PETSC ERROR: Petsc Release Version 3.0.0, Patch 4, Fri Mar 6 > >> 14:46:08 > >> CST 2009 > >> [2]PETSC ERROR: See docs/changes/index.html for recent updates. > >> [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > >> [2]PETSC ERROR: See docs/index.html for manual pages. > >> [2]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [2]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by > >> christian > >> Tue Jun 16 13:37:36 2009 > >> [2]PETSC ERROR: Libraries linked from > >> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > >> [2]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 > >> [2]PETSC ERROR: Configure options --with-cc="gcc -fPIC" > >> --download-mpich=1 > >> --download-f-blas-lapack --download-scalapack --download-blacs > >> --downlPetsc Release Version 3.0.0, Patch 4, Fri Mar 6 14:46:08 CST > >> 2009 > >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. > >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > >> [0]PETSC ERROR: See docs/index.html for manual pages. > >> [0]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [0]PETSC ERROR: ./ex4 on a linux-gnu named christian-desktop by > >> christian > >> Tue Jun 16 13:37:36 2009 > >> [0]PETSC ERROR: Libraries linked from > >> /home/christian/Desktop/petsc-3.0.0-p4/linux-gnu-c-debug/lib > >> [0]PETSC ERROR: Configure run at Mon Jun 15 17:05:31 2009 > >> [0]PETSC ERROR: Configure options --with-cc="gcc -fPIC" > >> --download-mpich=1 > >> --download-f-blas-lapack --download-scalapack --download-blacs > >> --download-mumps --download-parmetis --download-hypre > >> --download-triangle > >> --with-shared=0 > >> [0]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [0]PETSC ERROR: MatLUFactorSymbolic() line 2311 in > >> src/mat/interface/matrix.c > >> [0]PETSC ERROR: main() line 148oad-mumps --download-parmetis > >> --download-hypre --download-triangle --with-shared=0 > >> [2]PETSC ERROR: > >> ------------------------------------------------------------------------ > >> [2]PETSC ERROR: MatLUFactorSymbolic() line 2311 in > >> src/mat/interface/matrix.c > >> [2]PETSC ERROR: main() line 1488 in src/dm/ao/examples/tutorials/ex4.c > >> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2[cli_2]: > >> aborting job: > >> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 2 > >> 8 in src/dm/ao/examples/tutorials/ex4.c > >> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0[cli_0]: > >> aborting job: > >> application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 > >> [0]0:Return code = 62 > >> [0]1:Return code = 0, signaled with Interrupt > >> [0]2:Return code = 62 > >> [0]3:Return code = 0, signaled with Interrupt > >> > >> Is the matrix (ie parameter 2) in teh wrong state because it's not a > >> MUMPS > >> matrix? > >> > >> Any help would be greatly appreciated, > >> Best regards, > >> Christian Klettner > >> > >> > >> > >> > The problem is small enough that you might be able to use MUMPS. > >> > > >> > Matt > >> > > >> > On Fri, Jun 12, 2009 at 9:31 AM, Lisandro Dalcin > >> > wrote: > >> > > >> >> On Fri, Jun 12, 2009 at 11:13 AM, Christian > >> >> Klettner wrote: > >> >> > Sorry that I sent this twice. No subject in the first one. > >> >> > > >> >> > Dear PETSc Team, > >> >> > I am writing a CFD finite element code in C. From the > >> discretization > >> >> of > >> >> > the governing equations I have to solve a Poisson type equation > >> which > >> >> is > >> >> > really killing my performance. Which solver/preconditioner from > >> PETSc > >> >> or > >> >> > any external packages would you recommend? The size of my problem > >> is > >> >> from > >> >> > ~30000-100000 DOF per core. What kind of performance would I be > >> able > >> >> to > >> >> > expect with this solver/preconditioner? > >> >> > >> >> I would suggest KSPCG. As preconditioner I would use ML or > >> >> HYPRE/BoomerAMG (both are external packages) > >> >> > >> >> > I am using a 2*quad core 2.3 GHz Opteron. I have decomposed the > >> domain > >> >> > with Parmetis. The mesh is unstructured. > >> >> > Also, I am writing a code which studies free surface phenomena so > >> the > >> >> mesh > >> >> > is continually changing. Does this matter when choosing a > >> >> > solver/preconditioner? My left hand side matrix (A in Ax=b) does > >> not > >> >> > change in time. > >> >> > >> >> ML has a faster setup that BoomerAMG, but the convergence is a bit > >> >> slower. If your A matrix do not change, then likely BoomerAMG will be > >> >> better for you. In any case, you can try both: just build PETSc with > >> >> both packages, then you can change the preconditioner by just passing > >> >> a command line option. > >> >> > >> >> > > >> >> > Best regards and thank you in advance, > >> >> > Christian Klettner > >> >> > > >> >> > >> >> Disclaimer: the convergence of multigrid preconditioners depends a > >> lot > >> >> on your actual problem. What I've suggested is just my limited > >> >> experience in a few problems I've run solving electric potentials. > >> >> > >> >> > >> >> -- > >> >> Lisandro Dalc?n > >> >> --------------- > >> >> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > >> >> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > >> >> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > >> >> PTLC - G?emes 3450, (3000) Santa Fe, Argentina > >> >> Tel/Fax: +54-(0)342-451.1594 > >> >> > >> > > >> > > >> > > >> > -- > >> > What most experimenters take for granted before they begin their > >> > experiments > >> > is infinitely more interesting than any results to which their > >> experiments > >> > lead. > >> > -- Norbert Wiener > >> > > >> > >> > >> > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments > > is infinitely more interesting than any results to which their > experiments > > lead. > > -- Norbert Wiener > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jun 18 11:39:12 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 18 Jun 2009 11:39:12 -0500 Subject: Mismatch in explicit fortran interface for MatGetInfo In-Reply-To: <4A3A316E.6020206@imperial.ac.uk> References: <4A18016F.6030805@imperial.ac.uk> <0A67546F-4327-4265-B94D-B889B94644E5@mcs.anl.gov> <4A212A19.3090404@imperial.ac.uk> <4A215AB2.2010900@imperial.ac.uk> <4A3A316E.6020206@imperial.ac.uk> Message-ID: Stephan, Those are ones for which our automatic tools do not work for (mostly because they have character strings in the arguments and our tools cannot handle the translation from Fortran to C automatically). So we are in the position of 1) having better tools that can handle these cases or 2) making the interfaces manually. Neither path is desirable to us :-( Barry On Jun 18, 2009, at 7:22 AM, Stephan Kramer wrote: > Hi Satish, > > I tried with p6 and it indeed works fine now. Thanks a lot for > looking into this. A related question: it seems still quite a lot of > very common interfaces (PETScInitialize, KSPSettype, MatView, etc.) > are missing. Are there any plans of adding those in the future? > > Cheers > Stephan > > Satish Balay wrote: >> On Sat, 30 May 2009, Stephan Kramer wrote: >>> Satish Balay wrote: >>>> On Sat, 30 May 2009, Stephan Kramer wrote: >>>> >>>>> Thanks a lot for looking into this. The explicit fortran >>>>> interfaces are in >>>>> general very useful. The problem occurred for me with >>>>> petsc-3.0.0-p1. I'm >>>>> happy to try it out with a more recent patch-level or with petsc- >>>>> dev. >>>> Did you configure with '--with-fortran-interfaces=1' or are you >>>> directly using '#include "finclude/ftn-auto/petscmat.h90"'? >>>> >>> Configured with '--with-fortran-interfaces=1', yes, and then using >>> them via >>> the fortran modules: "use petscksp", "use petscmat", etc. >> ok. --with-fortran-interfaces was broken in p0, worked in p1,p2,p3,p4 >> - broken in curent p5. The next patch update - p6 will have the fix >> for this issue [along with the fix for MatGetInfo() interface] >> Satish > From meganlewis13 at gmail.com Mon Jun 22 08:40:43 2009 From: meganlewis13 at gmail.com (Megan Lewis) Date: Mon, 22 Jun 2009 07:40:43 -0600 Subject: Create CRS matrix Message-ID: <4573ea9f0906220640r34d06672w56ee61900c182e4f@mail.gmail.com> Hi, I know PETSc stores their matrices in CRS format. I have the 3 arrays required for CRS format as pointers (two int pointers and one double pointer). Instead of allocating the memory and running through all of the non-zero elements in the matrix, is there any way to create a matrix in PETSc by passing in the 3 arrays? Thanks! From balay at mcs.anl.gov Mon Jun 22 08:46:57 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 22 Jun 2009 08:46:57 -0500 (CDT) Subject: Create CRS matrix In-Reply-To: <4573ea9f0906220640r34d06672w56ee61900c182e4f@mail.gmail.com> References: <4573ea9f0906220640r34d06672w56ee61900c182e4f@mail.gmail.com> Message-ID: Check the following: http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatCreateSeqAIJWithArrays.html http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatCreateMPIAIJWithArrays.html Satish On Mon, 22 Jun 2009, Megan Lewis wrote: > Hi, > I know PETSc stores their matrices in CRS format. I have the 3 arrays > required for CRS format as pointers (two int pointers and one double > pointer). Instead of allocating the memory and running through all of > the non-zero elements in the matrix, is there any way to create a > matrix in PETSc by passing in the 3 arrays? > > Thanks! > From meganlewis13 at gmail.com Mon Jun 22 09:00:14 2009 From: meganlewis13 at gmail.com (Megan Lewis) Date: Mon, 22 Jun 2009 08:00:14 -0600 Subject: Create CRS matrix In-Reply-To: References: <4573ea9f0906220640r34d06672w56ee61900c182e4f@mail.gmail.com> Message-ID: <4573ea9f0906220700g1eccf1c1w88f58451268f35cb@mail.gmail.com> Great, that was exactly what I was looking for. Thanks! On Mon, Jun 22, 2009 at 7:46 AM, Satish Balay wrote: > Check the following: > > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatCreateSeqAIJWithArrays.html > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatCreateMPIAIJWithArrays.html > > Satish > > On Mon, 22 Jun 2009, Megan Lewis wrote: > >> Hi, >> I know PETSc stores their matrices in CRS format. ?I have the 3 arrays >> required for CRS format as pointers (two int pointers and one double >> pointer). ?Instead of allocating the memory and running through all of >> the non-zero elements in the matrix, is there any way to create a >> matrix in PETSc by passing in the 3 arrays? >> >> Thanks! >> > > From Andreas.Grassl at student.uibk.ac.at Tue Jun 23 08:56:49 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Tue, 23 Jun 2009 15:56:49 +0200 Subject: PCNN-preconditioner and "floating domains" Message-ID: <4A40DF21.1070007@student.uibk.ac.at> Hello again, the issues from my last request regarding VecView and data distribution among the different processors are solved, but I'm experiencing still great problems on the performance of the actual preconditioner. I have an implementation of the BNN-algorithm in Matlab from a previous project which is performing very well (about 5 iterations vs. 200 for plain-cg) for a long linear elastic beam fixed at one end and loaded at the other end, discretized with solid cubic bricks (8 nodes, 24 DOF's). condition of the Matrix: 1.5e7 I now modeled a similar beam in DIANA (a bit shorter, less elements due to restrictions of DIANA-preprocessor) and tried to solve with PETSc-solver. The condition of the Matrix is of the same magnitude: ~3e7 (smallest singular value: ~1e-3, largest sv: ~4e4), number of iterations for plain-cg seems reasonable (437), but for the preconditioned system I get completely unexpected values: condition: ~7e12 (smallest sv: ~1, largest sv: ~7e12) and therefore 612 iterations for cg. The beam is divided in 4 subdomains. For more subdomains ksp ran out of iterations (Converged_Reason -3). I can imagine this is a problem of properly setting the null space, because only the first subdomain is touching the boundary, but I have no idea how to specify the null space. So far I didn't regard this issue at all. Do I have to define a function which calculates the weights of the interface DOF's and applies in a some way to create an orthonormal basis? How do I realize that? Is there anywhere an example? Cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From bsmith at mcs.anl.gov Tue Jun 23 09:34:11 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 23 Jun 2009 09:34:11 -0500 Subject: PCNN-preconditioner and "floating domains" In-Reply-To: <4A40DF21.1070007@student.uibk.ac.at> References: <4A40DF21.1070007@student.uibk.ac.at> Message-ID: As we have said before the BNN in PETSc is ONLY implemented for a scalar PDE with a null space of the constant functions. If you shove in linear elasticity it isn't really going to work. Barry On Jun 23, 2009, at 8:56 AM, Andreas Grassl wrote: > Hello again, > > the issues from my last request regarding VecView and data > distribution among > the different processors are solved, but I'm experiencing still > great problems > on the performance of the actual preconditioner. > > I have an implementation of the BNN-algorithm in Matlab from a > previous project > which is performing very well (about 5 iterations vs. 200 for plain- > cg) for a > long linear elastic beam fixed at one end and loaded at the other end, > discretized with solid cubic bricks (8 nodes, 24 DOF's). condition > of the > Matrix: 1.5e7 > > I now modeled a similar beam in DIANA (a bit shorter, less elements > due to > restrictions of DIANA-preprocessor) and tried to solve with PETSc- > solver. > > The condition of the Matrix is of the same magnitude: ~3e7 (smallest > singular > value: ~1e-3, largest sv: ~4e4), number of iterations for plain-cg > seems > reasonable (437), but for the preconditioned system I get completely > unexpected > values: > > condition: ~7e12 (smallest sv: ~1, largest sv: ~7e12) and therefore > 612 > iterations for cg. > > The beam is divided in 4 subdomains. For more subdomains ksp ran out > of > iterations (Converged_Reason -3). I can imagine this is a problem of > properly > setting the null space, because only the first subdomain is touching > the > boundary, but I have no idea how to specify the null space. So far I > didn't > regard this issue at all. > > Do I have to define a function which calculates the weights of the > interface > DOF's and applies in a some way to create an orthonormal basis? How > do I realize > that? Is there anywhere an example? > > Cheers, > > ando > > -- > /"\ Grassl Andreas > \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik > X against HTML email Technikerstr. 13 Zi 709 > / \ +43 (0)512 507 6091 From Andreas.Grassl at student.uibk.ac.at Tue Jun 23 11:15:39 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Tue, 23 Jun 2009 18:15:39 +0200 Subject: PCNN-preconditioner and "floating domains" In-Reply-To: References: <4A40DF21.1070007@student.uibk.ac.at> Message-ID: <4A40FFAB.8010806@student.uibk.ac.at> Barry Smith schrieb: > > As we have said before the BNN in PETSc is ONLY implemented for a > scalar PDE with > a null space of the constant functions. If you shove in linear > elasticity it isn't really going to work. Do you have any suggestions to work around this drawback? Do I understand right, that this issue is problematic, if floating subdomains appear? Do I have the possibility to provide the null space from user site? Or how big would be the effort to change the nn-code to work for this problem classes? Is such a work useful at all or do you regard BNN only a rather complicated method to be done much better by other (even simpler) algorithms? Cheers, Ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From bsmith at mcs.anl.gov Tue Jun 23 11:23:21 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 23 Jun 2009 11:23:21 -0500 Subject: PCNN-preconditioner and "floating domains" In-Reply-To: <4A40FFAB.8010806@student.uibk.ac.at> References: <4A40DF21.1070007@student.uibk.ac.at> <4A40FFAB.8010806@student.uibk.ac.at> Message-ID: <093D54A6-0166-4BFC-A8CC-932BD5E3D6DB@mcs.anl.gov> To modify the code for elasticity is a big job. There are two faculty Axel Klawonn and O. Rheinbach at Universit?t Duisburg-Essen who have implemented a variety of these fast methods for elasticity. I suggest you contact them. Barry On Jun 23, 2009, at 11:15 AM, Andreas Grassl wrote: > Barry Smith schrieb: >> >> As we have said before the BNN in PETSc is ONLY implemented for a >> scalar PDE with >> a null space of the constant functions. If you shove in linear >> elasticity it isn't really going to work. > > Do you have any suggestions to work around this drawback? > Do I understand right, that this issue is problematic, if floating > subdomains > appear? > Do I have the possibility to provide the null space from user site? > Or how big would be the effort to change the nn-code to work for > this problem > classes? > Is such a work useful at all or do you regard BNN only a rather > complicated > method to be done much better by other (even simpler) algorithms? > > Cheers, > > Ando > > -- > /"\ Grassl Andreas > \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik > X against HTML email Technikerstr. 13 Zi 709 > / \ +43 (0)512 507 6091 From tyoung at ippt.gov.pl Wed Jun 24 11:34:47 2009 From: tyoung at ippt.gov.pl (Toby D. Young) Date: Wed, 24 Jun 2009 18:34:47 +0200 (CEST) Subject: cast unsigned int* -> PetscInt* In-Reply-To: References: <4573ea9f0906220640r34d06672w56ee61900c182e4f@mail.gmail.com> Message-ID: Hello PETSc users, I want to convert an unsigned int* to PetscInt*. Is there a simple way to do this? My reason is, that I wish to protect in my application code from asking for a negative number of eigenvalues to be computed (in SLEPc). Thanks. Best, Toby ----- Toby D. Young Philosopher-Physicist Adiunkt (Assistant Professor) Polish Academy of Sciences Warszawa, Polska www: http://www.ippt.gov.pl/~tyoung skype: stenografia From knepley at gmail.com Wed Jun 24 11:49:02 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Jun 2009 11:49:02 -0500 Subject: cast unsigned int* -> PetscInt* In-Reply-To: References: <4573ea9f0906220640r34d06672w56ee61900c182e4f@mail.gmail.com> Message-ID: On Wed, Jun 24, 2009 at 11:34 AM, Toby D. Young wrote: > > > Hello PETSc users, > > I want to convert an unsigned int* to PetscInt*. Is there a simple way to > do this? My reason is, that I wish to protect in my application code from > asking for a negative number of eigenvalues to be computed (in SLEPc). You could cast, but you would have to be careful that they are the same size. Matt > > Thanks. > > Best, > Toby > > > ----- > > Toby D. Young > Philosopher-Physicist > Adiunkt (Assistant Professor) > Polish Academy of Sciences > Warszawa, Polska > > www: http://www.ippt.gov.pl/~tyoung > skype: stenografia > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsjb00 at hotmail.com Wed Jun 24 15:38:22 2009 From: tsjb00 at hotmail.com (tsjb00) Date: Wed, 24 Jun 2009 20:38:22 +0000 Subject: fortran 90 program with PETSc Message-ID: Hi! I am trying to parallelize a fortran 90 program with PETSc. The original program uses a lot of global variables defined using fortran 90 modules. My question is: Can I use 'include PETSc include file' and 'use MyModule' in the program at the same time? If so, in what sequence should I put 'include XXX', 'use XXX' and 'implicit' statements? Another general question is whether I can use fortran 90 features such as 'module' and dynamic allocation on a parallel system of MPI-1 standard? Any reference on such programming work will be also appreciated. Many thanks in advance! _________________________________________________________________ ?????????????????msn????? http://ditu.live.com/?form=TL&swm=1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Wed Jun 24 16:06:57 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Wed, 24 Jun 2009 17:06:57 -0400 Subject: Petsc configure error Message-ID: Hi all, I got configuration errors, when I trying to built PETSC with Tau. Basically, I followed the instruction on the installation webpage of PETSc. In the attached file, please find the configure log. I do built TAU and PDT seperately, and TAU with mpi and PDT. If any more information is needed, please let me know. Thanks, Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 577860 bytes Desc: not available URL: From balay at mcs.anl.gov Wed Jun 24 16:52:36 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 24 Jun 2009 16:52:36 -0500 (CDT) Subject: Petsc configure error In-Reply-To: References: Message-ID: TAU isn't tested with petsc in the past few years. So thing could be broken. I'll have to check on this. One comment though.. If you have tau built with mpi - you should use that mpi & wrappers with tau and not use --download-mpich. [but this is also likely broken] Satish On Wed, 24 Jun 2009, Ryan Yan wrote: > Hi all, > I got configuration errors, when I trying to built PETSC with Tau. > Basically, I followed the instruction on the installation webpage of PETSc. > > In the attached file, please find the configure log. > > I do built TAU and PDT seperately, and TAU with mpi and PDT. > > If any more information is needed, please let me know. > > Thanks, > > Yan > From balay at mcs.anl.gov Wed Jun 24 16:59:43 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 24 Jun 2009 16:59:43 -0500 (CDT) Subject: fortran 90 program with PETSc In-Reply-To: References: Message-ID: On Wed, 24 Jun 2009, tsjb00 wrote: > > Hi! > I am trying to parallelize a fortran 90 program with PETSc. The > original program uses a lot of global variables defined using > fortran 90 modules. > My question is: Can I use 'include PETSc include file' and 'use > MyModule' in the program at the same time? yes. > If so, in what sequence should I put 'include XXX', 'use XXX' and > 'implicit' statements? The following should work. use foo #include "finclude/petsc.h" implicit none > Another general question is whether I can use fortran 90 features > such as 'module' and dynamic allocation on a parallel system of > MPI-1 standard? Any reference on such programming work will be also > appreciated. I don't see why not. f90 issues you mentioned are unrelated to MPI Satish > Many thanks in advance! > > > _________________________________________________________________ > ?????????????????msn????? > http://ditu.live.com/?form=TL&swm=1 From vyan2000 at gmail.com Wed Jun 24 17:11:22 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Wed, 24 Jun 2009 18:11:22 -0400 Subject: Petsc configure error In-Reply-To: References: Message-ID: Hi Satish, What I am tring to do is using TAU to profile a solver using PETSc function calls on a Sicortex cluster. Now I am just building on a desktop. I tried tauex, it can run. But it did not provide me any explicit profiling about PETSc function calls. Instead, it only gave me profiling on the TAU application and MPI routines. I have send email inquiries to TAU developer. They suggestted me rebuild PETSc with TAU if I want those profiles on PETSc function calls. Frankly, I do not have experience with profiling or performance analysis on a solver. Do you have any suggestion or pointers to other source on doing performance analysis for PETSc solvers, considering the option of using TAU may need lots of workarounds when I rebuilt PETSC with TAU on the cluster. Thank you very much, Yan On Wed, Jun 24, 2009 at 5:52 PM, Satish Balay wrote: > TAU isn't tested with petsc in the past few years. So thing could be > broken. I'll have to check on this. > Hi Satish, What I am tring to do is using TAU to profile a solver using PETSc function calls on a Sicortex cluster. I tried tauex, it can run. But it did not provide me any explicit profiling about PETSc function calls. Instead, it just give me an profiling on the TAU application and MPI routines. I have send inquiryies to TAU developer. They suggestted me rebuild PETSc with TAU if I want those profiles on PETSc function calls. Frankly, I do not have experience with profiling or performance analysis on a solver. Do you have any suggestion on doing performance analysis for PETSc solvers? > > One comment though.. If you have tau built with mpi - you should use > that mpi & wrappers with tau and not use --download-mpich. [but this > is also likely broken] > > Satish > > On Wed, 24 Jun 2009, Ryan Yan wrote: > > > Hi all, > > I got configuration errors, when I trying to built PETSC with Tau. > > Basically, I followed the instruction on the installation webpage of > PETSc. > > > > In the attached file, please find the configure log. > > > > I do built TAU and PDT seperately, and TAU with mpi and PDT. > > > > If any more information is needed, please let me know. > > > > Thanks, > > > > Yan > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jun 24 17:16:28 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Jun 2009 17:16:28 -0500 Subject: Petsc configure error In-Reply-To: References: Message-ID: On Wed, Jun 24, 2009 at 5:11 PM, Ryan Yan wrote: > Hi Satish, > What I am tring to do is using TAU to profile a solver using PETSc > function calls on a Sicortex cluster. Now I am just building on a desktop. > > I tried tauex, it can run. But it did not provide me any explicit profiling > about PETSc function calls. Instead, it only gave me profiling on the TAU > application and MPI routines. > > I have send email inquiries to TAU developer. They suggestted me rebuild > PETSc with TAU if I want those profiles on PETSc function calls. > > Frankly, I do not have experience with profiling or performance analysis on > a solver. Do you have any suggestion or pointers to other source on doing > performance analysis for PETSc solvers, considering the option of using TAU > may need lots of workarounds when I rebuilt PETSC with TAU on the cluster. I just use PETSc logging (-log_summary). However, you can also try cachegrind (in valgrind), which has several nice visualization tools, like kcachegrind. Matt > > Thank you very much, > > Yan > > On Wed, Jun 24, 2009 at 5:52 PM, Satish Balay wrote: > >> TAU isn't tested with petsc in the past few years. So thing could be >> broken. I'll have to check on this. >> > > > Hi Satish, > What I am tring to do is using TAU to profile a solver using PETSc > function calls on a Sicortex cluster. > > I tried tauex, it can run. But it did not provide me any explicit profiling > about PETSc function calls. Instead, it just give me an profiling on the TAU > application and MPI routines. > > I have send inquiryies to TAU developer. They suggestted me rebuild PETSc > with TAU if I want those profiles on PETSc function calls. > > Frankly, I do not have experience with profiling or performance analysis on > a solver. Do you have any suggestion on doing performance analysis for PETSc > solvers? > > > >> >> One comment though.. If you have tau built with mpi - you should use >> that mpi & wrappers with tau and not use --download-mpich. [but this >> is also likely broken] >> >> Satish >> >> On Wed, 24 Jun 2009, Ryan Yan wrote: >> >> > Hi all, >> > I got configuration errors, when I trying to built PETSC with Tau. >> > Basically, I followed the instruction on the installation webpage of >> PETSc. >> > >> > In the attached file, please find the configure log. >> > >> > I do built TAU and PDT seperately, and TAU with mpi and PDT. >> > >> > If any more information is needed, please let me know. >> > >> > Thanks, >> > >> > Yan >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Jun 24 17:18:26 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 24 Jun 2009 17:18:26 -0500 (CDT) Subject: Petsc configure error In-Reply-To: References: Message-ID: Have you tried running with -log_summary? [I still need to check tau] Satish On Wed, 24 Jun 2009, Ryan Yan wrote: > Hi Satish, > What I am tring to do is using TAU to profile a solver using PETSc function > calls on a Sicortex cluster. Now I am just building on a desktop. > > I tried tauex, it can run. But it did not provide me any explicit profiling > about PETSc function calls. Instead, it only gave me profiling on the TAU > application and MPI routines. > > I have send email inquiries to TAU developer. They suggestted me rebuild > PETSc with TAU if I want those profiles on PETSc function calls. > > Frankly, I do not have experience with profiling or performance analysis on > a solver. Do you have any suggestion or pointers to other source on doing > performance analysis for PETSc solvers, considering the option of using TAU > may need lots of workarounds when I rebuilt PETSC with TAU on the cluster. > > Thank you very much, > > Yan > > On Wed, Jun 24, 2009 at 5:52 PM, Satish Balay wrote: > > > TAU isn't tested with petsc in the past few years. So thing could be > > broken. I'll have to check on this. > > > > > Hi Satish, > What I am tring to do is using TAU to profile a solver using PETSc function > calls on a Sicortex cluster. > > I tried tauex, it can run. But it did not provide me any explicit profiling > about PETSc function calls. Instead, it just give me an profiling on the TAU > application and MPI routines. > > I have send inquiryies to TAU developer. They suggestted me rebuild PETSc > with TAU if I want those profiles on PETSc function calls. > > Frankly, I do not have experience with profiling or performance analysis on > a solver. Do you have any suggestion on doing performance analysis for PETSc > solvers? > > > > > > > One comment though.. If you have tau built with mpi - you should use > > that mpi & wrappers with tau and not use --download-mpich. [but this > > is also likely broken] > > > > Satish > > > > On Wed, 24 Jun 2009, Ryan Yan wrote: > > > > > Hi all, > > > I got configuration errors, when I trying to built PETSC with Tau. > > > Basically, I followed the instruction on the installation webpage of > > PETSc. > > > > > > In the attached file, please find the configure log. > > > > > > I do built TAU and PDT seperately, and TAU with mpi and PDT. > > > > > > If any more information is needed, please let me know. > > > > > > Thanks, > > > > > > Yan > > > > > > > > From vyan2000 at gmail.com Wed Jun 24 17:26:28 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Wed, 24 Jun 2009 18:26:28 -0400 Subject: Petsc configure error In-Reply-To: References: Message-ID: Hi Matt and Satish, Thank you very much for pointing this out. I almost forgot this. I will pay more attention to -log-summary. Yan On Wed, Jun 24, 2009 at 6:18 PM, Satish Balay wrote: > Have you tried running with -log_summary? > > [I still need to check tau] > > Satish > > On Wed, 24 Jun 2009, Ryan Yan wrote: > > > Hi Satish, > > What I am tring to do is using TAU to profile a solver using PETSc > function > > calls on a Sicortex cluster. Now I am just building on a desktop. > > > > I tried tauex, it can run. But it did not provide me any explicit > profiling > > about PETSc function calls. Instead, it only gave me profiling on the TAU > > application and MPI routines. > > > > I have send email inquiries to TAU developer. They suggestted me rebuild > > PETSc with TAU if I want those profiles on PETSc function calls. > > > > Frankly, I do not have experience with profiling or performance analysis > on > > a solver. Do you have any suggestion or pointers to other source on > doing > > performance analysis for PETSc solvers, considering the option of using > TAU > > may need lots of workarounds when I rebuilt PETSC with TAU on the > cluster. > > > > Thank you very much, > > > > Yan > > > > On Wed, Jun 24, 2009 at 5:52 PM, Satish Balay > wrote: > > > > > TAU isn't tested with petsc in the past few years. So thing could be > > > broken. I'll have to check on this. > > > > > > > > > Hi Satish, > > What I am tring to do is using TAU to profile a solver using PETSc > function > > calls on a Sicortex cluster. > > > > I tried tauex, it can run. But it did not provide me any explicit > profiling > > about PETSc function calls. Instead, it just give me an profiling on the > TAU > > application and MPI routines. > > > > I have send inquiryies to TAU developer. They suggestted me rebuild PETSc > > with TAU if I want those profiles on PETSc function calls. > > > > Frankly, I do not have experience with profiling or performance analysis > on > > a solver. Do you have any suggestion on doing performance analysis for > PETSc > > solvers? > > > > > > > > > > > > One comment though.. If you have tau built with mpi - you should use > > > that mpi & wrappers with tau and not use --download-mpich. [but this > > > is also likely broken] > > > > > > Satish > > > > > > On Wed, 24 Jun 2009, Ryan Yan wrote: > > > > > > > Hi all, > > > > I got configuration errors, when I trying to built PETSC with Tau. > > > > Basically, I followed the instruction on the installation webpage of > > > PETSc. > > > > > > > > In the attached file, please find the configure log. > > > > > > > > I do built TAU and PDT seperately, and TAU with mpi and PDT. > > > > > > > > If any more information is needed, please let me know. > > > > > > > > Thanks, > > > > > > > > Yan > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsjb00 at hotmail.com Wed Jun 24 17:59:06 2009 From: tsjb00 at hotmail.com (tsjb00) Date: Wed, 24 Jun 2009 22:59:06 +0000 Subject: fortran 90 program with PETSc In-Reply-To: References: Message-ID: Many thanks for your help! I've tried the method in your email . In the main program called driver.F, it is coded: program main use parm #include "finclude/petsc.h" implicit real*8 (a-h,o-z) ................................. I got the following error information : In file driver.F:280 implicit none 1 In file /home/jinbei/Soft/petsc-3.0.0-p5/include/finclude/petsc.h:209 Included at driver.F:279 2 Error: IMPLICIT NONE statement at (1) cannot follow data declaration statement at (2) Would you please tell me what might be wrong? Many thanks in advance! > Date: Wed, 24 Jun 2009 16:59:43 -0500 > From: balay at mcs.anl.gov > To: petsc-users at mcs.anl.gov > Subject: Re: fortran 90 program with PETSc > > On Wed, 24 Jun 2009, tsjb00 wrote: > > > > > Hi! > > > I am trying to parallelize a fortran 90 program with PETSc. The > > original program uses a lot of global variables defined using > > fortran 90 modules. > > > My question is: Can I use 'include PETSc include file' and 'use > > MyModule' in the program at the same time? > > yes. > > > If so, in what sequence should I put 'include XXX', 'use XXX' and > > 'implicit' statements? > > The following should work. > > use foo > #include "finclude/petsc.h" > implicit none > > > > Another general question is whether I can use fortran 90 features > > such as 'module' and dynamic allocation on a parallel system of > > MPI-1 standard? Any reference on such programming work will be also > > appreciated. > > I don't see why not. f90 issues you mentioned are unrelated to MPI > > Satish > > > Many thanks in advance! > > > > > > _________________________________________________________________ > > ?????????????????msn????? > > http://ditu.live.com/?form=TL&swm=1 _________________________________________________________________ Messenger10????????????? http://10.msn.com.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsjb00 at hotmail.com Wed Jun 24 18:36:27 2009 From: tsjb00 at hotmail.com (tsjb00) Date: Wed, 24 Jun 2009 23:36:27 +0000 Subject: fortran 90 program with PETSc In-Reply-To: References: Message-ID: compiling passed if: use parm implicit real*8 (a-h,o-z) #include "finclude/petsc.h" Is this ok? Many thanks! From: tsjb00 at hotmail.com To: petsc-users at mcs.anl.gov Subject: RE: fortran 90 program with PETSc Date: Wed, 24 Jun 2009 22:59:06 +0000 Many thanks for your help! I've tried the method in your email . In the main program called driver.F, it is coded: program main use parm #include "finclude/petsc.h" implicit real*8 (a-h,o-z) ................................. I got the following error information : In file driver.F:280 implicit none 1 In file /home/jinbei/Soft/petsc-3.0.0-p5/include/finclude/petsc.h:209 Included at driver.F:279 2 Error: IMPLICIT NONE statement at (1) cannot follow data declaration statement at (2) Would you please tell me what might be wrong? Many thanks in advance! > Date: Wed, 24 Jun 2009 16:59:43 -0500 > From: balay at mcs.anl.gov > To: petsc-users at mcs.anl.gov > Subject: Re: fortran 90 program with PETSc > > On Wed, 24 Jun 2009, tsjb00 wrote: > > > > > Hi! > > > I am trying to parallelize a fortran 90 program with PETSc. The > > original program uses a lot of global variables defined using > > fortran 90 modules. > > > My question is: Can I use 'include PETSc include file' and 'use > > MyModule' in the program at the same time? > > yes. > > > If so, in what sequence should I put 'include XXX', 'use XXX' and > > 'implicit' statements? > > The following should work. > > use foo > #include "finclude/petsc.h" > implicit none > > > > Another general question is whether I can use fortran 90 features > > such as 'module' and dynamic allocation on a parallel system of > > MPI-1 standard? Any reference on such programming work will be also > > appreciated. > > I don't see why not. f90 issues you mentioned are unrelated to MPI > > Satish > > > Many thanks in advance! > > > > > > _________________________________________________________________ > > ?????????????????msn????? > > http://ditu.live.com/?form=TL&swm=1 ???? MSN ??????Messenger ????? ?????? _________________________________________________________________ Messenger10????????????? http://10.msn.com.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Jun 24 19:27:51 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 24 Jun 2009 19:27:51 -0500 (CDT) Subject: fortran 90 program with PETSc In-Reply-To: References: Message-ID: I guess this is fine. we use 'implicit none' You can check some of the petsc example sources. satish On Wed, 24 Jun 2009, tsjb00 wrote: > > compiling passed if: > use parm > implicit real*8 (a-h,o-z) > #include "finclude/petsc.h" > > Is this ok? Many thanks! > From: tsjb00 at hotmail.com > To: petsc-users at mcs.anl.gov > Subject: RE: fortran 90 program with PETSc > Date: Wed, 24 Jun 2009 22:59:06 +0000 > > > > > > > > > Many thanks for your help! I've tried the method in your email . > In the main program called driver.F, it is coded: > program main > use parm > #include "finclude/petsc.h" > implicit real*8 (a-h,o-z) > ................................. > I got the following error information : > > In file driver.F:280 > > implicit none > 1 > In file /home/jinbei/Soft/petsc-3.0.0-p5/include/finclude/petsc.h:209 > > Included at driver.F:279 > > > 2 > Error: IMPLICIT NONE statement at (1) cannot follow data declaration statement at (2) > > Would you please tell me what might be wrong? > > Many thanks in advance! > > > > > Date: Wed, 24 Jun 2009 16:59:43 -0500 > > From: balay at mcs.anl.gov > > To: petsc-users at mcs.anl.gov > > Subject: Re: fortran 90 program with PETSc > > > > On Wed, 24 Jun 2009, tsjb00 wrote: > > > > > > > > Hi! > > > > > I am trying to parallelize a fortran 90 program with PETSc. The > > > original program uses a lot of global variables defined using > > > fortran 90 modules. > > > > > My question is: Can I use 'include PETSc include file' and 'use > > > MyModule' in the program at the same time? > > > > yes. > > > > > If so, in what sequence should I put 'include XXX', 'use XXX' and > > > 'implicit' statements? > > > > The following should work. > > > > use foo > > #include "finclude/petsc.h" > > implicit none > > > > > > > Another general question is whether I can use fortran 90 features > > > such as 'module' and dynamic allocation on a parallel system of > > > MPI-1 standard? Any reference on such programming work will be also > > > appreciated. > > > > I don't see why not. f90 issues you mentioned are unrelated to MPI > > > > Satish > > > > > Many thanks in advance! > > > > > > > > > _________________________________________________________________ > > > ?????????????????msn????? > > > http://ditu.live.com/?form=TL&swm=1 > > ???? MSN ??????Messenger ????? ?????? > _________________________________________________________________ > Messenger10????????????? > http://10.msn.com.cn From enjoywm at cs.wm.edu Thu Jun 25 13:30:41 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Thu, 25 Jun 2009 14:30:41 -0400 Subject: mpiexec Message-ID: <4A43C251.1070306@cs.wm.edu> Hi, My PETSc based application can work correctly, but after system updating when I use the same commands: >lamboot >mpiexec -np 4 application It seems only one processor works. Then I test it using the following code, ********** PetscErrorCode ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); COUT << "This is processor : " << rank << ENDL; Use command: mpiexec -np 4 application, The output is: This is processor : 0 This is processor : 0 This is processor : 0 This is processor : 0 ********* Thanks for your help. Yixun From bsmith at mcs.anl.gov Thu Jun 25 13:41:19 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 25 Jun 2009 13:41:19 -0500 Subject: mpiexec In-Reply-To: <4A43C251.1070306@cs.wm.edu> References: <4A43C251.1070306@cs.wm.edu> Message-ID: Did you recompile the MPI libraries? Did you re configure and compile ALL of PETSc after the change? You will need to do all this. Barry On Jun 25, 2009, at 1:30 PM, Yixun Liu wrote: > Hi, > My PETSc based application can work correctly, but after system > updating > when I use the same commands: >> lamboot >> mpiexec -np 4 application > > It seems only one processor works. Then I test it using the > following code, > > ********** > PetscErrorCode ierr = > MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > > COUT << "This is processor : " << rank << ENDL; > > Use command: mpiexec -np 4 application, > The output is: > This is processor : 0 > This is processor : 0 > This is processor : 0 > This is processor : 0 > ********* > > Thanks for your help. > > Yixun > > From balay at mcs.anl.gov Thu Jun 25 13:52:44 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 25 Jun 2009 13:52:44 -0500 (CDT) Subject: mpiexec In-Reply-To: References: <4A43C251.1070306@cs.wm.edu> Message-ID: Also make sure 'mpiexec' you are using corresponds to the MPI petsc is built with. Satish On Thu, 25 Jun 2009, Barry Smith wrote: > > Did you recompile the MPI libraries? Did you re configure and compile ALL of > PETSc after the change? You will need to do all this. > > Barry > > On Jun 25, 2009, at 1:30 PM, Yixun Liu wrote: > > > Hi, > > My PETSc based application can work correctly, but after system updating > > when I use the same commands: > > > lamboot > > > mpiexec -np 4 application > > > > It seems only one processor works. Then I test it using the following code, > > > > ********** > > PetscErrorCode ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > > > > COUT << "This is processor : " << rank << ENDL; > > > > Use command: mpiexec -np 4 application, > > The output is: > > This is processor : 0 > > This is processor : 0 > > This is processor : 0 > > This is processor : 0 > > ********* > > > > Thanks for your help. > > > > Yixun > > > > > From enjoywm at cs.wm.edu Thu Jun 25 15:17:58 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Thu, 25 Jun 2009 16:17:58 -0400 Subject: mpiexec In-Reply-To: References: <4A43C251.1070306@cs.wm.edu> Message-ID: <4A43DB76.7030502@cs.wm.edu> I recompile PETSc using the following commands, petsc-3.0.0-p0>setenv PETSC_DIR $PWD >./config/configure.py ================================================================================= Configuring PETSc to compile on your system ================================================================================= /home/scratch/yixun/petsc-3.0.0-p3/config/BuildSystem/config/compilers.py:7: DeprecationWarning: the sets module is deprecated import sets /home/scratch/yixun/petsc-3.0.0-p3/config/PETSc/package.py:7: DeprecationWarning: the md5 module is deprecated; use hashlib instead import md5 /home/scratch/yixun/petsc-3.0.0-p3/config/BuildSystem/script.py:101: DeprecationWarning: The popen2 module is deprecated. Use the subprocess module. import popen2 TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:394) ********************************************************************************* UNABLE to EXECUTE BINARIES for config/configure.py --------------------------------------------------------------------------------------- Cannot run executables created with C. It is likely that you will need to configure using --with-batch which allows configuration without interactive sessions. ********************************************************************************* Satish Balay wrote: > Also make sure 'mpiexec' you are using corresponds to the MPI petsc is built with. > > Satish > > > On Thu, 25 Jun 2009, Barry Smith wrote: > > >> Did you recompile the MPI libraries? Did you re configure and compile ALL of >> PETSc after the change? You will need to do all this. >> >> Barry >> >> On Jun 25, 2009, at 1:30 PM, Yixun Liu wrote: >> >> >>> Hi, >>> My PETSc based application can work correctly, but after system updating >>> when I use the same commands: >>> >>>> lamboot >>>> mpiexec -np 4 application >>>> >>> It seems only one processor works. Then I test it using the following code, >>> >>> ********** >>> PetscErrorCode ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); >>> >>> COUT << "This is processor : " << rank << ENDL; >>> >>> Use command: mpiexec -np 4 application, >>> The output is: >>> This is processor : 0 >>> This is processor : 0 >>> This is processor : 0 >>> This is processor : 0 >>> ********* >>> >>> Thanks for your help. >>> >>> Yixun >>> >>> >>> > > From balay at mcs.anl.gov Thu Jun 25 15:37:25 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 25 Jun 2009 15:37:25 -0500 (CDT) Subject: mpiexec In-Reply-To: <4A43DB76.7030502@cs.wm.edu> References: <4A43C251.1070306@cs.wm.edu> <4A43DB76.7030502@cs.wm.edu> Message-ID: Send us the corresponding configure.log to petsc-maint at mcs.anl.gov Satish On Thu, 25 Jun 2009, Yixun Liu wrote: > I recompile PETSc using the following commands, > > petsc-3.0.0-p0>setenv PETSC_DIR $PWD > >./config/configure.py > > ================================================================================= > Configuring PETSc to compile on your system > ================================================================================= > /home/scratch/yixun/petsc-3.0.0-p3/config/BuildSystem/config/compilers.py:7: > DeprecationWarning: the sets module is deprecated > import sets > /home/scratch/yixun/petsc-3.0.0-p3/config/PETSc/package.py:7: > DeprecationWarning: the md5 module is deprecated; use hashlib instead > import md5 > /home/scratch/yixun/petsc-3.0.0-p3/config/BuildSystem/script.py:101: > DeprecationWarning: The popen2 module is deprecated. Use the subprocess > module. > import popen2 > TESTING: checkCCompiler from > config.setCompilers(config/BuildSystem/config/setCompilers.py:394) > ********************************************************************************* > UNABLE to EXECUTE BINARIES for config/configure.py > --------------------------------------------------------------------------------------- > Cannot run executables created with C. It is likely that you will need > to configure using --with-batch which allows configuration without > interactive sessions. > ********************************************************************************* > > > Satish Balay wrote: > > Also make sure 'mpiexec' you are using corresponds to the MPI petsc is built with. > > > > Satish > > > > > > On Thu, 25 Jun 2009, Barry Smith wrote: > > > > > >> Did you recompile the MPI libraries? Did you re configure and compile ALL of > >> PETSc after the change? You will need to do all this. > >> > >> Barry > >> > >> On Jun 25, 2009, at 1:30 PM, Yixun Liu wrote: > >> > >> > >>> Hi, > >>> My PETSc based application can work correctly, but after system updating > >>> when I use the same commands: > >>> > >>>> lamboot > >>>> mpiexec -np 4 application > >>>> > >>> It seems only one processor works. Then I test it using the following code, > >>> > >>> ********** > >>> PetscErrorCode ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > >>> > >>> COUT << "This is processor : " << rank << ENDL; > >>> > >>> Use command: mpiexec -np 4 application, > >>> The output is: > >>> This is processor : 0 > >>> This is processor : 0 > >>> This is processor : 0 > >>> This is processor : 0 > >>> ********* > >>> > >>> Thanks for your help. > >>> > >>> Yixun > >>> > >>> > >>> > > > > > > From enjoywm at cs.wm.edu Thu Jun 25 15:55:44 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Thu, 25 Jun 2009 16:55:44 -0400 Subject: mpiexec In-Reply-To: References: <4A43C251.1070306@cs.wm.edu> <4A43DB76.7030502@cs.wm.edu> Message-ID: <4A43E450.3010909@cs.wm.edu> Satish Balay wrote: > Send us the corresponding configure.log to petsc-maint at mcs.anl.gov > > Satish > > On Thu, 25 Jun 2009, Yixun Liu wrote: > > >> I recompile PETSc using the following commands, >> >> petsc-3.0.0-p0>setenv PETSC_DIR $PWD >> >>> ./config/configure.py >>> >> ================================================================================= >> Configuring PETSc to compile on your system >> ================================================================================= >> /home/scratch/yixun/petsc-3.0.0-p3/config/BuildSystem/config/compilers.py:7: >> DeprecationWarning: the sets module is deprecated >> import sets >> /home/scratch/yixun/petsc-3.0.0-p3/config/PETSc/package.py:7: >> DeprecationWarning: the md5 module is deprecated; use hashlib instead >> import md5 >> /home/scratch/yixun/petsc-3.0.0-p3/config/BuildSystem/script.py:101: >> DeprecationWarning: The popen2 module is deprecated. Use the subprocess >> module. >> import popen2 >> TESTING: checkCCompiler from >> config.setCompilers(config/BuildSystem/config/setCompilers.py:394) >> ********************************************************************************* >> UNABLE to EXECUTE BINARIES for config/configure.py >> --------------------------------------------------------------------------------------- >> Cannot run executables created with C. It is likely that you will need >> to configure using --with-batch which allows configuration without >> interactive sessions. >> ********************************************************************************* >> >> >> Satish Balay wrote: >> >>> Also make sure 'mpiexec' you are using corresponds to the MPI petsc is built with. >>> >>> Satish >>> >>> >>> On Thu, 25 Jun 2009, Barry Smith wrote: >>> >>> >>> >>>> Did you recompile the MPI libraries? Did you re configure and compile ALL of >>>> PETSc after the change? You will need to do all this. >>>> >>>> Barry >>>> >>>> On Jun 25, 2009, at 1:30 PM, Yixun Liu wrote: >>>> >>>> >>>> >>>>> Hi, >>>>> My PETSc based application can work correctly, but after system updating >>>>> when I use the same commands: >>>>> >>>>> >>>>>> lamboot >>>>>> mpiexec -np 4 application >>>>>> >>>>>> >>>>> It seems only one processor works. Then I test it using the following code, >>>>> >>>>> ********** >>>>> PetscErrorCode ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); >>>>> >>>>> COUT << "This is processor : " << rank << ENDL; >>>>> >>>>> Use command: mpiexec -np 4 application, >>>>> The output is: >>>>> This is processor : 0 >>>>> This is processor : 0 >>>>> This is processor : 0 >>>>> This is processor : 0 >>>>> ********* >>>>> >>>>> Thanks for your help. >>>>> >>>>> Yixun >>>>> >>>>> >>>>> >>>>> >>> >>> >> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 39587 bytes Desc: not available URL: From balay at mcs.anl.gov Thu Jun 25 16:29:20 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 25 Jun 2009 16:29:20 -0500 (CDT) Subject: mpiexec In-Reply-To: <4A43E450.3010909@cs.wm.edu> References: <4A43C251.1070306@cs.wm.edu> <4A43DB76.7030502@cs.wm.edu> <4A43E450.3010909@cs.wm.edu> Message-ID: >> ERROR while running executable: Could not execute './conftest': >> ./conftest: error while loading shared libraries: libopen-rte.so.0: cannot open shared object file: No such file or directory Your OpenMPI compilers are not configured properly and are currently broken. You'll need to setup LD_LIBRARY_PATH to the correct location of the OpenMPI shared libraries. Satish On Thu, 25 Jun 2009, Yixun Liu wrote: > Satish Balay wrote: > > Send us the corresponding configure.log to petsc-maint at mcs.anl.gov > > > > Satish > > > > On Thu, 25 Jun 2009, Yixun Liu wrote: > > > > > >> I recompile PETSc using the following commands, > >> > >> petsc-3.0.0-p0>setenv PETSC_DIR $PWD > >> > >>> ./config/configure.py > >>> > >> ================================================================================= > >> Configuring PETSc to compile on your system > >> ================================================================================= > >> /home/scratch/yixun/petsc-3.0.0-p3/config/BuildSystem/config/compilers.py:7: > >> DeprecationWarning: the sets module is deprecated > >> import sets > >> /home/scratch/yixun/petsc-3.0.0-p3/config/PETSc/package.py:7: > >> DeprecationWarning: the md5 module is deprecated; use hashlib instead > >> import md5 > >> /home/scratch/yixun/petsc-3.0.0-p3/config/BuildSystem/script.py:101: > >> DeprecationWarning: The popen2 module is deprecated. Use the subprocess > >> module. > >> import popen2 > >> TESTING: checkCCompiler from > >> config.setCompilers(config/BuildSystem/config/setCompilers.py:394) > >> ********************************************************************************* > >> UNABLE to EXECUTE BINARIES for config/configure.py > >> --------------------------------------------------------------------------------------- > >> Cannot run executables created with C. It is likely that you will need > >> to configure using --with-batch which allows configuration without > >> interactive sessions. > >> ********************************************************************************* > >> > >> > >> Satish Balay wrote: > >> > >>> Also make sure 'mpiexec' you are using corresponds to the MPI petsc is built with. > >>> > >>> Satish > >>> > >>> > >>> On Thu, 25 Jun 2009, Barry Smith wrote: > >>> > >>> > >>> > >>>> Did you recompile the MPI libraries? Did you re configure and compile ALL of > >>>> PETSc after the change? You will need to do all this. > >>>> > >>>> Barry > >>>> > >>>> On Jun 25, 2009, at 1:30 PM, Yixun Liu wrote: > >>>> > >>>> > >>>> > >>>>> Hi, > >>>>> My PETSc based application can work correctly, but after system updating > >>>>> when I use the same commands: > >>>>> > >>>>> > >>>>>> lamboot > >>>>>> mpiexec -np 4 application > >>>>>> > >>>>>> > >>>>> It seems only one processor works. Then I test it using the following code, > >>>>> > >>>>> ********** > >>>>> PetscErrorCode ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > >>>>> > >>>>> COUT << "This is processor : " << rank << ENDL; > >>>>> > >>>>> Use command: mpiexec -np 4 application, > >>>>> The output is: > >>>>> This is processor : 0 > >>>>> This is processor : 0 > >>>>> This is processor : 0 > >>>>> This is processor : 0 > >>>>> ********* > >>>>> > >>>>> Thanks for your help. > >>>>> > >>>>> Yixun > >>>>> > >>>>> > >>>>> > >>>>> > >>> > >>> > >> > > > > > > From Andreas.Grassl at student.uibk.ac.at Mon Jun 29 16:55:28 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Mon, 29 Jun 2009 23:55:28 +0200 Subject: PCNN-preconditioner and "floating domains" In-Reply-To: <093D54A6-0166-4BFC-A8CC-932BD5E3D6DB@mcs.anl.gov> References: <4A40DF21.1070007@student.uibk.ac.at> <4A40FFAB.8010806@student.uibk.ac.at> <093D54A6-0166-4BFC-A8CC-932BD5E3D6DB@mcs.anl.gov> Message-ID: <4A493850.7040209@student.uibk.ac.at> Hello, Barry Smith schrieb: > To modify the code for elasticity is a big job. There are two faculty > Axel Klawonn and O. Rheinbach > at Universit?t Duisburg-Essen who have implemented a variety of these > fast methods for elasticity. > I suggest you contact them. I wrote an email last week, but I still have no answer. Investigating further my code and the source code related to IS I noticed, that there is a flag pure_neumann which I guess should handle the singular Neumann problems giving the problem, but from my understanding of the code flow there is no situation it is set true. Is this flag a remainder from previous implementations or am I just looking at the wrong place? Furthermore I'm wondering about the size of the coarse problem. From my understanding it should include all interface DOF's? But the size I get is the number of subdomains... Last but not least I wanted to thank for the fast help you provide and to apologize for the questions, which may seem rather stupid but help me to find the right understanding. cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: From bsmith at mcs.anl.gov Mon Jun 29 17:20:39 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 29 Jun 2009 17:20:39 -0500 Subject: PCNN-preconditioner and "floating domains" In-Reply-To: <4A493850.7040209@student.uibk.ac.at> References: <4A40DF21.1070007@student.uibk.ac.at> <4A40FFAB.8010806@student.uibk.ac.at> <093D54A6-0166-4BFC-A8CC-932BD5E3D6DB@mcs.anl.gov> <4A493850.7040209@student.uibk.ac.at> Message-ID: <2F776A80-ED4F-4EF3-ACBA-2696C424792C@mcs.anl.gov> On Jun 29, 2009, at 4:55 PM, Andreas Grassl wrote: > Hello, > > Barry Smith schrieb: >> To modify the code for elasticity is a big job. There are two >> faculty >> Axel Klawonn and O. Rheinbach >> at Universit?t Duisburg-Essen who have implemented a variety of these >> fast methods for elasticity. >> I suggest you contact them. > > I wrote an email last week, but I still have no answer. > Investigating further my > code and the source code related to IS I noticed, that there is a flag > pure_neumann which I guess should handle the singular Neumann > problems giving > the problem, but from my understanding of the code flow there is no > situation it > is set true. Is this flag a remainder from previous implementations > or am I just > looking at the wrong place? I have no idea what that flag is for. > > Furthermore I'm wondering about the size of the coarse problem. From > my > understanding it should include all interface DOF's? But the size I > get is the > number of subdomains... > It should be the number of subdomains times the dimension of the null space for the subdomains. For Laplacian that is just the number of subdomains. For 3d linear elasticity it is 6 times the number of subdomains. > Last but not least I wanted to thank for the fast help you provide > and to > apologize for the questions, which may seem rather stupid but help > me to find > the right understanding. > > cheers, > > ando > > > -- > /"\ Grassl Andreas > \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik > X against HTML email Technikerstr. 13 Zi 709 > / \ +43 (0)512 507 6091 > > From Andreas.Grassl at student.uibk.ac.at Mon Jun 29 18:08:06 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Tue, 30 Jun 2009 01:08:06 +0200 Subject: PCNN-preconditioner and "floating domains" In-Reply-To: <2F776A80-ED4F-4EF3-ACBA-2696C424792C@mcs.anl.gov> References: <4A40DF21.1070007@student.uibk.ac.at> <4A40FFAB.8010806@student.uibk.ac.at> <093D54A6-0166-4BFC-A8CC-932BD5E3D6DB@mcs.anl.gov> <4A493850.7040209@student.uibk.ac.at> <2F776A80-ED4F-4EF3-ACBA-2696C424792C@mcs.anl.gov> Message-ID: <4A494956.6070809@student.uibk.ac.at> Barry Smith schrieb: > On Jun 29, 2009, at 4:55 PM, Andreas Grassl wrote: >> Furthermore I'm wondering about the size of the coarse problem. From my >> understanding it should include all interface DOF's? But the size I >> get is the >> number of subdomains... >> > It should be the number of subdomains times the dimension of the null > space for the subdomains. > For Laplacian that is just the number of subdomains. For 3d linear > elasticity it is 6 times the number of subdomains. Ok, slowly I get an idea where I have to change the code (at least I want to give it a try). cheers, ando -- /"\ \ / ASCII Ribbon X against HTML email / \ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: From kuiper at mpia-hd.mpg.de Mon Jun 29 19:07:35 2009 From: kuiper at mpia-hd.mpg.de (Rolf Kuiper) Date: Tue, 30 Jun 2009 02:07:35 +0200 Subject: MPI-layout of PETSc Message-ID: Hi PETSc users, I ran into trouble in combining my developed PETSc application with another code (based on another library called "ArrayLib"). The problem is the parallel layout for MPI, e.g. in 2D with 6 cpus the ArrayLib code gives the names/ranks of the local cpus first in y- direction, than in x (from last to first, in the same way the MPI arrays are called, like 3Darray[z][y][x]): y ^ | 2-4-6 | 1-3-5 |--------> x If I call DACreate() from PETSc, it will assume an ordering according to names/ranks first set in x-direction, than in y: y ^ | 4-5-6 | 1-2-3 |--------> x Of course, if I now communicate the boundary values, I mix up the domain (build by the other program). Is there a possibility / a flag to set the name of the ranks? Due to the fact that my application is written and working in curvilinear coordinates and not in cartesian, I cannot just switch the directions. Thanks a lot for your help, Rolf From bsmith at mcs.anl.gov Mon Jun 29 19:24:21 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 29 Jun 2009 19:24:21 -0500 Subject: MPI-layout of PETSc In-Reply-To: References: Message-ID: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote: > Hi PETSc users, > > I ran into trouble in combining my developed PETSc application with > another code (based on another library called "ArrayLib"). > The problem is the parallel layout for MPI, e.g. in 2D with 6 cpus > the ArrayLib code gives the names/ranks of the local cpus first in y- > direction, than in x (from last to first, in the same way the MPI > arrays are called, like 3Darray[z][y][x]): > > y > ^ > | 2-4-6 > | 1-3-5 > |--------> x > > If I call DACreate() from PETSc, it will assume an ordering > according to names/ranks first set in x-direction, than in y: > > y > ^ > | 4-5-6 > | 1-2-3 > |--------> x > > Of course, if I now communicate the boundary values, I mix up the > domain (build by the other program). > > Is there a possibility / a flag to set the name of the ranks? > Due to the fact that my application is written and working in > curvilinear coordinates and not in cartesian, I cannot just switch > the directions. What we recommend in this case is to just change the meaning of x, y, and z when you use the PETSc DA. This does mean changing your code that uses the PETSc DA. I do not understand why curvilinear coordinates has anything to do with it. Another choice is to create a new MPI communicator that has the different ordering of the ranks of the processors and then using that comm to create the PETSc DA objects; then you would not need to change your code that calls PETSc. Unfortunately PETSc doesn't have any way to flip how the DA handles the layout automatically. Barry > > Thanks a lot for your help, > Rolf From cmay at phys.ethz.ch Tue Jun 30 06:48:49 2009 From: cmay at phys.ethz.ch (Christian May) Date: Tue, 30 Jun 2009 13:48:49 +0200 (CEST) Subject: no parallel speedup with slepc krylovschur Message-ID: Dear readers, I want to solve the following generalized eigenvalue problem in parallel: http://www.phys.ethz.ch/~cmay/binaryoutput.tgz (gnu zipped tar of the binaryoutput you get using -eps_view_binary) I know the approximate value of the lowest eigenvalues I'm interested in, so I'm using shift-and-invert with the following options: -eps_type krylovschur -st_type sinvert -st_shift -0.228 -eps_ncv 32 -eps_nev 16 -eps_tol 1e-10 However, when I use multiple CPUs, I cannot see any speedup. I am not sure whether this is a problem of the matrix or of my system. Can anybody please have a look? Thanks Christian From tyoung at ippt.gov.pl Tue Jun 30 08:24:13 2009 From: tyoung at ippt.gov.pl (Toby D. Young) Date: Tue, 30 Jun 2009 15:24:13 +0200 (CEST) Subject: no parallel speedup with slepc krylovschur In-Reply-To: References: Message-ID: > I want to solve the following generalized eigenvalue problem in parallel: > > However, when I use multiple CPUs, I cannot see any speedup. I am not sure > whether this is a problem of the matrix or of my system. Can anybody > please have a look? This goes against everything I have done with the KrylovSchur solver! Can you confirm (to yourself and to me) that your system really is using multiple processors and not multiple threads? If you can answer that I may take a look at your code. Show me some output please. How do you invoke your code on the command line? mpicc??? Best, Toby ----- Toby D. Young Philosopher-Physicist Adiunkt (Assistant Professor) Polish Academy of Sciences Warszawa, Polska www: http://www.ippt.gov.pl/~tyoung skype: stenografia From cmay at phys.ethz.ch Tue Jun 30 08:57:38 2009 From: cmay at phys.ethz.ch (Christian May) Date: Tue, 30 Jun 2009 15:57:38 +0200 (CEST) Subject: no parallel speedup with slepc krylovschur In-Reply-To: References: Message-ID: On Tue, 30 Jun 2009, Toby D. Young wrote: > >> I want to solve the following generalized eigenvalue problem in parallel: >> >> However, when I use multiple CPUs, I cannot see any speedup. I am not sure >> whether this is a problem of the matrix or of my system. Can anybody >> please have a look? > > This goes against everything I have done with the KrylovSchur solver! > > Can you confirm (to yourself and to me) that your system really is using > multiple processors and not multiple threads? If you can answer that I may > take a look at your code. Show me some output please. > Sure, here's the output for 2 CPUs: ************** Job was submitted from host by user . Job was executed on host(s) <2*a6400>, in queue , as user . was used as the home directory. was used as the working directory. Started at Tue Jun 30 15:32:25 2009 Results reported at Tue Jun 30 15:49:58 2009 Your job looked like: ------------------------------------------------------------ # LSBATCH: User input ompirun ./SPsolver3D-dune ini-files/fuhrer-dot/fuhrer-dot-test.ini -eps_type krylovschur -st_type sinvert -st_shift -0.228 -eps_ncv 32 -eps_tol 1e-10 -ksp_monitor -ksp_type gmres -pc_type bjacobi -ksp_rtol 1e-14 -st_ksp_rtol 1e-14 -ksp_converged_reason ------------------------------------------------------------ Successfully completed. Resource usage summary: CPU time : 2096.54 sec. Max Memory : 1831 MB Max Swap : 2461 MB Max Processes : 2 Max Threads : 2 The output (if any) follows: ********************* On one CPU, the needed time is slepc time: 262.43 On two CPUs, it is slepc time: 247.88 > How do you invoke your code on the command line? mpicc??? I compile with mpiCC and run with ompirun. Thanks Christian > > Best, > Toby > > ----- > > Toby D. Young > Philosopher-Physicist > Adiunkt (Assistant Professor) > Polish Academy of Sciences > Warszawa, Polska > > www: http://www.ippt.gov.pl/~tyoung > skype: stenografia > --------------------------------------------------------------------------- Christian May ETH Zurich Institute for Theoretical Physics, HIT K 31.5 CH-8093 Zurich, Switzerland Tel: + 41 44 633 79 61 Fax: + 41 44 633 11 15 Email: cmay at itp.phys.ethz.ch From jroman at dsic.upv.es Tue Jun 30 09:36:12 2009 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 30 Jun 2009 16:36:12 +0200 Subject: no parallel speedup with slepc krylovschur In-Reply-To: References: Message-ID: On 30/06/2009, Christian May wrote: > Dear readers, > > I want to solve the following generalized eigenvalue problem in > parallel: > > http://www.phys.ethz.ch/~cmay/binaryoutput.tgz > (gnu zipped tar of the binaryoutput you get using -eps_view_binary) > > I know the approximate value of the lowest eigenvalues I'm > interested in, so I'm using shift-and-invert with the following > options: > > -eps_type krylovschur -st_type sinvert -st_shift -0.228 -eps_ncv 32 - > eps_nev 16 -eps_tol 1e-10 > > > However, when I use multiple CPUs, I cannot see any speedup. I am > not sure whether this is a problem of the matrix or of my system. > Can anybody please have a look? > > Thanks > Christian With a shift-and-invert scheme, most of the computing time is spent in the solution of linear systems. In SLEPc 3.0.0 the default is to use KSP=redundant+PC=lu for the linear systems, so it is normal that you get little or no speedup with more than one processor. You should either use a truly parallel direct linear solver or an iterative linear solver. Please contact the SLEPc team directly if you need further assistance. Best regards, Jose From tyoung at ippt.gov.pl Tue Jun 30 09:46:29 2009 From: tyoung at ippt.gov.pl (Toby D. Young) Date: Tue, 30 Jun 2009 16:46:29 +0200 (CEST) Subject: no parallel speedup with slepc krylovschur In-Reply-To: References: Message-ID: > Sure, here's the output for 2 CPUs: > > ************** > Job -eps_type krylovschur -st_type sinvert -st_shift -0.228 -eps_ncv 32 > -eps_tol 1 > e-10 -ksp_monitor -ksp_type gmres -pc_type bjacobi -ksp_rtol 1e-14 > -st_ksp_rtol 1e-14 -ksp_converged_reason> was submitted from host > by user > . > Job was executed on host(s) <2*a6400>, in queue , as user . > was used as the home directory. > was used as the working directory. > Started at Tue Jun 30 15:32:25 2009 > Results reported at Tue Jun 30 15:49:58 2009 Something is not right here..... I think you are using SLEPc incorrectly. I suspect that you are not taking advantage of the MPI process or the solvers in an efficient way. How does your code scale with other solvers? My advice: Talk to the SLEPc guys! :-) Sorry I can be of no more help; but I will look at your code tomorrow and report anything if something. Best, Toby ----- Toby D. Young Philosopher-Physicist Adiunkt (Assistant Professor) Polish Academy of Sciences Warszawa, Polska www: http://www.ippt.gov.pl/~tyoung skype: stenografia From bsmith at mcs.anl.gov Tue Jun 30 11:03:01 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 30 Jun 2009 11:03:01 -0500 Subject: no parallel speedup with slepc krylovschur In-Reply-To: References: Message-ID: Running with -log_summary will tell you WHERE the code is taking the time and what parts are faster and slower on two processors, this will help you track down the problem. Absent this information it is impossible to know why it is not speeding up. Barry On Jun 30, 2009, at 6:48 AM, Christian May wrote: > Dear readers, > > I want to solve the following generalized eigenvalue problem in > parallel: > > http://www.phys.ethz.ch/~cmay/binaryoutput.tgz > (gnu zipped tar of the binaryoutput you get using -eps_view_binary) > > I know the approximate value of the lowest eigenvalues I'm > interested in, so I'm using shift-and-invert with the following > options: > > -eps_type krylovschur -st_type sinvert -st_shift -0.228 -eps_ncv 32 - > eps_nev 16 -eps_tol 1e-10 > > > However, when I use multiple CPUs, I cannot see any speedup. I am > not sure whether this is a problem of the matrix or of my system. > Can anybody please have a look? > > Thanks > Christian > > From keita at cray.com Tue Jun 30 11:50:37 2009 From: keita at cray.com (Keita Teranishi) Date: Tue, 30 Jun 2009 11:50:37 -0500 Subject: MUMPS and PETSc Message-ID: <925346A443D4E340BEB20248BAFCDBDF0BA11E85@CFEVS1-IP.americas.cray.com> Hi, Just a quick question, does PETSc call any MUMPS's I/O functions? I believe it just calls MUMPS's solver functions. Thanks, ================================ Keita Teranishi Scientific Library Group Cray, Inc. keita at cray.com ================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Tue Jun 30 11:57:05 2009 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Tue, 30 Jun 2009 11:57:05 -0500 (CDT) Subject: MUMPS and PETSc In-Reply-To: <925346A443D4E340BEB20248BAFCDBDF0BA11E85@CFEVS1-IP.americas.cray.com> References: <925346A443D4E340BEB20248BAFCDBDF0BA11E85@CFEVS1-IP.americas.cray.com> Message-ID: On Tue, 30 Jun 2009, Keita Teranishi wrote: > Hi, > > > > Just a quick question, does PETSc call any MUMPS's I/O functions? I > believe it just calls MUMPS's solver functions. No. Hong > > > > Thanks, > > ================================ > Keita Teranishi > Scientific Library Group > Cray, Inc. > keita at cray.com > ================================ > > > > From s.kramer at imperial.ac.uk Tue Jun 30 12:45:13 2009 From: s.kramer at imperial.ac.uk (Stephan Kramer) Date: Tue, 30 Jun 2009 18:45:13 +0100 Subject: MatGetArrayF90 returns 2d array Message-ID: <4A4A4F29.5070105@imperial.ac.uk> Hello, Why is it that MatGetArrayF90 returns a pointer to a 2d scalar array, instead of 1d as stated in the documentation? In fact it seems to always return a nrows x ncolumns array. Is this to deal with the MATDENSE case? Would it not be more elegant to always return a 1d array, so you get what you expect for the sparse matrices, and return a 1d array of length nrows*ncolumns in the case of MATDENSE? Cheers Stephan Kramer -- Stephan Kramer Applied Modelling and Computation Group, Department of Earth Science and Engineering, Imperial College London From bsmith at mcs.anl.gov Tue Jun 30 13:19:54 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 30 Jun 2009 13:19:54 -0500 Subject: MatGetArrayF90 returns 2d array In-Reply-To: <4A4A4F29.5070105@imperial.ac.uk> References: <4A4A4F29.5070105@imperial.ac.uk> Message-ID: Good point. Barry On Jun 30, 2009, at 12:45 PM, Stephan Kramer wrote: > Hello, > > Why is it that MatGetArrayF90 returns a pointer to a 2d scalar > array, instead of 1d as stated in the documentation? In fact it > seems to always return a nrows x ncolumns array. Is this to deal > with the MATDENSE case? Would it not be more elegant to always > return a 1d array, so you get what you expect for the sparse > matrices, and return a 1d array of length nrows*ncolumns in the case > of MATDENSE? > > Cheers > Stephan Kramer > > -- > Stephan Kramer > Applied Modelling and Computation Group, > Department of Earth Science and Engineering, > Imperial College London From knepley at gmail.com Tue Jun 30 17:48:56 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 30 Jun 2009 17:48:56 -0500 Subject: MatGetArrayF90 returns 2d array In-Reply-To: References: <4A4A4F29.5070105@imperial.ac.uk> Message-ID: On Tue, Jun 30, 2009 at 1:19 PM, Barry Smith wrote: I thought the idea was that MatGetArray() never applies to a sparse matrix. No other sparse format supports this, does it? Matt Good point. > > Barry > > > On Jun 30, 2009, at 12:45 PM, Stephan Kramer wrote: > > Hello, >> >> Why is it that MatGetArrayF90 returns a pointer to a 2d scalar array, >> instead of 1d as stated in the documentation? In fact it seems to always >> return a nrows x ncolumns array. Is this to deal with the MATDENSE case? >> Would it not be more elegant to always return a 1d array, so you get what >> you expect for the sparse matrices, and return a 1d array of length >> nrows*ncolumns in the case of MATDENSE? >> >> Cheers >> Stephan Kramer >> >> -- >> Stephan Kramer >> Applied Modelling and Computation Group, >> Department of Earth Science and Engineering, >> Imperial College London >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Tue Jun 30 18:37:25 2009 From: jed at 59A2.org (Jed Brown) Date: Tue, 30 Jun 2009 17:37:25 -0600 Subject: MatGetArrayF90 returns 2d array In-Reply-To: References: <4A4A4F29.5070105@imperial.ac.uk> Message-ID: <4A4AA1B5.9030501@59A2.org> Matthew Knepley wrote: > I thought the idea was that MatGetArray() never applies to a sparse > matrix. No other sparse format supports this, does it? That's not true at all, but the result is implementation-dependent. For example, the array for AIJ is different from the array for BAIJ. For this reason, you shouldn't be calling MatGetArray unless you know the matrix type, but of course the F90 interface should agree with the C interface. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: