From li76pan at yahoo.com Mon Apr 3 10:42:33 2006 From: li76pan at yahoo.com (li pan) Date: Mon, 3 Apr 2006 08:42:33 -0700 (PDT) Subject: blaslapack Message-ID: <20060403154233.51893.qmail@web36814.mail.mud.yahoo.com> I download blas lapack from the ftp which was suggested by petsc download homepage ftp://ftp.mcs.anl.gov/pub/petsc/fblaslapack.tar.gz and transfer it to a suse laptop, since this has no internet connection. I set a value for --with-blas-lapack-dir to the unziped the directory. But got this error information : ..../fblaslapack can not be used why? best pan __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From balay at mcs.anl.gov Mon Apr 3 10:53:34 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 3 Apr 2006 10:53:34 -0500 (CDT) Subject: blaslapack In-Reply-To: <20060403154233.51893.qmail@web36814.mail.mud.yahoo.com> References: <20060403154233.51893.qmail@web36814.mail.mud.yahoo.com> Message-ID: The option --with-blas-lapack-dir is useful if you already have blaslapack libraries compiled & installed. If you've manually downloaded fblaslapack.tar.gz - then use the option: --download-f-blas-lapack=/home/petsc/fblaslapack.tar.gz [with the correct patch to fblaslapack.tar.gz file] Satish On Mon, 3 Apr 2006, li pan wrote: > I download blas lapack from the ftp which was > suggested by petsc download homepage > ftp://ftp.mcs.anl.gov/pub/petsc/fblaslapack.tar.gz > and transfer it to a suse laptop, since this has no > internet connection. I set a value for > --with-blas-lapack-dir to the unziped the directory. > But got this error information : > ..../fblaslapack can not be used > why? > > best > > pan > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > From balay at mcs.anl.gov Mon Apr 3 11:20:30 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 3 Apr 2006 11:20:30 -0500 (CDT) Subject: blaslapack In-Reply-To: <20060403160727.63846.qmail@web36814.mail.mud.yahoo.com> References: <20060403160727.63846.qmail@web36814.mail.mud.yahoo.com> Message-ID: Use the latest 2.3.1 release of PETSc. Satish On Mon, 3 Apr 2006, li pan wrote: > hi > but --download-f-blas-lapack only takes "no, yes .." > boolean value. > > pan > > > > The option --with-blas-lapack-dir is useful if you > already have > blaslapack libraries compiled & installed. If you've > manually > downloaded fblaslapack.tar.gz - then use the option: > > --download-f-blas-lapack=/home/petsc/fblaslapack.tar.gz > > [with the correct patch to fblaslapack.tar.gz file] > > Satish > > On Mon, 3 Apr 2006, li pan wrote: > > > I download blas lapack from the ftp which was > > suggested by petsc download homepage > > ftp://ftp.mcs.anl.gov/pub/petsc/fblaslapack.tar.gz > > and transfer it to a suse laptop, since this has no > > internet connection. I set a value for > > --with-blas-lapack-dir to the unziped the directory. > > But got this error information : > > ..../fblaslapack can not be used > > why? > > > > best > > > > pan > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > From li76pan at yahoo.com Mon Apr 3 11:42:59 2006 From: li76pan at yahoo.com (li pan) Date: Mon, 3 Apr 2006 09:42:59 -0700 (PDT) Subject: how to install Message-ID: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com> Dear all, could anybody tell how to install petsc into a pc without internet connection? best pan __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From knepley at gmail.com Mon Apr 3 11:53:52 2006 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 3 Apr 2006 11:53:52 -0500 Subject: how to install In-Reply-To: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com> References: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com> Message-ID: If you can get the tarball to your machine, you should not need the internet, unless you need another package like MPI. Matt On 4/3/06, li pan wrote: > > Dear all, > could anybody tell how to install petsc into a pc > without internet connection? > > best > > pan > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Apr 3 13:00:44 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 3 Apr 2006 13:00:44 -0500 (CDT) Subject: blaslapack In-Reply-To: <20060403162523.22522.qmail@web36802.mail.mud.yahoo.com> References: <20060403162523.22522.qmail@web36802.mail.mud.yahoo.com> Message-ID: Please send replies to the list.. If you are not using 2.3.1 - then do the following: cd petsc-2.3.0 mkdir externalpackages cd externalpackages tar -xzf ~/fblaslapack.tar.gz cd .. ./config/configure.py --download-f-blas-lapack=1 Satish On Mon, 3 Apr 2006, li pan wrote: > hmmmmmmmm, I'm not sure whether my libmesh version > supports new version of petsc. > > pan > > > > Satish > > On Mon, 3 Apr 2006, li pan wrote: > > > hi > > but --download-f-blas-lapack only takes "no, yes .." > > boolean value. > > > > pan > > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > From balay at mcs.anl.gov Mon Apr 3 13:03:03 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 3 Apr 2006 13:03:03 -0500 (CDT) Subject: how to install In-Reply-To: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com> References: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com> Message-ID: already responded to this query in the previous thread. Satish On Mon, 3 Apr 2006, li pan wrote: > Dear all, > could anybody tell how to install petsc into a pc > without internet connection? > > best > > pan > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > From billy at dem.uminho.pt Tue Apr 4 09:20:31 2006 From: billy at dem.uminho.pt (billy at dem.uminho.pt) Date: Tue, 4 Apr 2006 15:20:31 +0100 Subject: Combine ghost updates Message-ID: <1144160431.443280afadef6@serv-g1.ccom.uminho.pt> Hi, I read in a paper that if you combine all updates in a single vector, you can speed up the comunication. They combined gradient x, y, z values in one vector. Does anyone know how this can be done? For example vectors ux, uy, and uz: VecGhostUpdateBegin(ux,INSERT_VALUES,SCATTER_FORWARD); VecGhostUpdateEnd(ux,INSERT_VALUES,SCATTER_FORWARD); VecGhostUpdateBegin(uy,INSERT_VALUES,SCATTER_FORWARD); VecGhostUpdateEnd(uy,INSERT_VALUES,SCATTER_FORWARD); VecGhostUpdateBegin(uz,INSERT_VALUES,SCATTER_FORWARD); VecGhostUpdateEnd(uz,INSERT_VALUES,SCATTER_FORWARD); Could they be combined in a vector U: VecGhostUpdateBegin(U,INSERT_VALUES,SCATTER_FORWARD); VecGhostUpdateEnd(U,INSERT_VALUES,SCATTER_FORWARD); and would it be faster? Billy. From bsmith at mcs.anl.gov Tue Apr 4 13:50:53 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Apr 2006 13:50:53 -0500 (CDT) Subject: Combine ghost updates In-Reply-To: <1144160431.443280afadef6@serv-g1.ccom.uminho.pt> References: <1144160431.443280afadef6@serv-g1.ccom.uminho.pt> Message-ID: Billy, Since the three vectors are independent there is currently no way to do this. We would need to add additional support to the scatter operations to allow packing several scatters together (actually not a bad idea). This problem does not usually come up because we recommend under most circumstances to "interlace" field variables rather than keep them in seperate vectors. For example, in your case you would have a single U vector that looked like (ux_0,uy_0,uz_0,ux_1,uy_1,uz_1,...) The reason to interlace is for efficiency; in most codes when ux_i is loaded the uy_i and uz_i are also needed "at the same time"; by combining them you can get less cache thrashing and less tlb misses: see for example http://www-fp.mcs.anl.gov/petsc-fun3d/Papers/manke.pdf table 3 where performance is more than doubled by simply interlacing. There are many other papers that discuss this at http://www-fp.mcs.anl.gov/petsc-fun3d/Papers/papers.html It is likely that switching to interlaced variables would give much more of a performance boost then combining the scatters. But you could much through the code in src/vec/vec/utils/vpscat.c you see how it would be possible to manage multiple vectors with some hacking. Barry On Tue, 4 Apr 2006, billy at dem.uminho.pt wrote: > > Hi, > > I read in a paper that if you combine all updates in a single vector, you can > speed up the comunication. They combined gradient x, y, z values in one vector. > Does anyone know how this can be done? > > For example vectors ux, uy, and uz: > > VecGhostUpdateBegin(ux,INSERT_VALUES,SCATTER_FORWARD); > VecGhostUpdateEnd(ux,INSERT_VALUES,SCATTER_FORWARD); > > VecGhostUpdateBegin(uy,INSERT_VALUES,SCATTER_FORWARD); > VecGhostUpdateEnd(uy,INSERT_VALUES,SCATTER_FORWARD); > > VecGhostUpdateBegin(uz,INSERT_VALUES,SCATTER_FORWARD); > VecGhostUpdateEnd(uz,INSERT_VALUES,SCATTER_FORWARD); > > Could they be combined in a vector U: > > VecGhostUpdateBegin(U,INSERT_VALUES,SCATTER_FORWARD); > VecGhostUpdateEnd(U,INSERT_VALUES,SCATTER_FORWARD); > > and would it be faster? > > > Billy. > > From buket at be.itu.edu.tr Sat Apr 8 13:30:44 2006 From: buket at be.itu.edu.tr (buket at be.itu.edu.tr) Date: Sat, 8 Apr 2006 21:30:44 +0300 (EEST) Subject: what is default ordering method Message-ID: <1981.160.75.78.141.1144521044.squirrel@www.be.itu.edu.tr> Hi, I have several questions; Which ordering method does petsc use by default before matrix factorization? How can I change the restarted value of Gmres(m)? Is there any parallel overhead when I run petsc program on single processor (with mprun -np 1)? Thank you, Best Regards, Buket Benek From knepley at gmail.com Sat Apr 8 13:42:19 2006 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 8 Apr 2006 13:42:19 -0500 Subject: what is default ordering method In-Reply-To: <1981.160.75.78.141.1144521044.squirrel@www.be.itu.edu.tr> References: <1981.160.75.78.141.1144521044.squirrel@www.be.itu.edu.tr> Message-ID: On 4/8/06, buket at be.itu.edu.tr wrote: > > Hi, > > I have several questions; > > Which ordering method does petsc use by default before matrix > factorization? None, but external packages may use a default ordering. How can I change the restarted value of Gmres(m)? -ksp_gmres_restart Is there any parallel overhead when I run petsc program on single > processor (with mprun -np 1)? Not normally, since all classes are Seq by default. Matt Thank you, > Best Regards, > Buket Benek > > > > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From petsc-maint at mcs.anl.gov Sat Apr 8 14:09:36 2006 From: petsc-maint at mcs.anl.gov (Barry Smith) Date: Sat, 8 Apr 2006 14:09:36 -0500 (CDT) Subject: what is default ordering method In-Reply-To: References: <1981.160.75.78.141.1144521044.squirrel@www.be.itu.edu.tr> Message-ID: On Sat, 8 Apr 2006, Matthew Knepley wrote: > On 4/8/06, buket at be.itu.edu.tr wrote: >> >> Hi, >> >> I have several questions; >> >> Which ordering method does petsc use by default before matrix >> factorization? > > For ILU it is none (i.e. natural) for LU it is nested dissection. For ICC and Cholesky it is natural (note the reason it is natural for Cholesky is because of the storage of only the upper triangular reordering is expensive.) Barry > None, but external packages may use a default ordering. > > How can I change the restarted value of Gmres(m)? > > > -ksp_gmres_restart > > Is there any parallel overhead when I run petsc program on single >> processor (with mprun -np 1)? > > > Not normally, since all classes are Seq by default. > > Matt > > Thank you, >> Best Regards, >> Buket Benek >> >> >> >> >> > > > -- > "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec > Guiness > From abdul-rahman at tu-harburg.de Tue Apr 11 11:33:18 2006 From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de) Date: Tue, 11 Apr 2006 18:33:18 +0200 (METDST) Subject: Q about matsingle Message-ID: Dear all, if I were to compute in single precision complex, should I configure --with-precision=single or matsingle, or both? Thank you, Regards, Razi From balay at mcs.anl.gov Tue Apr 11 12:17:46 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 11 Apr 2006 12:17:46 -0500 (CDT) Subject: Q about matsingle In-Reply-To: References: Message-ID: That would be --with-precision=single. However PETSc currently doesn't compile in this mode. Satish On Tue, 11 Apr 2006, abdul-rahman at tu-harburg.de wrote: > Dear all, > > if I were to compute in single precision complex, should I configure > > --with-precision=single or matsingle, or both? > > Thank you, > > Regards, > > Razi > > From abdul-rahman at tu-harburg.de Tue Apr 11 12:26:22 2006 From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de) Date: Tue, 11 Apr 2006 19:26:22 +0200 (METDST) Subject: Q about matsingle In-Reply-To: References: Message-ID: On Tue, 11 Apr 2006, Satish Balay wrote: > That would be --with-precision=single. However PETSc currently doesn't > compile in this mode. > > Satish Thanks for telling me. saves me from compiling. Out of curiousity, what is matsingle for? Is it also not usable? Razi > On Tue, 11 Apr 2006, abdul-rahman at tu-harburg.de wrote: > > > Dear all, > > > > if I were to compute in single precision complex, should I configure > > > > --with-precision=single or matsingle, or both? > > > > Thank you, > > > > Regards, > > > > Razi > > > > > > From balay at mcs.anl.gov Tue Apr 11 12:45:59 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 11 Apr 2006 12:45:59 -0500 (CDT) Subject: Q about matsingle In-Reply-To: References: Message-ID: On Tue, 11 Apr 2006, abdul-rahman at tu-harburg.de wrote: > > > On Tue, 11 Apr 2006, Satish Balay wrote: > > > That would be --with-precision=single. However PETSc currently doesn't > > compile in this mode. > > > > Satish > > Thanks for telling me. saves me from compiling. Out of curiousity, what is > matsingle for? Is it also not usable? Even matsingle is not tested for complex mode. [Both --with-precision=single,matsingle should work for the default --with-scalar-type=real mode] matsingle mode stores the matrix in single precision - but all operations are done in double precision. This is a performance optimization mode [the performance gain is from the lower memory bandwidth requirements of this single precision matrix storage]. Satish From letian.wang at ghiocel-tech.com Wed Apr 12 17:31:46 2006 From: letian.wang at ghiocel-tech.com (Letian Wang) Date: Wed, 12 Apr 2006 18:31:46 -0400 Subject: get the preconditioner matrix Message-ID: <000001c65e80$e1e51770$0b00a8c0@lele> Dear all, Is it possible to obtain the PETSc pre-conditioner matrix and output it to a file? How can I do that? Thanks. Letian Wang -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Apr 12 19:55:34 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 12 Apr 2006 19:55:34 -0500 (CDT) Subject: get the preconditioner matrix In-Reply-To: <000001c65e80$e1e51770$0b00a8c0@lele> References: <000001c65e80$e1e51770$0b00a8c0@lele> Message-ID: Letian, What do you mean be "pre-conditioner matrix"? It is very rare that a preconditioner is explicitly represented as a matrix; it is almost always just some code that applies the operator. In general an explicitly represented preconditioner would actually be a dense matrix. If you truly want this dense matrix you can call PCComputeExplicitOperator() and store the resulting matrix to a file. Barry On Wed, 12 Apr 2006, Letian Wang wrote: > Dear all, > > > > Is it possible to obtain the PETSc pre-conditioner matrix and output it to a > file? How can I do that? Thanks. > > > > Letian Wang > > > > From abdul-rahman at tu-harburg.de Thu Apr 13 04:57:32 2006 From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de) Date: Thu, 13 Apr 2006 11:57:32 +0200 (METDST) Subject: petsc-2.3.1 on FC4 with Intel Compilers Message-ID: Dear all, I'd like to know if anyone has successfully built petsc 2.3.1-p12 on Fedora Core 4 with the Intel compilers package (icc and ifort 9.0 Build 20051201Z ). I used to have problems in building the complex type but haven't really looked into it. Real-valued is perfect though. Many thanks, Razi From bsmith at mcs.anl.gov Thu Apr 13 11:50:37 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 13 Apr 2006 11:50:37 -0500 (CDT) Subject: get the preconditioner matrix In-Reply-To: <000601c65f05$154b7620$0b00a8c0@lele> References: <000601c65f05$154b7620$0b00a8c0@lele> Message-ID: Again, this does not make sense. Prometheus as a dense matrix will 1) require much much to much memory and 2) take much to long to apply O(n^2) flops. The whole idea behind multilevel methods is to be roughly order O(n) to apply. If Mark has provided a PCView() and PCLoad() for Prometheus then it could be saved and reused (but it would not be saved as a dense matrix), BUT I don't think Mark has done this (plus it would very likely require rerunning on the same number of processors). You just need to calculate the preconditioner for each time you run the program. Barry On Thu, 13 Apr 2006, Letian Wang wrote: > Barry: > > Thank you for your reply. What I want to do is to use PETSc for > optimization. I use Prometheus pre-conditioner to solve the initial problem. > Usually it spends much time on getting the pre-conditioner, then the > iterations are relatively going faster. I'm thinking to save the > pre-conditioner matrix for the initial problem (Now I know I can do that by > PCComputeExplicitOperator), then for other very similar problems, I can pass > the saved pre-conditioner and apply them directly to the new problem. Do you > think it will work? > > Is there any routine to apply the saved matrix as pre-conditioner? Or I have > to program an user-defined PC routine? > > Thanks. > > Letian > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] > On Behalf Of Barry Smith > Sent: Wednesday, April 12, 2006 7:56 PM > To: petsc-users at mcs.anl.gov > Subject: Re: get the preconditioner matrix > > > Letian, > > What do you mean be "pre-conditioner matrix"? It is very rare > that a preconditioner is explicitly represented as a matrix; it is almost > always just some code that applies the operator. In general an explicitly > represented preconditioner would actually be a dense matrix. > > If you truly want this dense matrix you can call > PCComputeExplicitOperator() > and store the resulting matrix to a file. > > Barry > > > On Wed, 12 Apr 2006, Letian Wang wrote: > >> Dear all, >> >> >> >> Is it possible to obtain the PETSc pre-conditioner matrix and output it to > a >> file? How can I do that? Thanks. >> >> >> >> Letian Wang >> >> >> >> > > > From balay at mcs.anl.gov Thu Apr 13 14:06:33 2006 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 13 Apr 2006 14:06:33 -0500 (CDT) Subject: petsc-2.3.1 on FC4 with Intel Compilers In-Reply-To: References: Message-ID: What problems? Send us the logs at petsc-maint at mcs.anl.gov - and we can take a look at them. Satish On Thu, 13 Apr 2006, abdul-rahman at tu-harburg.de wrote: > Dear all, > > I'd like to know if anyone has successfully built petsc 2.3.1-p12 on > Fedora Core 4 with the Intel compilers package (icc and ifort 9.0 Build > 20051201Z ). I used to have problems in building the complex type > but haven't really looked into it. Real-valued is perfect though. > > Many thanks, > > > > Razi > > From adams at pppl.gov Thu Apr 13 19:40:42 2006 From: adams at pppl.gov (Mark Adams) Date: Thu, 13 Apr 2006 20:40:42 -0400 Subject: get the preconditioner matrix In-Reply-To: References: <000601c65f05$154b7620$0b00a8c0@lele> Message-ID: Letian, I think what you are asking for is the capacity to save the state of the PC after setup which I have not implemented. As Barry said, saving the explicit operator would not be practical and I don't know of a linear solver that provides for saving the state in the way that I think you are asking (but it would be more natural and easy for a direct solver to save factors). But you could keep using the same PC for different problems - at your own risk of course - in one run of your code. Mark On Apr 13, 2006, at 12:50 PM, Barry Smith wrote: > > Again, this does not make sense. Prometheus as a dense matrix > will 1) require much much to much memory and 2) take much to > long to apply O(n^2) flops. The whole idea behind multilevel > methods is to be roughly order O(n) to apply. > > If Mark has provided a PCView() and PCLoad() for Prometheus > then it could be saved and reused (but it would not be saved > as a dense matrix), BUT I don't think Mark has done this (plus > it would very likely require rerunning on the same number of > processors). > > You just need to calculate the preconditioner for each time > you run the program. > > Barry > > On Thu, 13 Apr 2006, Letian Wang wrote: > >> Barry: >> >> Thank you for your reply. What I want to do is to use PETSc for >> optimization. I use Prometheus pre-conditioner to solve the >> initial problem. >> Usually it spends much time on getting the pre-conditioner, then the >> iterations are relatively going faster. I'm thinking to save the >> pre-conditioner matrix for the initial problem (Now I know I can >> do that by >> PCComputeExplicitOperator), then for other very similar problems, >> I can pass >> the saved pre-conditioner and apply them directly to the new >> problem. Do you >> think it will work? >> >> Is there any routine to apply the saved matrix as pre-conditioner? >> Or I have >> to program an user-defined PC routine? >> >> Thanks. >> >> Letian >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc- >> users at mcs.anl.gov] >> On Behalf Of Barry Smith >> Sent: Wednesday, April 12, 2006 7:56 PM >> To: petsc-users at mcs.anl.gov >> Subject: Re: get the preconditioner matrix >> >> >> Letian, >> >> What do you mean be "pre-conditioner matrix"? It is very rare >> that a preconditioner is explicitly represented as a matrix; it is >> almost >> always just some code that applies the operator. In general an >> explicitly >> represented preconditioner would actually be a dense matrix. >> >> If you truly want this dense matrix you can call >> PCComputeExplicitOperator() >> and store the resulting matrix to a file. >> >> Barry >> >> >> On Wed, 12 Apr 2006, Letian Wang wrote: >> >>> Dear all, >>> >>> >>> >>> Is it possible to obtain the PETSc pre-conditioner matrix and >>> output it to >> a >>> file? How can I do that? Thanks. >>> >>> >>> >>> Letian Wang >>> >>> >>> >>> >> >> >> ********************************************************************** Mark Adams Ph.D. Columbia University 289 Engineering Terrace MC 4701 New York NY 10027 adams at pppl.gov www.columbia.edu/~ma2325 voice: 212.854.4485 fax: 212.854.8257 ********************************************************************** From abdul-rahman at tu-harburg.de Tue Apr 18 03:26:14 2006 From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de) Date: Tue, 18 Apr 2006 10:26:14 +0200 (METDST) Subject: petsc-2.3.1 on FC4 with Intel Compilers In-Reply-To: References: Message-ID: On Thu, 13 Apr 2006, Satish Balay wrote: > What problems? > > Send us the logs at petsc-maint at mcs.anl.gov - and we can take > a look at them. Thanks Satish. I sent them already. I think the compilation problem has to do with gcc 4.0.2 c++ headers. I don't get it why PETSc insists on gcc headers and not icc 9.0, although I did point to the intel compilers in my config. Razi From knepley at gmail.com Tue Apr 18 08:46:22 2006 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 18 Apr 2006 08:46:22 -0500 Subject: petsc-2.3.1 on FC4 with Intel Compilers In-Reply-To: References: Message-ID: I can't find any logs. However, we do not require any specific headers, and use intel compilers on a lot of our machines. This sounds like a compiler installation problem. Matt On 4/18/06, abdul-rahman at tu-harburg.de wrote: > > > > On Thu, 13 Apr 2006, Satish Balay wrote: > > > What problems? > > > > Send us the logs at petsc-maint at mcs.anl.gov - and we can take > > a look at them. > > Thanks Satish. I sent them already. I think the compilation problem has to > do with gcc 4.0.2 c++ headers. I don't get it why PETSc insists on gcc > headers and not icc 9.0, although I did point to the intel compilers in my > config. > > > Razi > > > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From shma7099 at student.uu.se Tue Apr 18 09:09:29 2006 From: shma7099 at student.uu.se (Sh.M) Date: Tue, 18 Apr 2006 16:09:29 +0200 (MEST) Subject: My settings are overriden when I call the solver more than once? Message-ID: Hi all, I am solving a matrix with boomerAMG, and I set the boomerAMG settings thru the console. The matrix(it is a preconditioner matrix) is solved several times... I set the boomerAMG preconditioner/solver to have a convergence tolerance of 1.0e-06 and max number of iterations 4. I print out boomerAMG status to verify that it works as it should.. But for some reason it doesnt.. Here is a print from boomerAMG when it starts, I guess when it constructs the boomerAMG solver/preconditioner: . . . . . . BoomerAMG SOLVER PARAMETERS: Maximum number of cycles: 4 Stopping Tolerance: 1.000000e-06 Cycle type (1 = V, 2 = W, etc.): 1 Relaxation Parameters: Visiting Grid: fine down up coarse Number of partial sweeps: 1 1 1 1 Type 0=Jac, 1=GS, 3=Hybrid 9=GE: 3 3 3 9 Point types, partial sweeps (1=C, -1=F): Finest grid: 1 -1 Pre-CG relaxation (down): 1 -1 Post-CG relaxation (up): -1 1 Coarsest grid: 0 Immediately after the above message, wich is when it starts to solve, it changes its parameters to this: BoomerAMG SOLVER PARAMETERS: Maximum number of cycles: 10000 Stopping Tolerance: 1.000000e-05 Cycle type (1 = V, 2 = W, etc.): 1 Relaxation Parameters: Visiting Grid: fine down up coarse Number of partial sweeps: 1 1 1 1 Type 0=Jac, 1=GS, 3=Hybrid 9=GE: 3 3 3 9 Point types, partial sweeps (1=C, -1=F): Finest grid: 1 -1 Pre-CG relaxation (down): 1 -1 Post-CG relaxation (up): -1 1 Coarsest grid: 0 . . . . . . As you see my settings have been changed. I have used boomerAMG in the past, not exactly this way.,... but my settings were preserved before and after solve. here is what I run from the command line: mprun -np 1 petscSolver -a ../hypre/data/nr_hvl40.csr -b ../hypre/data/nr_hvl40.csr.blockMatrix -inner_ksp_type richardson -inner_pc_type hypre -inner_pc_hypre_type boomeramg -inner_pc_hypre_boomeramg_max_iter 4 -inner_pc_hypre_boomeramg_tol 1.0e-06 -ksp_monitor -ksp_type gmres -inner_pc_hypre_boomeramg_print_statistics And a piece of code: . . . . . . BLOCK_PC user_pc; , , , , , . KSPSetOptionsPrefix(user_pc.block_solver,"inner_"); KSPSetOperators(user_pc.block_solver,user_pc.M,user_pc.M,DIFFERENT_NONZERO_PATTERN); KSPSetFromOptions(user_pc.block_solver); I am running on solaris64. With best regards, Shaman Mahmoudi From bsmith at mcs.anl.gov Tue Apr 18 12:16:47 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 18 Apr 2006 12:16:47 -0500 (CDT) Subject: My settings are overriden when I call the solver more than once? In-Reply-To: References: Message-ID: What controls the iteration max and tolerance for Richardson solver is -inner_ksp_max_its and -inner_ksp_rtol This is true for any preconditioner including hypre boomeramg. The defaults are 10000 and 1.e-5. The problem is that the options -inner_pc_hypre_boomeramg_max_iter 4 -inner_pc_hypre_boomeramg_tol 1.0e-06 sure look like THEY should control this. Hence the confusion. Here is the issue, pc_hypre_boomeramg_max_iter and pc_hypre_boomeramg_tol are suppose to control (with PETSc) how many iterations boomeramg does WITHIN a single call to ITS solve. Say we used a ksp of gmres and a pc_hypre_boomeramg_max_iter 4 this means we are using preconditioned gmres with a preconditioner of FOUR V cycles of boomeramg (this is different then using preconditioned gmres with a preconditioner of ONE cycle). Completely possible and maybe a reasonable thing to do. The error in PETSc is that paragraph three above does NOT hold for Richardsons method (but it does hold for all other KSP methods). This is because Richardson's method has a "special method" PCApplyRichardson() that avoids the overhead of explicitly applying Richardson's method (see src/ksp/ksp/impls/rich/rich.c). I did not handle this properly when I wrote the PCApplyRichardson_BoomerAMG. I will fix this error. Anyways, the short answer is when using Richardson with hypre use -inner_ksp_max_its and -inner_ksp_rtol to control the iterations and tolerance, NOT -inner_pc_hypre_boomeramg_max_iter and -inner_pc_hypre_boomeramg_tol I will attempt to make the docs clearer. Barry On Tue, 18 Apr 2006, Sh.M wrote: > Hi all, > > I am solving a matrix with boomerAMG, and I set the boomerAMG settings > thru the console. The matrix(it is a preconditioner matrix) is solved > several times... I set the boomerAMG preconditioner/solver to have a > convergence tolerance of 1.0e-06 and max number of iterations 4. I print out boomerAMG status to > verify that it works as it should.. But for some reason it doesnt.. > > Here is a print from boomerAMG when it starts, I guess when it constructs > the boomerAMG solver/preconditioner: > > . > . > . > . > . > . > BoomerAMG SOLVER PARAMETERS: > > Maximum number of cycles: 4 > Stopping Tolerance: 1.000000e-06 > Cycle type (1 = V, 2 = W, etc.): 1 > > Relaxation Parameters: > Visiting Grid: fine down up coarse > Number of partial sweeps: 1 1 1 1 > Type 0=Jac, 1=GS, 3=Hybrid 9=GE: 3 3 3 9 > Point types, partial sweeps (1=C, -1=F): > Finest grid: 1 -1 > Pre-CG relaxation (down): 1 -1 > Post-CG relaxation (up): -1 1 > Coarsest grid: 0 > > > > Immediately after the above message, wich is when it starts to > solve, it changes its parameters to this: > > > BoomerAMG SOLVER PARAMETERS: > > Maximum number of cycles: 10000 > Stopping Tolerance: 1.000000e-05 > Cycle type (1 = V, 2 = W, etc.): 1 > > Relaxation Parameters: > Visiting Grid: fine down up coarse > Number of partial sweeps: 1 1 1 1 > Type 0=Jac, 1=GS, 3=Hybrid 9=GE: 3 3 3 9 > Point types, partial sweeps (1=C, -1=F): > Finest grid: 1 -1 > Pre-CG relaxation (down): 1 -1 > Post-CG relaxation (up): -1 1 > Coarsest grid: 0 > . > . > . > . > . > . > > > As you see my settings have been changed. > > I have used boomerAMG in the past, not exactly this way.,... but my > settings were preserved before and after solve. > > here is what I run from the command line: > > mprun -np 1 petscSolver > -a ../hypre/data/nr_hvl40.csr > -b ../hypre/data/nr_hvl40.csr.blockMatrix -inner_ksp_type richardson > -inner_pc_type hypre -inner_pc_hypre_type boomeramg > -inner_pc_hypre_boomeramg_max_iter 4 > -inner_pc_hypre_boomeramg_tol 1.0e-06 > -ksp_monitor > -ksp_type gmres > -inner_pc_hypre_boomeramg_print_statistics > > > And a piece of code: > > . > . > . > . > . > . > BLOCK_PC user_pc; > , > , > , > , > , > . > KSPSetOptionsPrefix(user_pc.block_solver,"inner_"); > KSPSetOperators(user_pc.block_solver,user_pc.M,user_pc.M,DIFFERENT_NONZERO_PATTERN); > KSPSetFromOptions(user_pc.block_solver); > > > I am running on solaris64. > > With best regards, Shaman Mahmoudi > > From shma7099 at student.uu.se Thu Apr 20 05:21:43 2006 From: shma7099 at student.uu.se (Sh.M) Date: Thu, 20 Apr 2006 12:21:43 +0200 (MEST) Subject: My settings are overriden when I call the solver more than once? In-Reply-To: References: Message-ID: Hi, Thanks for the thorough explanation! With best regards, Shaman Mahmoudi On Tue, 18 Apr 2006, Barry Smith wrote: > > What controls the iteration max and tolerance for Richardson > solver is -inner_ksp_max_its and -inner_ksp_rtol > This is true for any preconditioner including hypre boomeramg. > The defaults are 10000 and 1.e-5. > > The problem is that the options > -inner_pc_hypre_boomeramg_max_iter 4 > -inner_pc_hypre_boomeramg_tol 1.0e-06 > sure look like THEY should control this. Hence the confusion. > > Here is the issue, pc_hypre_boomeramg_max_iter and pc_hypre_boomeramg_tol are > suppose to control (with PETSc) how many iterations boomeramg does WITHIN a > single call to ITS solve. Say we used a ksp of gmres and a pc_hypre_boomeramg_max_iter 4 > this means we are using preconditioned gmres with a preconditioner of FOUR V cycles > of boomeramg (this is different then using preconditioned gmres with a preconditioner of > ONE cycle). Completely possible and maybe a reasonable thing to do. > > The error in PETSc is that paragraph three above does NOT hold for Richardsons method > (but it does hold for all other KSP methods). This is because Richardson's method > has a "special method" PCApplyRichardson() that avoids the overhead of explicitly > applying Richardson's method (see src/ksp/ksp/impls/rich/rich.c). I did not handle > this properly when I wrote the PCApplyRichardson_BoomerAMG. I will fix this error. > > Anyways, the short answer is when using Richardson with hypre > use -inner_ksp_max_its and -inner_ksp_rtol to control > the iterations and tolerance, NOT > -inner_pc_hypre_boomeramg_max_iter > and -inner_pc_hypre_boomeramg_tol > I will attempt to make the docs clearer. > > > Barry > > > > > > On Tue, 18 Apr 2006, Sh.M wrote: > > > Hi all, > > > > I am solving a matrix with boomerAMG, and I set the boomerAMG settings > > thru the console. The matrix(it is a preconditioner matrix) is solved > > several times... I set the boomerAMG preconditioner/solver to have a > > convergence tolerance of 1.0e-06 and max number of iterations 4. I print out boomerAMG status to > > verify that it works as it should.. But for some reason it doesnt.. > > > > Here is a print from boomerAMG when it starts, I guess when it constructs > > the boomerAMG solver/preconditioner: > > > > . > > . > > . > > . > > . > > . > > BoomerAMG SOLVER PARAMETERS: > > > > Maximum number of cycles: 4 > > Stopping Tolerance: 1.000000e-06 > > Cycle type (1 = V, 2 = W, etc.): 1 > > > > Relaxation Parameters: > > Visiting Grid: fine down up coarse > > Number of partial sweeps: 1 1 1 1 > > Type 0=Jac, 1=GS, 3=Hybrid 9=GE: 3 3 3 9 > > Point types, partial sweeps (1=C, -1=F): > > Finest grid: 1 -1 > > Pre-CG relaxation (down): 1 -1 > > Post-CG relaxation (up): -1 1 > > Coarsest grid: 0 > > > > > > > > Immediately after the above message, wich is when it starts to > > solve, it changes its parameters to this: > > > > > > BoomerAMG SOLVER PARAMETERS: > > > > Maximum number of cycles: 10000 > > Stopping Tolerance: 1.000000e-05 > > Cycle type (1 = V, 2 = W, etc.): 1 > > > > Relaxation Parameters: > > Visiting Grid: fine down up coarse > > Number of partial sweeps: 1 1 1 1 > > Type 0=Jac, 1=GS, 3=Hybrid 9=GE: 3 3 3 9 > > Point types, partial sweeps (1=C, -1=F): > > Finest grid: 1 -1 > > Pre-CG relaxation (down): 1 -1 > > Post-CG relaxation (up): -1 1 > > Coarsest grid: 0 > > . > > . > > . > > . > > . > > . > > > > > > As you see my settings have been changed. > > > > I have used boomerAMG in the past, not exactly this way.,... but my > > settings were preserved before and after solve. > > > > here is what I run from the command line: > > > > mprun -np 1 petscSolver > > -a ../hypre/data/nr_hvl40.csr > > -b ../hypre/data/nr_hvl40.csr.blockMatrix -inner_ksp_type richardson > > -inner_pc_type hypre -inner_pc_hypre_type boomeramg > > -inner_pc_hypre_boomeramg_max_iter 4 > > -inner_pc_hypre_boomeramg_tol 1.0e-06 > > -ksp_monitor > > -ksp_type gmres > > -inner_pc_hypre_boomeramg_print_statistics > > > > > > And a piece of code: > > > > . > > . > > . > > . > > . > > . > > BLOCK_PC user_pc; > > , > > , > > , > > , > > , > > . > > KSPSetOptionsPrefix(user_pc.block_solver,"inner_"); > > KSPSetOperators(user_pc.block_solver,user_pc.M,user_pc.M,DIFFERENT_NONZERO_PATTERN); > > KSPSetFromOptions(user_pc.block_solver); > > > > > > I am running on solaris64. > > > > With best regards, Shaman Mahmoudi > > > > > > From letian.wang at ghiocel-tech.com Mon Apr 24 11:56:36 2006 From: letian.wang at ghiocel-tech.com (Letian Wang) Date: Mon, 24 Apr 2006 12:56:36 -0400 Subject: How to loop Petsc in Fortran? Message-ID: <000201c667c0$0c716e60$0b00a8c0@lele> Dear All: Question 1): For an optimization task, I need to loop Petsc (I'm using Petsc-2.3.0). But I had problems to reinitialize Petsc after finalize, here is a simple FORTRAN program to explain my problem: program petsc_test # include "include/finclude/petsc.h" call PetscInitialize(PETSC_NULL_CHARACTER, ierr) call PetscFinalize(ierr) print*,'ierr=',ierr call PetscInitialize(PETSC_NULL_CHARACTER, ierr) call PetscFinalize(ierr) end When the program excuted to the second PetscInitial line, it shows error message: "Error encountered before initializing MPICH". Can anyone help me on this? Thanks. Question 2): Follow up my previous question, I also tried to Initiallize and Finalize Petsc only once and perform the do-loop between Petscinitialize and PetscFinalize. I used KSP CR solver with Prometheus PCs to solver large linear equations. After several loops, the program was interrupted by segmentation violation error. I suppose there was a memory leak somewhere. The error message is like this: Any suggestion for this? Thanks *********Doing job -- nosort0001 Task No. 1 Total CPU= 52.3 --------------------------------------------------- *********Doing job -- nosort0002 Task No. 2 Total CPU= 52.1 --------------------------------------------------- *********Doing job -- nosort0003 -------------------------------------------------------------------------- Petsc Release Version 2.3.0, Patch 44, April, 26, 2005 See docs/changes/index.html for recent updates. See docs/faq.html for hints about trouble shooting. See docs/index.html for manual pages. ----------------------------------------------------------------------- ../feap on a linux-gnu named GPTnode3.cl.ghiocel-tech.com by ltwang Mon Apr 24 15:25:04 2006 Libraries linked from /home/ltwang/Library/petsc-2.3.0/lib/linux-gnu Configure run at Tue Mar 14 11:19:49 2006 Configure options --with-mpi-dir=/usr --with-debugging=0 --download-spooles=1 --download-f-blas-lapack=1 --download-parmetis=1 --download-prometheus=1 --with-shared=0 ----------------------------------------------------------------------- [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [1]PETSC ERROR: to get more information on the crash. [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file [1]PETSC ERROR: Signal received! [1]PETSC ERROR: ! [cli_1]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 [cli_0]: aborting job: Fatal error in MPI_Allgather: Other MPI error, error stack: MPI_Allgather(949)........................: MPI_Allgather(sbuf=0xbffeea14, scount=1, MPI_INT, rbuf=0x8bf0a0c, rcount=1, MPI_INT, comm=0x84000000) failed MPIR_Allgather(180).......................: MPIC_Sendrecv(161)........................: MPIC_Wait(321)............................: MPIDI_CH3_Progress_wait(199)..............: an error occurred while handling an event returned by MPIDU_Sock_Wait() MPIDI_CH3I_Progress_handle_sock_event(422): MPIDU_Socki_handle_read(649)..............: connection failure (set=0,sock=2,errno=104:(strerror() not found)) rank 1 in job 477 GPTMaster_53830 caused collective abort of all ranks exit status of rank 1: return code 59 Letian -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Apr 24 12:01:36 2006 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Apr 2006 12:01:36 -0500 Subject: How to loop Petsc in Fortran? In-Reply-To: <000201c667c0$0c716e60$0b00a8c0@lele> References: <000201c667c0$0c716e60$0b00a8c0@lele> Message-ID: On 4/24/06, Letian Wang wrote: > > Dear All: > > > > Question 1): > > > > For an optimization task, I need to loop Petsc (I'm using Petsc-2.3.0). > But I had problems to reinitialize Petsc after finalize, here is a simple > FORTRAN program to explain my problem: > It is not possible to call MPI_Init() after an MPI_Finalize(). Therefore you should only call PetscInitialize/Finalize() once. > Question 2): > > > > Follow up my previous question, I also tried to Initiallize and Finalize > Petsc only once and perform the do-loop between Petscinitialize and > PetscFinalize. I used KSP CR solver with Prometheus PCs to solver large > linear equations. After several loops, the program was interrupted by > segmentation violation error. I suppose there was a memory leak somewhere. > The error message is like this: Any suggestion for this? Thanks > This is a memory corruption problem. Use the debugger (-start_in_debugger) to get a stack trace so at least we know where the SEGV is occurring. Then we can try to fix it. Thanks, Matt > *********Doing job -- nosort0001 > > > > Task No. 1 Total CPU= 52.3 > > --------------------------------------------------- > > > > *********Doing job -- nosort0002 > > > > Task No. 2 Total CPU= 52.1 > > --------------------------------------------------- > > > > *********Doing job -- nosort0003 > > -------------------------------------------------------------------------- > > Petsc Release Version 2.3.0, Patch 44, April, 26, 2005 > > See docs/changes/index.html for recent updates. > > See docs/faq.html for hints about trouble shooting. > > See docs/index.html for manual pages. > > ----------------------------------------------------------------------- > > ../feap on a linux-gnu named GPTnode3.cl.ghiocel-tech.com by ltwang Mon > Apr 24 15:25:04 2006 > > Libraries linked from /home/ltwang/Library/petsc-2.3.0/lib/linux-gnu > > Configure run at Tue Mar 14 11:19:49 2006 > > Configure options --with-mpi-dir=/usr --with-debugging=0 > --download-spooles=1 --download-f-blas-lapack=1 --download-parmetis=1 > --download-prometheus=1 --with-shared=0 > > ----------------------------------------------------------------------- > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [1]PETSC ERROR: to get more information on the crash. > > [1]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > [1]PETSC ERROR: Signal received! > > [1]PETSC ERROR: ! > > [cli_1]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 > > [cli_0]: aborting job: > > Fatal error in MPI_Allgather: Other MPI error, error stack: > > MPI_Allgather(949)........................: MPI_Allgather(sbuf=0xbffeea14, > scount=1, MPI_INT, rbuf=0x8bf0a0c, rcount=1, MPI_INT, comm=0x84000000) > failed > > MPIR_Allgather(180).......................: > > MPIC_Sendrecv(161)........................: > > MPIC_Wait(321)............................: > > MPIDI_CH3_Progress_wait(199)..............: an error occurred while > handling an event returned by MPIDU_Sock_Wait() > > MPIDI_CH3I_Progress_handle_sock_event(422): > > MPIDU_Socki_handle_read(649)..............: connection failure > (set=0,sock=2,errno=104:(strerror() not found)) > > rank 1 in job 477 GPTMaster_53830 caused collective abort of all ranks > > exit status of rank 1: return code 59 > > > > > > > > Letian > -- "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec Guiness -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Apr 24 13:02:16 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 24 Apr 2006 13:02:16 -0500 (CDT) Subject: How to loop Petsc in Fortran? In-Reply-To: References: <000201c667c0$0c716e60$0b00a8c0@lele> Message-ID: Question 2) You should also use http://valgrind.org/ to determine where the memory corruption is taking place. Barry On Mon, 24 Apr 2006, Matthew Knepley wrote: > On 4/24/06, Letian Wang wrote: >> >> Dear All: >> >> >> >> Question 1): >> >> >> >> For an optimization task, I need to loop Petsc (I'm using Petsc-2.3.0). >> But I had problems to reinitialize Petsc after finalize, here is a simple >> FORTRAN program to explain my problem: >> > It is not possible to call MPI_Init() after an MPI_Finalize(). Therefore you > should only call PetscInitialize/Finalize() once. > > >> Question 2): >> >> >> >> Follow up my previous question, I also tried to Initiallize and Finalize >> Petsc only once and perform the do-loop between Petscinitialize and >> PetscFinalize. I used KSP CR solver with Prometheus PCs to solver large >> linear equations. After several loops, the program was interrupted by >> segmentation violation error. I suppose there was a memory leak somewhere. >> The error message is like this: Any suggestion for this? Thanks >> > > This is a memory corruption problem. Use the debugger (-start_in_debugger) > to get a stack trace so at > least we know where the SEGV is occurring. Then we can try to fix it. > > Thanks, > > Matt > >> *********Doing job -- nosort0001 >> >> >> >> Task No. 1 Total CPU= 52.3 >> >> --------------------------------------------------- >> >> >> >> *********Doing job -- nosort0002 >> >> >> >> Task No. 2 Total CPU= 52.1 >> >> --------------------------------------------------- >> >> >> >> *********Doing job -- nosort0003 >> >> -------------------------------------------------------------------------- >> >> Petsc Release Version 2.3.0, Patch 44, April, 26, 2005 >> >> See docs/changes/index.html for recent updates. >> >> See docs/faq.html for hints about trouble shooting. >> >> See docs/index.html for manual pages. >> >> ----------------------------------------------------------------------- >> >> ../feap on a linux-gnu named GPTnode3.cl.ghiocel-tech.com by ltwang Mon >> Apr 24 15:25:04 2006 >> >> Libraries linked from /home/ltwang/Library/petsc-2.3.0/lib/linux-gnu >> >> Configure run at Tue Mar 14 11:19:49 2006 >> >> Configure options --with-mpi-dir=/usr --with-debugging=0 >> --download-spooles=1 --download-f-blas-lapack=1 --download-parmetis=1 >> --download-prometheus=1 --with-shared=0 >> >> ----------------------------------------------------------------------- >> >> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and >> run >> >> [1]PETSC ERROR: to get more information on the crash. >> >> [1]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> >> [1]PETSC ERROR: Signal received! >> >> [1]PETSC ERROR: ! >> >> [cli_1]: aborting job: >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 >> >> [cli_0]: aborting job: >> >> Fatal error in MPI_Allgather: Other MPI error, error stack: >> >> MPI_Allgather(949)........................: MPI_Allgather(sbuf=0xbffeea14, >> scount=1, MPI_INT, rbuf=0x8bf0a0c, rcount=1, MPI_INT, comm=0x84000000) >> failed >> >> MPIR_Allgather(180).......................: >> >> MPIC_Sendrecv(161)........................: >> >> MPIC_Wait(321)............................: >> >> MPIDI_CH3_Progress_wait(199)..............: an error occurred while >> handling an event returned by MPIDU_Sock_Wait() >> >> MPIDI_CH3I_Progress_handle_sock_event(422): >> >> MPIDU_Socki_handle_read(649)..............: connection failure >> (set=0,sock=2,errno=104:(strerror() not found)) >> >> rank 1 in job 477 GPTMaster_53830 caused collective abort of all ranks >> >> exit status of rank 1: return code 59 >> >> >> >> >> >> >> >> Letian >> > -- > "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec > Guiness > From randy at geosystem.us Wed Apr 26 19:42:10 2006 From: randy at geosystem.us (Randall Mackie) Date: Wed, 26 Apr 2006 17:42:10 -0700 Subject: question on DA's and performance Message-ID: <44501362.6080600@geosystem.us> I've been using Petsc for a few years with reasonably good success. My application is 3D EM forward modeling and inversion. What has been working well is basically an adaptation of what I did in serial mode, by solving the following system of equations: |Mxx Mxy Mxz| |Hx| |bx| |Myx Myy Myz| |Hy| = |by| |Mzx Mzy Mzz| |Hz| |bz| Because this system is very stiff and ill-conditioned, the preconditioner that has been successfully used is the ILU(k) of the diagonal sub-blocks of the coefficient matrix only, not the ILU(k) of the entire matrix. I've tried, with less success, converting to distributed arrays, because in reality my program requires alternating between solving the system above, and solving another system based on the divergences of the magnetic field, and so this requires a lot of message passing between the nodes. I think that using DA's would require passing only the ghost values instead of all the values as I am now doing. So I coded up DA's, and it works, but only okay, and not as well as the case above. The main problem is that it seems to work best for ILU(0), whereas the case above works better and better with the more fill-in specified. I think I've tried every possible option, but I can't get any improvement over just using ILU(0), but the problem is that with ILU(0), it takes too many iterations, and so the total time is more than when I use my original method with, say, ILU(8). The differences between using DA's and the approach above is that the fields are interlaced in DA's, and the boundary values are included as unknowns (with coefficient matrix values set to 1.0), whereas my original way the boundary values are incorporated in the right-hand side. Maybe that hurts the condition of my system. I would really like to use DA's to improve on the efficiency of the code, but I can't figure out how to do that. It was suggested to me last year to use PCFIELDSPLIT, but there are no examples, and I'm not a c programmer so it's hard for me to look at the source code and know what to do. (I'm only just now able to get back to this). Does anyone do any prescaling of their systems? If so, does anyone have examples of how this can be done? Any advice is greatly appreciated. Thanks, Randy -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From bsmith at mcs.anl.gov Wed Apr 26 20:31:55 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 26 Apr 2006 20:31:55 -0500 (CDT) Subject: question on DA's and performance In-Reply-To: <44501362.6080600@geosystem.us> References: <44501362.6080600@geosystem.us> Message-ID: Randy, 1) I'd first change the scaling of the > with coefficient matrix values set to 1.0), to match the other diagonal entries in the matrix. For example, if the diagonal entries are usually 10,000 then multiply the "boundary condition" rows by 10,000. (Note the diagonal entries may be proportional to h, or 1/h etc so make sure you get the same scaling in your boundary condition rows. For example, if the diagonal entries double when you refine the grid make sure your "boundary condition" equations scale the same way.) 2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better in your "new" code because it uses point block ILU(k) instead of point ILU(k). 3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do in the previous version?). 4) If you use LU on the blocks does it work better than ILU(0) in terms of iterations? A little, a lot? If just a little I suggest trying -pc_type asm Let us know how things go, you should be able to get similar convergence using the DAs. Barry On Wed, 26 Apr 2006, Randall Mackie wrote: > I've been using Petsc for a few years with reasonably good success. My > application is > 3D EM forward modeling and inversion. > > What has been working well is basically an adaptation of what I did in serial > mode, > by solving the following system of equations: > > |Mxx Mxy Mxz| |Hx| |bx| > |Myx Myy Myz| |Hy| = |by| > |Mzx Mzy Mzz| |Hz| |bz| > > Because this system is very stiff and ill-conditioned, the preconditioner > that has > been successfully used is the ILU(k) of the diagonal sub-blocks of the > coefficient > matrix only, not the ILU(k) of the entire matrix. > > > I've tried, with less success, converting to distributed arrays, because in > reality > my program requires alternating between solving the system above, and solving > another system based on the divergences of the magnetic field, and so this > requires > a lot of message passing between the nodes. I think that using DA's would > require > passing only the ghost values instead of all the values as I am now doing. > > So I coded up DA's, and it works, but only okay, and not as well as the case > above. > The main problem is that it seems to work best for ILU(0), whereas the case > above > works better and better with the more fill-in specified. I think I've tried > every > possible option, but I can't get any improvement over just using ILU(0), but > the problem > is that with ILU(0), it takes too many iterations, and so the total time is > more > than when I use my original method with, say, ILU(8). > > The differences between using DA's and the approach above is that the fields > are > interlaced in DA's, and the boundary values are included as unknowns (with > coefficient > matrix values set to 1.0), whereas my original way the boundary values are > incorporated in the right-hand side. Maybe that hurts the condition of my > system. > > I would really like to use DA's to improve on the efficiency of the code, but > I > can't figure out how to do that. > > It was suggested to me last year to use PCFIELDSPLIT, but there are no > examples, > and I'm not a c programmer so it's hard for me to look at the source code and > know what to do. (I'm only just now able to get back to this). > > Does anyone do any prescaling of their systems? If so, does anyone have > examples > of how this can be done? > > Any advice is greatly appreciated. > > Thanks, Randy > > From randy at geosystem.us Thu Apr 27 00:23:37 2006 From: randy at geosystem.us (Randall Mackie) Date: Wed, 26 Apr 2006 22:23:37 -0700 Subject: question on DA's and performance In-Reply-To: References: <44501362.6080600@geosystem.us> Message-ID: <44505559.9010405@geosystem.us> Barry Smith wrote: > > Randy, > > 1) I'd first change the scaling of the >> with coefficient matrix values set to 1.0), > to match the other diagonal entries in the matrix. For example, if the > diagonal entries are usually 10,000 then multiply the "boundary condition" > rows by 10,000. (Note the diagonal entries may be proportional to h, or > 1/h etc > so make sure you get the same scaling in your boundary condition rows. > For example, > if the diagonal entries double when you refine the grid make sure your > "boundary condition" equations scale the same way.) I suspected this might be a problem. The diagonal entries have a wide dynamic range, all the way up to 1e10. I will try this and let you know if it helps. > > 2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better > in your "new" code because it uses point block ILU(k) instead of point > ILU(k). I am using parallel AIJ matrices. Are there any examples that show how to set up the BAIJ matrices? (in fortran?) > > 3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do > in the previous version?). In the DA code, I am using: -em_ksp_type bcgs \ -em_sub_pc_type ilu \ -divh_ksp_type cr \ -divh_sub_pc_type ilu \ In my previous code, I was using: -em_ksp_type bcgs \ -em_sub_pc_type ilu \ -em_sub_pc_factor_levels 8 \ -em_sub_pc_factor_fill 4 \ -divh_ksp_type cr \ -divh_sub_pc_type icc \ > > 4) If you use LU on the blocks does it work better than ILU(0) in > terms of iterations? A little, a lot? If just a little I suggest trying > -pc_type asm I have tried asm, and it helps a little, but not enough to be competitive with my previous code. > > Let us know how things go, you should be able to get similar convergence > using the DAs. I will try the scaling and the BAIJ matrix suggestions. Thanks, Randy > > Barry > > > > > On Wed, 26 Apr 2006, Randall Mackie wrote: > >> I've been using Petsc for a few years with reasonably good success. My >> application is >> 3D EM forward modeling and inversion. >> >> What has been working well is basically an adaptation of what I did in >> serial mode, >> by solving the following system of equations: >> >> |Mxx Mxy Mxz| |Hx| |bx| >> |Myx Myy Myz| |Hy| = |by| >> |Mzx Mzy Mzz| |Hz| |bz| >> >> Because this system is very stiff and ill-conditioned, the >> preconditioner that has >> been successfully used is the ILU(k) of the diagonal sub-blocks of the >> coefficient >> matrix only, not the ILU(k) of the entire matrix. >> >> >> I've tried, with less success, converting to distributed arrays, >> because in reality >> my program requires alternating between solving the system above, and >> solving >> another system based on the divergences of the magnetic field, and so >> this requires >> a lot of message passing between the nodes. I think that using DA's >> would require >> passing only the ghost values instead of all the values as I am now >> doing. >> >> So I coded up DA's, and it works, but only okay, and not as well as >> the case above. >> The main problem is that it seems to work best for ILU(0), whereas the >> case above >> works better and better with the more fill-in specified. I think I've >> tried every >> possible option, but I can't get any improvement over just using >> ILU(0), but the problem >> is that with ILU(0), it takes too many iterations, and so the total >> time is more >> than when I use my original method with, say, ILU(8). >> >> The differences between using DA's and the approach above is that the >> fields are >> interlaced in DA's, and the boundary values are included as unknowns >> (with coefficient >> matrix values set to 1.0), whereas my original way the boundary values >> are >> incorporated in the right-hand side. Maybe that hurts the condition of >> my system. >> >> I would really like to use DA's to improve on the efficiency of the >> code, but I >> can't figure out how to do that. >> >> It was suggested to me last year to use PCFIELDSPLIT, but there are no >> examples, >> and I'm not a c programmer so it's hard for me to look at the source >> code and >> know what to do. (I'm only just now able to get back to this). >> >> Does anyone do any prescaling of their systems? If so, does anyone >> have examples >> of how this can be done? >> >> Any advice is greatly appreciated. >> >> Thanks, Randy >> >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From randy at geosystem.us Thu Apr 27 11:56:50 2006 From: randy at geosystem.us (Randall Mackie) Date: Thu, 27 Apr 2006 09:56:50 -0700 Subject: question on DA's and performance In-Reply-To: References: <44501362.6080600@geosystem.us> Message-ID: <4450F7D2.4080304@geosystem.us> Barry, Can you give some advice, or do you have any examples, on how to set the block size in MatCreateMPIBAIJ? Randy Barry Smith wrote: > > Randy, > > 1) I'd first change the scaling of the >> with coefficient matrix values set to 1.0), > to match the other diagonal entries in the matrix. For example, if the > diagonal entries are usually 10,000 then multiply the "boundary condition" > rows by 10,000. (Note the diagonal entries may be proportional to h, or > 1/h etc > so make sure you get the same scaling in your boundary condition rows. > For example, > if the diagonal entries double when you refine the grid make sure your > "boundary condition" equations scale the same way.) > > 2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better > in your "new" code because it uses point block ILU(k) instead of point > ILU(k). > > 3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do > in the previous version?). > > 4) If you use LU on the blocks does it work better than ILU(0) in > terms of iterations? A little, a lot? If just a little I suggest trying > -pc_type asm > > Let us know how things go, you should be able to get similar convergence > using the DAs. > > Barry > > > > > On Wed, 26 Apr 2006, Randall Mackie wrote: > >> I've been using Petsc for a few years with reasonably good success. My >> application is >> 3D EM forward modeling and inversion. >> >> What has been working well is basically an adaptation of what I did in >> serial mode, >> by solving the following system of equations: >> >> |Mxx Mxy Mxz| |Hx| |bx| >> |Myx Myy Myz| |Hy| = |by| >> |Mzx Mzy Mzz| |Hz| |bz| >> >> Because this system is very stiff and ill-conditioned, the >> preconditioner that has >> been successfully used is the ILU(k) of the diagonal sub-blocks of the >> coefficient >> matrix only, not the ILU(k) of the entire matrix. >> >> >> I've tried, with less success, converting to distributed arrays, >> because in reality >> my program requires alternating between solving the system above, and >> solving >> another system based on the divergences of the magnetic field, and so >> this requires >> a lot of message passing between the nodes. I think that using DA's >> would require >> passing only the ghost values instead of all the values as I am now >> doing. >> >> So I coded up DA's, and it works, but only okay, and not as well as >> the case above. >> The main problem is that it seems to work best for ILU(0), whereas the >> case above >> works better and better with the more fill-in specified. I think I've >> tried every >> possible option, but I can't get any improvement over just using >> ILU(0), but the problem >> is that with ILU(0), it takes too many iterations, and so the total >> time is more >> than when I use my original method with, say, ILU(8). >> >> The differences between using DA's and the approach above is that the >> fields are >> interlaced in DA's, and the boundary values are included as unknowns >> (with coefficient >> matrix values set to 1.0), whereas my original way the boundary values >> are >> incorporated in the right-hand side. Maybe that hurts the condition of >> my system. >> >> I would really like to use DA's to improve on the efficiency of the >> code, but I >> can't figure out how to do that. >> >> It was suggested to me last year to use PCFIELDSPLIT, but there are no >> examples, >> and I'm not a c programmer so it's hard for me to look at the source >> code and >> know what to do. (I'm only just now able to get back to this). >> >> Does anyone do any prescaling of their systems? If so, does anyone >> have examples >> of how this can be done? >> >> Any advice is greatly appreciated. >> >> Thanks, Randy >> >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From bsmith at mcs.anl.gov Thu Apr 27 15:07:02 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 27 Apr 2006 15:07:02 -0500 (CDT) Subject: question on DA's and performance In-Reply-To: <4450F7D2.4080304@geosystem.us> References: <44501362.6080600@geosystem.us> <4450F7D2.4080304@geosystem.us> Message-ID: In your case it is 2, since you have 2 degree's of freedom per cell. But note: Aren't you using DAGetMatrix() to get your matrix instead of MatCreate....() directory? You would pass MATBAIJ into DAGetMatrix() to get the BAIJ matrix. Barry On Thu, 27 Apr 2006, Randall Mackie wrote: > Barry, > > Can you give some advice, or do you have any examples, on how to set > the block size in MatCreateMPIBAIJ? > > Randy > > > Barry Smith wrote: >> >> Randy, >> >> 1) I'd first change the scaling of the >>> with coefficient matrix values set to 1.0), >> to match the other diagonal entries in the matrix. For example, if the >> diagonal entries are usually 10,000 then multiply the "boundary condition" >> rows by 10,000. (Note the diagonal entries may be proportional to h, or 1/h >> etc >> so make sure you get the same scaling in your boundary condition rows. For >> example, >> if the diagonal entries double when you refine the grid make sure your >> "boundary condition" equations scale the same way.) >> >> 2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better >> in your "new" code because it uses point block ILU(k) instead of point >> ILU(k). >> >> 3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do in >> the previous version?). >> >> 4) If you use LU on the blocks does it work better than ILU(0) in terms >> of iterations? A little, a lot? If just a little I suggest trying -pc_type >> asm >> >> Let us know how things go, you should be able to get similar convergence >> using the DAs. >> >> Barry >> >> >> >> >> On Wed, 26 Apr 2006, Randall Mackie wrote: >> >>> I've been using Petsc for a few years with reasonably good success. My >>> application is >>> 3D EM forward modeling and inversion. >>> >>> What has been working well is basically an adaptation of what I did in >>> serial mode, >>> by solving the following system of equations: >>> >>> |Mxx Mxy Mxz| |Hx| |bx| >>> |Myx Myy Myz| |Hy| = |by| >>> |Mzx Mzy Mzz| |Hz| |bz| >>> >>> Because this system is very stiff and ill-conditioned, the preconditioner >>> that has >>> been successfully used is the ILU(k) of the diagonal sub-blocks of the >>> coefficient >>> matrix only, not the ILU(k) of the entire matrix. >>> >>> >>> I've tried, with less success, converting to distributed arrays, because >>> in reality >>> my program requires alternating between solving the system above, and >>> solving >>> another system based on the divergences of the magnetic field, and so this >>> requires >>> a lot of message passing between the nodes. I think that using DA's would >>> require >>> passing only the ghost values instead of all the values as I am now doing. >>> >>> So I coded up DA's, and it works, but only okay, and not as well as the >>> case above. >>> The main problem is that it seems to work best for ILU(0), whereas the >>> case above >>> works better and better with the more fill-in specified. I think I've >>> tried every >>> possible option, but I can't get any improvement over just using ILU(0), >>> but the problem >>> is that with ILU(0), it takes too many iterations, and so the total time >>> is more >>> than when I use my original method with, say, ILU(8). >>> >>> The differences between using DA's and the approach above is that the >>> fields are >>> interlaced in DA's, and the boundary values are included as unknowns (with >>> coefficient >>> matrix values set to 1.0), whereas my original way the boundary values are >>> incorporated in the right-hand side. Maybe that hurts the condition of my >>> system. >>> >>> I would really like to use DA's to improve on the efficiency of the code, >>> but I >>> can't figure out how to do that. >>> >>> It was suggested to me last year to use PCFIELDSPLIT, but there are no >>> examples, >>> and I'm not a c programmer so it's hard for me to look at the source code >>> and >>> know what to do. (I'm only just now able to get back to this). >>> >>> Does anyone do any prescaling of their systems? If so, does anyone have >>> examples >>> of how this can be done? >>> >>> Any advice is greatly appreciated. >>> >>> Thanks, Randy >>> >>> >> > > From randy at geosystem.us Thu Apr 27 15:20:22 2006 From: randy at geosystem.us (Randall Mackie) Date: Thu, 27 Apr 2006 13:20:22 -0700 Subject: question on DA's and performance In-Reply-To: References: <44501362.6080600@geosystem.us> <4450F7D2.4080304@geosystem.us> Message-ID: <44512786.1070404@geosystem.us> Actually, I have 3 degrees of freedom (Hx, Hy, Hz) per cell. I'm not using DAGetMatrix because as of last year, that returns the full coupling and a lot of structure that I don't use, and I couldn't figure out how to use DASetBlockFills() to get the right amount of coupling I needed, so I just coded it up use MatCreateMPIAIJ, and it works. I suggested last year that the actual coupling could be based on the values actually entered, and you said you'd put that on the todo list. For example, in my problem, I'm solving the curl curl equations, so the coupling is like >>> >>> Hx(i,j,k) is coupled to Hx(i,j,k-1), Hx(i,j,k+1), Hx(i,j+1,k), >>> Hx(i,j-1,k), Hy(i+1,j,k), Hy(i,j,k), Hy(i+1,j-1,k), Hy(i,j-1,k), >>> Hz(i,j,k-1), Hz(i+1,j,k-1), Hz(i,j,k), Hz(i+1,j,k) >>> >>> >>> Similarly for Hy(i,j,k) and Hz(i,j,k) Randy Barry Smith wrote: > > In your case it is 2, since you have 2 degree's of freedom per cell. > But note: Aren't you using DAGetMatrix() to get your matrix instead of > MatCreate....() directory? You would pass MATBAIJ into DAGetMatrix() to > get the BAIJ matrix. > > Barry > > > On Thu, 27 Apr 2006, Randall Mackie wrote: > >> Barry, >> >> Can you give some advice, or do you have any examples, on how to set >> the block size in MatCreateMPIBAIJ? >> >> Randy >> >> >> Barry Smith wrote: >>> >>> Randy, >>> >>> 1) I'd first change the scaling of the >>>> with coefficient matrix values set to 1.0), >>> to match the other diagonal entries in the matrix. For example, if the >>> diagonal entries are usually 10,000 then multiply the "boundary >>> condition" >>> rows by 10,000. (Note the diagonal entries may be proportional to h, >>> or 1/h etc >>> so make sure you get the same scaling in your boundary condition >>> rows. For example, >>> if the diagonal entries double when you refine the grid make sure your >>> "boundary condition" equations scale the same way.) >>> >>> 2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better >>> in your "new" code because it uses point block ILU(k) instead of >>> point ILU(k). >>> >>> 3) Are you using -pc_type bjacobi or -pc_type asm (and what did you >>> do in the previous version?). >>> >>> 4) If you use LU on the blocks does it work better than ILU(0) in >>> terms of iterations? A little, a lot? If just a little I suggest >>> trying -pc_type asm >>> >>> Let us know how things go, you should be able to get similar >>> convergence >>> using the DAs. >>> >>> Barry >>> >>> >>> >>> >>> On Wed, 26 Apr 2006, Randall Mackie wrote: >>> >>>> I've been using Petsc for a few years with reasonably good success. >>>> My application is >>>> 3D EM forward modeling and inversion. >>>> >>>> What has been working well is basically an adaptation of what I did >>>> in serial mode, >>>> by solving the following system of equations: >>>> >>>> |Mxx Mxy Mxz| |Hx| |bx| >>>> |Myx Myy Myz| |Hy| = |by| >>>> |Mzx Mzy Mzz| |Hz| |bz| >>>> >>>> Because this system is very stiff and ill-conditioned, the >>>> preconditioner that has >>>> been successfully used is the ILU(k) of the diagonal sub-blocks of >>>> the coefficient >>>> matrix only, not the ILU(k) of the entire matrix. >>>> >>>> >>>> I've tried, with less success, converting to distributed arrays, >>>> because in reality >>>> my program requires alternating between solving the system above, >>>> and solving >>>> another system based on the divergences of the magnetic field, and >>>> so this requires >>>> a lot of message passing between the nodes. I think that using DA's >>>> would require >>>> passing only the ghost values instead of all the values as I am now >>>> doing. >>>> >>>> So I coded up DA's, and it works, but only okay, and not as well as >>>> the case above. >>>> The main problem is that it seems to work best for ILU(0), whereas >>>> the case above >>>> works better and better with the more fill-in specified. I think >>>> I've tried every >>>> possible option, but I can't get any improvement over just using >>>> ILU(0), but the problem >>>> is that with ILU(0), it takes too many iterations, and so the total >>>> time is more >>>> than when I use my original method with, say, ILU(8). >>>> >>>> The differences between using DA's and the approach above is that >>>> the fields are >>>> interlaced in DA's, and the boundary values are included as unknowns >>>> (with coefficient >>>> matrix values set to 1.0), whereas my original way the boundary >>>> values are >>>> incorporated in the right-hand side. Maybe that hurts the condition >>>> of my system. >>>> >>>> I would really like to use DA's to improve on the efficiency of the >>>> code, but I >>>> can't figure out how to do that. >>>> >>>> It was suggested to me last year to use PCFIELDSPLIT, but there are >>>> no examples, >>>> and I'm not a c programmer so it's hard for me to look at the source >>>> code and >>>> know what to do. (I'm only just now able to get back to this). >>>> >>>> Does anyone do any prescaling of their systems? If so, does anyone >>>> have examples >>>> of how this can be done? >>>> >>>> Any advice is greatly appreciated. >>>> >>>> Thanks, Randy >>>> >>>> >>> >> >> > -- Randall Mackie GSY-USA, Inc. PMB# 643 2261 Market St., San Francisco, CA 94114-1600 Tel (415) 469-8649 Fax (415) 469-5044 California Registered Geophysicist License No. GP 1034 From bsmith at mcs.anl.gov Thu Apr 27 15:35:00 2006 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 27 Apr 2006 15:35:00 -0500 (CDT) Subject: question on DA's and performance In-Reply-To: <44512786.1070404@geosystem.us> References: <44501362.6080600@geosystem.us> <4450F7D2.4080304@geosystem.us> <44512786.1070404@geosystem.us> Message-ID: Randy, If the codes are in a form that I could build and run them could you just send me the two codes (petsc-maint at mcs.anl.gov) so I can play around with the convergence of both? Also send sample input data if there is some. Barry On Thu, 27 Apr 2006, Randall Mackie wrote: > Actually, I have 3 degrees of freedom (Hx, Hy, Hz) per cell. I'm not using > DAGetMatrix because as of last year, that returns the full coupling > and a lot of structure that I don't use, and I couldn't figure out > how to use DASetBlockFills() to get the right amount of coupling I needed, > so I just coded it up use MatCreateMPIAIJ, and it works. > > I suggested last year that the actual coupling could be based on the > values actually entered, and you said you'd put that on the todo list. > > For example, in my problem, I'm solving the curl curl equations, so > the coupling is like > >>>> >>>> Hx(i,j,k) is coupled to Hx(i,j,k-1), Hx(i,j,k+1), Hx(i,j+1,k), >>>> Hx(i,j-1,k), Hy(i+1,j,k), Hy(i,j,k), Hy(i+1,j-1,k), > Hy(i,j-1,k), >>>> Hz(i,j,k-1), Hz(i+1,j,k-1), Hz(i,j,k), Hz(i+1,j,k) >>>> >>>> >>>> Similarly for Hy(i,j,k) and Hz(i,j,k) > > > Randy > > > Barry Smith wrote: >> >> In your case it is 2, since you have 2 degree's of freedom per cell. >> But note: Aren't you using DAGetMatrix() to get your matrix instead of >> MatCreate....() directory? You would pass MATBAIJ into DAGetMatrix() to >> get the BAIJ matrix. >> >> Barry >> >> >> On Thu, 27 Apr 2006, Randall Mackie wrote: >> >>> Barry, >>> >>> Can you give some advice, or do you have any examples, on how to set >>> the block size in MatCreateMPIBAIJ? >>> >>> Randy >>> >>> >>> Barry Smith wrote: >>>> >>>> Randy, >>>> >>>> 1) I'd first change the scaling of the >>>>> with coefficient matrix values set to 1.0), >>>> to match the other diagonal entries in the matrix. For example, if the >>>> diagonal entries are usually 10,000 then multiply the "boundary >>>> condition" >>>> rows by 10,000. (Note the diagonal entries may be proportional to h, or >>>> 1/h etc >>>> so make sure you get the same scaling in your boundary condition rows. >>>> For example, >>>> if the diagonal entries double when you refine the grid make sure your >>>> "boundary condition" equations scale the same way.) >>>> >>>> 2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better >>>> in your "new" code because it uses point block ILU(k) instead of point >>>> ILU(k). >>>> >>>> 3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do >>>> in the previous version?). >>>> >>>> 4) If you use LU on the blocks does it work better than ILU(0) in terms >>>> of iterations? A little, a lot? If just a little I suggest trying >>>> -pc_type asm >>>> >>>> Let us know how things go, you should be able to get similar >>>> convergence >>>> using the DAs. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> On Wed, 26 Apr 2006, Randall Mackie wrote: >>>> >>>>> I've been using Petsc for a few years with reasonably good success. My >>>>> application is >>>>> 3D EM forward modeling and inversion. >>>>> >>>>> What has been working well is basically an adaptation of what I did in >>>>> serial mode, >>>>> by solving the following system of equations: >>>>> >>>>> |Mxx Mxy Mxz| |Hx| |bx| >>>>> |Myx Myy Myz| |Hy| = |by| >>>>> |Mzx Mzy Mzz| |Hz| |bz| >>>>> >>>>> Because this system is very stiff and ill-conditioned, the >>>>> preconditioner that has >>>>> been successfully used is the ILU(k) of the diagonal sub-blocks of the >>>>> coefficient >>>>> matrix only, not the ILU(k) of the entire matrix. >>>>> >>>>> >>>>> I've tried, with less success, converting to distributed arrays, because >>>>> in reality >>>>> my program requires alternating between solving the system above, and >>>>> solving >>>>> another system based on the divergences of the magnetic field, and so >>>>> this requires >>>>> a lot of message passing between the nodes. I think that using DA's >>>>> would require >>>>> passing only the ghost values instead of all the values as I am now >>>>> doing. >>>>> >>>>> So I coded up DA's, and it works, but only okay, and not as well as the >>>>> case above. >>>>> The main problem is that it seems to work best for ILU(0), whereas the >>>>> case above >>>>> works better and better with the more fill-in specified. I think I've >>>>> tried every >>>>> possible option, but I can't get any improvement over just using ILU(0), >>>>> but the problem >>>>> is that with ILU(0), it takes too many iterations, and so the total time >>>>> is more >>>>> than when I use my original method with, say, ILU(8). >>>>> >>>>> The differences between using DA's and the approach above is that the >>>>> fields are >>>>> interlaced in DA's, and the boundary values are included as unknowns >>>>> (with coefficient >>>>> matrix values set to 1.0), whereas my original way the boundary values >>>>> are >>>>> incorporated in the right-hand side. Maybe that hurts the condition of >>>>> my system. >>>>> >>>>> I would really like to use DA's to improve on the efficiency of the >>>>> code, but I >>>>> can't figure out how to do that. >>>>> >>>>> It was suggested to me last year to use PCFIELDSPLIT, but there are no >>>>> examples, >>>>> and I'm not a c programmer so it's hard for me to look at the source >>>>> code and >>>>> know what to do. (I'm only just now able to get back to this). >>>>> >>>>> Does anyone do any prescaling of their systems? If so, does anyone have >>>>> examples >>>>> of how this can be done? >>>>> >>>>> Any advice is greatly appreciated. >>>>> >>>>> Thanks, Randy >>>>> >>>>> >>>> >>> >>> >> > >