From cmay at phys.ethz.ch Wed Jul 1 04:07:09 2009 From: cmay at phys.ethz.ch (Christian May) Date: Wed, 1 Jul 2009 11:07:09 +0200 (CEST) Subject: petsc with parmetis configuration problem Message-ID: Hi all, I want to configure petsc with parmetis invoking ./configure --prefix=/cluster/work/phys/cmay/petsc/ --with-c++-support --with-precision=double --with-shared=0 --download-superlu_dist=1 --with-superlu_dist --with-parmetis-lib=/cluster/work/phys/cmay/petsc/petsc-3.0.0-p3/externalpackages/ParMetis-3.1.1/libparmetis.a --with-parmetis-include=/cluster/work/phys/cmay/petsc/petsc-3.0.0-p3/externalpackages/ParMetis-3.1.1/ According to configure.log the error is: Possible ERROR while running linker: /cluster/work/phys/cmay/petsc/petsc-3.0.0-p3/externalpackages/ParMetis-3.1.1/libparmetis.a(kmetis.o): In function `ParMETIS_V3_PartKway': kmetis.c:(.text+0xec5): undefined reference to `METIS_mCPartGraphRecursive2' kmetis.c:(.text+0xf45): undefined reference to `METIS_WPartGraphKway' These symbols are defined in libmetis.a, however there is no configure option for petsc to include metis. What would be the clean solution for this? Thanks Christian From knepley at gmail.com Wed Jul 1 08:26:37 2009 From: knepley at gmail.com (Matt Knepley) Date: Wed, 1 Jul 2009 08:26:37 -0500 Subject: MatGetArrayF90 returns 2d array In-Reply-To: <4A4AA1B5.9030501@59A2.org> References: <4A4A4F29.5070105@imperial.ac.uk> <4A4AA1B5.9030501@59A2.org> Message-ID: <2A14059B-1CA8-49AF-B247-648AB63DDE62@gmail.com> I cannot see the logic in getting the array when we already allow getting the aij structures. I cannot condone a function which changes it's meaning for every matrix type. Matt From the phone On Jun 30, 2009, at 6:37 PM, Jed Brown wrote: > Matthew Knepley wrote: >> I thought the idea was that MatGetArray() never applies to a sparse >> matrix. No other sparse format supports this, does it? > > That's not true at all, but the result is implementation-dependent. > For > example, the array for AIJ is different from the array for BAIJ. For > this reason, you shouldn't be calling MatGetArray unless you know the > matrix type, but of course the F90 interface should agree with the C > interface. > > Jed > From bsmith at mcs.anl.gov Wed Jul 1 13:36:01 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 1 Jul 2009 13:36:01 -0500 Subject: MatGetArrayF90 returns 2d array In-Reply-To: <4A4AA1B5.9030501@59A2.org> References: <4A4A4F29.5070105@imperial.ac.uk> <4A4AA1B5.9030501@59A2.org> Message-ID: <871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov> Satish, Could you please convert these (both the stub code and the F90 interface) to use a 1d array? In both 3.0.0 and petsc-dev. Thanks Barry On Jun 30, 2009, at 6:37 PM, Jed Brown wrote: > Matthew Knepley wrote: >> I thought the idea was that MatGetArray() never applies to a sparse >> matrix. No other sparse format supports this, does it? > > That's not true at all, but the result is implementation-dependent. > For > example, the array for AIJ is different from the array for BAIJ. For > this reason, you shouldn't be calling MatGetArray unless you know the > matrix type, but of course the F90 interface should agree with the C > interface. > > Jed > From knepley at gmail.com Wed Jul 1 17:54:58 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 2 Jul 2009 06:54:58 +0800 Subject: MatGetArrayF90 returns 2d array In-Reply-To: <871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov> References: <4A4A4F29.5070105@imperial.ac.uk> <4A4AA1B5.9030501@59A2.org> <871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov> Message-ID: On Thu, Jul 2, 2009 at 2:36 AM, Barry Smith wrote: > > Satish, > > Could you please convert these (both the stub code and the F90 > interface) to use a 1d array? In both 3.0.0 and petsc-dev. I am still against this, since it seems much nicer to get a 2D array for the dense matrix, which is the ONLY matrix for which GetArray() makes any sense I think. Matt > > Thanks > > Barry > > On Jun 30, 2009, at 6:37 PM, Jed Brown wrote: > > Matthew Knepley wrote: >> >>> I thought the idea was that MatGetArray() never applies to a sparse >>> matrix. No other sparse format supports this, does it? >>> >> >> That's not true at all, but the result is implementation-dependent. For >> example, the array for AIJ is different from the array for BAIJ. For >> this reason, you shouldn't be calling MatGetArray unless you know the >> matrix type, but of course the F90 interface should agree with the C >> interface. >> >> Jed >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jul 2 11:47:56 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 2 Jul 2009 11:47:56 -0500 Subject: MatGetArrayF90 returns 2d array In-Reply-To: References: <4A4A4F29.5070105@imperial.ac.uk> <4A4AA1B5.9030501@59A2.org> <871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov> Message-ID: MatGetArray() returns access to the nonzero values of the matrix in a format that depends on the underlying class. It is a perfectly reasonable method and has been there for decades just fine. Barry On Jul 1, 2009, at 5:54 PM, Matthew Knepley wrote: > On Thu, Jul 2, 2009 at 2:36 AM, Barry Smith > wrote: > > Satish, > > Could you please convert these (both the stub code and the F90 > interface) to use a 1d array? In both 3.0.0 and petsc-dev. > > I am still against this, since it seems much nicer to get a 2D array > for the dense matrix, which is the ONLY matrix for which GetArray() > makes any sense I think. > > Matt > > > Thanks > > Barry > > On Jun 30, 2009, at 6:37 PM, Jed Brown wrote: > > Matthew Knepley wrote: > I thought the idea was that MatGetArray() never applies to a sparse > matrix. No other sparse format supports this, does it? > > That's not true at all, but the result is implementation-dependent. > For > example, the array for AIJ is different from the array for BAIJ. For > this reason, you shouldn't be calling MatGetArray unless you know the > matrix type, but of course the F90 interface should agree with the C > interface. > > Jed > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From dalcinl at gmail.com Thu Jul 2 14:59:41 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 2 Jul 2009 16:59:41 -0300 Subject: MatGetArrayF90 returns 2d array In-Reply-To: References: <4A4A4F29.5070105@imperial.ac.uk> <4A4AA1B5.9030501@59A2.org> <871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov> Message-ID: On Thu, Jul 2, 2009 at 1:47 PM, Barry Smith wrote: > > ?MatGetArray() returns access to the nonzero values of the matrix in a > format that depends on the underlying class. It is a perfectly reasonable > method and has been there for decades just fine. > BTW, Matt commented about PETSc providing support for obtaining the AIJ structure... I know how to get the "IJ" part, MatGetArray() let me get the "A" part, but ... Where is the call that let one to get the whole data, indices and values? > ?Barry > > On Jul 1, 2009, at 5:54 PM, Matthew Knepley wrote: > >> On Thu, Jul 2, 2009 at 2:36 AM, Barry Smith wrote: >> >> Satish, >> >> ?Could you please convert these (both the stub code and the F90 interface) >> to use a 1d array? In both 3.0.0 and petsc-dev. >> >> I am still against this, since it seems much nicer to get a 2D array >> for the dense matrix, which is the ONLY matrix for which GetArray() >> makes any sense I think. >> >> ?Matt >> >> >> ?Thanks >> >> ?Barry >> >> On Jun 30, 2009, at 6:37 PM, Jed Brown wrote: >> >> Matthew Knepley wrote: >> I thought the idea was that MatGetArray() never applies to a sparse >> matrix. ?No other sparse format supports this, does it? >> >> That's not true at all, but the result is implementation-dependent. ?For >> example, the array for AIJ is different from the array for BAIJ. ?For >> this reason, you shouldn't be calling MatGetArray unless you know the >> matrix type, but of course the F90 interface should agree with the C >> interface. >> >> Jed >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From bsmith at mcs.anl.gov Thu Jul 2 15:02:04 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 2 Jul 2009 15:02:04 -0500 Subject: MatGetArrayF90 returns 2d array In-Reply-To: References: <4A4A4F29.5070105@imperial.ac.uk> <4A4AA1B5.9030501@59A2.org> <871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov> Message-ID: <76E06F04-2715-410F-A3BB-B86A8398E7BC@mcs.anl.gov> #include "src/mat/impls/aij/seq/aij.h" :-) On Jul 2, 2009, at 2:59 PM, Lisandro Dalcin wrote: > On Thu, Jul 2, 2009 at 1:47 PM, Barry Smith wrote: >> >> MatGetArray() returns access to the nonzero values of the matrix >> in a >> format that depends on the underlying class. It is a perfectly >> reasonable >> method and has been there for decades just fine. >> > > BTW, Matt commented about PETSc providing support for obtaining the > AIJ structure... I know how to get the "IJ" part, MatGetArray() let me > get the "A" part, but ... Where is the call that let one to get the > whole data, indices and values? > > >> Barry >> >> On Jul 1, 2009, at 5:54 PM, Matthew Knepley wrote: >> >>> On Thu, Jul 2, 2009 at 2:36 AM, Barry Smith >>> wrote: >>> >>> Satish, >>> >>> Could you please convert these (both the stub code and the F90 >>> interface) >>> to use a 1d array? In both 3.0.0 and petsc-dev. >>> >>> I am still against this, since it seems much nicer to get a 2D array >>> for the dense matrix, which is the ONLY matrix for which GetArray() >>> makes any sense I think. >>> >>> Matt >>> >>> >>> Thanks >>> >>> Barry >>> >>> On Jun 30, 2009, at 6:37 PM, Jed Brown wrote: >>> >>> Matthew Knepley wrote: >>> I thought the idea was that MatGetArray() never applies to a sparse >>> matrix. No other sparse format supports this, does it? >>> >>> That's not true at all, but the result is implementation- >>> dependent. For >>> example, the array for AIJ is different from the array for BAIJ. >>> For >>> this reason, you shouldn't be calling MatGetArray unless you know >>> the >>> matrix type, but of course the F90 interface should agree with the C >>> interface. >>> >>> Jed >>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which their >>> experiments lead. >>> -- Norbert Wiener >> >> > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 From sekikawa at msi.co.jp Thu Jul 2 23:30:30 2009 From: sekikawa at msi.co.jp (Takuya Sekikawa) Date: Fri, 03 Jul 2009 13:30:30 +0900 Subject: matrix creation on LAPACK mode Message-ID: <20090703132511.6ACF.SEKIKAWA@msi.co.jp> Hello I made eigenvalue solver program with SLEPc. in my program, to setup matrix, I use MatCreateSeqAIJ() function. void setupMatrix(int m, int n) { PetscErrorCode ierr; ierr=MatCreateSeqAIJ(PETSC_COMM_WORLD, m, n, nz, PETSC_NULL, &g_A); ... } Normally I select solver as KrylovSchur, but sometimes I switched solver to LAPACK. with using LAPACK, result seems to be no problem. but I suspect calculation time takes longer (because of using MatCreateSeqAIJ) Does switching matrix create function to MatCreateSeqDense() give any effect to speed up on LAPACK mode? Thanks, Takuya --------------------------------------------------------------- Takuya Sekikawa Mathematical Systems, Inc sekikawa at msi.co.jp --------------------------------------------------------------- From sekikawa at msi.co.jp Fri Jul 3 00:59:34 2009 From: sekikawa at msi.co.jp (Takuya Sekikawa) Date: Fri, 03 Jul 2009 14:59:34 +0900 Subject: calculation time Message-ID: <20090703145230.6ADB.SEKIKAWA@msi.co.jp> Dear PETSc/SLEPc users, I have made eigenproblem solver program with SLEPc. Currently it works well, but it takes very long time to solve big problem. with 10000x10000 random matrix, it takes about 34 hours to solve. (solver = KrylovSchur, on 64bit Linux platform, 16G memory, 1 machine) Is this ordinally time to solve problem like these size? or Is there any good way to shorten calculation time? Thanks Takuya --------------------------------------------------------------- Takuya Sekikawa Mathematical Systems, Inc sekikawa at msi.co.jp --------------------------------------------------------------- From socrates.wei at gmail.com Fri Jul 3 02:50:27 2009 From: socrates.wei at gmail.com (Zi-Hao Wei) Date: Fri, 3 Jul 2009 15:50:27 +0800 Subject: matrix creation on LAPACK mode In-Reply-To: <20090703132511.6ACF.SEKIKAWA@msi.co.jp> References: <20090703132511.6ACF.SEKIKAWA@msi.co.jp> Message-ID: Hi I remember that when you use LAPACK as eigensolver the SLEPc will automatically convert sparse matrix into dense matrix by the function SlepcMatConvertSeqDense. On Fri, Jul 3, 2009 at 12:30 PM, Takuya Sekikawa wrote: > Hello > > I made eigenvalue solver program with SLEPc. in my program, to > setup matrix, I use MatCreateSeqAIJ() function. > > void setupMatrix(int m, int n) > { > ? ? ? ?PetscErrorCode ierr; > > ? ? ? ?ierr=MatCreateSeqAIJ(PETSC_COMM_WORLD, m, n, nz, PETSC_NULL, > &g_A); > ? ? ? ?... > } > > Normally I select solver as KrylovSchur, but sometimes I switched solver > to LAPACK. with using LAPACK, result seems to be no problem. but I > suspect calculation time takes longer (because of using MatCreateSeqAIJ) > > Does switching matrix create function to MatCreateSeqDense() give any > effect to speed up on LAPACK mode? > > Thanks, > Takuya > --------------------------------------------------------------- > ? Takuya Sekikawa > ? ? ? ? Mathematical Systems, Inc > ? ? ? ? ? ? ? ? ? sekikawa at msi.co.jp > --------------------------------------------------------------- > > > -- Zi-Hao Wei Department of Mathematics National Central University, Taiwan Adrienne Gusoff - "Opportunity knocked. My doorman threw him out." - http://www.brainyquote.com/quotes/authors/a/adrienne_gusoff.html From socrates.wei at gmail.com Fri Jul 3 02:56:08 2009 From: socrates.wei at gmail.com (Zi-Hao Wei) Date: Fri, 3 Jul 2009 15:56:08 +0800 Subject: calculation time In-Reply-To: <20090703145230.6ADB.SEKIKAWA@msi.co.jp> References: <20090703145230.6ADB.SEKIKAWA@msi.co.jp> Message-ID: Hi How many eigenvalues did you compute? I think that these Krylov subspace methods, such as Krylov-Schur, Arnoldi, Lanczos, and etc. are not suitable for finding whole spectrum. On Fri, Jul 3, 2009 at 1:59 PM, Takuya Sekikawa wrote: > Dear PETSc/SLEPc users, > > I have made eigenproblem solver program with SLEPc. > Currently it works well, but it takes very long time to solve big > problem. > > with 10000x10000 random matrix, it takes about 34 hours to solve. > (solver = KrylovSchur, on 64bit Linux platform, 16G memory, 1 machine) > > Is this ordinally time to solve problem like these size? > or Is there any good way to shorten calculation time? > > Thanks > Takuya > --------------------------------------------------------------- > ?Takuya Sekikawa > ? ? ? ? Mathematical Systems, Inc > ? ? ? ? ? ? ? ? ? ?sekikawa at msi.co.jp > --------------------------------------------------------------- > > > -- Zi-Hao Wei Department of Mathematics National Central University, Taiwan Rita Rudner - "I was a vegetarian until I started leaning toward the sunlight." - http://www.brainyquote.com/quotes/authors/r/rita_rudner.html From kuiper at mpia.de Fri Jul 3 03:52:55 2009 From: kuiper at mpia.de (Rolf Kuiper) Date: Fri, 3 Jul 2009 10:52:55 +0200 Subject: MPI-layout of PETSc In-Reply-To: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> References: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> Message-ID: Hi, Am 30.06.2009 um 02:24 schrieb Barry Smith: > > On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote: > >> Hi PETSc users, >> >> I ran into trouble in combining my developed PETSc application with >> another code (based on another library called "ArrayLib"). >> The problem is the parallel layout for MPI, e.g. in 2D with 6 cpus >> the ArrayLib code gives the names/ranks of the local cpus first in >> y-direction, than in x (from last to first, in the same way the MPI >> arrays are called, like 3Darray[z][y][x]): >> >> y >> ^ >> | 2-4-6 >> | 1-3-5 >> |--------> x >> >> If I call DACreate() from PETSc, it will assume an ordering >> according to names/ranks first set in x-direction, than in y: >> >> y >> ^ >> | 4-5-6 >> | 1-2-3 >> |--------> x >> >> Of course, if I now communicate the boundary values, I mix up the >> domain (build by the other program). >> >> Is there a possibility / a flag to set the name of the ranks? >> Due to the fact that my application is written and working in >> curvilinear coordinates and not in cartesian, I cannot just switch >> the directions. > > What we recommend in this case is to just change the meaning of x, > y, and z when you use the PETSc DA. This does mean changing your > code that uses the PETSc DA. The code is used as a module for many codes, so I would prefer to not change the code (and the meaning of directions, that's not user- friendly), but 'just' change the communicator. > I do not understand why curvilinear coordinates has anything to do > with it. Another choice is to create a new MPI communicator that has > the different ordering of the ranks of the processors and then using > that comm to create the PETSc DA objects; then you would not need to > change your code that calls PETSc. I tried some time before to use the PetscSetCommWorld() routine, but I can't find it anymore, how can I set a new communicator in PETSc3.0? The communicator, I want to use, is the MPI_COMM_WORLD, which takes the first described ordering. Now I read that the MPI_COMM_WORLD is the default communicator for PETSc. But why is the ordering than different? Sorry for all this question, but (as you can see) I really don't understand this comm problem at the moment, Thanks for all, Rolf > Unfortunately PETSc doesn't have any way to flip how the DA > handles the layout automatically. > > Barry > >> >> Thanks a lot for your help, >> Rolf > From bsmith at mcs.anl.gov Fri Jul 3 10:56:20 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 3 Jul 2009 10:56:20 -0500 Subject: MPI-layout of PETSc In-Reply-To: References: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> Message-ID: <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov> In designing the PETSc DA I did not (by ignorance) follow the layout approach of the MPI cartesian MPI_Cart_create (that gives the first local cpus first in the y-direction). I had it put the first cpus in the x-direction. What you need to do is create a new communicator that changes the order of the processors so that when used by the PETSc DA they lie out in the ordering that matches the other code. You will need to read up on the MPI_Cart stuff. To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD = yournewcom BEFORE calling PetscInitialize(). Barry On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote: > Hi, > > Am 30.06.2009 um 02:24 schrieb Barry Smith: >> >> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote: >> >>> Hi PETSc users, >>> >>> I ran into trouble in combining my developed PETSc application >>> with another code (based on another library called "ArrayLib"). >>> The problem is the parallel layout for MPI, e.g. in 2D with 6 cpus >>> the ArrayLib code gives the names/ranks of the local cpus first in >>> y-direction, than in x (from last to first, in the same way the >>> MPI arrays are called, like 3Darray[z][y][x]): >>> >>> y >>> ^ >>> | 2-4-6 >>> | 1-3-5 >>> |--------> x >>> >>> If I call DACreate() from PETSc, it will assume an ordering >>> according to names/ranks first set in x-direction, than in y: >>> >>> y >>> ^ >>> | 4-5-6 >>> | 1-2-3 >>> |--------> x >>> >>> Of course, if I now communicate the boundary values, I mix up the >>> domain (build by the other program). >>> >>> Is there a possibility / a flag to set the name of the ranks? >>> Due to the fact that my application is written and working in >>> curvilinear coordinates and not in cartesian, I cannot just switch >>> the directions. >> >> What we recommend in this case is to just change the meaning of x, >> y, and z when you use the PETSc DA. This does mean changing your >> code that uses the PETSc DA. > > The code is used as a module for many codes, so I would prefer to > not change the code (and the meaning of directions, that's not user- > friendly), but 'just' change the communicator. > >> I do not understand why curvilinear coordinates has anything to do >> with it. Another choice is to create a new MPI communicator that >> has the different ordering of the ranks of the processors and then >> using that comm to create the PETSc DA objects; then you would not >> need to change your code that calls PETSc. > > I tried some time before to use the PetscSetCommWorld() routine, but > I can't find it anymore, how can I set a new communicator in PETSc3.0? > The communicator, I want to use, is the MPI_COMM_WORLD, which takes > the first described ordering. > Now I read that the MPI_COMM_WORLD is the default communicator for > PETSc. But why is the ordering than different? > > Sorry for all this question, but (as you can see) I really don't > understand this comm problem at the moment, > Thanks for all, > Rolf > >> Unfortunately PETSc doesn't have any way to flip how the DA >> handles the layout automatically. >> >> Barry >> >>> >>> Thanks a lot for your help, >>> Rolf >> > From bsmith at mcs.anl.gov Fri Jul 3 11:00:39 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 3 Jul 2009 11:00:39 -0500 Subject: matrix creation on LAPACK mode In-Reply-To: References: <20090703132511.6ACF.SEKIKAWA@msi.co.jp> Message-ID: <649C57BF-E0C5-4D04-9CD0-9C94BFEE562F@mcs.anl.gov> On Jul 3, 2009, at 2:50 AM, Zi-Hao Wei wrote: > Hi > > I remember that when you use LAPACK as eigensolver the SLEPc will > automatically convert sparse matrix into dense matrix by the function > SlepcMatConvertSeqDense. That conversion should be very fast so it won't affect the overall time by much. Especially since the LAPACK eigenvalues computation is order N^3 work which will swamp out any oder N^2 work. Barry > > > On Fri, Jul 3, 2009 at 12:30 PM, Takuya Sekikawa > wrote: >> Hello >> >> I made eigenvalue solver program with SLEPc. in my program, to >> setup matrix, I use MatCreateSeqAIJ() function. >> >> void setupMatrix(int m, int n) >> { >> PetscErrorCode ierr; >> >> ierr=MatCreateSeqAIJ(PETSC_COMM_WORLD, m, n, nz, PETSC_NULL, >> &g_A); >> ... >> } >> >> Normally I select solver as KrylovSchur, but sometimes I switched >> solver >> to LAPACK. with using LAPACK, result seems to be no problem. but I >> suspect calculation time takes longer (because of using >> MatCreateSeqAIJ) >> >> Does switching matrix create function to MatCreateSeqDense() give any >> effect to speed up on LAPACK mode? >> >> Thanks, >> Takuya >> --------------------------------------------------------------- >> Takuya Sekikawa >> Mathematical Systems, Inc >> sekikawa at msi.co.jp >> --------------------------------------------------------------- >> >> >> > > > > -- > Zi-Hao Wei > Department of Mathematics > National Central University, Taiwan > Adrienne Gusoff - "Opportunity knocked. My doorman threw him out." - > http://www.brainyquote.com/quotes/authors/a/adrienne_gusoff.html From kuiper at mpia-hd.mpg.de Fri Jul 3 12:09:00 2009 From: kuiper at mpia-hd.mpg.de (Rolf Kuiper) Date: Fri, 3 Jul 2009 19:09:00 +0200 Subject: MPI-layout of PETSc In-Reply-To: <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov> References: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov> Message-ID: Hi Barry, I tried that already with: First way by copying: MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm); Second way by creating: int dims[3] = {0,0,0}; int ndims=3; MPI_Dims_create(NumberOfProcessors, ndims, dims); int false = 0; int true = 1; int periods[3] = { false, false, true }; int reorder = true; MPI_Comm MyComm; MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder, &MyComm); in the end then: PETSC_COMM_WORLD = MyComm; I test the MyComm with MPI_Topo_test(); and it is cartesian, yes. I can the coordinates of the cpus with MPI_Cart_coords(MyComm, LocalRank, ndims, coords); , but I found no way to set/rearrange these coordinates. Do you can help me in that case or have I to ask a MPI-support? Thanks for all, Rolf Am 03.07.2009 um 17:56 schrieb Barry Smith: > > In designing the PETSc DA I did not (by ignorance) follow the > layout approach of the MPI cartesian MPI_Cart_create (that gives the > first local cpus first in the y-direction). > I had it put the first cpus in the x-direction. > > What you need to do is create a new communicator that changes the > order of the processors so that when used by the PETSc DA they lie > out in the ordering that matches the other code. You will need to > read up on the MPI_Cart stuff. > > To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD = > yournewcom BEFORE calling PetscInitialize(). > > Barry > > On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote: > >> Hi, >> >> Am 30.06.2009 um 02:24 schrieb Barry Smith: >>> >>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote: >>> >>>> Hi PETSc users, >>>> >>>> I ran into trouble in combining my developed PETSc application >>>> with another code (based on another library called "ArrayLib"). >>>> The problem is the parallel layout for MPI, e.g. in 2D with 6 >>>> cpus the ArrayLib code gives the names/ranks of the local cpus >>>> first in y-direction, than in x (from last to first, in the same >>>> way the MPI arrays are called, like 3Darray[z][y][x]): >>>> >>>> y >>>> ^ >>>> | 2-4-6 >>>> | 1-3-5 >>>> |--------> x >>>> >>>> If I call DACreate() from PETSc, it will assume an ordering >>>> according to names/ranks first set in x-direction, than in y: >>>> >>>> y >>>> ^ >>>> | 4-5-6 >>>> | 1-2-3 >>>> |--------> x >>>> >>>> Of course, if I now communicate the boundary values, I mix up the >>>> domain (build by the other program). >>>> >>>> Is there a possibility / a flag to set the name of the ranks? >>>> Due to the fact that my application is written and working in >>>> curvilinear coordinates and not in cartesian, I cannot just >>>> switch the directions. >>> >>> What we recommend in this case is to just change the meaning of x, >>> y, and z when you use the PETSc DA. This does mean changing your >>> code that uses the PETSc DA. >> >> The code is used as a module for many codes, so I would prefer to >> not change the code (and the meaning of directions, that's not user- >> friendly), but 'just' change the communicator. >> >>> I do not understand why curvilinear coordinates has anything to do >>> with it. Another choice is to create a new MPI communicator that >>> has the different ordering of the ranks of the processors and then >>> using that comm to create the PETSc DA objects; then you would not >>> need to change your code that calls PETSc. >> >> I tried some time before to use the PetscSetCommWorld() routine, >> but I can't find it anymore, how can I set a new communicator in >> PETSc3.0? >> The communicator, I want to use, is the MPI_COMM_WORLD, which takes >> the first described ordering. >> Now I read that the MPI_COMM_WORLD is the default communicator for >> PETSc. But why is the ordering than different? >> >> Sorry for all this question, but (as you can see) I really don't >> understand this comm problem at the moment, >> Thanks for all, >> Rolf >> >>> Unfortunately PETSc doesn't have any way to flip how the DA >>> handles the layout automatically. >>> >>> Barry >>> >>>> >>>> Thanks a lot for your help, >>>> Rolf >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Fri Jul 3 12:42:20 2009 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 3 Jul 2009 19:42:20 +0200 Subject: calculation time In-Reply-To: <20090703145230.6ADB.SEKIKAWA@msi.co.jp> References: <20090703145230.6ADB.SEKIKAWA@msi.co.jp> Message-ID: <219AC380-90D4-45DF-8A5E-0E70F2556DF2@dsic.upv.es> On 03/07/2009, Takuya Sekikawa wrote: > Dear PETSc/SLEPc users, > > I have made eigenproblem solver program with SLEPc. > Currently it works well, but it takes very long time to solve big > problem. > > with 10000x10000 random matrix, it takes about 34 hours to solve. > (solver = KrylovSchur, on 64bit Linux platform, 16G memory, 1 machine) > > Is this ordinally time to solve problem like these size? > or Is there any good way to shorten calculation time? > > Thanks > Takuya > --------------------------------------------------------------- > Takuya Sekikawa > Mathematical Systems, Inc > sekikawa at msi.co.jp > --------------------------------------------------------------- SLEPc is intended for computing part of the spectrum of a sparse matrix. If you want to compute a few eigenpairs of a 10000 matrix, it should be very fast. If you want to compute a large percentage of the spectrum (30% say) then you can do it with SLEPc but need to be more careful (use appropriate values of nev, ncv and mpd parameters). Finally, if you want to compute all eigenvalues, then you should not use SLEPc. The Lapack solver in SLEPc should be used only for debugging purposes in small problems. Please read the documentation. Jose From bsmith at mcs.anl.gov Fri Jul 3 18:44:39 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 3 Jul 2009 18:44:39 -0500 Subject: MPI-layout of PETSc In-Reply-To: References: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov> Message-ID: Use MPI_Comm_split() with the same color for all processors, then use the second integer argument to indicate the new rank you want for the process. Choice the new rank so its x,y coordinate in the logical grid will match the y,x coordinate in the cartesian grid. Barry On Jul 3, 2009, at 12:09 PM, Rolf Kuiper wrote: > Hi Barry, > > I tried that already with: > First way by copying: > MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm); > > Second way by creating: > int dims[3] = {0,0,0}; > int ndims=3; > MPI_Dims_create(NumberOfProcessors, ndims, dims); > int false = 0; int true = 1; > int periods[3] = { false, false, true }; > int reorder = true; > MPI_Comm MyComm; > MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder, > &MyComm); > > in the end then: > PETSC_COMM_WORLD = MyComm; > > I test the MyComm with MPI_Topo_test(); and it is cartesian, yes. > I can the coordinates of the cpus with MPI_Cart_coords(MyComm, > LocalRank, ndims, coords); , but I found no way to set/rearrange > these coordinates. > > Do you can help me in that case or have I to ask a MPI-support? > > Thanks for all, > Rolf > > > Am 03.07.2009 um 17:56 schrieb Barry Smith: >> >> In designing the PETSc DA I did not (by ignorance) follow the >> layout approach of the MPI cartesian MPI_Cart_create (that gives >> the first local cpus first in the y-direction). >> I had it put the first cpus in the x-direction. >> >> What you need to do is create a new communicator that changes the >> order of the processors so that when used by the PETSc DA they lie >> out in the ordering that matches the other code. You will need to >> read up on the MPI_Cart stuff. >> >> To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD = >> yournewcom BEFORE calling PetscInitialize(). >> >> Barry >> >> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote: >> >>> Hi, >>> >>> Am 30.06.2009 um 02:24 schrieb Barry Smith: >>>> >>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote: >>>> >>>>> Hi PETSc users, >>>>> >>>>> I ran into trouble in combining my developed PETSc application >>>>> with another code (based on another library called "ArrayLib"). >>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6 >>>>> cpus the ArrayLib code gives the names/ranks of the local cpus >>>>> first in y-direction, than in x (from last to first, in the same >>>>> way the MPI arrays are called, like 3Darray[z][y][x]): >>>>> >>>>> y >>>>> ^ >>>>> | 2-4-6 >>>>> | 1-3-5 >>>>> |--------> x >>>>> >>>>> If I call DACreate() from PETSc, it will assume an ordering >>>>> according to names/ranks first set in x-direction, than in y: >>>>> >>>>> y >>>>> ^ >>>>> | 4-5-6 >>>>> | 1-2-3 >>>>> |--------> x >>>>> >>>>> Of course, if I now communicate the boundary values, I mix up >>>>> the domain (build by the other program). >>>>> >>>>> Is there a possibility / a flag to set the name of the ranks? >>>>> Due to the fact that my application is written and working in >>>>> curvilinear coordinates and not in cartesian, I cannot just >>>>> switch the directions. >>>> >>>> What we recommend in this case is to just change the meaning of >>>> x, y, and z when you use the PETSc DA. This does mean changing >>>> your code that uses the PETSc DA. >>> >>> The code is used as a module for many codes, so I would prefer to >>> not change the code (and the meaning of directions, that's not >>> user-friendly), but 'just' change the communicator. >>> >>>> I do not understand why curvilinear coordinates has anything to >>>> do with it. Another choice is to create a new MPI communicator >>>> that has the different ordering of the ranks of the processors >>>> and then using that comm to create the PETSc DA objects; then you >>>> would not need to change your code that calls PETSc. >>> >>> I tried some time before to use the PetscSetCommWorld() routine, >>> but I can't find it anymore, how can I set a new communicator in >>> PETSc3.0? >>> The communicator, I want to use, is the MPI_COMM_WORLD, which >>> takes the first described ordering. >>> Now I read that the MPI_COMM_WORLD is the default communicator for >>> PETSc. But why is the ordering than different? >>> >>> Sorry for all this question, but (as you can see) I really don't >>> understand this comm problem at the moment, >>> Thanks for all, >>> Rolf >>> >>>> Unfortunately PETSc doesn't have any way to flip how the DA >>>> handles the layout automatically. >>>> >>>> Barry >>>> >>>>> >>>>> Thanks a lot for your help, >>>>> Rolf >>>> >>> >> > From kuiper at mpia-hd.mpg.de Sat Jul 4 06:08:44 2009 From: kuiper at mpia-hd.mpg.de (Rolf Kuiper) Date: Sat, 4 Jul 2009 13:08:44 +0200 Subject: MPI-layout of PETSc In-Reply-To: References: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov> Message-ID: Thanks Barry! It's working. But by the way: You simply should offer such a second communicator inside the PETSc-library. Thanks for all your help, the support we got from this mailing list is amazing, Rolf Am 04.07.2009 um 01:44 schrieb Barry Smith: > > Use MPI_Comm_split() with the same color for all processors, then > use the second integer argument to indicate the new rank you want > for the process. > Choice the new rank so its x,y coordinate in the logical grid will > match the y,x coordinate in the cartesian grid. > > Barry > > On Jul 3, 2009, at 12:09 PM, Rolf Kuiper wrote: > >> Hi Barry, >> >> I tried that already with: >> First way by copying: >> MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm); >> >> Second way by creating: >> int dims[3] = {0,0,0}; >> int ndims=3; >> MPI_Dims_create(NumberOfProcessors, ndims, dims); >> int false = 0; int true = 1; >> int periods[3] = { false, false, true }; >> int reorder = true; >> MPI_Comm MyComm; >> MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder, >> &MyComm); >> >> in the end then: >> PETSC_COMM_WORLD = MyComm; >> >> I test the MyComm with MPI_Topo_test(); and it is cartesian, yes. >> I can the coordinates of the cpus with MPI_Cart_coords(MyComm, >> LocalRank, ndims, coords); , but I found no way to set/rearrange >> these coordinates. >> >> Do you can help me in that case or have I to ask a MPI-support? >> >> Thanks for all, >> Rolf >> >> >> Am 03.07.2009 um 17:56 schrieb Barry Smith: >>> >>> In designing the PETSc DA I did not (by ignorance) follow the >>> layout approach of the MPI cartesian MPI_Cart_create (that gives >>> the first local cpus first in the y-direction). >>> I had it put the first cpus in the x-direction. >>> >>> What you need to do is create a new communicator that changes the >>> order of the processors so that when used by the PETSc DA they lie >>> out in the ordering that matches the other code. You will need to >>> read up on the MPI_Cart stuff. >>> >>> To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD = >>> yournewcom BEFORE calling PetscInitialize(). >>> >>> Barry >>> >>> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote: >>> >>>> Hi, >>>> >>>> Am 30.06.2009 um 02:24 schrieb Barry Smith: >>>>> >>>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote: >>>>> >>>>>> Hi PETSc users, >>>>>> >>>>>> I ran into trouble in combining my developed PETSc application >>>>>> with another code (based on another library called "ArrayLib"). >>>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6 >>>>>> cpus the ArrayLib code gives the names/ranks of the local cpus >>>>>> first in y-direction, than in x (from last to first, in the >>>>>> same way the MPI arrays are called, like 3Darray[z][y][x]): >>>>>> >>>>>> y >>>>>> ^ >>>>>> | 2-4-6 >>>>>> | 1-3-5 >>>>>> |--------> x >>>>>> >>>>>> If I call DACreate() from PETSc, it will assume an ordering >>>>>> according to names/ranks first set in x-direction, than in y: >>>>>> >>>>>> y >>>>>> ^ >>>>>> | 4-5-6 >>>>>> | 1-2-3 >>>>>> |--------> x >>>>>> >>>>>> Of course, if I now communicate the boundary values, I mix up >>>>>> the domain (build by the other program). >>>>>> >>>>>> Is there a possibility / a flag to set the name of the ranks? >>>>>> Due to the fact that my application is written and working in >>>>>> curvilinear coordinates and not in cartesian, I cannot just >>>>>> switch the directions. >>>>> >>>>> What we recommend in this case is to just change the meaning of >>>>> x, y, and z when you use the PETSc DA. This does mean changing >>>>> your code that uses the PETSc DA. >>>> >>>> The code is used as a module for many codes, so I would prefer to >>>> not change the code (and the meaning of directions, that's not >>>> user-friendly), but 'just' change the communicator. >>>> >>>>> I do not understand why curvilinear coordinates has anything to >>>>> do with it. Another choice is to create a new MPI communicator >>>>> that has the different ordering of the ranks of the processors >>>>> and then using that comm to create the PETSc DA objects; then >>>>> you would not need to change your code that calls PETSc. >>>> >>>> I tried some time before to use the PetscSetCommWorld() routine, >>>> but I can't find it anymore, how can I set a new communicator in >>>> PETSc3.0? >>>> The communicator, I want to use, is the MPI_COMM_WORLD, which >>>> takes the first described ordering. >>>> Now I read that the MPI_COMM_WORLD is the default communicator >>>> for PETSc. But why is the ordering than different? >>>> >>>> Sorry for all this question, but (as you can see) I really don't >>>> understand this comm problem at the moment, >>>> Thanks for all, >>>> Rolf >>>> >>>>> Unfortunately PETSc doesn't have any way to flip how the DA >>>>> handles the layout automatically. >>>>> >>>>> Barry >>>>> >>>>>> >>>>>> Thanks a lot for your help, >>>>>> Rolf >>>>> >>>> >>> >> > From bsmith at mcs.anl.gov Sat Jul 4 12:24:43 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 4 Jul 2009 12:24:43 -0500 Subject: MPI-layout of PETSc In-Reply-To: References: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov> Message-ID: Send us the code to do the conversion and we'll include as a utility. Barry On Jul 4, 2009, at 6:08 AM, Rolf Kuiper wrote: > Thanks Barry! > It's working. But by the way: You simply should offer such a second > communicator inside the PETSc-library. > > Thanks for all your help, the support we got from this mailing list > is amazing, > Rolf > > > Am 04.07.2009 um 01:44 schrieb Barry Smith: >> >> Use MPI_Comm_split() with the same color for all processors, then >> use the second integer argument to indicate the new rank you want >> for the process. >> Choice the new rank so its x,y coordinate in the logical grid will >> match the y,x coordinate in the cartesian grid. >> >> Barry >> >> On Jul 3, 2009, at 12:09 PM, Rolf Kuiper wrote: >> >>> Hi Barry, >>> >>> I tried that already with: >>> First way by copying: >>> MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm); >>> >>> Second way by creating: >>> int dims[3] = {0,0,0}; >>> int ndims=3; >>> MPI_Dims_create(NumberOfProcessors, ndims, dims); >>> int false = 0; int true = 1; >>> int periods[3] = { false, false, true }; >>> int reorder = true; >>> MPI_Comm MyComm; >>> MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder, >>> &MyComm); >>> >>> in the end then: >>> PETSC_COMM_WORLD = MyComm; >>> >>> I test the MyComm with MPI_Topo_test(); and it is cartesian, yes. >>> I can the coordinates of the cpus with MPI_Cart_coords(MyComm, >>> LocalRank, ndims, coords); , but I found no way to set/rearrange >>> these coordinates. >>> >>> Do you can help me in that case or have I to ask a MPI-support? >>> >>> Thanks for all, >>> Rolf >>> >>> >>> Am 03.07.2009 um 17:56 schrieb Barry Smith: >>>> >>>> In designing the PETSc DA I did not (by ignorance) follow the >>>> layout approach of the MPI cartesian MPI_Cart_create (that gives >>>> the first local cpus first in the y-direction). >>>> I had it put the first cpus in the x-direction. >>>> >>>> What you need to do is create a new communicator that changes the >>>> order of the processors so that when used by the PETSc DA they >>>> lie out in the ordering that matches the other code. You will >>>> need to read up on the MPI_Cart stuff. >>>> >>>> To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD = >>>> yournewcom BEFORE calling PetscInitialize(). >>>> >>>> Barry >>>> >>>> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote: >>>> >>>>> Hi, >>>>> >>>>> Am 30.06.2009 um 02:24 schrieb Barry Smith: >>>>>> >>>>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote: >>>>>> >>>>>>> Hi PETSc users, >>>>>>> >>>>>>> I ran into trouble in combining my developed PETSc application >>>>>>> with another code (based on another library called "ArrayLib"). >>>>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6 >>>>>>> cpus the ArrayLib code gives the names/ranks of the local cpus >>>>>>> first in y-direction, than in x (from last to first, in the >>>>>>> same way the MPI arrays are called, like 3Darray[z][y][x]): >>>>>>> >>>>>>> y >>>>>>> ^ >>>>>>> | 2-4-6 >>>>>>> | 1-3-5 >>>>>>> |--------> x >>>>>>> >>>>>>> If I call DACreate() from PETSc, it will assume an ordering >>>>>>> according to names/ranks first set in x-direction, than in y: >>>>>>> >>>>>>> y >>>>>>> ^ >>>>>>> | 4-5-6 >>>>>>> | 1-2-3 >>>>>>> |--------> x >>>>>>> >>>>>>> Of course, if I now communicate the boundary values, I mix up >>>>>>> the domain (build by the other program). >>>>>>> >>>>>>> Is there a possibility / a flag to set the name of the ranks? >>>>>>> Due to the fact that my application is written and working in >>>>>>> curvilinear coordinates and not in cartesian, I cannot just >>>>>>> switch the directions. >>>>>> >>>>>> What we recommend in this case is to just change the meaning of >>>>>> x, y, and z when you use the PETSc DA. This does mean changing >>>>>> your code that uses the PETSc DA. >>>>> >>>>> The code is used as a module for many codes, so I would prefer >>>>> to not change the code (and the meaning of directions, that's >>>>> not user-friendly), but 'just' change the communicator. >>>>> >>>>>> I do not understand why curvilinear coordinates has anything to >>>>>> do with it. Another choice is to create a new MPI communicator >>>>>> that has the different ordering of the ranks of the processors >>>>>> and then using that comm to create the PETSc DA objects; then >>>>>> you would not need to change your code that calls PETSc. >>>>> >>>>> I tried some time before to use the PetscSetCommWorld() routine, >>>>> but I can't find it anymore, how can I set a new communicator in >>>>> PETSc3.0? >>>>> The communicator, I want to use, is the MPI_COMM_WORLD, which >>>>> takes the first described ordering. >>>>> Now I read that the MPI_COMM_WORLD is the default communicator >>>>> for PETSc. But why is the ordering than different? >>>>> >>>>> Sorry for all this question, but (as you can see) I really don't >>>>> understand this comm problem at the moment, >>>>> Thanks for all, >>>>> Rolf >>>>> >>>>>> Unfortunately PETSc doesn't have any way to flip how the DA >>>>>> handles the layout automatically. >>>>>> >>>>>> Barry >>>>>> >>>>>>> >>>>>>> Thanks a lot for your help, >>>>>>> Rolf >>>>>> >>>>> >>>> >>> >> > From kuiper at mpia-hd.mpg.de Sat Jul 4 16:33:33 2009 From: kuiper at mpia-hd.mpg.de (Rolf Kuiper) Date: Sat, 4 Jul 2009 23:33:33 +0200 Subject: MPI-layout of PETSc In-Reply-To: References: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov> <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov> Message-ID: No problem, here is the code: // the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively // (no parallelization in direction 'dir' means dir_procs = 1) MPI_Comm NewComm; int MPI_Rank, NewRank, x,y,z; // get rank from MPI ordering: MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank); // calculate coordinates of cpus in MPI ordering: x = MPI_rank / (z_procs*y_procs); y = (MPI_rank % (z_procs*y_procs)) / z_procs; z = (MPI_rank % (z_procs*y_procs)) % z_procs; // set new rank according to PETSc ordering: NewRank = z*y_procs*x_procs + y*x_procs + x; // create communicator with new ranks according to PETSc ordering: MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm); // override the default communicator (was MPI_COMM_WORLD as default) PETSC_COMM_WORLD = NewComm; I hope, this will be useful for some of you. Ciao, Rolf ------------------------------------------------------- Rolf Kuiper Max-Planck Institute for Astronomy K?nigstuhl 17 69117 Heidelberg Office A5, Els?sser Labor Phone: 0049 (0)6221 528 350 Mail: kuiper at mpia.de Homepage: http://www.mpia.de/~kuiper ------------------------------------------------------- Am 04.07.2009 um 19:24 schrieb Barry Smith: > > Send us the code to do the conversion and we'll include as a > utility. > > Barry > > On Jul 4, 2009, at 6:08 AM, Rolf Kuiper wrote: > >> Thanks Barry! >> It's working. But by the way: You simply should offer such a second >> communicator inside the PETSc-library. >> >> Thanks for all your help, the support we got from this mailing list >> is amazing, >> Rolf >> >> >> Am 04.07.2009 um 01:44 schrieb Barry Smith: >>> >>> Use MPI_Comm_split() with the same color for all processors, then >>> use the second integer argument to indicate the new rank you want >>> for the process. >>> Choice the new rank so its x,y coordinate in the logical grid will >>> match the y,x coordinate in the cartesian grid. >>> >>> Barry >>> >>> On Jul 3, 2009, at 12:09 PM, Rolf Kuiper wrote: >>> >>>> Hi Barry, >>>> >>>> I tried that already with: >>>> First way by copying: >>>> MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm); >>>> >>>> Second way by creating: >>>> int dims[3] = {0,0,0}; >>>> int ndims=3; >>>> MPI_Dims_create(NumberOfProcessors, ndims, dims); >>>> int false = 0; int true = 1; >>>> int periods[3] = { false, false, true }; >>>> int reorder = true; >>>> MPI_Comm MyComm; >>>> MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder, >>>> &MyComm); >>>> >>>> in the end then: >>>> PETSC_COMM_WORLD = MyComm; >>>> >>>> I test the MyComm with MPI_Topo_test(); and it is cartesian, yes. >>>> I can the coordinates of the cpus with MPI_Cart_coords(MyComm, >>>> LocalRank, ndims, coords); , but I found no way to set/rearrange >>>> these coordinates. >>>> >>>> Do you can help me in that case or have I to ask a MPI-support? >>>> >>>> Thanks for all, >>>> Rolf >>>> >>>> >>>> Am 03.07.2009 um 17:56 schrieb Barry Smith: >>>>> >>>>> In designing the PETSc DA I did not (by ignorance) follow the >>>>> layout approach of the MPI cartesian MPI_Cart_create (that gives >>>>> the first local cpus first in the y-direction). >>>>> I had it put the first cpus in the x-direction. >>>>> >>>>> What you need to do is create a new communicator that changes >>>>> the order of the processors so that when used by the PETSc DA >>>>> they lie out in the ordering that matches the other code. You >>>>> will need to read up on the MPI_Cart stuff. >>>>> >>>>> To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD = >>>>> yournewcom BEFORE calling PetscInitialize(). >>>>> >>>>> Barry >>>>> >>>>> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Am 30.06.2009 um 02:24 schrieb Barry Smith: >>>>>>> >>>>>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote: >>>>>>> >>>>>>>> Hi PETSc users, >>>>>>>> >>>>>>>> I ran into trouble in combining my developed PETSc >>>>>>>> application with another code (based on another library >>>>>>>> called "ArrayLib"). >>>>>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6 >>>>>>>> cpus the ArrayLib code gives the names/ranks of the local >>>>>>>> cpus first in y-direction, than in x (from last to first, in >>>>>>>> the same way the MPI arrays are called, like 3Darray[z][y][x]): >>>>>>>> >>>>>>>> y >>>>>>>> ^ >>>>>>>> | 2-4-6 >>>>>>>> | 1-3-5 >>>>>>>> |--------> x >>>>>>>> >>>>>>>> If I call DACreate() from PETSc, it will assume an ordering >>>>>>>> according to names/ranks first set in x-direction, than in y: >>>>>>>> >>>>>>>> y >>>>>>>> ^ >>>>>>>> | 4-5-6 >>>>>>>> | 1-2-3 >>>>>>>> |--------> x >>>>>>>> >>>>>>>> Of course, if I now communicate the boundary values, I mix up >>>>>>>> the domain (build by the other program). >>>>>>>> >>>>>>>> Is there a possibility / a flag to set the name of the ranks? >>>>>>>> Due to the fact that my application is written and working in >>>>>>>> curvilinear coordinates and not in cartesian, I cannot just >>>>>>>> switch the directions. >>>>>>> >>>>>>> What we recommend in this case is to just change the meaning >>>>>>> of x, y, and z when you use the PETSc DA. This does mean >>>>>>> changing your code that uses the PETSc DA. >>>>>> >>>>>> The code is used as a module for many codes, so I would prefer >>>>>> to not change the code (and the meaning of directions, that's >>>>>> not user-friendly), but 'just' change the communicator. >>>>>> >>>>>>> I do not understand why curvilinear coordinates has anything >>>>>>> to do with it. Another choice is to create a new MPI >>>>>>> communicator that has the different ordering of the ranks of >>>>>>> the processors and then using that comm to create the PETSc DA >>>>>>> objects; then you would not need to change your code that >>>>>>> calls PETSc. >>>>>> >>>>>> I tried some time before to use the PetscSetCommWorld() >>>>>> routine, but I can't find it anymore, how can I set a new >>>>>> communicator in PETSc3.0? >>>>>> The communicator, I want to use, is the MPI_COMM_WORLD, which >>>>>> takes the first described ordering. >>>>>> Now I read that the MPI_COMM_WORLD is the default communicator >>>>>> for PETSc. But why is the ordering than different? >>>>>> >>>>>> Sorry for all this question, but (as you can see) I really >>>>>> don't understand this comm problem at the moment, >>>>>> Thanks for all, >>>>>> Rolf >>>>>> >>>>>>> Unfortunately PETSc doesn't have any way to flip how the DA >>>>>>> handles the layout automatically. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>>> >>>>>>>> Thanks a lot for your help, >>>>>>>> Rolf >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From enjoywm at cs.wm.edu Sun Jul 5 12:46:51 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Sun, 05 Jul 2009 13:46:51 -0400 Subject: make test Message-ID: <4A50E70B.7070208@cs.wm.edu> Hi, After making test, I received a lot of warnings and lib load failures. Running test examples to verify correct installation Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI process See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,0]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,1]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,0]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 Possible error running Graphics examples src/snes/examples/tutorials/ex19 1 MPI process See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,0]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 Error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI process See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,0]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- Number of Newton iterations = 4 Completed test examples Thanks. Yixun From balay at mcs.anl.gov Sun Jul 5 12:57:28 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 5 Jul 2009 12:57:28 -0500 (CDT) Subject: make test In-Reply-To: <4A50E70B.7070208@cs.wm.edu> References: <4A50E70B.7070208@cs.wm.edu> Message-ID: Looks like some issue with your MPI. You might want to talk with your sysadmin about it. Also send us some compile logs - so we know whats hapenning. For eg: cd src/ksp/ksp/examples/tutorials/ make ex2 mpiexec -n 2 ./ex2 [or however you are supporsed to run MPI binaries on this cluster] BTW: If you are currently doing development - don't bother with a cluster MPI - and just use --download-mpich=1 Satish On Sun, 5 Jul 2009, Yixun Liu wrote: > Hi, > After making test, I received a lot of warnings and lib load failures. > > Running test examples to verify correct installation > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI > process > See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-iwarp" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > [0,1,0]: uDAPL on host md was unable to find any NICs. > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > lid velocity = 0.0016, prandtl # = 1, grashof # = 1 > Number of Newton iterations = 2 > lid velocity = 0.0016, prandtl # = 1, grashof # = 1 > Number of Newton iterations = 2 > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI > processes > See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-iwarp" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > [0,1,1]: uDAPL on host md was unable to find any NICs. > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-iwarp" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > [0,1,0]: uDAPL on host md was unable to find any NICs. > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > lid velocity = 0.0016, prandtl # = 1, grashof # = 1 > Number of Newton iterations = 2 > lid velocity = 0.0016, prandtl # = 1, grashof # = 1 > Number of Newton iterations = 2 > Possible error running Graphics examples > src/snes/examples/tutorials/ex19 1 MPI process > See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-iwarp" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > [0,1,0]: uDAPL on host md was unable to find any NICs. > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > lid velocity = 0.0016, prandtl # = 1, grashof # = 1 > Number of Newton iterations = 2 > lid velocity = 0.0016, prandtl # = 1, grashof # = 1 > Number of Newton iterations = 2 > Error running Fortran example src/snes/examples/tutorials/ex5f with 1 > MPI process > See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mthca0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-1" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-mlx4_0-2" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-iwarp" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > [0,1,0]: uDAPL on host md was unable to find any NICs. > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > Number of Newton iterations = 4 > Completed test examples > > > Thanks. > > Yixun > From enjoywm at cs.wm.edu Sun Jul 5 13:05:17 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Sun, 05 Jul 2009 14:05:17 -0400 Subject: make test In-Reply-To: References: <4A50E70B.7070208@cs.wm.edu> Message-ID: <4A50EB5D.8090001@cs.wm.edu> I run it on my computer. md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make ex2 mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include -I/home/scratch/yixun/petsc-3.0.0-p3/include -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64 -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas -L/usr/lib64/mpi/gcc/openmpi/lib64 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64 -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl /bin/rm -f ex2.o md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec -n 2 ./ex2 DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,0]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,1]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- Norm of error 0.000411674 iterations 7 Satish Balay wrote: > Looks like some issue with your MPI. You might want to talk with your > sysadmin about it. > > Also send us some compile logs - so we know whats hapenning. For eg: > > cd src/ksp/ksp/examples/tutorials/ > make ex2 > mpiexec -n 2 ./ex2 [or however you are supporsed to run MPI binaries on this cluster] > > BTW: If you are currently doing development - don't bother with a > cluster MPI - and just use --download-mpich=1 > > > Satish > > On Sun, 5 Jul 2009, Yixun Liu wrote: > > >> Hi, >> After making test, I received a lot of warnings and lib load failures. >> >> Running test examples to verify correct installation >> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI >> process >> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-iwarp" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> [0,1,0]: uDAPL on host md was unable to find any NICs. >> Another transport will be used instead, although this may result in >> lower performance. >> -------------------------------------------------------------------------- >> lid velocity = 0.0016, prandtl # = 1, grashof # = 1 >> Number of Newton iterations = 2 >> lid velocity = 0.0016, prandtl # = 1, grashof # = 1 >> Number of Newton iterations = 2 >> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI >> processes >> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-iwarp" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> [0,1,1]: uDAPL on host md was unable to find any NICs. >> Another transport will be used instead, although this may result in >> lower performance. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-iwarp" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> [0,1,0]: uDAPL on host md was unable to find any NICs. >> Another transport will be used instead, although this may result in >> lower performance. >> -------------------------------------------------------------------------- >> lid velocity = 0.0016, prandtl # = 1, grashof # = 1 >> Number of Newton iterations = 2 >> lid velocity = 0.0016, prandtl # = 1, grashof # = 1 >> Number of Newton iterations = 2 >> Possible error running Graphics examples >> src/snes/examples/tutorials/ex19 1 MPI process >> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-iwarp" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> [0,1,0]: uDAPL on host md was unable to find any NICs. >> Another transport will be used instead, although this may result in >> lower performance. >> -------------------------------------------------------------------------- >> lid velocity = 0.0016, prandtl # = 1, grashof # = 1 >> Number of Newton iterations = 2 >> lid velocity = 0.0016, prandtl # = 1, grashof # = 1 >> Number of Newton iterations = 2 >> Error running Fortran example src/snes/examples/tutorials/ex5f with 1 >> MPI process >> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mthca0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-1" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-mlx4_0-2" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-iwarp" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> -------------------------------------------------------------------------- >> -------------------------------------------------------------------------- >> [0,1,0]: uDAPL on host md was unable to find any NICs. >> Another transport will be used instead, although this may result in >> lower performance. >> -------------------------------------------------------------------------- >> Number of Newton iterations = 4 >> Completed test examples >> >> >> Thanks. >> >> Yixun >> >> > > From balay at mcs.anl.gov Sun Jul 5 13:17:24 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 5 Jul 2009 13:17:24 -0500 (CDT) Subject: make test In-Reply-To: <4A50EB5D.8090001@cs.wm.edu> References: <4A50E70B.7070208@cs.wm.edu> <4A50EB5D.8090001@cs.wm.edu> Message-ID: On Sun, 5 Jul 2009, Yixun Liu wrote: > I run it on my computer. > > md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make > ex2 > > mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3 > -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve > -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include > -I/home/scratch/yixun/petsc-3.0.0-p3/include > -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64 > -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c > mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o > -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib > -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp > -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas > -L/usr/lib64/mpi/gcc/openmpi/lib64 > -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64 > -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl > -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran > -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin > -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s > -lpthread -ldl > /bin/rm -f ex2.o Did you install this OpenMPI - or did someone-else/sysadmin install it for you? > > md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec > -n 2 ./ex2 > > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > dat_registry_add_provider > -------------------------------------------------------------------------- > > WARNING: Failed to open "OpenIB-cma" > [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > This may be a real error or it may be an invalid entry in the uDAPL > Registry which is contained in the dat.conf file. Contact your local > System Administrator to confirm the availability of the interfaces in > the dat.conf file. Your mpiexec is trying to run on infiniban and failing? > -------------------------------------------------------------------------- > [0,1,1]: uDAPL on host md was unable to find any NICs. > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > Norm of error 0.000411674 iterations 7 And then it attempts 'sockets' - and then successfully runs the PETSc example.. So something is wrong with your mpi usage. I guess - you'll have to check with your sysadmin - how to correctly use infiniband.. Satish From enjoywm at cs.wm.edu Sun Jul 5 13:27:52 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Sun, 05 Jul 2009 14:27:52 -0400 Subject: make test In-Reply-To: References: <4A50E70B.7070208@cs.wm.edu> <4A50EB5D.8090001@cs.wm.edu> Message-ID: <4A50F0A8.1010700@cs.wm.edu> Satish Balay wrote: > On Sun, 5 Jul 2009, Yixun Liu wrote: > > >> I run it on my computer. >> >> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make >> ex2 >> >> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3 >> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve >> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include >> -I/home/scratch/yixun/petsc-3.0.0-p3/include >> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64 >> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c >> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o >> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib >> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp >> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas >> -L/usr/lib64/mpi/gcc/openmpi/lib64 >> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64 >> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl >> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran >> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin >> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s >> -lpthread -ldl >> /bin/rm -f ex2.o >> > > Did you install this OpenMPI - or did someone-else/sysadmin install it for you? > Sysadmin install it. They let me set LD_LIBRARY_PATH to /usr/lib64/mpi/gcc/openmpi/lib64, but it still doesn't work. > >> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec >> -n 2 ./ex2 >> >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> > > >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> > > Your mpiexec is trying to run on infiniban and failing? > > >> -------------------------------------------------------------------------- >> [0,1,1]: uDAPL on host md was unable to find any NICs. >> Another transport will be used instead, although this may result in >> lower performance. >> -------------------------------------------------------------------------- >> Norm of error 0.000411674 iterations 7 >> > > And then it attempts 'sockets' - and then successfully runs the PETSc example.. > > So something is wrong with your mpi usage. I guess - you'll have to > check with your sysadmin - how to correctly use infiniband.. > > Satish > > From jed at 59A2.org Sun Jul 5 13:33:28 2009 From: jed at 59A2.org (Jed Brown) Date: Sun, 05 Jul 2009 20:33:28 +0200 Subject: make test In-Reply-To: <4A50F0A8.1010700@cs.wm.edu> References: <4A50E70B.7070208@cs.wm.edu> <4A50EB5D.8090001@cs.wm.edu> <4A50F0A8.1010700@cs.wm.edu> Message-ID: <4A50F1F8.40202@59A2.org> Yixun Liu wrote: > Satish Balay wrote: >> On Sun, 5 Jul 2009, Yixun Liu wrote: >> >> >>> I run it on my computer. >>> >>> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make >>> ex2 >>> >>> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3 >>> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve >>> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include >>> -I/home/scratch/yixun/petsc-3.0.0-p3/include >>> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64 >>> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c >>> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o >>> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib >>> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp >>> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas >>> -L/usr/lib64/mpi/gcc/openmpi/lib64 >>> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64 >>> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl >>> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran >>> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin >>> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s >>> -lpthread -ldl >>> /bin/rm -f ex2.o >>> >> Did you install this OpenMPI - or did someone-else/sysadmin install it for you? >> > Sysadmin install it. They let me set LD_LIBRARY_PATH to > /usr/lib64/mpi/gcc/openmpi/lib64, but it still doesn't work. How about running with 'make runex2_2' or /usr/lib64/mpi/gcc/openmpi/bin/mpiexec? Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: From enjoywm at cs.wm.edu Sun Jul 5 14:01:33 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Sun, 05 Jul 2009 15:01:33 -0400 Subject: make test In-Reply-To: <4A50F1F8.40202@59A2.org> References: <4A50E70B.7070208@cs.wm.edu> <4A50EB5D.8090001@cs.wm.edu> <4A50F0A8.1010700@cs.wm.edu> <4A50F1F8.40202@59A2.org> Message-ID: <4A50F88D.9010200@cs.wm.edu> It has the same errors when I use /usr/lib64/mpi/gcc/openmpi/bin/mpiexec. md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>/usr/lib64/mpi/gcc/openmpi/bin/mpiexec -np 2 ./ex2 DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,0]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol: dat_registry_add_provider DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: dat_registry_add_provider -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-cma-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mthca0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-1" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-mlx4_0-2" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: Failed to open "OpenIB-iwarp" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -------------------------------------------------------------------------- -------------------------------------------------------------------------- [0,1,1]: uDAPL on host md was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- Norm of error 0.000411674 iterations 7 Jed Brown wrote: > Yixun Liu wrote: > >> Satish Balay wrote: >> >>> On Sun, 5 Jul 2009, Yixun Liu wrote: >>> >>> >>> >>>> I run it on my computer. >>>> >>>> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make >>>> ex2 >>>> >>>> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3 >>>> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve >>>> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include >>>> -I/home/scratch/yixun/petsc-3.0.0-p3/include >>>> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64 >>>> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c >>>> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o >>>> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib >>>> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp >>>> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas >>>> -L/usr/lib64/mpi/gcc/openmpi/lib64 >>>> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64 >>>> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl >>>> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran >>>> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin >>>> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s >>>> -lpthread -ldl >>>> /bin/rm -f ex2.o >>>> >>>> >>> Did you install this OpenMPI - or did someone-else/sysadmin install it for you? >>> >>> >> Sysadmin install it. They let me set LD_LIBRARY_PATH to >> /usr/lib64/mpi/gcc/openmpi/lib64, but it still doesn't work. >> > > How about running with 'make runex2_2' or > /usr/lib64/mpi/gcc/openmpi/bin/mpiexec? > > Jed > > From vyan2000 at gmail.com Mon Jul 6 14:22:51 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 6 Jul 2009 15:22:51 -0400 Subject: PCFIELDSPLIT Message-ID: Hi, All, I am reading a large Block Compressed Row Storage PETSc from an application into PETSc binary files. And I use matload to load this PETSc binar matrix as mpiaij. Since the matrix is resulting from a finite volume discretization with degree of freedom 5 at each cell center, what I am going to is use pcfieldsplit and PCFieldSplitGetSubKSP. For each filed I want to use the pc type hypre, and hypre type euclid. My question is: is there any way to send this euclid information by a function call, instead of command line parameter. The thing is that I want to save some typing, just in case that there are ten fields. The way that I am using now is "-fieldsplit_4_sub_pc_type hypre, -fieldsplit_4_sub_pc_hypre_type euclid". Notice that PCSetType can only pass in the "PCHYPRE". Thank you very much, Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jul 6 14:44:50 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 6 Jul 2009 14:44:50 -0500 Subject: PCFIELDSPLIT In-Reply-To: References: Message-ID: On Jul 6, 2009, at 2:22 PM, Ryan Yan wrote: > Hi, All, > I am reading a large Block Compressed Row Storage PETSc from an > application into PETSc binary files. > > And I use matload to load this PETSc binar matrix as mpiaij. Since > the matrix is resulting from a finite volume discretization with > degree of freedom 5 at each cell center, what I am going to is use > pcfieldsplit and PCFieldSplitGetSubKSP. For each filed I want to > use the pc type hypre, and hypre type euclid. > > My question is: is there any way to send this euclid information by > a function call, instead of command line parameter. The thing is > that I want to save some typing, just in case that there are ten > fields. The way that I am using now is "-fieldsplit_4_sub_pc_type > hypre, -fieldsplit_4_sub_pc_hypre_type euclid". > You can put them in a file called .petscrc or another file and list that filename in PetscInitialize() You can call PetscOptionsSet("- fieldsplit_4_sub_pc_hypre_type","euclid"); in your code right after PetscInitialize(). Barry > > Notice that PCSetType can only pass in the "PCHYPRE". > > Thank you very much, > > Yan From enjoywm at cs.wm.edu Tue Jul 7 12:48:13 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Tue, 07 Jul 2009 13:48:13 -0400 Subject: make test In-Reply-To: References: <4A50E70B.7070208@cs.wm.edu> <4A50EB5D.8090001@cs.wm.edu> Message-ID: <4A538A5D.6000606@cs.wm.edu> Hi, I use the command, ./config/configure.py --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1 and make test success. But when I compile my Petsc-based application I got the following errors, Linking CXX executable ../../../bin/PETScSolver /home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib/libpetscvec.a(vpscat.o): In function `VecScatterCreateCommon_PtoS': /home/scratch/yixun/petsc-3.0.0-p3/src/vec/vec/utils/vpscat.c:1770: undefined reference to `MPI_Type_create_indexed_block' /home/scratch/yixun/petsc-3.0.0-p3/src/vec/vec/utils/vpscat.c:1792: undefined reference to `MPI_Type_create_indexed_block' collect2: ld returned 1 exit status /usr/bin/mpiCC: No such file or directory gmake[2]: *** [bin/PETScSolver] Error 1 gmake[1]: *** [PersoPkgs/oclatzPkg/MeshRegister/CMakeFiles/PETScSolver.dir/all] Error 2 gmake: *** [all] Error 2 Does it mean that I need to set LD_LIBRARY_PATH to MPICH2 installation path? Thanks. Satish Balay wrote: > On Sun, 5 Jul 2009, Yixun Liu wrote: > > >> I run it on my computer. >> >> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make >> ex2 >> >> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3 >> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve >> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include >> -I/home/scratch/yixun/petsc-3.0.0-p3/include >> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64 >> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c >> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o >> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib >> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp >> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas >> -L/usr/lib64/mpi/gcc/openmpi/lib64 >> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64 >> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl >> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran >> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin >> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s >> -lpthread -ldl >> /bin/rm -f ex2.o >> > > Did you install this OpenMPI - or did someone-else/sysadmin install it for you? > > >> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec >> -n 2 ./ex2 >> >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: >> dat_registry_add_provider >> > > >> -------------------------------------------------------------------------- >> >> WARNING: Failed to open "OpenIB-cma" >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. >> This may be a real error or it may be an invalid entry in the uDAPL >> Registry which is contained in the dat.conf file. Contact your local >> System Administrator to confirm the availability of the interfaces in >> the dat.conf file. >> > > Your mpiexec is trying to run on infiniban and failing? > > >> -------------------------------------------------------------------------- >> [0,1,1]: uDAPL on host md was unable to find any NICs. >> Another transport will be used instead, although this may result in >> lower performance. >> -------------------------------------------------------------------------- >> Norm of error 0.000411674 iterations 7 >> > > And then it attempts 'sockets' - and then successfully runs the PETSc example.. > > So something is wrong with your mpi usage. I guess - you'll have to > check with your sysadmin - how to correctly use infiniband.. > > Satish > > From balay at mcs.anl.gov Tue Jul 7 12:54:50 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 7 Jul 2009 12:54:50 -0500 (CDT) Subject: make test In-Reply-To: <4A538A5D.6000606@cs.wm.edu> References: <4A50E70B.7070208@cs.wm.edu> <4A50EB5D.8090001@cs.wm.edu> <4A538A5D.6000606@cs.wm.edu> Message-ID: > /usr/bin/mpiCC: No such file or directory You are using --downlod-mpich with PETSc - but compiling your code wiht mpiCC from a different MPI install? It won't work. Is your code c++? If so - sugest building PETSc with additional options: '--with-cxx=g++ --with-clanguage=cxx' And then use PETSc Makefile format for your appliation code [that sets all make variables and targets needed to build PETSc applications]. For eg: check src/ksp/ksp/examples/tutorials/makefile Satish On Tue, 7 Jul 2009, Yixun Liu wrote: > Hi, > I use the command, > ./config/configure.py --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1 > > and make test success. > > But when I compile my Petsc-based application I got the following errors, > > > Linking CXX executable ../../../bin/PETScSolver > /home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib/libpetscvec.a(vpscat.o): > In function `VecScatterCreateCommon_PtoS': > /home/scratch/yixun/petsc-3.0.0-p3/src/vec/vec/utils/vpscat.c:1770: > undefined reference to `MPI_Type_create_indexed_block' > /home/scratch/yixun/petsc-3.0.0-p3/src/vec/vec/utils/vpscat.c:1792: > undefined reference to `MPI_Type_create_indexed_block' > collect2: ld returned 1 exit status > /usr/bin/mpiCC: No such file or directory > gmake[2]: *** [bin/PETScSolver] Error 1 > gmake[1]: *** > [PersoPkgs/oclatzPkg/MeshRegister/CMakeFiles/PETScSolver.dir/all] Error 2 > gmake: *** [all] Error 2 > > > > Does it mean that I need to set LD_LIBRARY_PATH to MPICH2 installation path? > > Thanks. > > > > > > > > > > > > > Satish Balay wrote: > > On Sun, 5 Jul 2009, Yixun Liu wrote: > > > > > >> I run it on my computer. > >> > >> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make > >> ex2 > >> > >> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3 > >> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve > >> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include > >> -I/home/scratch/yixun/petsc-3.0.0-p3/include > >> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64 > >> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c > >> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o > >> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib > >> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp > >> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas > >> -L/usr/lib64/mpi/gcc/openmpi/lib64 > >> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64 > >> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl > >> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran > >> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin > >> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s > >> -lpthread -ldl > >> /bin/rm -f ex2.o > >> > > > > Did you install this OpenMPI - or did someone-else/sysadmin install it for you? > > > > > >> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec > >> -n 2 ./ex2 > >> > >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > >> dat_registry_add_provider > >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol: > >> dat_registry_add_provider > >> > > > > > >> -------------------------------------------------------------------------- > >> > >> WARNING: Failed to open "OpenIB-cma" > >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. > >> This may be a real error or it may be an invalid entry in the uDAPL > >> Registry which is contained in the dat.conf file. Contact your local > >> System Administrator to confirm the availability of the interfaces in > >> the dat.conf file. > >> > > > > Your mpiexec is trying to run on infiniban and failing? > > > > > >> -------------------------------------------------------------------------- > >> [0,1,1]: uDAPL on host md was unable to find any NICs. > >> Another transport will be used instead, although this may result in > >> lower performance. > >> -------------------------------------------------------------------------- > >> Norm of error 0.000411674 iterations 7 > >> > > > > And then it attempts 'sockets' - and then successfully runs the PETSc example.. > > > > So something is wrong with your mpi usage. I guess - you'll have to > > check with your sysadmin - how to correctly use infiniband.. > > > > Satish > > > > > > From luitjens at cs.utah.edu Tue Jul 7 15:01:39 2009 From: luitjens at cs.utah.edu (Justin Luitjens) Date: Tue, 7 Jul 2009 14:01:39 -0600 Subject: PCILUSetFill in 3.0.0 Message-ID: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com> Hi, We are trying to make our code 3.0.0 compliant. We are currently using versions in the 2.3.* range. We currently have a call to PCILUSetFill in order to preallocate memory. What is the equivalent to this call in 3.0.0? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyoung at ippt.gov.pl Tue Jul 7 15:18:51 2009 From: tyoung at ippt.gov.pl (Toby D. Young) Date: Tue, 7 Jul 2009 22:18:51 +0200 (CEST) Subject: PCILUSetFill in 3.0.0 In-Reply-To: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com> References: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com> Message-ID: > We are trying to make our code 3.0.0 compliant. We are currently using > versions in the 2.3.* range. We currently have a call to PCILUSetFill in > order to preallocate memory. What is the equivalent to this call in 3.0.0? Allocating memory for PETSc is a pain in the ass. Check the documentation or (better) ask Barry Smith directly. Then let me know about it and I will write a patch for us deal.ii.ers. I will gladly submit a patch for this. Throw something at me.... like an error message???? ;-) Cheers, Toby ----- Toby D. Young Philosopher-Physicist Adiunkt (Assistant Professor) Polish Academy of Sciences Warszawa, Polska www: http://www.ippt.gov.pl/~tyoung skype: stenografia From bsmith at mcs.anl.gov Tue Jul 7 15:20:21 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 7 Jul 2009 15:20:21 -0500 Subject: PCILUSetFill in 3.0.0 In-Reply-To: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com> References: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com> Message-ID: <91E99583-E7CB-4756-BEC7-890AB166D8FD@mcs.anl.gov> PCFactorSetFill(). Essentially we introduced a factor class that took all the methods common to the various PCILUXXX, PCICCXXX, PCLUXXX, ... objects and put them together. Barry On Jul 7, 2009, at 3:01 PM, Justin Luitjens wrote: > Hi, > > We are trying to make our code 3.0.0 compliant. We are currently > using versions in the 2.3.* range. We currently have a call to > PCILUSetFill in order to preallocate memory. What is the equivalent > to this call in 3.0.0? > > Thanks, > Justin From john.fettig at gmail.com Tue Jul 7 15:20:50 2009 From: john.fettig at gmail.com (John Fettig) Date: Tue, 7 Jul 2009 15:20:50 -0500 Subject: MatGetSubMatrix performance Message-ID: What kind of performance should one expect with MatGetSubMatrix on a MPIAIJ matrix, and is there anything that I need to know to get the best performance? Or is this routine best avoided? I currently use it, but find that performance varies widely from call to call. One time it will take 0.25 seconds, another time it will take 185 seconds, and I can't figure out what would cause such a disparity. John From bsmith at mcs.anl.gov Tue Jul 7 15:33:55 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 7 Jul 2009 15:33:55 -0500 Subject: MatGetSubMatrix performance In-Reply-To: References: Message-ID: I've found that it is generally much faster than the numerical parts of the code (for example if you use a MatGetSubmatrix to select a big chunk of the matrix and then solve a linear system on that chunk the get submatrix may take 5 percent of the time to solve the system). So, in general, I don't think there is a reason to avoid it. Are you getting a huge difference in time for the exact same submatrix? This would surprise me. A cluster with gigabyte ethernet will also be slow. The performance will get bad for a poor load balance of the gotten submatrix. For example if some processes get huge chunks of other processes values it will be slow. Generally you want most of the gotten rows to live on the same process they are gotten from. Barry On Jul 7, 2009, at 3:20 PM, John Fettig wrote: > What kind of performance should one expect with MatGetSubMatrix on a > MPIAIJ matrix, and is there anything that I need to know to get the > best performance? Or is this routine best avoided? I currently use > it, but find that performance varies widely from call to call. One > time it will take 0.25 seconds, another time it will take 185 seconds, > and I can't figure out what would cause such a disparity. > > John From yfeng1 at tigers.lsu.edu Wed Jul 8 13:24:05 2009 From: yfeng1 at tigers.lsu.edu (Yin Feng) Date: Wed, 8 Jul 2009 13:24:05 -0500 Subject: A question about parallel computation Message-ID: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> I am a beginner of PETSc. I tried the PETSC example 5(ex5) with 4 nodes, However, it seems every nodes doing the exactly the same things and output the same results again and again. is this the problem of petsc or MPI installation? Thank you in adcance! Sincerely, YIN From balay at mcs.anl.gov Wed Jul 8 13:26:26 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 8 Jul 2009 13:26:26 -0500 (CDT) Subject: A question about parallel computation In-Reply-To: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> Message-ID: Perhaps you are using the wrong mpiexec or mpirun. You'll have to use the correspond mpiexec from MPI you've used to build PETSc. Or if the MPI has special instruction on usage - you should follow that [for ex: some clusters require extra options to mpiexec ] Satish On Wed, 8 Jul 2009, Yin Feng wrote: > I am a beginner of PETSc. > I tried the PETSC example 5(ex5) with 4 nodes, > However, it seems every nodes doing the exactly the same things and > output the same results again and again. is this the problem of petsc or > MPI installation? > > Thank you in adcance! > > Sincerely, > YIN > From enjoywm at cs.wm.edu Wed Jul 8 14:49:20 2009 From: enjoywm at cs.wm.edu (Yixun Liu) Date: Wed, 08 Jul 2009 15:49:20 -0400 Subject: rebuild petsc Message-ID: <4A54F840.8060002@cs.wm.edu> Hi, I want to clean the configuration generated at last time. Which command should I use? Thanks. Yixun From balay at mcs.anl.gov Wed Jul 8 15:47:30 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 8 Jul 2009 15:47:30 -0500 (CDT) Subject: rebuild petsc In-Reply-To: <4A54F840.8060002@cs.wm.edu> References: <4A54F840.8060002@cs.wm.edu> Message-ID: rm -rf PETSC_ARCH Satish On Wed, 8 Jul 2009, Yixun Liu wrote: > Hi, > I want to clean the configuration generated at last time. Which command > should I use? > > Thanks. > > Yixun > From yfeng1 at tigers.lsu.edu Wed Jul 8 15:48:42 2009 From: yfeng1 at tigers.lsu.edu (Yin Feng) Date: Wed, 8 Jul 2009 15:48:42 -0500 Subject: A question about parallel computation In-Reply-To: References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> Message-ID: <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI. But, when I check the load on each node, I found the master node take all the load and others are just free. Did you have any idea about this situation? Thanks in adcance! Sincerely, YIN On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay wrote: > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use > the correspond mpiexec from MPI you've used to build PETSc. > > Or if the MPI has special instruction on usage - you should follow > that [for ex: some clusters require extra options to mpiexec ] > > Satish > > On Wed, 8 Jul 2009, Yin Feng wrote: > >> I am a beginner of PETSc. >> I tried the PETSC example 5(ex5) with 4 nodes, >> However, it seems every nodes doing the exactly the same things and >> output the same results again and again. is this the problem of petsc or >> MPI installation? >> >> Thank you in adcance! >> >> Sincerely, >> YIN >> > > From balay at mcs.anl.gov Wed Jul 8 16:01:39 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 8 Jul 2009 16:01:39 -0500 (CDT) Subject: A question about parallel computation In-Reply-To: <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com> References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com> Message-ID: Sounds like openmpi configuration issue. Perhaps you need to configure hostfile for it? You can try '--default-hostfile' option for mpiexec. Also - you should figure out OpenMPI issues with a simple MPI test code [like cpi.c] - not PETSc. Satish On Wed, 8 Jul 2009, Yin Feng wrote: > I tried OpenMPI build PETSc and used mpirun provided by OpenMPI. > But, when I check the load on each node, I found the master node take > all the load > and others are just free. > > Did you have any idea about this situation? > > Thanks in adcance! > > Sincerely, > YIN > > On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay wrote: > > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use > > the correspond mpiexec from MPI you've used to build PETSc. > > > > Or if the MPI has special instruction on usage - you should follow > > that [for ex: some clusters require extra options to mpiexec ] > > > > Satish > > > > On Wed, 8 Jul 2009, Yin Feng wrote: > > > >> I am a beginner of PETSc. > >> I tried the PETSC example 5(ex5) with 4 nodes, > >> However, it seems every nodes doing the exactly the same things and > >> output the same results again and again. is this the problem of petsc or > >> MPI installation? > >> > >> Thank you in adcance! > >> > >> Sincerely, > >> YIN > >> > > > > > From chianshin at gmail.com Wed Jul 8 16:15:18 2009 From: chianshin at gmail.com (Xin Qian) Date: Wed, 8 Jul 2009 17:15:18 -0400 Subject: A question about parallel computation In-Reply-To: <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com> References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com> Message-ID: You can try to run sole MPI samples coming with OpenMPI first, make sure the OpenMPI is running all right. Thanks, Xin Qian On Wed, Jul 8, 2009 at 4:48 PM, Yin Feng wrote: > I tried OpenMPI build PETSc and used mpirun provided by OpenMPI. > But, when I check the load on each node, I found the master node take > all the load > and others are just free. > > Did you have any idea about this situation? > > Thanks in adcance! > > Sincerely, > YIN > > On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay wrote: > > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use > > the correspond mpiexec from MPI you've used to build PETSc. > > > > Or if the MPI has special instruction on usage - you should follow > > that [for ex: some clusters require extra options to mpiexec ] > > > > Satish > > > > On Wed, 8 Jul 2009, Yin Feng wrote: > > > >> I am a beginner of PETSc. > >> I tried the PETSC example 5(ex5) with 4 nodes, > >> However, it seems every nodes doing the exactly the same things and > >> output the same results again and again. is this the problem of petsc or > >> MPI installation? > >> > >> Thank you in adcance! > >> > >> Sincerely, > >> YIN > >> > > > > > -- QIAN, Xin (http://pubpages.unh.edu/~xqian/) xqian at unh.edu chianshin at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From yfeng1 at tigers.lsu.edu Thu Jul 9 00:02:37 2009 From: yfeng1 at tigers.lsu.edu (Yin Feng) Date: Thu, 9 Jul 2009 00:02:37 -0500 Subject: A question about parallel computation In-Reply-To: References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com> Message-ID: <1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com> Firstly, thanks for all your replies! I changed compiler to MPICH and tried a sample successfully but the problem is still there. I ran my code in 4 nodes and each node have 8 processors. And the information I saw is like: NODE LOAD 0 32 1 0 2 0 3 0 Normally, in that case, we should see is: NODE LOAD 0 8 1 8 2 8 3 8 So, anyone got any idea about this? Thank you in advance! Sincerely, YIN On Wed, Jul 8, 2009 at 4:15 PM, Xin Qian wrote: > You can try to run sole MPI samples coming with OpenMPI first, make sure the > OpenMPI is running all right. > > Thanks, > > Xin Qian > > On Wed, Jul 8, 2009 at 4:48 PM, Yin Feng wrote: >> >> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI. >> But, when I check the load on each node, I found the master node take >> all the load >> and others are just free. >> >> Did you have any idea about this situation? >> >> Thanks in adcance! >> >> Sincerely, >> YIN >> >> On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay wrote: >> > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use >> > the correspond mpiexec from MPI you've used to build PETSc. >> > >> > Or if the MPI has special instruction on usage - you should follow >> > that [for ex: some clusters require extra options to mpiexec ] >> > >> > Satish >> > >> > On Wed, 8 Jul 2009, Yin Feng wrote: >> > >> >> I am a beginner of PETSc. >> >> I tried the PETSC example 5(ex5) with 4 nodes, >> >> However, it seems every nodes doing the exactly the same things and >> >> output the same results again and again. is this the problem of petsc >> >> or >> >> MPI installation? >> >> >> >> Thank you in adcance! >> >> >> >> Sincerely, >> >> YIN >> >> >> > >> > > > > > -- > QIAN, Xin (http://pubpages.unh.edu/~xqian/) > xqian at unh.edu chianshin at gmail.com > From sekikawa at msi.co.jp Thu Jul 9 02:50:24 2009 From: sekikawa at msi.co.jp (Takuya Sekikawa) Date: Thu, 09 Jul 2009 16:50:24 +0900 Subject: PETSc configure with Intel-compiler static linking Message-ID: <20090709163521.0DEC.SEKIKAWA@msi.co.jp> Hello petsc users, I need to know how to configure PETSc with Intel-compiler (icc/icpc) on static linking. shared linking is just fine, but I need to build PETSc with static-linking. so I tried several description. [1] $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 --with-shared=0 --with-blas-lapack-dir=${MKL_DIR} environment variable MKL_DIR is set to intel MKL library directory. this one is fine, (also compiling and running is ok) but is spite of "--with-shared=0" flag, executable still link with .so (libmkl_lapack.so, etc) so I tried another one: [2] $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a this time configure.py failed: ********************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------- You set a value for --with-blas-lapack-lib=, but ['/opt/intel/mkl/10.0.010/lib/em64t/libmkl_lapack.a'] cannot be used ********************************************************************************* Could someone give me good advice? (or examples are greatly appriciated) Thanks in advance Takuya From knepley at gmail.com Thu Jul 9 06:08:00 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 9 Jul 2009 06:08:00 -0500 Subject: PETSc configure with Intel-compiler static linking In-Reply-To: <20090709163521.0DEC.SEKIKAWA@msi.co.jp> References: <20090709163521.0DEC.SEKIKAWA@msi.co.jp> Message-ID: For any configure problem, you MUST send configure.log or we have no idea what happened. Matt On Thu, Jul 9, 2009 at 2:50 AM, Takuya Sekikawa wrote: > Hello petsc users, > > I need to know how to configure PETSc with Intel-compiler (icc/icpc) on > static linking. > > shared linking is just fine, > but I need to build PETSc with static-linking. so I tried several > description. > > [1] > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 > --with-shared=0 --with-blas-lapack-dir=${MKL_DIR} > > environment variable MKL_DIR is set to intel MKL library directory. > this one is fine, (also compiling and running is ok) > but is spite of "--with-shared=0" flag, executable still link with .so > (libmkl_lapack.so, etc) > > so I tried another one: > > [2] > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 > --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a > > this time configure.py failed: > > > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > --------------------------------------------------------------------------------------- > You set a value for --with-blas-lapack-lib=, but > ['/opt/intel/mkl/10.0.010/lib/em64t/libmkl_lapack.a'] cannot be used > > ********************************************************************************* > > Could someone give me good advice? (or examples are greatly appriciated) > > Thanks in advance > > Takuya > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jul 9 06:15:49 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 9 Jul 2009 06:15:49 -0500 Subject: rebuild petsc In-Reply-To: References: <4A54F840.8060002@cs.wm.edu> Message-ID: cd $PETSC_DIR rm -f $PETSC_ARCH On Wed, Jul 8, 2009 at 3:47 PM, Satish Balay wrote: > rm -rf PETSC_ARCH > > Satish > > On Wed, 8 Jul 2009, Yixun Liu wrote: > > > Hi, > > I want to clean the configuration generated at last time. Which command > > should I use? > > > > Thanks. > > > > Yixun > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jul 9 06:20:13 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 9 Jul 2009 06:20:13 -0500 Subject: A question about parallel computation In-Reply-To: <1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com> References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com> <1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com> Message-ID: I think it is time to ask your system administrator for help. Matt On Thu, Jul 9, 2009 at 12:02 AM, Yin Feng wrote: > Firstly, thanks for all your replies! > > I changed compiler to MPICH and tried a sample successfully but the > problem is still there. > I ran my code in 4 nodes and each node have 8 processors. And the > information I saw is like: > NODE LOAD > 0 32 > 1 0 > 2 0 > 3 0 > > Normally, in that case, we should see is: > NODE LOAD > 0 8 > 1 8 > 2 8 > 3 8 > > So, anyone got any idea about this? > > Thank you in advance! > > Sincerely, > YIN > > On Wed, Jul 8, 2009 at 4:15 PM, Xin Qian wrote: > > You can try to run sole MPI samples coming with OpenMPI first, make sure > the > > OpenMPI is running all right. > > > > Thanks, > > > > Xin Qian > > > > On Wed, Jul 8, 2009 at 4:48 PM, Yin Feng wrote: > >> > >> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI. > >> But, when I check the load on each node, I found the master node take > >> all the load > >> and others are just free. > >> > >> Did you have any idea about this situation? > >> > >> Thanks in adcance! > >> > >> Sincerely, > >> YIN > >> > >> On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay wrote: > >> > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use > >> > the correspond mpiexec from MPI you've used to build PETSc. > >> > > >> > Or if the MPI has special instruction on usage - you should follow > >> > that [for ex: some clusters require extra options to mpiexec ] > >> > > >> > Satish > >> > > >> > On Wed, 8 Jul 2009, Yin Feng wrote: > >> > > >> >> I am a beginner of PETSc. > >> >> I tried the PETSC example 5(ex5) with 4 nodes, > >> >> However, it seems every nodes doing the exactly the same things and > >> >> output the same results again and again. is this the problem of petsc > >> >> or > >> >> MPI installation? > >> >> > >> >> Thank you in adcance! > >> >> > >> >> Sincerely, > >> >> YIN > >> >> > >> > > >> > > > > > > > > > -- > > QIAN, Xin (http://pubpages.unh.edu/~xqian/ > ) > > xqian at unh.edu chianshin at gmail.com > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Jul 9 09:50:21 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 9 Jul 2009 09:50:21 -0500 (CDT) Subject: PETSc configure with Intel-compiler static linking In-Reply-To: <20090709163521.0DEC.SEKIKAWA@msi.co.jp> References: <20090709163521.0DEC.SEKIKAWA@msi.co.jp> Message-ID: On Thu, 9 Jul 2009, Takuya Sekikawa wrote: > Hello petsc users, > > I need to know how to configure PETSc with Intel-compiler (icc/icpc) on > static linking. Why? > > shared linking is just fine, > but I need to build PETSc with static-linking. so I tried several description. > > [1] > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 > --with-shared=0 --with-blas-lapack-dir=${MKL_DIR} > > environment variable MKL_DIR is set to intel MKL library directory. > this one is fine, (also compiling and running is ok) > but is spite of "--with-shared=0" flag, executable still link with .so > (libmkl_lapack.so, etc) --with-shared=0 refers to petsc libraries. It doesn't mean static linking or shared linking. Generally static linking is done by the linker option [with icc/ifort its: -Bstatic]. But since all system libraries might not be available as static libraries - this might not work. Esp with MKL - since the librariry names are different between .so and .a files. [so PETSc configure doesn't explicitly look for tha .a names. > > so I tried another one: > > [2] > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 > --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a Generally - you need -lmkl_lapack -lmkl -lpthread -lguide However -lmkl is only available as .so. So you'll have to cat libmkl.so to see what the actual libraries it links with: For me I have: [petsc:10.0.2.018/lib/em64t] petsc> cat libmkl.so GROUP (libmkl_intel_lp64.so libmkl_intel_thread.so libmkl_core.so) [petsc:10.0.2.018/lib/em64t] petsc> So you might be able to use: --with-blas-lapack-lib="${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a" Satish > > this time configure.py failed: > > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------- > You set a value for --with-blas-lapack-lib=, but ['/opt/intel/mkl/10.0.010/lib/em64t/libmkl_lapack.a'] cannot be used > ********************************************************************************* > > Could someone give me good advice? (or examples are greatly appriciated) > > Thanks in advance > > Takuya > From balay at mcs.anl.gov Thu Jul 9 09:52:49 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 9 Jul 2009 09:52:49 -0500 (CDT) Subject: A question about parallel computation In-Reply-To: <1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com> References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com> <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com> <1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com> Message-ID: You'll have to learn about the MPI you've installed. If its MPICH - how did you install it? Did you install with PETSc or MPICH separately? Did you make sure its install with mpd? [This is the default if its installed separately. However if you've installed with PETSc - you will need additional option: --download-mpich-pm=mpd] And then have you configured mpd correctly across all the nodes you'd like to use? These are all MPI issues - you should figure these out - before attempting PETSc. Satish On Thu, 9 Jul 2009, Yin Feng wrote: > Firstly, thanks for all your replies! > > I changed compiler to MPICH and tried a sample successfully but the > problem is still there. > I ran my code in 4 nodes and each node have 8 processors. And the > information I saw is like: > NODE LOAD > 0 32 > 1 0 > 2 0 > 3 0 > > Normally, in that case, we should see is: > NODE LOAD > 0 8 > 1 8 > 2 8 > 3 8 > > So, anyone got any idea about this? > > Thank you in advance! > > Sincerely, > YIN > > On Wed, Jul 8, 2009 at 4:15 PM, Xin Qian wrote: > > You can try to run sole MPI samples coming with OpenMPI first, make sure the > > OpenMPI is running all right. > > > > Thanks, > > > > Xin Qian > > > > On Wed, Jul 8, 2009 at 4:48 PM, Yin Feng wrote: > >> > >> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI. > >> But, when I check the load on each node, I found the master node take > >> all the load > >> and others are just free. > >> > >> Did you have any idea about this situation? > >> > >> Thanks in adcance! > >> > >> Sincerely, > >> YIN > >> > >> On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay wrote: > >> > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use > >> > the correspond mpiexec from MPI you've used to build PETSc. > >> > > >> > Or if the MPI has special instruction on usage - you should follow > >> > that [for ex: some clusters require extra options to mpiexec ] > >> > > >> > Satish > >> > > >> > On Wed, 8 Jul 2009, Yin Feng wrote: > >> > > >> >> I am a beginner of PETSc. > >> >> I tried the PETSC example 5(ex5) with 4 nodes, > >> >> However, it seems every nodes doing the exactly the same things and > >> >> output the same results again and again. is this the problem of petsc > >> >> or > >> >> MPI installation? > >> >> > >> >> Thank you in adcance! > >> >> > >> >> Sincerely, > >> >> YIN > >> >> > >> > > >> > > > > > > > > > -- > > QIAN, Xin (http://pubpages.unh.edu/~xqian/) > > xqian at unh.edu chianshin at gmail.com > > > From balay at mcs.anl.gov Thu Jul 9 21:37:02 2009 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 9 Jul 2009 21:37:02 -0500 (CDT) Subject: PETSc configure with Intel-compiler static linking In-Reply-To: <20090710085311.4E21.SEKIKAWA@msi.co.jp> References: <20090709163521.0DEC.SEKIKAWA@msi.co.jp> <20090710085311.4E21.SEKIKAWA@msi.co.jp> Message-ID: For one - shell is not expanding ${MKL_DIR} for you. Perhaps you used the wrong quotes? Anyway - the current configure interface to --with-blas-lapack-lib prevents listing files as I mentioned before. So you can try the following workarround: - create a different mkl location for just the .a files - and use it with configure - as follows: [choose any convinent location] mkdir /foo/mkl-static cp /opt/intel/mkl/10.0.010/lib/em64t/*.a /foo/mkl-static/ cd $PETSC_DIR ./configure .... --with-blas-lapack-lib=[/foo/mkl-static/libmkl_lapack.a,mkl_intel_lp64.a,libmkl_core.a,libguide.a,libthread.a] Also we prevent flooding the mailing list with configure.log - so such issues [requiring communicating configure.log] can be sent to petsc-maint at mcs.anl.gov Satish On Fri, 10 Jul 2009, Takuya Sekikawa wrote: > Dear Matt and Satish, > > Thank you for quick response. > > On Thu, 9 Jul 2009 06:08:00 -0500 > Matthew Knepley wrote: > > > For any configure problem, you MUST send configure.log or we have no idea > > what happened. > > Oh, Sorry. > I attached latest configure.log. > > On Thu, 9 Jul 2009 09:50:21 -0500 (CDT) > Satish Balay wrote: > > > On Thu, 9 Jul 2009, Takuya Sekikawa wrote: > > > > > Hello petsc users, > > > > > > I need to know how to configure PETSc with Intel-compiler (icc/icpc) on > > > static linking. > > > > Why? > > Mainly because of license problem. > .so version needs target user to purchase licsense. > > > > shared linking is just fine, > > > but I need to build PETSc with static-linking. so I tried several description. > > > > > > [1] > > > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 > > > --with-shared=0 --with-blas-lapack-dir=${MKL_DIR} > > > > > > environment variable MKL_DIR is set to intel MKL library directory. > > > this one is fine, (also compiling and running is ok) > > > but is spite of "--with-shared=0" flag, executable still link with .so > > > (libmkl_lapack.so, etc) > > > > --with-shared=0 refers to petsc libraries. It doesn't mean static > > linking or shared linking. > > Ok. I understood. > > > Generally static linking is done by the linker option [with icc/ifort > > its: -Bstatic]. But since all system libraries might not be available > > as static libraries - this might not work. > > > > Esp with MKL - since the librariry names are different between .so and > > .a files. [so PETSc configure doesn't explicitly look for tha .a > > names. > > > > > > > > so I tried another one: > > > > > > [2] > > > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 > > > --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a > > > > Generally - you need -lmkl_lapack -lmkl -lpthread -lguide > > > > However -lmkl is only available as .so. So you'll have to cat > > libmkl.so to see what the actual libraries it links with: For me I > > have: > > > > [petsc:10.0.2.018/lib/em64t] petsc> cat libmkl.so > > GROUP (libmkl_intel_lp64.so libmkl_intel_thread.so libmkl_core.so) > > [petsc:10.0.2.018/lib/em64t] petsc> > > > > > > So you might be able to use: > > > > --with-blas-lapack-lib="${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a" > > Thank you. > I tried as you wrote, but unsuccessful. > configure.py said: > > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------- > You set a value for --with-blas-lapack-lib=, but ['${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a'] cannot be used > ********************************************************************************* > > What is the real cause of "cannot be used" ? > I cannot make out why "cannot be used" (.a is collapsed? or simply need > to specify more .a?) > > Takuya > From sekikawa at msi.co.jp Fri Jul 10 03:58:12 2009 From: sekikawa at msi.co.jp (Takuya Sekikawa) Date: Fri, 10 Jul 2009 17:58:12 +0900 Subject: PETSc configure with Intel-compiler static linking In-Reply-To: References: <20090710085311.4E21.SEKIKAWA@msi.co.jp> Message-ID: <20090710174935.4E36.SEKIKAWA@msi.co.jp> Dear Satish, On Thu, 9 Jul 2009 21:37:02 -0500 (CDT) Satish Balay wrote: > For one - shell is not expanding ${MKL_DIR} for you. Perhaps you used > the wrong quotes? As you wrote I suspected ${MKL_DIR} didn't expand by shell so I changed this part to fullpath, but result was same. > Anyway - the current configure interface to --with-blas-lapack-lib > prevents listing files as I mentioned before. So you can try the > following workarround: > > - create a different mkl location for just the .a files - and use it > with configure - as follows: > > [choose any convinent location] > mkdir /foo/mkl-static > cp /opt/intel/mkl/10.0.010/lib/em64t/*.a /foo/mkl-static/ > cd $PETSC_DIR > ./configure .... --with-blas-lapack-lib=[/foo/mkl-static/libmkl_lapack.a,mkl_intel_lp64.a,libmkl_core.a,libguide.a,libthread.a] Thank you for your advice. but seems that it don't work well... Well, situation was changed. I pursaded my client that we have to use .so version of MKL. so for the time I don't have to compile PETSc with Intel static library. Thank you for assistance. > Also we prevent flooding the mailing list with configure.log - so such > issues [requiring communicating configure.log] can be sent to > petsc-maint at mcs.anl.gov I'm so sorry. next time I'll post configure.log to maintainance address. Takuya > Satish > > On Fri, 10 Jul 2009, Takuya Sekikawa wrote: > > > Dear Matt and Satish, > > > > Thank you for quick response. > > > > On Thu, 9 Jul 2009 06:08:00 -0500 > > Matthew Knepley wrote: > > > > > For any configure problem, you MUST send configure.log or we have no idea > > > what happened. > > > > Oh, Sorry. > > I attached latest configure.log. > > > > On Thu, 9 Jul 2009 09:50:21 -0500 (CDT) > > Satish Balay wrote: > > > > > On Thu, 9 Jul 2009, Takuya Sekikawa wrote: > > > > > > > Hello petsc users, > > > > > > > > I need to know how to configure PETSc with Intel-compiler (icc/icpc) on > > > > static linking. > > > > > > Why? > > > > Mainly because of license problem. > > .so version needs target user to purchase licsense. > > > > > > shared linking is just fine, > > > > but I need to build PETSc with static-linking. so I tried several description. > > > > > > > > [1] > > > > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 > > > > --with-shared=0 --with-blas-lapack-dir=${MKL_DIR} > > > > > > > > environment variable MKL_DIR is set to intel MKL library directory. > > > > this one is fine, (also compiling and running is ok) > > > > but is spite of "--with-shared=0" flag, executable still link with .so > > > > (libmkl_lapack.so, etc) > > > > > > --with-shared=0 refers to petsc libraries. It doesn't mean static > > > linking or shared linking. > > > > Ok. I understood. > > > > > Generally static linking is done by the linker option [with icc/ifort > > > its: -Bstatic]. But since all system libraries might not be available > > > as static libraries - this might not work. > > > > > > Esp with MKL - since the librariry names are different between .so and > > > .a files. [so PETSc configure doesn't explicitly look for tha .a > > > names. > > > > > > > > > > > so I tried another one: > > > > > > > > [2] > > > > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0 > > > > --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a > > > > > > Generally - you need -lmkl_lapack -lmkl -lpthread -lguide > > > > > > However -lmkl is only available as .so. So you'll have to cat > > > libmkl.so to see what the actual libraries it links with: For me I > > > have: > > > > > > [petsc:10.0.2.018/lib/em64t] petsc> cat libmkl.so > > > GROUP (libmkl_intel_lp64.so libmkl_intel_thread.so libmkl_core.so) > > > [petsc:10.0.2.018/lib/em64t] petsc> > > > > > > > > > So you might be able to use: > > > > > > --with-blas-lapack-lib="${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a" > > > > Thank you. > > I tried as you wrote, but unsuccessful. > > configure.py said: > > > > ********************************************************************************* > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > > --------------------------------------------------------------------------------------- > > You set a value for --with-blas-lapack-lib=, but ['${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a'] cannot be used > > ********************************************************************************* > > > > What is the real cause of "cannot be used" ? > > I cannot make out why "cannot be used" (.a is collapsed? or simply need > > to specify more .a?) > > > > Takuya > > --------------------------------------------------------------- ? Takuya Sekikawa ??? Mathematical Systems, Inc ? sekikawa at msi.co.jp --------------------------------------------------------------- From w_subber at yahoo.com Fri Jul 10 20:18:49 2009 From: w_subber at yahoo.com (Waad Subber) Date: Fri, 10 Jul 2009 18:18:49 -0700 (PDT) Subject: Matrix transpose Message-ID: <244334.84354.qm@web38207.mail.mud.yahoo.com> Hi all In the function MatMatMultTranspose(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) is A has to be a square matrix ? And what about the function MatTranspose(Mat mat,MatReuse reuse,Mat *B) is mat has to be a square matrix too ? I am trying to use these functions with a rectangular matrix. but it doesn't work for me ! Thanks Waad -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jul 10 20:42:47 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 10 Jul 2009 20:42:47 -0500 Subject: Matrix transpose In-Reply-To: <244334.84354.qm@web38207.mail.mud.yahoo.com> References: <244334.84354.qm@web38207.mail.mud.yahoo.com> Message-ID: <424F650E-380B-4B6F-B9F5-BA7BBB430755@mcs.anl.gov> They have not been written or tested to work for general rectangular matrices. They may work for some formats and not for others. You may need to debug and modify them yourself to provide the support you need. Or perhaps another PETSc user can generalize them. Barry We haven't had the time to provide all functionality that would be nice to have. On Jul 10, 2009, at 8:18 PM, Waad Subber wrote: > Hi all > > In the function MatMatMultTranspose(Mat A,Mat B,MatReuse > scall,PetscReal fill,Mat *C) > > is A has to be a square matrix ? > > And what about the function MatTranspose(Mat mat,MatReuse reuse,Mat > *B) is mat has to be a square matrix too ? > > I am trying to use these functions with a rectangular matrix. but it > doesn't work for me ! > > Thanks > Waad > From saswata at umd.edu Sun Jul 12 10:34:21 2009 From: saswata at umd.edu (Saswata Hier-Majumder) Date: Sun, 12 Jul 2009 11:34:21 -0400 Subject: VTK output from DA vectors Message-ID: <4A5A027D.60901@umd.edu> Hi, I would like to generate a vtk output from a multicomponent problem. In the vtk file, I would like the DA coordinates as well as all 4 components stored separately as scalar point data. I have been using the VecView_VTK routine from /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to contain only the local coordinates (may be because DAGetCoordinates is not collective?). Is there a way to fix this? Also, using the same routine, all components of the solution corresponding to a node are dumped together. Is there a way to extract each component separately and ouput them separately as scalar point data? Thanks -- www.geol.umd.edu/~saswata From vyan2000 at gmail.com Sun Jul 12 15:30:09 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 12 Jul 2009 16:30:09 -0400 Subject: about src/mat/examples/tutorials/ex5.c.html Message-ID: http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/src/mat/examples/tutorials/ex5.c.html Hi All, I am tring to read through an example about PetscBinaryRead. It looks like the matrix is reading from a CRS matrix object descriptor "fd1, or fd2". I have difficulty of understand the line 50: +++++++++++++++++++++++++++++ 50: PetscBinaryRead(fd2,(char *)header,4,PETSC_INT); +++++++++++++++++++++++++++++ >From the context, my guess is: header[0] unknown header[1] contains the info of how many rows of matrix stored on this processor header[2] contains the info of how many global columns of the matrix header[3] unknown and line: ++++++++++++++++++++++++++++ 86: PetscBinaryRead(fd1,ourlens,m,PETSC_INT); 101: PetscBinaryRead(fd1,mycols,ourlens[i],PETSC_INT); +++++++++++++++++++++++++++++++ >From the context, my guess is: ourlens[i] stores the length of the ith local row for the "local" portion of the matrix. mycols is an array storing the column indices of the nonzero entries of the ith local row for the "local" portion of the matrix(include diagonal and off diagonal). Is there any pointer to the definition of the struct descriptor. Can anyone confirm my guess and provide a pointer or example? How does the PetscBinaryRead() switch smoothly between reading different informations with the same parameter list. Thank you very much, Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Jul 12 16:41:56 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 12 Jul 2009 16:41:56 -0500 Subject: about src/mat/examples/tutorials/ex5.c.html In-Reply-To: References: Message-ID: <990EBC48-42A5-45C3-B7F2-A7B26537246E@mcs.anl.gov> The manual page for MatLoad() and VecLoad() contain the definitions of those structs. The file binary format is independent of parallel storage of the matrix so has no information about the "diagonal" and "off-diagonal" parts of the matrix. That is all determined when the binary file is read in. Barry On Jul 12, 2009, at 3:30 PM, Ryan Yan wrote: > http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/src/mat/examples/tutorials/ex5.c.html > > Hi All, > I am tring to read through an example about PetscBinaryRead. It > looks like the matrix is reading from a CRS matrix object descriptor > "fd1, or fd2". > > I have difficulty of understand the line 50: > +++++++++++++++++++++++++++++ > 50: PetscBinaryRead(fd2,(char *)header,4,PETSC_INT); > +++++++++++++++++++++++++++++ > From the context, my guess is: > header[0] unknown > > header[1] contains the info of how many rows of matrix stored on > this processor > > header[2] contains the info of how many global columns of the matrix > > header[3] unknown > > and line: > ++++++++++++++++++++++++++++ > 86: PetscBinaryRead(fd1,ourlens,m,PETSC_INT); > 101: PetscBinaryRead(fd1,mycols,ourlens[i],PETSC_INT); > +++++++++++++++++++++++++++++++ > From the context, my guess is: > ourlens[i] stores the length of the ith local row for the "local" > portion of the matrix. > > mycols is an array storing the column indices of the nonzero entries > of the ith local row for the "local" portion of the matrix(include > diagonal and off diagonal). > > > > > > Is there any pointer to the definition of the struct descriptor. > > Can anyone confirm my guess and provide a pointer or example? How > does the PetscBinaryRead() switch smoothly between reading different > informations with the same parameter list. > > Thank you very much, > > Yan > From knepley at gmail.com Sun Jul 12 16:48:25 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 12 Jul 2009 16:48:25 -0500 Subject: VTK output from DA vectors In-Reply-To: <4A5A027D.60901@umd.edu> References: <4A5A027D.60901@umd.edu> Message-ID: 1) I believe the standard VTK viewer (ASCII Viewer with VTK format) gets the coordinates right. Can you verify this? 2) You should be able to split the 4- component field into 4 fields using the Split filter in VTK or whatever viewer you use (I do this in Mayavi2, but Paraview also works). Matt On Sun, Jul 12, 2009 at 10:34 AM, Saswata Hier-Majumder wrote: > Hi, > I would like to generate a vtk output from a multicomponent problem. In > the vtk file, I would like the DA coordinates as well as all 4 components > stored separately as scalar point data. > > I have been using the VecView_VTK routine from > /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to contain > only the local coordinates (may be because DAGetCoordinates is not > collective?). Is there a way to fix this? > > Also, using the same routine, all components of the solution corresponding > to a node are dumped together. Is there a way to extract each component > separately and ouput them separately as scalar point data? > > Thanks > > -- > www.geol.umd.edu/~saswata > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Sun Jul 12 16:56:41 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 12 Jul 2009 17:56:41 -0400 Subject: about src/mat/examples/tutorials/ex5.c.html In-Reply-To: <990EBC48-42A5-45C3-B7F2-A7B26537246E@mcs.anl.gov> References: <990EBC48-42A5-45C3-B7F2-A7B26537246E@mcs.anl.gov> Message-ID: On Sun, Jul 12, 2009 at 5:41 PM, Barry Smith wrote: > > The manual page for MatLoad() and VecLoad() contain the definitions of > those structs. This pointer is great. > > The file binary format is independent of parallel storage of the matrix > so has no information about the "diagonal" and "off-diagonal" parts of the > matrix. That is all determined when the binary file is read in. That's exactly what I get confused with. Thank you very much, Yan > > Barry > > > > On Jul 12, 2009, at 3:30 PM, Ryan Yan wrote: > > >> http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/src/mat/examples/tutorials/ex5.c.html >> >> Hi All, >> I am tring to read through an example about PetscBinaryRead. It looks like >> the matrix is reading from a CRS matrix object descriptor "fd1, or fd2". >> >> I have difficulty of understand the line 50: >> +++++++++++++++++++++++++++++ >> 50: PetscBinaryRead(fd2,(char *)header,4,PETSC_INT); >> +++++++++++++++++++++++++++++ >> From the context, my guess is: >> header[0] unknown >> >> header[1] contains the info of how many rows of matrix stored on this >> processor >> >> header[2] contains the info of how many global columns of the matrix >> >> header[3] unknown >> >> and line: >> ++++++++++++++++++++++++++++ >> 86: PetscBinaryRead(fd1,ourlens,m,PETSC_INT); >> 101: PetscBinaryRead(fd1,mycols,ourlens[i],PETSC_INT); >> +++++++++++++++++++++++++++++++ >> From the context, my guess is: >> ourlens[i] stores the length of the ith local row for the "local" portion >> of the matrix. >> >> mycols is an array storing the column indices of the nonzero entries of >> the ith local row for the "local" portion of the matrix(include diagonal and >> off diagonal). >> >> >> >> >> >> Is there any pointer to the definition of the struct descriptor. >> >> Can anyone confirm my guess and provide a pointer or example? How does >> the PetscBinaryRead() switch smoothly between reading different informations >> with the same parameter list. >> >> Thank you very much, >> >> Yan >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From saswata at umd.edu Sun Jul 12 21:09:02 2009 From: saswata at umd.edu (Saswata Hier-Majumder) Date: Sun, 12 Jul 2009 22:09:02 -0400 Subject: VTK output from DA vectors In-Reply-To: References: <4A5A027D.60901@umd.edu> Message-ID: <4A5A973E.7060300@umd.edu> 1) I did. It returns correct values for the x nodes, but fails to do so for the y nodes. Here's a sample output generated from the driven cavity problem. ( I suppressed printing the solutions for brevity). 2) Thanks, I'll try that. Matthew Knepley wrote: > 1) I believe the standard VTK viewer (ASCII Viewer with VTK format) gets the > coordinates > right. Can you verify this? > > 2) You should be able to split the 4- component field into 4 fields using > the Split filter in VTK > or whatever viewer you use (I do this in Mayavi2, but Paraview also > works). > > Matt > > On Sun, Jul 12, 2009 at 10:34 AM, Saswata Hier-Majumder wrote: > > >> Hi, >> I would like to generate a vtk output from a multicomponent problem. In >> the vtk file, I would like the DA coordinates as well as all 4 components >> stored separately as scalar point data. >> >> I have been using the VecView_VTK routine from >> /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to contain >> only the local coordinates (may be because DAGetCoordinates is not >> collective?). Is there a way to fix this? >> >> Also, using the same routine, all components of the solution corresponding >> to a node are dumped together. Is there a way to extract each component >> separately and ouput them separately as scalar point data? >> >> Thanks >> >> -- >> www.geol.umd.edu/~saswata >> >> >> > > > -- www.geol.umd.edu/~saswata -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: solution.vtk URL: From knepley at gmail.com Sun Jul 12 21:13:02 2009 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 12 Jul 2009 21:13:02 -0500 Subject: VTK output from DA vectors In-Reply-To: <4A5A973E.7060300@umd.edu> References: <4A5A027D.60901@umd.edu> <4A5A973E.7060300@umd.edu> Message-ID: On Sun, Jul 12, 2009 at 9:09 PM, Saswata Hier-Majumder wrote: > 1) I did. It returns correct values for the x nodes, but fails to do so for > the y nodes. Here's a sample output generated from the driven cavity > problem. ( I suppressed printing the solutions for brevity). I did not realize you mean as fields (rather than as the mesh). Give me your exact calling sequence. I can output parallel fields fine, so something else must be going on. If you modified the code, send that too. Matt > > 2) Thanks, I'll try that. > > Matthew Knepley wrote: > >> 1) I believe the standard VTK viewer (ASCII Viewer with VTK format) gets >> the >> coordinates >> right. Can you verify this? >> >> 2) You should be able to split the 4- component field into 4 fields using >> the Split filter in VTK >> or whatever viewer you use (I do this in Mayavi2, but Paraview also >> works). >> >> Matt >> >> On Sun, Jul 12, 2009 at 10:34 AM, Saswata Hier-Majumder > >wrote: >> >> >> >>> Hi, >>> I would like to generate a vtk output from a multicomponent problem. In >>> the vtk file, I would like the DA coordinates as well as all 4 components >>> stored separately as scalar point data. >>> >>> I have been using the VecView_VTK routine from >>> /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to >>> contain >>> only the local coordinates (may be because DAGetCoordinates is not >>> collective?). Is there a way to fix this? >>> >>> Also, using the same routine, all components of the solution >>> corresponding >>> to a node are dumped together. Is there a way to extract each component >>> separately and ouput them separately as scalar point data? >>> >>> Thanks >>> >>> -- >>> www.geol.umd.edu/~saswata < >>> http://www.geol.umd.edu/%7Esaswata> >>> >>> >>> >>> >> >> >> >> > > -- > www.geol.umd.edu/~saswata > > > # vtk DataFile Version 2.0 > ASCII > DATASET STRUCTURED_POINTS > DIMENSIONS 49 49 1 > ORIGIN 0 0 0 > SPACING 1 1 1 > > POINT_DATA 2401 > SCALARS scalars double 3 > LOOKUP_TABLE default > X_COORDINATES 49 double > 0 0.0208333 0.0416667 0.0625 0.0833333 0.104167 0.125 0.145833 0.166667 > 0.1875 0.208333 0.229167 0.25 0.270833 0.291667 0.3125 0.333333 0.354167 > 0.375 0.395833 0.416667 0.4375 0.458333 0.479167 0.5 0 0.0208333 0.0416667 > 0.0625 0.0833333 0.104167 0.125 0.145833 0.166667 0.1875 0.208333 0.229167 > 0.25 0.270833 0.291667 0.3125 0.333333 0.354167 0.375 0.395833 0.416667 > 0.4375 0.458333 0.479167 > Y_COORDINATES 49 double > 0 0.0208333 0.0625 0.104167 0.145833 0.1875 0.229167 0.270833 0.3125 > 0.354167 0.395833 0.4375 0.479167 5.27367e-317 5.23119e-317 0 5.31253e-317 0 > 0 0 3.41641e-312 3.26575e-311 1.14376e-311 1.54269e-311 1.94163e-311 > 2.34056e-311 5.29014e-311 5.27095e-317 0 0 0 1.82492e-312 1.01431e-311 > 1.84614e-311 1.49455e-320 3.09811e-312 1.17559e-311 2.04136e-311 0 0 0 > 5.23119e-317 0 0 8.06358e-313 5.3312e-317 5.33093e-317 0 5.27392e-317 > Z_COORDINATES 1 double > 0 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From saswata at umd.edu Mon Jul 13 08:27:42 2009 From: saswata at umd.edu (Saswata Hier-Majumder) Date: Mon, 13 Jul 2009 09:27:42 -0400 Subject: VTK output from DA vectors In-Reply-To: References: <4A5A027D.60901@umd.edu> <4A5A973E.7060300@umd.edu> Message-ID: <4A5B364E.9070109@umd.edu> OK, here's the c program and the vtk output. Thanks for your help. Matthew Knepley wrote: > On Sun, Jul 12, 2009 at 9:09 PM, Saswata Hier-Majumder wrote: > > >> 1) I did. It returns correct values for the x nodes, but fails to do so for >> the y nodes. Here's a sample output generated from the driven cavity >> problem. ( I suppressed printing the solutions for brevity). >> > > > I did not realize you mean as fields (rather than as the mesh). Give me your > exact calling sequence. I can output > parallel fields fine, so something else must be going on. If you modified > the code, send that too. > > Matt > > > >> 2) Thanks, I'll try that. >> >> Matthew Knepley wrote: >> >> >>> 1) I believe the standard VTK viewer (ASCII Viewer with VTK format) gets >>> the >>> coordinates >>> right. Can you verify this? >>> >>> 2) You should be able to split the 4- component field into 4 fields using >>> the Split filter in VTK >>> or whatever viewer you use (I do this in Mayavi2, but Paraview also >>> works). >>> >>> Matt >>> >>> On Sun, Jul 12, 2009 at 10:34 AM, Saswata Hier-Majumder >> >>>> wrote: >>>> >>> >>> >>>> Hi, >>>> I would like to generate a vtk output from a multicomponent problem. In >>>> the vtk file, I would like the DA coordinates as well as all 4 components >>>> stored separately as scalar point data. >>>> >>>> I have been using the VecView_VTK routine from >>>> /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to >>>> contain >>>> only the local coordinates (may be because DAGetCoordinates is not >>>> collective?). Is there a way to fix this? >>>> >>>> Also, using the same routine, all components of the solution >>>> corresponding >>>> to a node are dumped together. Is there a way to extract each component >>>> separately and ouput them separately as scalar point data? >>>> >>>> Thanks >>>> >>>> -- >>>> www.geol.umd.edu/~saswata < >>>> http://www.geol.umd.edu/%7Esaswata> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> >> -- >> www.geol.umd.edu/~saswata >> >> >> # vtk DataFile Version 2.0 >> ASCII >> DATASET STRUCTURED_POINTS >> DIMENSIONS 49 49 1 >> ORIGIN 0 0 0 >> SPACING 1 1 1 >> >> POINT_DATA 2401 >> SCALARS scalars double 3 >> LOOKUP_TABLE default >> X_COORDINATES 49 double >> 0 0.0208333 0.0416667 0.0625 0.0833333 0.104167 0.125 0.145833 0.166667 >> 0.1875 0.208333 0.229167 0.25 0.270833 0.291667 0.3125 0.333333 0.354167 >> 0.375 0.395833 0.416667 0.4375 0.458333 0.479167 0.5 0 0.0208333 0.0416667 >> 0.0625 0.0833333 0.104167 0.125 0.145833 0.166667 0.1875 0.208333 0.229167 >> 0.25 0.270833 0.291667 0.3125 0.333333 0.354167 0.375 0.395833 0.416667 >> 0.4375 0.458333 0.479167 >> Y_COORDINATES 49 double >> 0 0.0208333 0.0625 0.104167 0.145833 0.1875 0.229167 0.270833 0.3125 >> 0.354167 0.395833 0.4375 0.479167 5.27367e-317 5.23119e-317 0 5.31253e-317 0 >> 0 0 3.41641e-312 3.26575e-311 1.14376e-311 1.54269e-311 1.94163e-311 >> 2.34056e-311 5.29014e-311 5.27095e-317 0 0 0 1.82492e-312 1.01431e-311 >> 1.84614e-311 1.49455e-320 3.09811e-312 1.17559e-311 2.04136e-311 0 0 0 >> 5.23119e-317 0 0 8.06358e-313 5.3312e-317 5.33093e-317 0 5.27392e-317 >> Z_COORDINATES 1 double >> 0 >> >> >> > > > -- www.geol.umd.edu/~saswata -------------- next part -------------- A non-text attachment was scrubbed... Name: dmmg.c Type: text/x-csrc Size: 15473 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: solution.vtk URL: From C.Klaij at marin.nl Tue Jul 14 03:36:15 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 14 Jul 2009 10:36:15 +0200 Subject: hypre preconditioners Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F780@MAR150CV1.marin.local> I'm solving the steady incompressible Navier-Stokes equations (discretized with FV on unstructured grids) using the SIMPLE Pressure Correction method. I'm using Picard linearization and solve the system for the momentum equations with BICG and for the pressure equation with CG. Currently, for parallel runs, I'm using JACOBI as a preconditioner. My grids typically have a few million cells and I use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux cluster). A significant portion of the CPU time goes into solving the pressure equation. To reach the relative tolerance I need, CG with JACOBI takes about 100 iterations per outer loop for these problems. In order to reduce CPU time, I've compiled PETSc with support for Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a preconditioner for the pressure equation. With default settings, both BoomerAMG and Euclid greatly reduce the number of iterations: with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10. However, I do not get any reduction in CPU time. With Euclid, CPU time is similar to JACOBI and with BoomerAMG it is approximately doubled. Is this what one can expect? Are BoomerAMG and Euclid meant for much larger problems? I understand Hypre uses a different matrix storage format, is CPU time 'lost in translation' between PETSc and Hypre for these small problems? Are there maybe any settings I should change? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/ http://www.marin.nl/web/show/id=46836/contentid=2324 First AMT'09 conference, Nantes, France, September 1-2 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1069 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1622 bytes Desc: not available URL: From Andreas.Grassl at student.uibk.ac.at Tue Jul 14 10:42:33 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Tue, 14 Jul 2009 17:42:33 +0200 Subject: ifort -i8 -r8 options Message-ID: <4A5CA769.4090101@student.uibk.ac.at> Hello, trying external packages (especially MUMPS and HYPRE) I noticed, that PETSc has to be compiled with-32-bit-indices and this is giving me some problems because all Diana-routines from which I'm reading out my data are compiled with ifort -i8 -r8 flags and I run into trouble matching together 64-bit integers from Diana to 32-bit PetscInt's. Recompiling Diana without the ugly flags is no alternative. Wrapping some casting routines around the arrays seems doable but doesn't seem to me the cleanest solution. Does anybody have an advice how to handle the problem? Cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From bsmith at mcs.anl.gov Tue Jul 14 10:42:58 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 14 Jul 2009 10:42:58 -0500 Subject: hypre preconditioners In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F780@MAR150CV1.marin.local> References: <5D9143EF9FADE942BEF6F2A636A861170800F780@MAR150CV1.marin.local> Message-ID: First run the three cases with -log_summary (also -ksp_view to see exact solver options that are being used) and send those files. This will tell us where the time is being spent; without this information any comments are pure speculation. (For example, the "copy" time to hypre format is trivial compared to the time to build a hypre preconditioner and not the problem). What you report is not uncommon; the setup and per iteration cost of the hypre preconditioners will be much larger than the simpler Jacobi preconditioner. Barry On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote: > > I'm solving the steady incompressible Navier-Stokes equations > (discretized with FV on unstructured grids) using the SIMPLE > Pressure Correction method. I'm using Picard linearization and solve > the system for the momentum equations with BICG and for the pressure > equation with CG. Currently, for parallel runs, I'm using JACOBI as > a preconditioner. My grids typically have a few million cells and I > use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux > cluster). A significant portion of the CPU time goes into solving > the pressure equation. To reach the relative tolerance I need, CG > with JACOBI takes about 100 iterations per outer loop for these > problems. > > In order to reduce CPU time, I've compiled PETSc with support for > Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a > preconditioner for the pressure equation. With default settings, > both BoomerAMG and Euclid greatly reduce the number of iterations: > with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10. > However, I do not get any reduction in CPU time. With Euclid, CPU > time is similar to JACOBI and with BoomerAMG it is approximately > doubled. > > Is this what one can expect? Are BoomerAMG and Euclid meant for much > larger problems? I understand Hypre uses a different matrix storage > format, is CPU time 'lost in translation' between PETSc and Hypre > for these small problems? Are there maybe any settings I should > change? > > Chris > > > > > > > > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > MARIN > 2, Haagsteeg > c.klaij at marin.nl > P.O. Box 28 > T +31 317 49 39 11 > 6700 AA Wageningen > F +31 317 49 32 45 > T +31 317 49 33 44 > The Netherlands > I www.marin.nl > > > MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2 > > > This e-mail may be confidential, privileged and/or protected by > copyright. If you are not the intended recipient, you should return > it to the sender immediately and delete your copy from your system. > From bsmith at mcs.anl.gov Tue Jul 14 10:51:23 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 14 Jul 2009 10:51:23 -0500 Subject: ifort -i8 -r8 options In-Reply-To: <4A5CA769.4090101@student.uibk.ac.at> References: <4A5CA769.4090101@student.uibk.ac.at> Message-ID: <4B624EC9-98C8-4F70-BAD1-26DBC63FD218@mcs.anl.gov> Mumps and hypre do not currently support using 32 bit integer indices, though the MUMPS folks say they plan to support it eventually. Changing PETSc to convert all 64 bit integers to 32 bit before passing to MUMPS and hypre is a huge project and we will not be doing that. You need to lobby the MUMPS and hypre to properly support 64 bit integers if you want to use them in that mode. Unless you are solving very large problems it seems you should be able to use the -r8 flag but not the -i8 flag. Barry On Jul 14, 2009, at 10:42 AM, Andreas Grassl wrote: > Hello, > > trying external packages (especially MUMPS and HYPRE) I noticed, > that PETSc has > to be compiled with-32-bit-indices and this is giving me some > problems because > all Diana-routines from which I'm reading out my data are compiled > with ifort > -i8 -r8 flags and I run into trouble matching together 64-bit > integers from > Diana to 32-bit PetscInt's. > > Recompiling Diana without the ugly flags is no alternative. > > Wrapping some casting routines around the arrays seems doable but > doesn't seem > to me the cleanest solution. > > Does anybody have an advice how to handle the problem? > > Cheers, > > ando > > -- > /"\ Grassl Andreas > \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik > X against HTML email Technikerstr. 13 Zi 709 > / \ +43 (0)512 507 6091 From Andreas.Grassl at student.uibk.ac.at Tue Jul 14 11:18:07 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Tue, 14 Jul 2009 18:18:07 +0200 Subject: ifort -i8 -r8 options In-Reply-To: <4B624EC9-98C8-4F70-BAD1-26DBC63FD218@mcs.anl.gov> References: <4A5CA769.4090101@student.uibk.ac.at> <4B624EC9-98C8-4F70-BAD1-26DBC63FD218@mcs.anl.gov> Message-ID: <4A5CAFBF.6020103@student.uibk.ac.at> Barry Smith schrieb: > > Mumps and hypre do not currently support using 32 bit integer indices, ^^^^^^ here you mean 64?! > though the MUMPS folks say they plan to support it eventually. > > Changing PETSc to convert all 64 bit integers to 32 bit before passing > to MUMPS and hypre is a huge project and we will not be doing that. > You need to lobby the MUMPS and hypre to properly support 64 bit > integers if you want to use them in that mode. > > Unless you are solving very large problems it seems you should be able > to use the -r8 flag but not the -i8 flag. For my needs, this is certainly true, but I don't have the whole sourcecode and I am not able to get a working Diana-program if I omit the -i8 flag. so you suggest casting the input data from Diana to PetscInt which is defined 32-bit?! Cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From bsmith at mcs.anl.gov Tue Jul 14 11:42:07 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 14 Jul 2009 11:42:07 -0500 Subject: ifort -i8 -r8 options In-Reply-To: <4A5CAFBF.6020103@student.uibk.ac.at> References: <4A5CA769.4090101@student.uibk.ac.at> <4B624EC9-98C8-4F70-BAD1-26DBC63FD218@mcs.anl.gov> <4A5CAFBF.6020103@student.uibk.ac.at> Message-ID: <60071B53-CF30-47B6-A7B4-B9729F5572DA@mcs.anl.gov> On Jul 14, 2009, at 11:18 AM, Andreas Grassl wrote: > Barry Smith schrieb: >> >> Mumps and hypre do not currently support using 32 bit integer >> indices, > ^^^^^^ > here you mean 64?! > >> though the MUMPS folks say they plan to support it eventually. >> >> Changing PETSc to convert all 64 bit integers to 32 bit before >> passing >> to MUMPS and hypre is a huge project and we will not be doing that. >> You need to lobby the MUMPS and hypre to properly support 64 bit >> integers if you want to use them in that mode. >> >> Unless you are solving very large problems it seems you should be >> able >> to use the -r8 flag but not the -i8 flag. > > For my needs, this is certainly true, but I don't have the whole > sourcecode and > I am not able to get a working Diana-program if I omit the -i8 flag. > > so you suggest casting the input data from Diana to PetscInt which > is defined > 32-bit?! If you can do that. But it means copying any integer arrays from 64 bit integer arrays to 32 bit integer arrays. Barry > > Cheers, > > ando > > -- > /"\ Grassl Andreas > \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik > X against HTML email Technikerstr. 13 Zi 709 > / \ +43 (0)512 507 6091 From C.Klaij at marin.nl Wed Jul 15 03:58:36 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 15 Jul 2009 10:58:36 +0200 Subject: hypre preconditioners References: Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F783@MAR150CV1.marin.local> Barry, Thanks for your reply! Below is the information from KSPView and -log_summary for the three cases. Indeed PCSetUp takes much more time with the hypre preconditioners. Chris ----------------------------- --- Jacobi preconditioner --- ----------------------------- KSP Object: type: cg maximum iterations=500 tolerances: relative=0.05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: jacobi linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=256576, cols=256576 total: nonzeros=1769552, allocated nonzeros=1769552 not using I-node (on process 0) routines ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:22:04 2009 Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 Max Max/Min Avg Total Time (sec): 6.037e+02 1.00000 6.037e+02 Objects: 9.270e+02 1.00000 9.270e+02 Flops: 5.671e+10 1.00065 5.669e+10 1.134e+11 Flops/sec: 9.393e+07 1.00065 9.390e+07 1.878e+08 MPI Messages: 1.780e+04 1.00000 1.780e+04 3.561e+04 MPI Message Lengths: 5.239e+08 1.00000 2.943e+04 1.048e+09 MPI Reductions: 2.651e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 6.0374e+02 100.0% 1.1338e+11 100.0% 3.561e+04 100.0% 2.943e+04 100.0% 5.302e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 0.0e+00 3.1e+04 2 14 0 0 59 2 14 0 0 59 1249 VecNorm 16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 0.0e+00 1.6e+04 0 7 0 0 31 0 7 0 0 31 3569 VecCopy 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 15 0 0 0 3 15 0 0 0 864 VecAYPX 16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 8 0 0 0 1 8 0 0 0 1144 VecAssemblyBegin 1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 0 0 0 0 7 0 0 0 0 7 0 VecAssemblyEnd 1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 4 0 0 0 2 4 0 0 0 323 VecScatterBegin 17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetup 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 2.9e+04 4.8e+04 27100100100 90 27100100100 90 686 PCSetUp 600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 PCApply 18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 1.0e+00 2 4 0 0 0 2 4 0 0 0 322 MatMult 16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 2.9e+04 0.0e+00 15 47 91 91 0 15 47 91 91 0 570 MatMultTranspose 1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 2.9e+04 0.0e+00 1 5 9 9 0 1 5 9 9 0 624 MatAssemblyBegin 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 0 0 0 0 2 0 0 0 0 2 0 MatAssemblyEnd 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 0 0 0 0 1 0 0 0 0 1 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage Index Set 4 4 30272 0 Vec 913 902 926180816 0 Vec Scatter 2 0 0 0 Krylov Solver 1 0 0 0 Preconditioner 1 0 0 0 Matrix 6 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 2.14577e-07 Average time for MPI_Barrier(): 8.10623e-07 Average time for zero size MPI_Send(): 2.0504e-05 ----------------------------------- --- Hypre Euclid preconditioner --- ----------------------------------- KSP Object: type: cg maximum iterations=500 tolerances: relative=0.05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: hypre HYPRE Euclid preconditioning HYPRE Euclid: number of levels 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=256576, cols=256576 total: nonzeros=1769552, allocated nonzeros=1769552 not using I-node (on process 0) routines ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:10:05 2009 Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 Max Max/Min Avg Total Time (sec): 6.961e+02 1.00000 6.961e+02 Objects: 1.227e+03 1.00000 1.227e+03 Flops: 1.340e+10 1.00073 1.340e+10 2.679e+10 Flops/sec: 1.925e+07 1.00073 1.924e+07 3.848e+07 MPI Messages: 4.748e+03 1.00000 4.748e+03 9.496e+03 MPI Message Lengths: 1.397e+08 1.00000 2.943e+04 2.794e+08 MPI Reductions: 7.192e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 6.9614e+02 100.0% 2.6790e+10 100.0% 9.496e+03 100.0% 2.943e+04 100.0% 1.438e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 0.0e+00 5.4e+03 1 10 0 0 38 1 10 0 0 38 234 VecNorm 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 3.3e+03 0 6 0 0 23 0 6 0 0 23 2139 VecCopy 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 13 0 0 0 1 13 0 0 0 715 VecAYPX 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 837 VecAssemblyBegin 1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 0 0 0 0 25 0 0 0 0 25 0 VecAssemblyEnd 1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 250 VecScatterBegin 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetup 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 2.9e+04 9.0e+03 37100100100 62 37100100100 62 103 PCSetUp 600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 26 0 0 0 1 26 0 0 0 1 0 PCApply 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 0.0e+00 1.0e+02 5 4 0 0 1 5 4 0 0 1 28 MatMult 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 2.9e+04 0.0e+00 3 40 69 69 0 3 40 69 69 0 464 MatMultTranspose 1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 2.9e+04 0.0e+00 1 20 31 31 0 1 20 31 31 0 621 MatConvert 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatAssemblyBegin 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 0 0 0 0 8 0 0 0 0 8 0 MatAssemblyEnd 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 0 0 0 0 4 0 0 0 0 4 0 MatGetRow 12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage Index Set 4 4 30272 0 Vec 1213 1202 1234223216 0 Vec Scatter 2 0 0 0 Krylov Solver 1 0 0 0 Preconditioner 1 0 0 0 Matrix 6 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 2.14577e-07 Average time for MPI_Barrier(): 3.8147e-07 Average time for zero size MPI_Send(): 1.39475e-05 -------------------------------------- --- Hypre BoomerAMG preconditioner --- -------------------------------------- KSP Object: type: cg maximum iterations=500 tolerances: relative=0.05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.25 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 HYPRE BoomerAMG: Maximum row sums 0.9 HYPRE BoomerAMG: Sweeps down 1 HYPRE BoomerAMG: Sweeps up 1 HYPRE BoomerAMG: Sweeps on coarse 1 HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax on coarse Gaussian-elimination HYPRE BoomerAMG: Relax weight (all) 1 HYPRE BoomerAMG: Outer relax weight (all) 1 HYPRE BoomerAMG: Using CF-relaxation HYPRE BoomerAMG: Measure type local HYPRE BoomerAMG: Coarsen type Falgout HYPRE BoomerAMG: Interpolation type classical linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=256576, cols=256576 total: nonzeros=1769552, allocated nonzeros=1769552 not using I-node (on process 0) routines ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 09:53:07 2009 Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 Max Max/Min Avg Total Time (sec): 7.080e+02 1.00000 7.080e+02 Objects: 1.227e+03 1.00000 1.227e+03 Flops: 1.054e+10 1.00076 1.054e+10 2.107e+10 Flops/sec: 1.489e+07 1.00076 1.488e+07 2.977e+07 MPI Messages: 3.857e+03 1.00000 3.857e+03 7.714e+03 MPI Message Lengths: 1.135e+08 1.00000 2.942e+04 2.270e+08 MPI Reductions: 5.800e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 7.0799e+02 100.0% 2.1075e+10 100.0% 7.714e+03 100.0% 2.942e+04 100.0% 1.160e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 0.0e+00 3.6e+03 0 9 0 0 31 0 9 0 0 31 1001 VecNorm 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 0.0e+00 2.3e+03 0 6 0 0 20 0 6 0 0 20 1781 VecCopy 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 12 0 0 0 1 12 0 0 0 674 VecAYPX 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 774 VecAssemblyBegin 1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 0 0 0 0 31 0 0 0 0 31 0 VecAssemblyEnd 1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 252 VecScatterBegin 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetup 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 2.9e+04 6.2e+03 38100100100 53 38100100100 53 77 PCSetUp 600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 23 0 0 0 2 23 0 0 0 2 0 PCApply 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 1.0e+02 10 5 0 0 1 10 5 0 0 1 14 MatMult 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 2.9e+04 0.0e+00 2 36 60 60 0 2 36 60 60 0 557 MatMultTranspose 1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 2.9e+04 0.0e+00 1 26 40 40 0 1 26 40 40 0 626 MatConvert 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatAssemblyBegin 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 0 0 0 0 10 0 0 0 0 10 0 MatAssemblyEnd 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 0 0 0 0 5 0 0 0 0 5 0 MatGetRow 12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage Index Set 4 4 30272 0 Vec 1213 1202 1234223216 0 Vec Scatter 2 0 0 0 Krylov Solver 1 0 0 0 Preconditioner 1 0 0 0 Matrix 6 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 1.90735e-07 Average time for MPI_Barrier(): 8.10623e-07 Average time for zero size MPI_Send(): 1.95503e-05 OptionTable: -log_summary -----Original Message----- Date: Tue, 14 Jul 2009 10:42:58 -0500 From: Barry Smith Subject: Re: hypre preconditioners To: PETSc users list Message-ID: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes First run the three cases with -log_summary (also -ksp_view to see exact solver options that are being used) and send those files. This will tell us where the time is being spent; without this information any comments are pure speculation. (For example, the "copy" time to hypre format is trivial compared to the time to build a hypre preconditioner and not the problem). What you report is not uncommon; the setup and per iteration cost of the hypre preconditioners will be much larger than the simpler Jacobi preconditioner. Barry On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote: > > I'm solving the steady incompressible Navier-Stokes equations > (discretized with FV on unstructured grids) using the SIMPLE > Pressure Correction method. I'm using Picard linearization and solve > the system for the momentum equations with BICG and for the pressure > equation with CG. Currently, for parallel runs, I'm using JACOBI as > a preconditioner. My grids typically have a few million cells and I > use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux > cluster). A significant portion of the CPU time goes into solving > the pressure equation. To reach the relative tolerance I need, CG > with JACOBI takes about 100 iterations per outer loop for these > problems. > > In order to reduce CPU time, I've compiled PETSc with support for > Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a > preconditioner for the pressure equation. With default settings, > both BoomerAMG and Euclid greatly reduce the number of iterations: > with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10. > However, I do not get any reduction in CPU time. With Euclid, CPU > time is similar to JACOBI and with BoomerAMG it is approximately > doubled. > > Is this what one can expect? Are BoomerAMG and Euclid meant for much > larger problems? I understand Hypre uses a different matrix storage > format, is CPU time 'lost in translation' between PETSc and Hypre > for these small problems? Are there maybe any settings I should > change? > > Chris > > > > > > > > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > MARIN > 2, Haagsteeg > c.klaij at marin.nl > P.O. Box 28 > T +31 317 49 39 11 > 6700 AA Wageningen > F +31 317 49 32 45 > T +31 317 49 33 44 > The Netherlands > I www.marin.nl > > > MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2 > > > This e-mail may be confidential, privileged and/or protected by > copyright. If you are not the intended recipient, you should return > it to the sender immediately and delete your copy from your system. > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 14202 bytes Desc: not available URL: From dalcinl at gmail.com Wed Jul 15 11:23:19 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 15 Jul 2009 13:23:19 -0300 Subject: hypre preconditioners In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F783@MAR150CV1.marin.local> References: <5D9143EF9FADE942BEF6F2A636A861170800F783@MAR150CV1.marin.local> Message-ID: Did you try Block-Jacobi for the velocity problem? If the matrix of your presure problem changes in each solve (is this your case?) could you try to use ML? In my little experience, ML leads to lower setup times, but higher iteration counts (let say twice); perhaps it will be faster than BommerAMG for you use case. On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan wrote: > Barry, > > Thanks for your reply! Below is the information from KSPView and -log_summary for the three cases. Indeed PCSetUp takes much more time with the hypre preconditioners. > > Chris > > ----------------------------- > --- Jacobi preconditioner --- > ----------------------------- > > KSP Object: > ?type: cg > ?maximum iterations=500 > ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: jacobi > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=256576, cols=256576 > ? ?total: nonzeros=1769552, allocated nonzeros=1769552 > ? ? ?not using I-node (on process 0) routines > > ************************************************************************************************************************ > *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?*** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:22:04 2009 > Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 > > ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total > Time (sec): ? ? ? ? ? 6.037e+02 ? ? ?1.00000 ? 6.037e+02 > Objects: ? ? ? ? ? ? ?9.270e+02 ? ? ?1.00000 ? 9.270e+02 > Flops: ? ? ? ? ? ? ? ?5.671e+10 ? ? ?1.00065 ? 5.669e+10 ?1.134e+11 > Flops/sec: ? ? ? ? ? ?9.393e+07 ? ? ?1.00065 ? 9.390e+07 ?1.878e+08 > MPI Messages: ? ? ? ? 1.780e+04 ? ? ?1.00000 ? 1.780e+04 ?3.561e+04 > MPI Message Lengths: ?5.239e+08 ? ? ?1.00000 ? 2.943e+04 ?1.048e+09 > MPI Reductions: ? ? ? 2.651e+04 ? ? ?1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops > ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions -- > ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total > ?0: ? ? ?Main Stage: 6.0374e+02 100.0% ?1.1338e+11 100.0% ?3.561e+04 100.0% ?2.943e+04 ? ? ?100.0% ?5.302e+04 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > ? Count: number of times phase was executed > ? Time and Flops/sec: Max - maximum over all processors > ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors > ? Mess: number of messages sent > ? Avg. len: average message length > ? Reduct: number of global reductions > ? Global: entire computation > ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase > ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase > ? ? ?%R - percent reductions in this phase > ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ? ? ?########################################################## > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? # > ? ? ?# ? macros. To get timing results we always recommend ? ?# > ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?# > ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # > ? ? ?########################################################## > > > Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total > ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > VecDot ? ? ? ? ? ? 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 0.0e+00 3.1e+04 ?2 14 ?0 ?0 59 ? 2 14 ?0 ?0 59 ?1249 > VecNorm ? ? ? ? ? ?16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 0.0e+00 1.6e+04 ?0 ?7 ?0 ?0 31 ? 0 ?7 ?0 ?0 31 ?3569 > VecCopy ? ? ? ? ? ? 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecSet ? ? ? ? ? ? ?3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecAXPY ? ? ? ? ? ?32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?3 15 ?0 ?0 ?0 ? 3 15 ?0 ?0 ?0 ? 864 > VecAYPX ? ? ? ? ? ?16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?8 ?0 ?0 ?0 ? 1 ?8 ?0 ?0 ?0 ?1144 > VecAssemblyBegin ? ?1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 ?7 ? 0 ?0 ?0 ?0 ?7 ? ? 0 > VecAssemblyEnd ? ? ?1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecPointwiseMult ? 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 323 > VecScatterBegin ? ?17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0 > VecScatterEnd ? ? ?17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSetup ? ? ? ? ? ? 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSolve ? ? ? ? ? ? 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 2.9e+04 4.8e+04 27100100100 90 ?27100100100 90 ? 686 > PCSetUp ? ? ? ? ? ? ?600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0 > PCApply ? ? ? ? ? ?18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 1.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 322 > MatMult ? ? ? ? ? ?16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 2.9e+04 0.0e+00 15 47 91 91 ?0 ?15 47 91 91 ?0 ? 570 > MatMultTranspose ? ?1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 ?5 ?9 ?9 ?0 ? 1 ?5 ?9 ?9 ?0 ? 624 > MatAssemblyBegin ? ? 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?2 ? 0 ?0 ?0 ?0 ?2 ? ? 0 > MatAssemblyEnd ? ? ? 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?1 ? 0 ?0 ?0 ?0 ?1 ? ? 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem. > > --- Event Stage 0: Main Stage > > ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0 > ? ? ? ? ? ? ? ? Vec ? 913 ? ? ? ? ? ?902 ?926180816 ? ? 0 > ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ======================================================================================================================== > Average time to get PetscTime(): 2.14577e-07 > Average time for MPI_Barrier(): 8.10623e-07 > Average time for zero size MPI_Send(): 2.0504e-05 > > > > ----------------------------------- > --- Hypre Euclid preconditioner --- > ----------------------------------- > > KSP Object: > ?type: cg > ?maximum iterations=500 > ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: hypre > ? ?HYPRE Euclid preconditioning > ? ?HYPRE Euclid: number of levels 1 > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=256576, cols=256576 > ? ?total: nonzeros=1769552, allocated nonzeros=1769552 > ? ? ?not using I-node (on process 0) routines > > ************************************************************************************************************************ > *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?*** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:10:05 2009 > Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 > > ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total > Time (sec): ? ? ? ? ? 6.961e+02 ? ? ?1.00000 ? 6.961e+02 > Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03 > Flops: ? ? ? ? ? ? ? ?1.340e+10 ? ? ?1.00073 ? 1.340e+10 ?2.679e+10 > Flops/sec: ? ? ? ? ? ?1.925e+07 ? ? ?1.00073 ? 1.924e+07 ?3.848e+07 > MPI Messages: ? ? ? ? 4.748e+03 ? ? ?1.00000 ? 4.748e+03 ?9.496e+03 > MPI Message Lengths: ?1.397e+08 ? ? ?1.00000 ? 2.943e+04 ?2.794e+08 > MPI Reductions: ? ? ? 7.192e+03 ? ? ?1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops > ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions -- > ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total > ?0: ? ? ?Main Stage: 6.9614e+02 100.0% ?2.6790e+10 100.0% ?9.496e+03 100.0% ?2.943e+04 ? ? ?100.0% ?1.438e+04 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > ? Count: number of times phase was executed > ? Time and Flops/sec: Max - maximum over all processors > ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors > ? Mess: number of messages sent > ? Avg. len: average message length > ? Reduct: number of global reductions > ? Global: entire computation > ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase > ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase > ? ? ?%R - percent reductions in this phase > ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ? ? ?########################################################## > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? # > ? ? ?# ? macros. To get timing results we always recommend ? ?# > ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?# > ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # > ? ? ?########################################################## > > > Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total > ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > VecDot ? ? ? ? ? ? ?5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 0.0e+00 5.4e+03 ?1 10 ?0 ?0 38 ? 1 10 ?0 ?0 38 ? 234 > VecNorm ? ? ? ? ? ? 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 3.3e+03 ?0 ?6 ?0 ?0 23 ? 0 ?6 ?0 ?0 23 ?2139 > VecCopy ? ? ? ? ? ? 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecSet ? ? ? ? ? ? ?4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecAXPY ? ? ? ? ? ? 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 13 ?0 ?0 ?0 ? 1 13 ?0 ?0 ?0 ? 715 > VecAYPX ? ? ? ? ? ? 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 837 > VecAssemblyBegin ? ?1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 25 ? 0 ?0 ?0 ?0 25 ? ? 0 > VecAssemblyEnd ? ? ?1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecPointwiseMult ? ?3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?4 ?0 ?0 ?0 ? 1 ?4 ?0 ?0 ?0 ? 250 > VecScatterBegin ? ? 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0 > VecScatterEnd ? ? ? 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSetup ? ? ? ? ? ? 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSolve ? ? ? ? ? ? 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 2.9e+04 9.0e+03 37100100100 62 ?37100100100 62 ? 103 > PCSetUp ? ? ? ? ? ? ?600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 26 ?0 ?0 ?0 ?1 ?26 ?0 ?0 ?0 ?1 ? ? 0 > PCApply ? ? ? ? ? ? 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 0.0e+00 1.0e+02 ?5 ?4 ?0 ?0 ?1 ? 5 ?4 ?0 ?0 ?1 ? ?28 > MatMult ? ? ? ? ? ? 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 2.9e+04 0.0e+00 ?3 40 69 69 ?0 ? 3 40 69 69 ?0 ? 464 > MatMultTranspose ? ?1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 2.9e+04 0.0e+00 ?1 20 31 31 ?0 ? 1 20 31 31 ?0 ? 621 > MatConvert ? ? ? ? ? 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0 > MatAssemblyBegin ? ? 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?8 ? 0 ?0 ?0 ?0 ?8 ? ? 0 > MatAssemblyEnd ? ? ? 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?4 ? 0 ?0 ?0 ?0 ?4 ? ? 0 > MatGetRow ? ? ? ?12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0 > MatGetRowIJ ? ? ? ? ?200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem. > > --- Event Stage 0: Main Stage > > ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0 > ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0 > ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ======================================================================================================================== > Average time to get PetscTime(): 2.14577e-07 > Average time for MPI_Barrier(): 3.8147e-07 > Average time for zero size MPI_Send(): 1.39475e-05 > > > > > -------------------------------------- > --- Hypre BoomerAMG preconditioner --- > -------------------------------------- > > KSP Object: > ?type: cg > ?maximum iterations=500 > ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: hypre > ? ?HYPRE BoomerAMG preconditioning > ? ?HYPRE BoomerAMG: Cycle type V > ? ?HYPRE BoomerAMG: Maximum number of levels 25 > ? ?HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > ? ?HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > ? ?HYPRE BoomerAMG: Threshold for strong coupling 0.25 > ? ?HYPRE BoomerAMG: Interpolation truncation factor 0 > ? ?HYPRE BoomerAMG: Interpolation: max elements per row 0 > ? ?HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > ? ?HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > ? ?HYPRE BoomerAMG: Maximum row sums 0.9 > ? ?HYPRE BoomerAMG: Sweeps down ? ? ? ? 1 > ? ?HYPRE BoomerAMG: Sweeps up ? ? ? ? ? 1 > ? ?HYPRE BoomerAMG: Sweeps on coarse ? ?1 > ? ?HYPRE BoomerAMG: Relax down ? ? ? ? ?symmetric-SOR/Jacobi > ? ?HYPRE BoomerAMG: Relax up ? ? ? ? ? ?symmetric-SOR/Jacobi > ? ?HYPRE BoomerAMG: Relax on coarse ? ? Gaussian-elimination > ? ?HYPRE BoomerAMG: Relax weight ?(all) ? ? ?1 > ? ?HYPRE BoomerAMG: Outer relax weight (all) 1 > ? ?HYPRE BoomerAMG: Using CF-relaxation > ? ?HYPRE BoomerAMG: Measure type ? ? ? ?local > ? ?HYPRE BoomerAMG: Coarsen type ? ? ? ?Falgout > ? ?HYPRE BoomerAMG: Interpolation type ?classical > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=256576, cols=256576 > ? ?total: nonzeros=1769552, allocated nonzeros=1769552 > ? ? ?not using I-node (on process 0) routines > > ************************************************************************************************************************ > *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?*** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 09:53:07 2009 > Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 > > ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total > Time (sec): ? ? ? ? ? 7.080e+02 ? ? ?1.00000 ? 7.080e+02 > Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03 > Flops: ? ? ? ? ? ? ? ?1.054e+10 ? ? ?1.00076 ? 1.054e+10 ?2.107e+10 > Flops/sec: ? ? ? ? ? ?1.489e+07 ? ? ?1.00076 ? 1.488e+07 ?2.977e+07 > MPI Messages: ? ? ? ? 3.857e+03 ? ? ?1.00000 ? 3.857e+03 ?7.714e+03 > MPI Message Lengths: ?1.135e+08 ? ? ?1.00000 ? 2.942e+04 ?2.270e+08 > MPI Reductions: ? ? ? 5.800e+03 ? ? ?1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops > ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions -- > ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total > ?0: ? ? ?Main Stage: 7.0799e+02 100.0% ?2.1075e+10 100.0% ?7.714e+03 100.0% ?2.942e+04 ? ? ?100.0% ?1.160e+04 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > ? Count: number of times phase was executed > ? Time and Flops/sec: Max - maximum over all processors > ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors > ? Mess: number of messages sent > ? Avg. len: average message length > ? Reduct: number of global reductions > ? Global: entire computation > ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase > ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase > ? ? ?%R - percent reductions in this phase > ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ? ? ?########################################################## > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? # > ? ? ?# ? macros. To get timing results we always recommend ? ?# > ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?# > ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # > ? ? ?########################################################## > > > Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total > ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > VecDot ? ? ? ? ? ? ?3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?9 ?0 ?0 31 ? 0 ?9 ?0 ?0 31 ?1001 > VecNorm ? ? ? ? ? ? 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 0.0e+00 2.3e+03 ?0 ?6 ?0 ?0 20 ? 0 ?6 ?0 ?0 20 ?1781 > VecCopy ? ? ? ? ? ? 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecSet ? ? ? ? ? ? ?3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecAXPY ? ? ? ? ? ? 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 12 ?0 ?0 ?0 ? 1 12 ?0 ?0 ?0 ? 674 > VecAYPX ? ? ? ? ? ? 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 774 > VecAssemblyBegin ? ?1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 31 ? 0 ?0 ?0 ?0 31 ? ? 0 > VecAssemblyEnd ? ? ?1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecPointwiseMult ? ?4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?5 ?0 ?0 ?0 ? 1 ?5 ?0 ?0 ?0 ? 252 > VecScatterBegin ? ? 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0 > VecScatterEnd ? ? ? 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSetup ? ? ? ? ? ? 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSolve ? ? ? ? ? ? 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 2.9e+04 6.2e+03 38100100100 53 ?38100100100 53 ? ?77 > PCSetUp ? ? ? ? ? ? ?600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 23 ?0 ?0 ?0 ?2 ?23 ?0 ?0 ?0 ?2 ? ? 0 > PCApply ? ? ? ? ? ? 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 1.0e+02 10 ?5 ?0 ?0 ?1 ?10 ?5 ?0 ?0 ?1 ? ?14 > MatMult ? ? ? ? ? ? 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 2.9e+04 0.0e+00 ?2 36 60 60 ?0 ? 2 36 60 60 ?0 ? 557 > MatMultTranspose ? ?1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 26 40 40 ?0 ? 1 26 40 40 ?0 ? 626 > MatConvert ? ? ? ? ? 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0 > MatAssemblyBegin ? ? 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 10 ? 0 ?0 ?0 ?0 10 ? ? 0 > MatAssemblyEnd ? ? ? 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?5 ? 0 ?0 ?0 ?0 ?5 ? ? 0 > MatGetRow ? ? ? ?12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0 > MatGetRowIJ ? ? ? ? ?200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem. > > --- Event Stage 0: Main Stage > > ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0 > ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0 > ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ======================================================================================================================== > Average time to get PetscTime(): 1.90735e-07 > Average time for MPI_Barrier(): 8.10623e-07 > Average time for zero size MPI_Send(): 1.95503e-05 > OptionTable: -log_summary > > > > > -----Original Message----- > Date: Tue, 14 Jul 2009 10:42:58 -0500 > From: Barry Smith > Subject: Re: hypre preconditioners > To: PETSc users list > Message-ID: > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > > > ? ?First run the three cases with -log_summary (also -ksp_view to see > exact solver options that are being used) and send those files. This > will tell us where the time is being spent; without this information > any comments are pure speculation. (For example, the "copy" time to > hypre format is trivial compared to the time to build a hypre > preconditioner and not the problem). > > > ? ?What you report is not uncommon; the setup and per iteration cost > of the hypre preconditioners will be much larger than the simpler > Jacobi preconditioner. > > ? ?Barry > > On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote: > >> >> I'm solving the steady incompressible Navier-Stokes equations >> (discretized with FV on unstructured grids) using the SIMPLE >> Pressure Correction method. I'm using Picard linearization and solve >> the system for the momentum equations with BICG and for the pressure >> equation with CG. Currently, for parallel runs, I'm using JACOBI as >> a preconditioner. My grids typically have a few million cells and I >> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux >> cluster). A significant portion of the CPU time goes into solving >> the pressure equation. To reach the relative tolerance I need, CG >> with JACOBI takes about 100 iterations per outer loop for these >> problems. >> >> In order to reduce CPU time, I've compiled PETSc with support for >> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a >> preconditioner for the pressure equation. With default settings, >> both BoomerAMG and Euclid greatly reduce the number of iterations: >> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10. >> However, I do not get any reduction in CPU time. With Euclid, CPU >> time is similar to JACOBI and with BoomerAMG it is approximately >> doubled. >> >> Is this what one can expect? Are BoomerAMG and Euclid meant for much >> larger problems? I understand Hypre uses a different matrix storage >> format, is CPU time 'lost in translation' between PETSc and Hypre >> for these small problems? Are there maybe any settings I should >> change? >> >> Chris >> >> >> >> >> >> >> >> >> >> dr. ir. Christiaan Klaij >> CFD Researcher >> Research & Development >> MARIN >> 2, Haagsteeg >> c.klaij at marin.nl >> P.O. Box 28 >> T +31 317 49 39 11 >> 6700 AA ?Wageningen >> F +31 317 49 32 45 >> T ?+31 317 49 33 44 >> The Netherlands >> I ?www.marin.nl >> >> >> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2 >> >> >> This e-mail may be confidential, privileged and/or protected by >> copyright. If you are not the intended recipient, you should return >> it to the sender immediately and delete your copy from your system. >> > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From Andreas.Grassl at student.uibk.ac.at Wed Jul 15 13:18:01 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Wed, 15 Jul 2009 20:18:01 +0200 Subject: Mumps speedup by passing symmetric matrix Message-ID: <4A5E1D59.3000905@student.uibk.ac.at> Hello, solved the 64-32-bit-issue, i have working now MUMPS and gain reasonable results, but I'm wondering if I could see some performance increasing by using the symmetry of the matrix. By setting only the option -mat_mumps_sym I don't see any changes in runtime and INFOG(8) returns 100. Setting MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE) I don't see any changes either. Does MUMPS recognize and use automatically the symmetry? Cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: From bsmith at mcs.anl.gov Wed Jul 15 15:26:17 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 15 Jul 2009 15:26:17 -0500 Subject: hypre preconditioners In-Reply-To: References: <5D9143EF9FADE942BEF6F2A636A861170800F783@MAR150CV1.marin.local> Message-ID: <83E2B8C2-9475-45C6-A448-502114D4959D@mcs.anl.gov> On Jul 15, 2009, at 11:23 AM, Lisandro Dalcin wrote: > Did you try Block-Jacobi for the velocity problem? You can try -pc_type sor and it will run block Jacobi with one symmetric sweep of SOR for each iteration. This may be faster than your plain Jacobi. > If the matrix of > your presure problem changes in each solve (is this your case?) could > you try to use ML? In my little experience, ML leads to lower setup > times, but higher iteration counts (let say twice); perhaps it will be > faster than BommerAMG for you use case. ML is worth trying. Also you might try "playing" with the various boomerAMG options. I don't know them in detail so cannot make suggestions, but the various ways of coarsening control how quickly the setup time is. Finally, if the matrix is not changing much for each new solve you can use the same boomerAMG preconditioner for several linear solves. Just use SAME_PRECONDITIONER as the argument to KSPSetOperators() and it will not create a new preconditioner until you call it with SAME_NONZERO_PATTERN. I am thinking this might work very well for you. Barry > > > On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan > wrote: >> Barry, >> >> Thanks for your reply! Below is the information from KSPView and - >> log_summary for the three cases. Indeed PCSetUp takes much more >> time with the hypre preconditioners. >> >> Chris >> >> ----------------------------- >> --- Jacobi preconditioner --- >> ----------------------------- >> >> KSP Object: >> type: cg >> maximum iterations=500 >> tolerances: relative=0.05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: jacobi >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=256576, cols=256576 >> total: nonzeros=1769552, allocated nonzeros=1769552 >> not using I-node (on process 0) routines >> >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij >> Wed Jul 15 10:22:04 2009 >> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 >> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 >> >> Max Max/Min Avg Total >> Time (sec): 6.037e+02 1.00000 6.037e+02 >> Objects: 9.270e+02 1.00000 9.270e+02 >> Flops: 5.671e+10 1.00065 5.669e+10 1.134e+11 >> Flops/sec: 9.393e+07 1.00065 9.390e+07 1.878e+08 >> MPI Messages: 1.780e+04 1.00000 1.780e+04 3.561e+04 >> MPI Message Lengths: 5.239e+08 1.00000 2.943e+04 1.048e+09 >> MPI Reductions: 2.651e+04 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of >> length N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 6.0374e+02 100.0% 1.1338e+11 100.0% 3.561e >> +04 100.0% 2.943e+04 100.0% 5.302e+04 100.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops/sec: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all >> processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with >> PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in >> this phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >> time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was run without the PreLoadBegin() # >> # macros. To get timing results we always recommend # >> # preloading. otherwise timing numbers may be # >> # meaningless. # >> ########################################################## >> >> >> Event Count Time (sec) Flops/ >> sec --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> VecDot 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 >> 0.0e+00 3.1e+04 2 14 0 0 59 2 14 0 0 59 1249 >> VecNorm 16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 >> 0.0e+00 1.6e+04 0 7 0 0 31 0 7 0 0 31 3569 >> VecCopy 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 3 15 0 0 0 3 15 0 0 0 864 >> VecAYPX 16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 8 0 0 0 1 8 0 0 0 1144 >> VecAssemblyBegin 1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.6e+03 0 0 0 0 7 0 0 0 0 7 0 >> VecAssemblyEnd 1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 2 4 0 0 0 2 4 0 0 0 323 >> VecScatterBegin 17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 >> 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 >> VecScatterEnd 17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetup 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 >> 2.9e+04 4.8e+04 27100100100 90 27100100100 90 686 >> PCSetUp 600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> PCApply 18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 >> 0.0e+00 1.0e+00 2 4 0 0 0 2 4 0 0 0 322 >> MatMult 16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 >> 2.9e+04 0.0e+00 15 47 91 91 0 15 47 91 91 0 570 >> MatMultTranspose 1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 >> 2.9e+04 0.0e+00 1 5 9 9 0 1 5 9 9 0 624 >> MatAssemblyBegin 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.2e+03 0 0 0 0 2 0 0 0 0 2 0 >> MatAssemblyEnd 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 >> 1.5e+04 6.1e+02 0 0 0 0 1 0 0 0 0 1 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory >> Descendants' Mem. >> >> --- Event Stage 0: Main Stage >> >> Index Set 4 4 30272 0 >> Vec 913 902 926180816 0 >> Vec Scatter 2 0 0 0 >> Krylov Solver 1 0 0 0 >> Preconditioner 1 0 0 0 >> Matrix 6 0 0 0 >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> Average time to get PetscTime(): 2.14577e-07 >> Average time for MPI_Barrier(): 8.10623e-07 >> Average time for zero size MPI_Send(): 2.0504e-05 >> >> >> >> ----------------------------------- >> --- Hypre Euclid preconditioner --- >> ----------------------------------- >> >> KSP Object: >> type: cg >> maximum iterations=500 >> tolerances: relative=0.05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: hypre >> HYPRE Euclid preconditioning >> HYPRE Euclid: number of levels 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=256576, cols=256576 >> total: nonzeros=1769552, allocated nonzeros=1769552 >> not using I-node (on process 0) routines >> >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij >> Wed Jul 15 10:10:05 2009 >> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 >> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 >> >> Max Max/Min Avg Total >> Time (sec): 6.961e+02 1.00000 6.961e+02 >> Objects: 1.227e+03 1.00000 1.227e+03 >> Flops: 1.340e+10 1.00073 1.340e+10 2.679e+10 >> Flops/sec: 1.925e+07 1.00073 1.924e+07 3.848e+07 >> MPI Messages: 4.748e+03 1.00000 4.748e+03 9.496e+03 >> MPI Message Lengths: 1.397e+08 1.00000 2.943e+04 2.794e+08 >> MPI Reductions: 7.192e+03 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of >> length N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 6.9614e+02 100.0% 2.6790e+10 100.0% 9.496e >> +03 100.0% 2.943e+04 100.0% 1.438e+04 100.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops/sec: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all >> processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with >> PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in >> this phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >> time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was run without the PreLoadBegin() # >> # macros. To get timing results we always recommend # >> # preloading. otherwise timing numbers may be # >> # meaningless. # >> ########################################################## >> >> >> Event Count Time (sec) Flops/ >> sec --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> VecDot 5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 >> 0.0e+00 5.4e+03 1 10 0 0 38 1 10 0 0 38 234 >> VecNorm 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 >> 0.0e+00 3.3e+03 0 6 0 0 23 0 6 0 0 23 2139 >> VecCopy 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 13 0 0 0 1 13 0 0 0 715 >> VecAYPX 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 837 >> VecAssemblyBegin 1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.6e+03 0 0 0 0 25 0 0 0 0 25 0 >> VecAssemblyEnd 1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 250 >> VecScatterBegin 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 >> 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 >> VecScatterEnd 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetup 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 >> 2.9e+04 9.0e+03 37100100100 62 37100100100 62 103 >> PCSetUp 600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 2.0e+02 26 0 0 0 1 26 0 0 0 1 0 >> PCApply 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 >> 0.0e+00 1.0e+02 5 4 0 0 1 5 4 0 0 1 28 >> MatMult 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 >> 2.9e+04 0.0e+00 3 40 69 69 0 3 40 69 69 0 464 >> MatMultTranspose 1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 >> 2.9e+04 0.0e+00 1 20 31 31 0 1 20 31 31 0 621 >> MatConvert 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> MatAssemblyBegin 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.2e+03 0 0 0 0 8 0 0 0 0 8 0 >> MatAssemblyEnd 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 >> 1.5e+04 6.1e+02 0 0 0 0 4 0 0 0 0 4 0 >> MatGetRow 12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> MatGetRowIJ 200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory >> Descendants' Mem. >> >> --- Event Stage 0: Main Stage >> >> Index Set 4 4 30272 0 >> Vec 1213 1202 1234223216 0 >> Vec Scatter 2 0 0 0 >> Krylov Solver 1 0 0 0 >> Preconditioner 1 0 0 0 >> Matrix 6 0 0 0 >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> Average time to get PetscTime(): 2.14577e-07 >> Average time for MPI_Barrier(): 3.8147e-07 >> Average time for zero size MPI_Send(): 1.39475e-05 >> >> >> >> >> -------------------------------------- >> --- Hypre BoomerAMG preconditioner --- >> -------------------------------------- >> >> KSP Object: >> type: cg >> maximum iterations=500 >> tolerances: relative=0.05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: hypre >> HYPRE BoomerAMG preconditioning >> HYPRE BoomerAMG: Cycle type V >> HYPRE BoomerAMG: Maximum number of levels 25 >> HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 >> HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 >> HYPRE BoomerAMG: Threshold for strong coupling 0.25 >> HYPRE BoomerAMG: Interpolation truncation factor 0 >> HYPRE BoomerAMG: Interpolation: max elements per row 0 >> HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 >> HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 >> HYPRE BoomerAMG: Maximum row sums 0.9 >> HYPRE BoomerAMG: Sweeps down 1 >> HYPRE BoomerAMG: Sweeps up 1 >> HYPRE BoomerAMG: Sweeps on coarse 1 >> HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi >> HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi >> HYPRE BoomerAMG: Relax on coarse Gaussian-elimination >> HYPRE BoomerAMG: Relax weight (all) 1 >> HYPRE BoomerAMG: Outer relax weight (all) 1 >> HYPRE BoomerAMG: Using CF-relaxation >> HYPRE BoomerAMG: Measure type local >> HYPRE BoomerAMG: Coarsen type Falgout >> HYPRE BoomerAMG: Interpolation type classical >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=256576, cols=256576 >> total: nonzeros=1769552, allocated nonzeros=1769552 >> not using I-node (on process 0) routines >> >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij >> Wed Jul 15 09:53:07 2009 >> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 >> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 >> >> Max Max/Min Avg Total >> Time (sec): 7.080e+02 1.00000 7.080e+02 >> Objects: 1.227e+03 1.00000 1.227e+03 >> Flops: 1.054e+10 1.00076 1.054e+10 2.107e+10 >> Flops/sec: 1.489e+07 1.00076 1.488e+07 2.977e+07 >> MPI Messages: 3.857e+03 1.00000 3.857e+03 7.714e+03 >> MPI Message Lengths: 1.135e+08 1.00000 2.942e+04 2.270e+08 >> MPI Reductions: 5.800e+03 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of >> length N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 7.0799e+02 100.0% 2.1075e+10 100.0% 7.714e >> +03 100.0% 2.942e+04 100.0% 1.160e+04 100.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops/sec: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all >> processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with >> PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in >> this phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >> time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was run without the PreLoadBegin() # >> # macros. To get timing results we always recommend # >> # preloading. otherwise timing numbers may be # >> # meaningless. # >> ########################################################## >> >> >> Event Count Time (sec) Flops/ >> sec --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> VecDot 3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 >> 0.0e+00 3.6e+03 0 9 0 0 31 0 9 0 0 31 1001 >> VecNorm 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 >> 0.0e+00 2.3e+03 0 6 0 0 20 0 6 0 0 20 1781 >> VecCopy 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 12 0 0 0 1 12 0 0 0 674 >> VecAYPX 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 774 >> VecAssemblyBegin 1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.6e+03 0 0 0 0 31 0 0 0 0 31 0 >> VecAssemblyEnd 1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 252 >> VecScatterBegin 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 >> 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 >> VecScatterEnd 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetup 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 >> 2.9e+04 6.2e+03 38100100100 53 38100100100 53 77 >> PCSetUp 600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 2.0e+02 23 0 0 0 2 23 0 0 0 2 0 >> PCApply 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 >> 0.0e+00 1.0e+02 10 5 0 0 1 10 5 0 0 1 14 >> MatMult 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 >> 2.9e+04 0.0e+00 2 36 60 60 0 2 36 60 60 0 557 >> MatMultTranspose 1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 >> 2.9e+04 0.0e+00 1 26 40 40 0 1 26 40 40 0 626 >> MatConvert 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> MatAssemblyBegin 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.2e+03 0 0 0 0 10 0 0 0 0 10 0 >> MatAssemblyEnd 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 >> 1.5e+04 6.1e+02 0 0 0 0 5 0 0 0 0 5 0 >> MatGetRow 12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> MatGetRowIJ 200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory >> Descendants' Mem. >> >> --- Event Stage 0: Main Stage >> >> Index Set 4 4 30272 0 >> Vec 1213 1202 1234223216 0 >> Vec Scatter 2 0 0 0 >> Krylov Solver 1 0 0 0 >> Preconditioner 1 0 0 0 >> Matrix 6 0 0 0 >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> Average time to get PetscTime(): 1.90735e-07 >> Average time for MPI_Barrier(): 8.10623e-07 >> Average time for zero size MPI_Send(): 1.95503e-05 >> OptionTable: -log_summary >> >> >> >> >> -----Original Message----- >> Date: Tue, 14 Jul 2009 10:42:58 -0500 >> From: Barry Smith >> Subject: Re: hypre preconditioners >> To: PETSc users list >> Message-ID: >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes >> >> >> First run the three cases with -log_summary (also -ksp_view to see >> exact solver options that are being used) and send those files. This >> will tell us where the time is being spent; without this information >> any comments are pure speculation. (For example, the "copy" time to >> hypre format is trivial compared to the time to build a hypre >> preconditioner and not the problem). >> >> >> What you report is not uncommon; the setup and per iteration cost >> of the hypre preconditioners will be much larger than the simpler >> Jacobi preconditioner. >> >> Barry >> >> On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote: >> >>> >>> I'm solving the steady incompressible Navier-Stokes equations >>> (discretized with FV on unstructured grids) using the SIMPLE >>> Pressure Correction method. I'm using Picard linearization and solve >>> the system for the momentum equations with BICG and for the pressure >>> equation with CG. Currently, for parallel runs, I'm using JACOBI as >>> a preconditioner. My grids typically have a few million cells and I >>> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux >>> cluster). A significant portion of the CPU time goes into solving >>> the pressure equation. To reach the relative tolerance I need, CG >>> with JACOBI takes about 100 iterations per outer loop for these >>> problems. >>> >>> In order to reduce CPU time, I've compiled PETSc with support for >>> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a >>> preconditioner for the pressure equation. With default settings, >>> both BoomerAMG and Euclid greatly reduce the number of iterations: >>> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10. >>> However, I do not get any reduction in CPU time. With Euclid, CPU >>> time is similar to JACOBI and with BoomerAMG it is approximately >>> doubled. >>> >>> Is this what one can expect? Are BoomerAMG and Euclid meant for much >>> larger problems? I understand Hypre uses a different matrix storage >>> format, is CPU time 'lost in translation' between PETSc and Hypre >>> for these small problems? Are there maybe any settings I should >>> change? >>> >>> Chris >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> dr. ir. Christiaan Klaij >>> CFD Researcher >>> Research & Development >>> MARIN >>> 2, Haagsteeg >>> c.klaij at marin.nl >>> P.O. Box 28 >>> T +31 317 49 39 11 >>> 6700 AA Wageningen >>> F +31 317 49 32 45 >>> T +31 317 49 33 44 >>> The Netherlands >>> I www.marin.nl >>> >>> >>> MARIN webnews: First AMT'09 conference, Nantes, France, September >>> 1-2 >>> >>> >>> This e-mail may be confidential, privileged and/or protected by >>> copyright. If you are not the intended recipient, you should return >>> it to the sender immediately and delete your copy from your system. >>> >> > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 From C.Klaij at marin.nl Thu Jul 16 01:47:27 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 16 Jul 2009 08:47:27 +0200 Subject: hypre preconditioners References: Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F787@MAR150CV1.marin.local> Lisandro, Thanks for your response! The velocity problem is segregated (I use BICG with Jacobi for the 3 linear systems) but these need (much) less iterations than the pressure problem. The pressure matrix changes at each solve. Also, I did try ML and, like you say, it needs about two times more iterations than boomerAMG. Overall, boomerAMG is a bit faster for my cases than ML. Chris -----Original Message----- Date: Wed, 15 Jul 2009 13:23:19 -0300 From: Lisandro Dalcin Subject: Re: hypre preconditioners To: PETSc users list Message-ID: Content-Type: text/plain; charset=ISO-8859-1 Did you try Block-Jacobi for the velocity problem? If the matrix of your presure problem changes in each solve (is this your case?) could you try to use ML? In my little experience, ML leads to lower setup times, but higher iteration counts (let say twice); perhaps it will be faster than BommerAMG for you use case. On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan wrote: > Barry, > > Thanks for your reply! Below is the information from KSPView and -log_summary for the three cases. Indeed PCSetUp takes much more time with the hypre preconditioners. > > Chris > > ----------------------------- > --- Jacobi preconditioner --- > ----------------------------- > > KSP Object: > ?type: cg > ?maximum iterations=500 > ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: jacobi > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=256576, cols=256576 > ? ?total: nonzeros=1769552, allocated nonzeros=1769552 > ? ? ?not using I-node (on process 0) routines > > ************************************************************************************************************************ > *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?*** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:22:04 2009 > Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 > > ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total > Time (sec): ? ? ? ? ? 6.037e+02 ? ? ?1.00000 ? 6.037e+02 > Objects: ? ? ? ? ? ? ?9.270e+02 ? ? ?1.00000 ? 9.270e+02 > Flops: ? ? ? ? ? ? ? ?5.671e+10 ? ? ?1.00065 ? 5.669e+10 ?1.134e+11 > Flops/sec: ? ? ? ? ? ?9.393e+07 ? ? ?1.00065 ? 9.390e+07 ?1.878e+08 > MPI Messages: ? ? ? ? 1.780e+04 ? ? ?1.00000 ? 1.780e+04 ?3.561e+04 > MPI Message Lengths: ?5.239e+08 ? ? ?1.00000 ? 2.943e+04 ?1.048e+09 > MPI Reductions: ? ? ? 2.651e+04 ? ? ?1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops > ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions -- > ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total > ?0: ? ? ?Main Stage: 6.0374e+02 100.0% ?1.1338e+11 100.0% ?3.561e+04 100.0% ?2.943e+04 ? ? ?100.0% ?5.302e+04 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > ? Count: number of times phase was executed > ? Time and Flops/sec: Max - maximum over all processors > ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors > ? Mess: number of messages sent > ? Avg. len: average message length > ? Reduct: number of global reductions > ? Global: entire computation > ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase > ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase > ? ? ?%R - percent reductions in this phase > ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ? ? ?########################################################## > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? # > ? ? ?# ? macros. To get timing results we always recommend ? ?# > ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?# > ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # > ? ? ?########################################################## > > > Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total > ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > VecDot ? ? ? ? ? ? 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 0.0e+00 3.1e+04 ?2 14 ?0 ?0 59 ? 2 14 ?0 ?0 59 ?1249 > VecNorm ? ? ? ? ? ?16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 0.0e+00 1.6e+04 ?0 ?7 ?0 ?0 31 ? 0 ?7 ?0 ?0 31 ?3569 > VecCopy ? ? ? ? ? ? 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecSet ? ? ? ? ? ? ?3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecAXPY ? ? ? ? ? ?32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?3 15 ?0 ?0 ?0 ? 3 15 ?0 ?0 ?0 ? 864 > VecAYPX ? ? ? ? ? ?16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?8 ?0 ?0 ?0 ? 1 ?8 ?0 ?0 ?0 ?1144 > VecAssemblyBegin ? ?1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 ?7 ? 0 ?0 ?0 ?0 ?7 ? ? 0 > VecAssemblyEnd ? ? ?1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecPointwiseMult ? 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 323 > VecScatterBegin ? ?17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0 > VecScatterEnd ? ? ?17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSetup ? ? ? ? ? ? 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSolve ? ? ? ? ? ? 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 2.9e+04 4.8e+04 27100100100 90 ?27100100100 90 ? 686 > PCSetUp ? ? ? ? ? ? ?600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0 > PCApply ? ? ? ? ? ?18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 1.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 322 > MatMult ? ? ? ? ? ?16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 2.9e+04 0.0e+00 15 47 91 91 ?0 ?15 47 91 91 ?0 ? 570 > MatMultTranspose ? ?1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 ?5 ?9 ?9 ?0 ? 1 ?5 ?9 ?9 ?0 ? 624 > MatAssemblyBegin ? ? 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?2 ? 0 ?0 ?0 ?0 ?2 ? ? 0 > MatAssemblyEnd ? ? ? 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?1 ? 0 ?0 ?0 ?0 ?1 ? ? 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem. > > --- Event Stage 0: Main Stage > > ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0 > ? ? ? ? ? ? ? ? Vec ? 913 ? ? ? ? ? ?902 ?926180816 ? ? 0 > ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ======================================================================================================================== > Average time to get PetscTime(): 2.14577e-07 > Average time for MPI_Barrier(): 8.10623e-07 > Average time for zero size MPI_Send(): 2.0504e-05 > > > > ----------------------------------- > --- Hypre Euclid preconditioner --- > ----------------------------------- > > KSP Object: > ?type: cg > ?maximum iterations=500 > ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: hypre > ? ?HYPRE Euclid preconditioning > ? ?HYPRE Euclid: number of levels 1 > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=256576, cols=256576 > ? ?total: nonzeros=1769552, allocated nonzeros=1769552 > ? ? ?not using I-node (on process 0) routines > > ************************************************************************************************************************ > *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?*** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:10:05 2009 > Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 > > ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total > Time (sec): ? ? ? ? ? 6.961e+02 ? ? ?1.00000 ? 6.961e+02 > Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03 > Flops: ? ? ? ? ? ? ? ?1.340e+10 ? ? ?1.00073 ? 1.340e+10 ?2.679e+10 > Flops/sec: ? ? ? ? ? ?1.925e+07 ? ? ?1.00073 ? 1.924e+07 ?3.848e+07 > MPI Messages: ? ? ? ? 4.748e+03 ? ? ?1.00000 ? 4.748e+03 ?9.496e+03 > MPI Message Lengths: ?1.397e+08 ? ? ?1.00000 ? 2.943e+04 ?2.794e+08 > MPI Reductions: ? ? ? 7.192e+03 ? ? ?1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops > ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions -- > ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total > ?0: ? ? ?Main Stage: 6.9614e+02 100.0% ?2.6790e+10 100.0% ?9.496e+03 100.0% ?2.943e+04 ? ? ?100.0% ?1.438e+04 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > ? Count: number of times phase was executed > ? Time and Flops/sec: Max - maximum over all processors > ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors > ? Mess: number of messages sent > ? Avg. len: average message length > ? Reduct: number of global reductions > ? Global: entire computation > ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase > ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase > ? ? ?%R - percent reductions in this phase > ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ? ? ?########################################################## > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? # > ? ? ?# ? macros. To get timing results we always recommend ? ?# > ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?# > ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # > ? ? ?########################################################## > > > Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total > ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > VecDot ? ? ? ? ? ? ?5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 0.0e+00 5.4e+03 ?1 10 ?0 ?0 38 ? 1 10 ?0 ?0 38 ? 234 > VecNorm ? ? ? ? ? ? 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 3.3e+03 ?0 ?6 ?0 ?0 23 ? 0 ?6 ?0 ?0 23 ?2139 > VecCopy ? ? ? ? ? ? 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecSet ? ? ? ? ? ? ?4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecAXPY ? ? ? ? ? ? 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 13 ?0 ?0 ?0 ? 1 13 ?0 ?0 ?0 ? 715 > VecAYPX ? ? ? ? ? ? 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 837 > VecAssemblyBegin ? ?1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 25 ? 0 ?0 ?0 ?0 25 ? ? 0 > VecAssemblyEnd ? ? ?1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecPointwiseMult ? ?3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?4 ?0 ?0 ?0 ? 1 ?4 ?0 ?0 ?0 ? 250 > VecScatterBegin ? ? 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0 > VecScatterEnd ? ? ? 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSetup ? ? ? ? ? ? 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSolve ? ? ? ? ? ? 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 2.9e+04 9.0e+03 37100100100 62 ?37100100100 62 ? 103 > PCSetUp ? ? ? ? ? ? ?600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 26 ?0 ?0 ?0 ?1 ?26 ?0 ?0 ?0 ?1 ? ? 0 > PCApply ? ? ? ? ? ? 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 0.0e+00 1.0e+02 ?5 ?4 ?0 ?0 ?1 ? 5 ?4 ?0 ?0 ?1 ? ?28 > MatMult ? ? ? ? ? ? 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 2.9e+04 0.0e+00 ?3 40 69 69 ?0 ? 3 40 69 69 ?0 ? 464 > MatMultTranspose ? ?1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 2.9e+04 0.0e+00 ?1 20 31 31 ?0 ? 1 20 31 31 ?0 ? 621 > MatConvert ? ? ? ? ? 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0 > MatAssemblyBegin ? ? 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?8 ? 0 ?0 ?0 ?0 ?8 ? ? 0 > MatAssemblyEnd ? ? ? 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?4 ? 0 ?0 ?0 ?0 ?4 ? ? 0 > MatGetRow ? ? ? ?12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0 > MatGetRowIJ ? ? ? ? ?200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem. > > --- Event Stage 0: Main Stage > > ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0 > ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0 > ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ======================================================================================================================== > Average time to get PetscTime(): 2.14577e-07 > Average time for MPI_Barrier(): 3.8147e-07 > Average time for zero size MPI_Send(): 1.39475e-05 > > > > > -------------------------------------- > --- Hypre BoomerAMG preconditioner --- > -------------------------------------- > > KSP Object: > ?type: cg > ?maximum iterations=500 > ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: hypre > ? ?HYPRE BoomerAMG preconditioning > ? ?HYPRE BoomerAMG: Cycle type V > ? ?HYPRE BoomerAMG: Maximum number of levels 25 > ? ?HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > ? ?HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > ? ?HYPRE BoomerAMG: Threshold for strong coupling 0.25 > ? ?HYPRE BoomerAMG: Interpolation truncation factor 0 > ? ?HYPRE BoomerAMG: Interpolation: max elements per row 0 > ? ?HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > ? ?HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > ? ?HYPRE BoomerAMG: Maximum row sums 0.9 > ? ?HYPRE BoomerAMG: Sweeps down ? ? ? ? 1 > ? ?HYPRE BoomerAMG: Sweeps up ? ? ? ? ? 1 > ? ?HYPRE BoomerAMG: Sweeps on coarse ? ?1 > ? ?HYPRE BoomerAMG: Relax down ? ? ? ? ?symmetric-SOR/Jacobi > ? ?HYPRE BoomerAMG: Relax up ? ? ? ? ? ?symmetric-SOR/Jacobi > ? ?HYPRE BoomerAMG: Relax on coarse ? ? Gaussian-elimination > ? ?HYPRE BoomerAMG: Relax weight ?(all) ? ? ?1 > ? ?HYPRE BoomerAMG: Outer relax weight (all) 1 > ? ?HYPRE BoomerAMG: Using CF-relaxation > ? ?HYPRE BoomerAMG: Measure type ? ? ? ?local > ? ?HYPRE BoomerAMG: Coarsen type ? ? ? ?Falgout > ? ?HYPRE BoomerAMG: Interpolation type ?classical > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=256576, cols=256576 > ? ?total: nonzeros=1769552, allocated nonzeros=1769552 > ? ? ?not using I-node (on process 0) routines > > ************************************************************************************************************************ > *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?*** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 09:53:07 2009 > Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 > > ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total > Time (sec): ? ? ? ? ? 7.080e+02 ? ? ?1.00000 ? 7.080e+02 > Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03 > Flops: ? ? ? ? ? ? ? ?1.054e+10 ? ? ?1.00076 ? 1.054e+10 ?2.107e+10 > Flops/sec: ? ? ? ? ? ?1.489e+07 ? ? ?1.00076 ? 1.488e+07 ?2.977e+07 > MPI Messages: ? ? ? ? 3.857e+03 ? ? ?1.00000 ? 3.857e+03 ?7.714e+03 > MPI Message Lengths: ?1.135e+08 ? ? ?1.00000 ? 2.942e+04 ?2.270e+08 > MPI Reductions: ? ? ? 5.800e+03 ? ? ?1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops > ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions -- > ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total > ?0: ? ? ?Main Stage: 7.0799e+02 100.0% ?2.1075e+10 100.0% ?7.714e+03 100.0% ?2.942e+04 ? ? ?100.0% ?1.160e+04 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > ? Count: number of times phase was executed > ? Time and Flops/sec: Max - maximum over all processors > ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors > ? Mess: number of messages sent > ? Avg. len: average message length > ? Reduct: number of global reductions > ? Global: entire computation > ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase > ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase > ? ? ?%R - percent reductions in this phase > ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ? ? ?########################################################## > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# > ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? # > ? ? ?# ? macros. To get timing results we always recommend ? ?# > ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?# > ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # > ? ? ?########################################################## > > > Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total > ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > VecDot ? ? ? ? ? ? ?3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?9 ?0 ?0 31 ? 0 ?9 ?0 ?0 31 ?1001 > VecNorm ? ? ? ? ? ? 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 0.0e+00 2.3e+03 ?0 ?6 ?0 ?0 20 ? 0 ?6 ?0 ?0 20 ?1781 > VecCopy ? ? ? ? ? ? 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecSet ? ? ? ? ? ? ?3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecAXPY ? ? ? ? ? ? 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 12 ?0 ?0 ?0 ? 1 12 ?0 ?0 ?0 ? 674 > VecAYPX ? ? ? ? ? ? 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 774 > VecAssemblyBegin ? ?1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 31 ? 0 ?0 ?0 ?0 31 ? ? 0 > VecAssemblyEnd ? ? ?1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > VecPointwiseMult ? ?4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?5 ?0 ?0 ?0 ? 1 ?5 ?0 ?0 ?0 ? 252 > VecScatterBegin ? ? 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0 > VecScatterEnd ? ? ? 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSetup ? ? ? ? ? ? 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > KSPSolve ? ? ? ? ? ? 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 2.9e+04 6.2e+03 38100100100 53 ?38100100100 53 ? ?77 > PCSetUp ? ? ? ? ? ? ?600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 23 ?0 ?0 ?0 ?2 ?23 ?0 ?0 ?0 ?2 ? ? 0 > PCApply ? ? ? ? ? ? 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 1.0e+02 10 ?5 ?0 ?0 ?1 ?10 ?5 ?0 ?0 ?1 ? ?14 > MatMult ? ? ? ? ? ? 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 2.9e+04 0.0e+00 ?2 36 60 60 ?0 ? 2 36 60 60 ?0 ? 557 > MatMultTranspose ? ?1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 26 40 40 ?0 ? 1 26 40 40 ?0 ? 626 > MatConvert ? ? ? ? ? 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0 > MatAssemblyBegin ? ? 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 10 ? 0 ?0 ?0 ?0 10 ? ? 0 > MatAssemblyEnd ? ? ? 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?5 ? 0 ?0 ?0 ?0 ?5 ? ? 0 > MatGetRow ? ? ? ?12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0 > MatGetRowIJ ? ? ? ? ?200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem. > > --- Event Stage 0: Main Stage > > ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0 > ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0 > ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0 > ======================================================================================================================== > Average time to get PetscTime(): 1.90735e-07 > Average time for MPI_Barrier(): 8.10623e-07 > Average time for zero size MPI_Send(): 1.95503e-05 > OptionTable: -log_summary > > > > > -----Original Message----- > Date: Tue, 14 Jul 2009 10:42:58 -0500 > From: Barry Smith > Subject: Re: hypre preconditioners > To: PETSc users list > Message-ID: > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > > > ? ?First run the three cases with -log_summary (also -ksp_view to see > exact solver options that are being used) and send those files. This > will tell us where the time is being spent; without this information > any comments are pure speculation. (For example, the "copy" time to > hypre format is trivial compared to the time to build a hypre > preconditioner and not the problem). > > > ? ?What you report is not uncommon; the setup and per iteration cost > of the hypre preconditioners will be much larger than the simpler > Jacobi preconditioner. > > ? ?Barry > > On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote: > >> >> I'm solving the steady incompressible Navier-Stokes equations >> (discretized with FV on unstructured grids) using the SIMPLE >> Pressure Correction method. I'm using Picard linearization and solve >> the system for the momentum equations with BICG and for the pressure >> equation with CG. Currently, for parallel runs, I'm using JACOBI as >> a preconditioner. My grids typically have a few million cells and I >> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux >> cluster). A significant portion of the CPU time goes into solving >> the pressure equation. To reach the relative tolerance I need, CG >> with JACOBI takes about 100 iterations per outer loop for these >> problems. >> >> In order to reduce CPU time, I've compiled PETSc with support for >> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a >> preconditioner for the pressure equation. With default settings, >> both BoomerAMG and Euclid greatly reduce the number of iterations: >> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10. >> However, I do not get any reduction in CPU time. With Euclid, CPU >> time is similar to JACOBI and with BoomerAMG it is approximately >> doubled. >> >> Is this what one can expect? Are BoomerAMG and Euclid meant for much >> larger problems? I understand Hypre uses a different matrix storage >> format, is CPU time 'lost in translation' between PETSc and Hypre >> for these small problems? Are there maybe any settings I should >> change? >> >> Chris >> >> >> >> >> >> >> >> >> >> dr. ir. Christiaan Klaij >> CFD Researcher >> Research & Development >> MARIN >> 2, Haagsteeg >> c.klaij at marin.nl >> P.O. Box 28 >> T +31 317 49 39 11 >> 6700 AA ?Wageningen >> F +31 317 49 32 45 >> T ?+31 317 49 33 44 >> The Netherlands >> I ?www.marin.nl >> >> >> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2 >> >> >> This e-mail may be confidential, privileged and/or protected by >> copyright. If you are not the intended recipient, you should return >> it to the sender immediately and delete your copy from your system. >> > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 15358 bytes Desc: not available URL: From jed at 59A2.org Thu Jul 16 03:06:48 2009 From: jed at 59A2.org (Jed Brown) Date: Thu, 16 Jul 2009 10:06:48 +0200 Subject: hypre preconditioners In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F787@MAR150CV1.marin.local> References: <5D9143EF9FADE942BEF6F2A636A861170800F787@MAR150CV1.marin.local> Message-ID: <4A5EDF98.7090508@59A2.org> Klaij, Christiaan wrote: > The velocity problem is segregated (I use BICG with Jacobi for the 3 > linear systems) but these need (much) less iterations than the pressure > problem. The pressure matrix changes at each solve. It may change, but it might still make a good preconditioner for several time steps. How many dofs are in your pressure system? You mentioned a few million cells, but it makes a huge difference whether you are using tets vs. hexes, and what the pressure space is. If the pressure space is around 1M dofs, the system is relatively well-conditioned to converge in only 100 iterations with Jacobi which means that you stand a good chance of getting acceptable performance from a 1-level DD preconditioner (block jacobi or small-overlap additive Schwarz). So try Barry's suggestion of -pc_type sor and also -pc_type asm with a few choices of subdomain solver (-sub_pc_type). > Also, I did try ML and, like you say, it needs about two times more > iterations than boomerAMG. Overall, boomerAMG is a bit faster for my > cases than ML. To speed up Hypre, I've found these options to be especially useful. -pc_hypre_boomeramg_strong_threshold defaults to 0.25 which is good for 2D scalar problems, change to 0.5 or above for 3D problems -pc_hypre_boomeramg_agg_nl set this greater than 0 to use aggressive coarsening However, I almost always find ML to be faster. By default, it uses way more levels than you want (often making the coarse level have only 1 dof instead of around 1000) so try reducing -pc_ml_maxNlevels. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: From tchouanm at msn.com Thu Jul 16 04:44:41 2009 From: tchouanm at msn.com (STEPHANE TCHOUANMO) Date: Thu, 16 Jul 2009 11:44:41 +0200 Subject: Petsc command to get the conditionning of matrices In-Reply-To: References: Message-ID: Dear all, I solve a non-linear problem in Petsc using the classical Newton method. I give to Petsc the jacobian matrix and residuals at each Newton iteration. Is there a way to get the conditionning of my Jacobian matrices? like a command when running my exectable (-ksp_.. or -snes_.. or -pc_..) Thanks. Stephane _________________________________________________________________ More than messages?check out the rest of the Windows Live?. http://www.microsoft.com/windows/windowslive/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jul 16 05:39:45 2009 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 16 Jul 2009 05:39:45 -0500 Subject: Petsc command to get the conditionning of matrices In-Reply-To: References: Message-ID: You can try -ksp_monitor_singular_value. Matt On Thu, Jul 16, 2009 at 4:44 AM, STEPHANE TCHOUANMO wrote: > Dear all, > > I solve a non-linear problem in Petsc using the classical Newton > method. I give to Petsc the jacobian matrix and residuals at each Newton > iteration. > Is there a way to get the conditionning of my Jacobian matrices? like a > command when running my exectable (-ksp_.. or -snes_.. or -pc_..) > > Thanks. > > Stephane > > > ------------------------------ > check out the rest of the Windows Live?. More than mail?Windows Live? goes > way beyond your inbox. More than messages > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Thu Jul 16 05:44:36 2009 From: jed at 59A2.org (Jed Brown) Date: Thu, 16 Jul 2009 12:44:36 +0200 Subject: Petsc command to get the conditionning of matrices In-Reply-To: References: Message-ID: <4A5F0494.2090205@59A2.org> STEPHANE TCHOUANMO wrote: > Is there a way to get the conditionning of my Jacobian matrices? like a command when running my exectable (-ksp_.. or -snes_.. or -pc_..) I use -ksp_monitor_singular_value estimate at every Krylov iteration, only works with GMRES and CG -ksp_compute_eigenvalues estimate of a few eigenvalues from the iteration (GMRES and CG) -ksp_compute_eigenvalues_explicitly sometimes useful for very small systems, e.g. to find the size of a null space while debugging -ksp_plot_eigenvalues_explicitly again, only for very small problems. Note that these eigenvalues and singular values are not reliable for eigen-analysis, they are only intended to help understand why iterative methods are working a certain way. If you care about accurate eigen/singular values, use SLEPc. I noticed a few weeks ago that -ksp_compute_singularvalues didn't work as advertised. A few lines needed to be added to KSPSolve(), as for -ksp_compute_eigenvalues. I usually use -ksp_monitor_singular_value instead, so it hadn't bothered me, but it's now in petsc-dev: http://petsc.cs.iit.edu/petsc/petsc-dev/rev/3180aa7f49b4 Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: From C.Klaij at marin.nl Thu Jul 16 09:20:17 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 16 Jul 2009 16:20:17 +0200 Subject: hypre preconditioners References: Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F789@MAR150CV1.marin.local> Barry, Thanks for your suggestions, I especially like the idea of keeping the same preconditioner for several solves; that's definitely worth a try. Chris -----Original Message----- Date: Wed, 15 Jul 2009 15:26:17 -0500 From: Barry Smith Subject: Re: hypre preconditioners To: PETSc users list Message-ID: <83E2B8C2-9475-45C6-A448-502114D4959D at mcs.anl.gov> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes On Jul 15, 2009, at 11:23 AM, Lisandro Dalcin wrote: > Did you try Block-Jacobi for the velocity problem? You can try -pc_type sor and it will run block Jacobi with one symmetric sweep of SOR for each iteration. This may be faster than your plain Jacobi. > If the matrix of > your presure problem changes in each solve (is this your case?) could > you try to use ML? In my little experience, ML leads to lower setup > times, but higher iteration counts (let say twice); perhaps it will be > faster than BommerAMG for you use case. ML is worth trying. Also you might try "playing" with the various boomerAMG options. I don't know them in detail so cannot make suggestions, but the various ways of coarsening control how quickly the setup time is. Finally, if the matrix is not changing much for each new solve you can use the same boomerAMG preconditioner for several linear solves. Just use SAME_PRECONDITIONER as the argument to KSPSetOperators() and it will not create a new preconditioner until you call it with SAME_NONZERO_PATTERN. I am thinking this might work very well for you. Barry > > > On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan > wrote: >> Barry, >> >> Thanks for your reply! Below is the information from KSPView and - >> log_summary for the three cases. Indeed PCSetUp takes much more >> time with the hypre preconditioners. >> >> Chris >> >> ----------------------------- >> --- Jacobi preconditioner --- >> ----------------------------- >> >> KSP Object: >> type: cg >> maximum iterations=500 >> tolerances: relative=0.05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: jacobi >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=256576, cols=256576 >> total: nonzeros=1769552, allocated nonzeros=1769552 >> not using I-node (on process 0) routines >> >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij >> Wed Jul 15 10:22:04 2009 >> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 >> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 >> >> Max Max/Min Avg Total >> Time (sec): 6.037e+02 1.00000 6.037e+02 >> Objects: 9.270e+02 1.00000 9.270e+02 >> Flops: 5.671e+10 1.00065 5.669e+10 1.134e+11 >> Flops/sec: 9.393e+07 1.00065 9.390e+07 1.878e+08 >> MPI Messages: 1.780e+04 1.00000 1.780e+04 3.561e+04 >> MPI Message Lengths: 5.239e+08 1.00000 2.943e+04 1.048e+09 >> MPI Reductions: 2.651e+04 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of >> length N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 6.0374e+02 100.0% 1.1338e+11 100.0% 3.561e >> +04 100.0% 2.943e+04 100.0% 5.302e+04 100.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops/sec: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all >> processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with >> PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in >> this phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >> time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was run without the PreLoadBegin() # >> # macros. To get timing results we always recommend # >> # preloading. otherwise timing numbers may be # >> # meaningless. # >> ########################################################## >> >> >> Event Count Time (sec) Flops/ >> sec --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> VecDot 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 >> 0.0e+00 3.1e+04 2 14 0 0 59 2 14 0 0 59 1249 >> VecNorm 16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 >> 0.0e+00 1.6e+04 0 7 0 0 31 0 7 0 0 31 3569 >> VecCopy 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 3 15 0 0 0 3 15 0 0 0 864 >> VecAYPX 16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 8 0 0 0 1 8 0 0 0 1144 >> VecAssemblyBegin 1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.6e+03 0 0 0 0 7 0 0 0 0 7 0 >> VecAssemblyEnd 1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 2 4 0 0 0 2 4 0 0 0 323 >> VecScatterBegin 17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 >> 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 >> VecScatterEnd 17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetup 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 >> 2.9e+04 4.8e+04 27100100100 90 27100100100 90 686 >> PCSetUp 600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> PCApply 18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 >> 0.0e+00 1.0e+00 2 4 0 0 0 2 4 0 0 0 322 >> MatMult 16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 >> 2.9e+04 0.0e+00 15 47 91 91 0 15 47 91 91 0 570 >> MatMultTranspose 1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 >> 2.9e+04 0.0e+00 1 5 9 9 0 1 5 9 9 0 624 >> MatAssemblyBegin 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.2e+03 0 0 0 0 2 0 0 0 0 2 0 >> MatAssemblyEnd 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 >> 1.5e+04 6.1e+02 0 0 0 0 1 0 0 0 0 1 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory >> Descendants' Mem. >> >> --- Event Stage 0: Main Stage >> >> Index Set 4 4 30272 0 >> Vec 913 902 926180816 0 >> Vec Scatter 2 0 0 0 >> Krylov Solver 1 0 0 0 >> Preconditioner 1 0 0 0 >> Matrix 6 0 0 0 >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> Average time to get PetscTime(): 2.14577e-07 >> Average time for MPI_Barrier(): 8.10623e-07 >> Average time for zero size MPI_Send(): 2.0504e-05 >> >> >> >> ----------------------------------- >> --- Hypre Euclid preconditioner --- >> ----------------------------------- >> >> KSP Object: >> type: cg >> maximum iterations=500 >> tolerances: relative=0.05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: hypre >> HYPRE Euclid preconditioning >> HYPRE Euclid: number of levels 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=256576, cols=256576 >> total: nonzeros=1769552, allocated nonzeros=1769552 >> not using I-node (on process 0) routines >> >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij >> Wed Jul 15 10:10:05 2009 >> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 >> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 >> >> Max Max/Min Avg Total >> Time (sec): 6.961e+02 1.00000 6.961e+02 >> Objects: 1.227e+03 1.00000 1.227e+03 >> Flops: 1.340e+10 1.00073 1.340e+10 2.679e+10 >> Flops/sec: 1.925e+07 1.00073 1.924e+07 3.848e+07 >> MPI Messages: 4.748e+03 1.00000 4.748e+03 9.496e+03 >> MPI Message Lengths: 1.397e+08 1.00000 2.943e+04 2.794e+08 >> MPI Reductions: 7.192e+03 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of >> length N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 6.9614e+02 100.0% 2.6790e+10 100.0% 9.496e >> +03 100.0% 2.943e+04 100.0% 1.438e+04 100.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops/sec: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all >> processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with >> PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in >> this phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >> time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was run without the PreLoadBegin() # >> # macros. To get timing results we always recommend # >> # preloading. otherwise timing numbers may be # >> # meaningless. # >> ########################################################## >> >> >> Event Count Time (sec) Flops/ >> sec --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> VecDot 5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 >> 0.0e+00 5.4e+03 1 10 0 0 38 1 10 0 0 38 234 >> VecNorm 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 >> 0.0e+00 3.3e+03 0 6 0 0 23 0 6 0 0 23 2139 >> VecCopy 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 13 0 0 0 1 13 0 0 0 715 >> VecAYPX 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 837 >> VecAssemblyBegin 1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.6e+03 0 0 0 0 25 0 0 0 0 25 0 >> VecAssemblyEnd 1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 250 >> VecScatterBegin 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 >> 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 >> VecScatterEnd 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetup 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 >> 2.9e+04 9.0e+03 37100100100 62 37100100100 62 103 >> PCSetUp 600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 2.0e+02 26 0 0 0 1 26 0 0 0 1 0 >> PCApply 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 >> 0.0e+00 1.0e+02 5 4 0 0 1 5 4 0 0 1 28 >> MatMult 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 >> 2.9e+04 0.0e+00 3 40 69 69 0 3 40 69 69 0 464 >> MatMultTranspose 1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 >> 2.9e+04 0.0e+00 1 20 31 31 0 1 20 31 31 0 621 >> MatConvert 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> MatAssemblyBegin 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.2e+03 0 0 0 0 8 0 0 0 0 8 0 >> MatAssemblyEnd 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 >> 1.5e+04 6.1e+02 0 0 0 0 4 0 0 0 0 4 0 >> MatGetRow 12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> MatGetRowIJ 200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory >> Descendants' Mem. >> >> --- Event Stage 0: Main Stage >> >> Index Set 4 4 30272 0 >> Vec 1213 1202 1234223216 0 >> Vec Scatter 2 0 0 0 >> Krylov Solver 1 0 0 0 >> Preconditioner 1 0 0 0 >> Matrix 6 0 0 0 >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> Average time to get PetscTime(): 2.14577e-07 >> Average time for MPI_Barrier(): 3.8147e-07 >> Average time for zero size MPI_Send(): 1.39475e-05 >> >> >> >> >> -------------------------------------- >> --- Hypre BoomerAMG preconditioner --- >> -------------------------------------- >> >> KSP Object: >> type: cg >> maximum iterations=500 >> tolerances: relative=0.05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: hypre >> HYPRE BoomerAMG preconditioning >> HYPRE BoomerAMG: Cycle type V >> HYPRE BoomerAMG: Maximum number of levels 25 >> HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 >> HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 >> HYPRE BoomerAMG: Threshold for strong coupling 0.25 >> HYPRE BoomerAMG: Interpolation truncation factor 0 >> HYPRE BoomerAMG: Interpolation: max elements per row 0 >> HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 >> HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 >> HYPRE BoomerAMG: Maximum row sums 0.9 >> HYPRE BoomerAMG: Sweeps down 1 >> HYPRE BoomerAMG: Sweeps up 1 >> HYPRE BoomerAMG: Sweeps on coarse 1 >> HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi >> HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi >> HYPRE BoomerAMG: Relax on coarse Gaussian-elimination >> HYPRE BoomerAMG: Relax weight (all) 1 >> HYPRE BoomerAMG: Outer relax weight (all) 1 >> HYPRE BoomerAMG: Using CF-relaxation >> HYPRE BoomerAMG: Measure type local >> HYPRE BoomerAMG: Coarsen type Falgout >> HYPRE BoomerAMG: Interpolation type classical >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=256576, cols=256576 >> total: nonzeros=1769552, allocated nonzeros=1769552 >> not using I-node (on process 0) routines >> >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript >> -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance >> Summary: ---------------------------------------------- >> >> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij >> Wed Jul 15 09:53:07 2009 >> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 >> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124 >> >> Max Max/Min Avg Total >> Time (sec): 7.080e+02 1.00000 7.080e+02 >> Objects: 1.227e+03 1.00000 1.227e+03 >> Flops: 1.054e+10 1.00076 1.054e+10 2.107e+10 >> Flops/sec: 1.489e+07 1.00076 1.488e+07 2.977e+07 >> MPI Messages: 3.857e+03 1.00000 3.857e+03 7.714e+03 >> MPI Message Lengths: 1.135e+08 1.00000 2.942e+04 2.270e+08 >> MPI Reductions: 5.800e+03 1.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of >> length N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- >> Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> 0: Main Stage: 7.0799e+02 100.0% 2.1075e+10 100.0% 7.714e >> +03 100.0% 2.942e+04 100.0% 1.160e+04 100.0% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops/sec: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all >> processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with >> PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in >> this phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >> time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was run without the PreLoadBegin() # >> # macros. To get timing results we always recommend # >> # preloading. otherwise timing numbers may be # >> # meaningless. # >> ########################################################## >> >> >> Event Count Time (sec) Flops/ >> sec --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg >> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> VecDot 3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 >> 0.0e+00 3.6e+03 0 9 0 0 31 0 9 0 0 31 1001 >> VecNorm 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 >> 0.0e+00 2.3e+03 0 6 0 0 20 0 6 0 0 20 1781 >> VecCopy 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 12 0 0 0 1 12 0 0 0 674 >> VecAYPX 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 774 >> VecAssemblyBegin 1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 >> 0.0e+00 3.6e+03 0 0 0 0 31 0 0 0 0 31 0 >> VecAssemblyEnd 1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 >> 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 252 >> VecScatterBegin 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 >> 2.9e+04 0.0e+00 0 0100100 0 0 0100100 0 0 >> VecScatterEnd 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetup 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 >> 2.9e+04 6.2e+03 38100100100 53 38100100100 53 77 >> PCSetUp 600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 2.0e+02 23 0 0 0 2 23 0 0 0 2 0 >> PCApply 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 >> 0.0e+00 1.0e+02 10 5 0 0 1 10 5 0 0 1 14 >> MatMult 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 >> 2.9e+04 0.0e+00 2 36 60 60 0 2 36 60 60 0 557 >> MatMultTranspose 1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 >> 2.9e+04 0.0e+00 1 26 40 40 0 1 26 40 40 0 626 >> MatConvert 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> MatAssemblyBegin 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.2e+03 0 0 0 0 10 0 0 0 0 10 0 >> MatAssemblyEnd 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 >> 1.5e+04 6.1e+02 0 0 0 0 5 0 0 0 0 5 0 >> MatGetRow 12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> MatGetRowIJ 200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory >> Descendants' Mem. >> >> --- Event Stage 0: Main Stage >> >> Index Set 4 4 30272 0 >> Vec 1213 1202 1234223216 0 >> Vec Scatter 2 0 0 0 >> Krylov Solver 1 0 0 0 >> Preconditioner 1 0 0 0 >> Matrix 6 0 0 0 >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> = >> ===================================================================== >> Average time to get PetscTime(): 1.90735e-07 >> Average time for MPI_Barrier(): 8.10623e-07 >> Average time for zero size MPI_Send(): 1.95503e-05 >> OptionTable: -log_summary >> >> >> >> >> -----Original Message----- >> Date: Tue, 14 Jul 2009 10:42:58 -0500 >> From: Barry Smith >> Subject: Re: hypre preconditioners >> To: PETSc users list >> Message-ID: >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes >> >> >> First run the three cases with -log_summary (also -ksp_view to see >> exact solver options that are being used) and send those files. This >> will tell us where the time is being spent; without this information >> any comments are pure speculation. (For example, the "copy" time to >> hypre format is trivial compared to the time to build a hypre >> preconditioner and not the problem). >> >> >> What you report is not uncommon; the setup and per iteration cost >> of the hypre preconditioners will be much larger than the simpler >> Jacobi preconditioner. >> >> Barry >> >> On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote: >> >>> >>> I'm solving the steady incompressible Navier-Stokes equations >>> (discretized with FV on unstructured grids) using the SIMPLE >>> Pressure Correction method. I'm using Picard linearization and solve >>> the system for the momentum equations with BICG and for the pressure >>> equation with CG. Currently, for parallel runs, I'm using JACOBI as >>> a preconditioner. My grids typically have a few million cells and I >>> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux >>> cluster). A significant portion of the CPU time goes into solving >>> the pressure equation. To reach the relative tolerance I need, CG >>> with JACOBI takes about 100 iterations per outer loop for these >>> problems. >>> >>> In order to reduce CPU time, I've compiled PETSc with support for >>> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a >>> preconditioner for the pressure equation. With default settings, >>> both BoomerAMG and Euclid greatly reduce the number of iterations: >>> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10. >>> However, I do not get any reduction in CPU time. With Euclid, CPU >>> time is similar to JACOBI and with BoomerAMG it is approximately >>> doubled. >>> >>> Is this what one can expect? Are BoomerAMG and Euclid meant for much >>> larger problems? I understand Hypre uses a different matrix storage >>> format, is CPU time 'lost in translation' between PETSc and Hypre >>> for these small problems? Are there maybe any settings I should >>> change? >>> >>> Chris >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> dr. ir. Christiaan Klaij >>> CFD Researcher >>> Research & Development >>> MARIN >>> 2, Haagsteeg >>> c.klaij at marin.nl >>> P.O. Box 28 >>> T +31 317 49 39 11 >>> 6700 AA Wageningen >>> F +31 317 49 32 45 >>> T +31 317 49 33 44 >>> The Netherlands >>> I www.marin.nl >>> >>> >>> MARIN webnews: First AMT'09 conference, Nantes, France, September >>> 1-2 >>> >>> >>> This e-mail may be confidential, privileged and/or protected by >>> copyright. If you are not the intended recipient, you should return >>> it to the sender immediately and delete your copy from your system. >>> >> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 15870 bytes Desc: not available URL: From ycollet at freesurf.fr Thu Jul 16 14:54:53 2009 From: ycollet at freesurf.fr (Collette Yann) Date: Thu, 16 Jul 2009 21:54:53 +0200 Subject: Petsc-3 MPIless Message-ID: <4A5F858D.1040701@freesurf.fr> Hello, I am currently interfacing petsc/snes to scilab (http://www.scilab.org). I worked using petsc-2.3.3-p15 and everything is nearly fine (I had some convergenge problem, but that's not really important). My petsc-2.3.3 is configure without mpi: config/configure.py --with-mpi=0 --enable-shared Now, I would like to switch to petsc-3. So I configured petsc-3 using the same command line as above. The problem I meet is that petsc-3 still required mpi. Is it possible to compilte petsc-3 without mpi ? Cheers, YC From bsmith at mcs.anl.gov Thu Jul 16 14:57:19 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 16 Jul 2009 14:57:19 -0500 Subject: Petsc-3 MPIless In-Reply-To: <4A5F858D.1040701@freesurf.fr> References: <4A5F858D.1040701@freesurf.fr> Message-ID: <6E6C11DE-E38D-4342-ABE3-2FE564A7BB19@mcs.anl.gov> PETSc 3.0 does not require MPI in the same way that 2.3.3 does not require MPI. You should be able to use the same configure options as before. If that does not work please send configure.log that is generated to petsc-maint at mcs.anl ,gov (not to this email address). Barry On Jul 16, 2009, at 2:54 PM, Collette Yann wrote: > Hello, > > I am currently interfacing petsc/snes to scilab (http:// > www.scilab.org). > I worked using petsc-2.3.3-p15 and everything is nearly fine (I had > some convergenge problem, but that's not really important). > My petsc-2.3.3 is configure without mpi: config/configure.py --with- > mpi=0 --enable-shared > > Now, I would like to switch to petsc-3. So I configured petsc-3 > using the same command line as above. > > The problem I meet is that petsc-3 still required mpi. > Is it possible to compilte petsc-3 without mpi ? > > Cheers, > > YC > > > > From C.Klaij at marin.nl Fri Jul 17 02:54:41 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 17 Jul 2009 09:54:41 +0200 Subject: hypre preconditioners References: Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F78A@MAR150CV1.marin.local> Jed, I'm using a cell centered discretization on hexa grids. I'll try sor and asm and also changing the default settings in boomeramg and ML. You say you find ML almost always faster, by how much? Thanks for your help! Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/ http://www.marin.nl/web/show/id=46836/contentid=2324 First AMT'09 conference, Nantes, France, September 1-2 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. -----Original Message----- Date: Thu, 16 Jul 2009 10:06:48 +0200 From: Jed Brown Subject: Re: hypre preconditioners To: PETSc users list Message-ID: <4A5EDF98.7090508 at 59A2.org> Content-Type: text/plain; charset="iso-8859-1" Klaij, Christiaan wrote: > The velocity problem is segregated (I use BICG with Jacobi for the 3 > linear systems) but these need (much) less iterations than the pressure > problem. The pressure matrix changes at each solve. It may change, but it might still make a good preconditioner for several time steps. How many dofs are in your pressure system? You mentioned a few million cells, but it makes a huge difference whether you are using tets vs. hexes, and what the pressure space is. If the pressure space is around 1M dofs, the system is relatively well-conditioned to converge in only 100 iterations with Jacobi which means that you stand a good chance of getting acceptable performance from a 1-level DD preconditioner (block jacobi or small-overlap additive Schwarz). So try Barry's suggestion of -pc_type sor and also -pc_type asm with a few choices of subdomain solver (-sub_pc_type). > Also, I did try ML and, like you say, it needs about two times more > iterations than boomerAMG. Overall, boomerAMG is a bit faster for my > cases than ML. To speed up Hypre, I've found these options to be especially useful. -pc_hypre_boomeramg_strong_threshold defaults to 0.25 which is good for 2D scalar problems, change to 0.5 or above for 3D problems -pc_hypre_boomeramg_agg_nl set this greater than 0 to use aggressive coarsening However, I almost always find ML to be faster. By default, it uses way more levels than you want (often making the coarse level have only 1 dof instead of around 1000) so try reducing -pc_ml_maxNlevels. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 4122 bytes Desc: not available URL: From C.Klaij at marin.nl Fri Jul 17 04:52:48 2009 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 17 Jul 2009 11:52:48 +0200 Subject: call PetscOptionsSetValue in fortran Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F78B@MAR150CV1.marin.local> I'm trying to change the options of the Hypre preconditioner using PetscOptionsSetValue in a fortran program, but I must be doing something wrong, see the session below. It works fine from the command line, though. As an example, I took ex12f from src/ksp/ksp/examples/tests (petsc-2.3.3-p13) and modified it a little. $ cat ex12f.F ! program main implicit none #include "include/finclude/petsc.h" #include "include/finclude/petscvec.h" #include "include/finclude/petscmat.h" #include "include/finclude/petscpc.h" #include "include/finclude/petscksp.h" #include "include/finclude/petscviewer.h" ! ! This example is the Fortran version of ex6.c. The program reads a PETSc matrix ! and vector from a file and solves a linear system. Input arguments are: ! -f : file to load. For a 5X5 example of the 5-pt. stencil ! use the file petsc/src/mat/examples/matbinary.ex ! PetscErrorCode ierr PetscInt its PetscTruth flg PetscScalar norm,none Vec x,b,u Mat A character*(128) f PetscViewer fd MatInfo info(MAT_INFO_SIZE) KSP ksp ! cklaij: adding pc PC pc ! cklaij: adding pc end none = -1.0 call PetscInitialize(PETSC_NULL_CHARACTER,ierr) ! Read in matrix and RHS call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',f,flg,ierr) print *,f call PetscViewerBinaryOpen(PETSC_COMM_WORLD,f,FILE_MODE_READ, & & fd,ierr) call MatLoad(fd,MATSEQAIJ,A,ierr) ! Get information about matrix call MatGetInfo(A,MAT_GLOBAL_SUM,info,ierr) write(*,100) info(MAT_INFO_ROWS_GLOBAL), & & info(MAT_INFO_COLUMNS_GLOBAL), & & info(MAT_INFO_ROWS_LOCAL),info(MAT_INFO_COLUMNS_LOCAL), & & info(MAT_INFO_BLOCK_SIZE),info(MAT_INFO_NZ_ALLOCATED), & & info(MAT_INFO_NZ_USED),info(MAT_INFO_NZ_UNNEEDED), & & info(MAT_INFO_MEMORY),info(MAT_INFO_ASSEMBLIES), & & info(MAT_INFO_MALLOCS) 100 format(11(g7.1,1x)) call VecLoad(fd,PETSC_NULL_CHARACTER,b,ierr) call PetscViewerDestroy(fd,ierr) ! Set up solution call VecDuplicate(b,x,ierr) call VecDuplicate(b,u,ierr) ! Solve system call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) call KSPSetFromOptions(ksp,ierr) ! cklaij: try boomeramg call KSPGetPC(ksp,pc,ierr) call PCSetType(pc,PCHYPRE,ierr) call PCHYPRESetType(pc,"boomeramg",ierr) call PetscOptionsSetValue & ("-pc_hypre_boomeramg_strong_threshold","0.5",ierr) ! cklaij: try boomeramg end call KSPSolve(ksp,b,x,ierr) ! Show result call MatMult(A,x,u,ierr) call VecAXPY(u,none,b,ierr) call VecNorm(u,NORM_2,norm,ierr) call KSPGetIterationNumber(ksp,its,ierr) print*, 'Number of iterations = ',its print*, 'Residual norm = ',norm ! Cleanup call KSPDestroy(ksp,ierr) call VecDestroy(b,ierr) call VecDestroy(x,ierr) call VecDestroy(u,ierr) call MatDestroy(A,ierr) call PetscFinalize(ierr) end $ ./ex12f -f ../../../../mat/examples/matbinary.ex -ksp_view | grep Threshold HYPRE BoomerAMG: Threshold for strong coupling 0.25 $ ./ex12f -f ../../../../mat/examples/matbinary.ex -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_strong_threshold 0.5 -ksp_view | grep Threshold HYPRE BoomerAMG: Threshold for strong coupling 0.5 $ dr. ir. Christiaan Klaij CFD Researcher Research & Development mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/ http://www.marin.nl/web/show/id=46836/contentid=2324 First AMT'09 conference, Nantes, France, September 1-2 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1069 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1622 bytes Desc: not available URL: From s.kramer at imperial.ac.uk Fri Jul 17 07:44:09 2009 From: s.kramer at imperial.ac.uk (Stephan Kramer) Date: Fri, 17 Jul 2009 13:44:09 +0100 Subject: non-local values being dropped in MatSetValues Message-ID: <4A607219.3070609@imperial.ac.uk> Hello We've spend sometime debugging a problem were in the assembly of a parallel MPIAIJ matrix, some values that were created on a process other than the owner of the row seemed to disappear. I think I narrowed it down to what I think is a bug in MatSetValues_MPIAIJ, but please tell me if I'm wrong. The situation is the following: I'm calling MatSetValues with the flag ADD_VALUES and with matrix option MAT_IGNORE_ZERO_ENTRIES. I'm inserting multiple values at once, multiple columns and rows, so I provide a rank-2 matrix of values. As I'm calling this from fortran I'm also using MAT_COLUMN_ORIENTED. Now for provided rows that are not owned by the process, it jumps to mpiaij.c:394 (line numbers as in petsc-dev). On line 399, it checks for zero entries, but only checks the very first entry of the (non-owned) row. If however other entries of that same row are nonzero, the entire row is still dropped. Note that this is independent of row_oriented/column_oriented as line 396 does exactly the same. If I don't set the option MAT_IGNORE_ZERO_ENTRIES the problem disappears. In that case however we would either have to preallocate substantially more nonzeros, or complicate the matrix assembly in our code by taking out the zero entries ourselves and call MatSetValues for each entry seperately. Your help would be much appreciated, Cheers Stephan -- Stephan Kramer Applied Modelling and Computation Group, Department of Earth Science and Engineering, Imperial College London From bsmith at mcs.anl.gov Fri Jul 17 10:51:53 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 17 Jul 2009 10:51:53 -0500 Subject: call PetscOptionsSetValue in fortran In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F78B@MAR150CV1.marin.local> References: <5D9143EF9FADE942BEF6F2A636A861170800F78B@MAR150CV1.marin.local> Message-ID: <47364ED4-B1E5-4713-A784-CF74F52D16AD@mcs.anl.gov> You are calling KSPSetFromOptions() BEFORE setting the PC type to hypre and setting boomeramg and before you call PetscOptionsSetValue(). You should call PetscOptionsSetValue() then PCSetType() then PCHypreSetType() then KSPSetFromOptions(). Barry On Jul 17, 2009, at 4:52 AM, Klaij, Christiaan wrote: > > I'm trying to change the options of the Hypre preconditioner using > PetscOptionsSetValue in a fortran program, but I must be doing > something wrong, see the session below. It works fine from the > command line, though. As an example, I took ex12f from src/ksp/ksp/ > examples/tests (petsc-2.3.3-p13) and modified it a little. > > > $ cat ex12f.F > ! > program main > implicit none > > #include "include/finclude/petsc.h" > #include "include/finclude/petscvec.h" > #include "include/finclude/petscmat.h" > #include "include/finclude/petscpc.h" > #include "include/finclude/petscksp.h" > #include "include/finclude/petscviewer.h" > ! > ! This example is the Fortran version of ex6.c. The program reads > a PETSc matrix > ! and vector from a file and solves a linear system. Input > arguments are: > ! -f : file to load. For a 5X5 example of the 5- > pt. stencil > ! use the file petsc/src/mat/examples/ > matbinary.ex > ! > > PetscErrorCode ierr > PetscInt its > PetscTruth flg > PetscScalar norm,none > Vec x,b,u > Mat A > character*(128) f > PetscViewer fd > MatInfo info(MAT_INFO_SIZE) > KSP ksp > ! cklaij: adding pc > PC pc > ! cklaij: adding pc end > > none = -1.0 > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > ! Read in matrix and RHS > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',f,flg,ierr) > print *,f > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,f,FILE_MODE_READ, & > & fd,ierr) > > call MatLoad(fd,MATSEQAIJ,A,ierr) > > ! Get information about matrix > call MatGetInfo(A,MAT_GLOBAL_SUM,info,ierr) > write(*,100) > info(MAT_INFO_ROWS_GLOBAL), & > & > info(MAT_INFO_COLUMNS_GLOBAL), & > & > info(MAT_INFO_ROWS_LOCAL),info(MAT_INFO_COLUMNS_LOCAL), & > & > info(MAT_INFO_BLOCK_SIZE),info(MAT_INFO_NZ_ALLOCATED), & > & > info(MAT_INFO_NZ_USED),info(MAT_INFO_NZ_UNNEEDED), & > & > info(MAT_INFO_MEMORY),info(MAT_INFO_ASSEMBLIES), & > & info(MAT_INFO_MALLOCS) > > 100 format(11(g7.1,1x)) > call VecLoad(fd,PETSC_NULL_CHARACTER,b,ierr) > call PetscViewerDestroy(fd,ierr) > > ! Set up solution > call VecDuplicate(b,x,ierr) > call VecDuplicate(b,u,ierr) > > ! Solve system > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) > call KSPSetFromOptions(ksp,ierr) > ! cklaij: try boomeramg > call KSPGetPC(ksp,pc,ierr) > call PCSetType(pc,PCHYPRE,ierr) > call PCHYPRESetType(pc,"boomeramg",ierr) > call PetscOptionsSetValue > & ("-pc_hypre_boomeramg_strong_threshold","0.5",ierr) > ! cklaij: try boomeramg end > call KSPSolve(ksp,b,x,ierr) > > ! Show result > call MatMult(A,x,u,ierr) > call VecAXPY(u,none,b,ierr) > call VecNorm(u,NORM_2,norm,ierr) > call KSPGetIterationNumber(ksp,its,ierr) > print*, 'Number of iterations = ',its > print*, 'Residual norm = ',norm > > ! Cleanup > call KSPDestroy(ksp,ierr) > call VecDestroy(b,ierr) > call VecDestroy(x,ierr) > call VecDestroy(u,ierr) > call MatDestroy(A,ierr) > > call PetscFinalize(ierr) > end > > $ ./ex12f -f ../../../../mat/examples/matbinary.ex -ksp_view | grep > Threshold > HYPRE BoomerAMG: Threshold for strong coupling 0.25 > $ ./ex12f -f ../../../../mat/examples/matbinary.ex -pc_type hypre - > pc_hypre_type boomeramg -pc_hypre_boomeramg_strong_threshold 0.5 - > ksp_view | grep Threshold > HYPRE BoomerAMG: Threshold for strong coupling 0.5 > $ > > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > MARIN > 2, Haagsteeg > c.klaij at marin.nl > P.O. Box 28 > T +31 317 49 39 11 > 6700 AA Wageningen > F +31 317 49 32 45 > T +31 317 49 33 44 > The Netherlands > I www.marin.nl > > > MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2 > > > This e-mail may be confidential, privileged and/or protected by > copyright. If you are not the intended recipient, you should return > it to the sender immediately and delete your copy from your system. > From vyan2000 at gmail.com Fri Jul 17 17:18:08 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Fri, 17 Jul 2009 18:18:08 -0400 Subject: About src/ksp/pc/impls/fieldsplit/fieldsplit.c Message-ID: Hi All, I have some difficulty of understanding the struct PC_FieldSplit. From the definition I can see that it has a data structure like a "list" and each member in the list is representing an object Field(or PC_Field). Suppose that my matrix has a blocksize of 5. I want to Set 3 Fields, i.e. field_0 as {0,1}, field_1 as{2,3},and field_3 as {4}. Then can anyone please help me to fill in the following parameter. PC_FieldSplit *jac = (PC_FieldSplit*)pc->data; what is jac->nsplit? 3, is my guess, since we have split the matrix into 3 split, namely field_0, field_1, field_2. PC_FieldSplitLink ilink = jac->head; what is ilink->nfields? 2, is my guess, since the fields_0 has 2 fields inside. Then ilink=ilink->next; ilink=ilin->next; ilink->nfileds should be 1 right? Thank you very much in advance. Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jul 17 18:23:50 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 17 Jul 2009 18:23:50 -0500 Subject: About src/ksp/pc/impls/fieldsplit/fieldsplit.c In-Reply-To: References: Message-ID: <1DDB2681-2D29-4373-900B-A3E8BD958BF0@mcs.anl.gov> You NEVER want to be building these linked lists yourself. You should use the PCFieldSplit API to construct the fields you want. If you are interested in seeing what the result is then run your code in the debugger and just look at the various links and fields. Barry On Jul 17, 2009, at 5:18 PM, Ryan Yan wrote: > Hi All, > I have some difficulty of understanding the struct PC_FieldSplit. > From the definition I can see that it has a data structure like a > "list" and each member in the list is representing an object > Field(or PC_Field). > > Suppose that my matrix has a blocksize of 5. I want to Set 3 > Fields, i.e. field_0 as {0,1}, field_1 as{2,3},and field_3 as {4}. > Then can anyone please help me to fill in the following parameter. > > PC_FieldSplit *jac = (PC_FieldSplit*)pc->data; > > what is jac->nsplit? 3, is my guess, since we have split the > matrix into 3 split, namely field_0, field_1, field_2. > > PC_FieldSplitLink ilink = jac->head; > > what is ilink->nfields? 2, is my guess, since the fields_0 has 2 > fields inside. > > Then ilink=ilink->next; ilink=ilin->next; ilink->nfileds should be 1 > right? > > Thank you very much in advance. > > Yan From bsmith at mcs.anl.gov Fri Jul 17 20:02:02 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 17 Jul 2009 20:02:02 -0500 Subject: non-local values being dropped in MatSetValues In-Reply-To: <4A607219.3070609@imperial.ac.uk> References: <4A607219.3070609@imperial.ac.uk> Message-ID: <80B2A52B-2566-4336-A52C-1744E21E200E@mcs.anl.gov> Stephan, I'm sorry for your wasted time finding this bug. I have fixed it in the Mecurial version of PETSc 3.0.0 and in petsc-dev. It will be fixed in the next 3.0.0 patch that we release. Barry On Jul 17, 2009, at 7:44 AM, Stephan Kramer wrote: > Hello > > We've spend sometime debugging a problem were in the assembly of a > parallel MPIAIJ matrix, some values that were created on a process > other than the owner of the row seemed to disappear. I think I > narrowed it down to what I think is a bug in MatSetValues_MPIAIJ, > but please tell me if I'm wrong. > > The situation is the following: I'm calling MatSetValues with the > flag ADD_VALUES and with matrix option MAT_IGNORE_ZERO_ENTRIES. I'm > inserting multiple values at once, multiple columns and rows, so I > provide a rank-2 matrix of values. As I'm calling this from fortran > I'm also using MAT_COLUMN_ORIENTED. Now for provided rows that are > not owned by the process, it jumps to mpiaij.c:394 (line numbers as > in petsc-dev). On line 399, it checks for zero entries, but only > checks the very first entry of the (non-owned) row. If however other > entries of that same row are nonzero, the entire row is still > dropped. Note that this is independent of row_oriented/ > column_oriented as line 396 does exactly the same. > > If I don't set the option MAT_IGNORE_ZERO_ENTRIES the problem > disappears. In that case however we would either have to preallocate > substantially more nonzeros, or complicate the matrix assembly in > our code by taking out the zero entries ourselves and call > MatSetValues for each entry seperately. > > Your help would be much appreciated, > Cheers > Stephan > > > -- > Stephan Kramer > Applied Modelling and Computation Group, > Department of Earth Science and Engineering, > Imperial College London From vyan2000 at gmail.com Sat Jul 18 23:24:46 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 19 Jul 2009 00:24:46 -0400 Subject: PETSC debugger Message-ID: Hi All, I am tring to use the PETSc runtime option -start_in_debugger. However, when I attach the debugger at run time to each process, there are error messages and I only get one gdb window(Am I suppose to get as many as the number of the processes?) vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger -display :0.0 [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display :0.0 on machine vyan2000-linux [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display :0.0 on machine vyan2000-linux Then, only a *single* gdb window prompts out. When I run with 3 process, there are only *two* gdb windows. Thank you very much in advance, Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jul 18 23:30:59 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 18 Jul 2009 23:30:59 -0500 Subject: PETSC debugger In-Reply-To: References: Message-ID: Try using -display vyan2000-linux:0.0 Shouldn't make any difference but since it appears you are running everything on the same machine what you have given should work. Barry On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: > Hi All, > I am tring to use the PETSc runtime option -start_in_debugger. > > However, when I attach the debugger at run time to each process, > there are error messages and I only get one gdb window(Am I suppose > to get as many as the number of the processes?) > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ > examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - > ksp_monitor_true_residual -start_in_debugger -display :0.0 > > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on > display :0.0 on machine vyan2000-linux > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on > display :0.0 on machine vyan2000-linux > > Then, only a *single* gdb window prompts out. When I run with 3 > process, there are only *two* gdb windows. > > Thank you very much in advance, > > Yan From vyan2000 at gmail.com Sat Jul 18 23:50:55 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 19 Jul 2009 00:50:55 -0400 Subject: PETSC debugger In-Reply-To: References: Message-ID: Thank you very much, Barry. After I use the vyan2000-linux:0.0, I got errors without any gdb window. vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger -display vyan2000-linux:0.0 [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on display vyan2000-linux:0.0 on machine vyan2000-linux [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on display vyan2000-linux:0.0 on machine vyan2000-linux xterm Xt error: Can't open display: vyan2000-linux:0.0 xterm Xt error: Can't open display: vyan2000-linux:0.0 Then I changed back, vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger -display :0.0 Same as before, error messages with only a single gdb window, (and the window shows up at different place at different instances). Yan On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith wrote: > > Try using -display vyan2000-linux:0.0 > > Shouldn't make any difference but since it appears you are running > everything on the same machine what you have given should work. > > Barry > > > On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: > > Hi All, >> I am tring to use the PETSc runtime option -start_in_debugger. >> >> However, when I attach the debugger at run time to each process, there are >> error messages and I only get one gdb window(Am I suppose to get as many as >> the number of the processes?) >> >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display :0.0 >> >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display >> :0.0 on machine vyan2000-linux >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display >> :0.0 on machine vyan2000-linux >> >> Then, only a *single* gdb window prompts out. When I run with 3 process, >> there are only *two* gdb windows. >> >> Thank you very much in advance, >> >> Yan >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jul 18 23:54:18 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 18 Jul 2009 23:54:18 -0500 Subject: PETSC debugger In-Reply-To: References: Message-ID: Are you sure the other window isn't lurking away somewhere off (or nearly) off the screen? Maybe try shutting down the x server and restarting? Barry On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: > Thank you very much, Barry. > > After I use the vyan2000-linux:0.0, I got errors without any gdb > window. > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ > examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - > ksp_monitor_true_residual -start_in_debugger -display vyan2000-linux: > 0.0 > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on > display vyan2000-linux:0.0 on machine vyan2000-linux > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on > display vyan2000-linux:0.0 on machine vyan2000-linux > xterm Xt error: Can't open display: vyan2000-linux:0.0 > xterm Xt error: Can't open display: vyan2000-linux:0.0 > > Then > I changed back, > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ > examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - > ksp_monitor_true_residual -start_in_debugger -display :0.0 > > Same as before, error messages with only a single gdb window, (and > the window shows up at different place at different instances). > > Yan > > > On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith > wrote: > > Try using -display vyan2000-linux:0.0 > > Shouldn't make any difference but since it appears you are running > everything on the same machine what you have given should work. > > Barry > > > On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: > > Hi All, > I am tring to use the PETSc runtime option -start_in_debugger. > > However, when I attach the debugger at run time to each process, > there are error messages and I only get one gdb window(Am I suppose > to get as many as the number of the processes?) > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ > examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - > ksp_monitor_true_residual -start_in_debugger -display :0.0 > > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on > display :0.0 on machine vyan2000-linux > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on > display :0.0 on machine vyan2000-linux > > Then, only a *single* gdb window prompts out. When I run with 3 > process, there are only *two* gdb windows. > > Thank you very much in advance, > > Yan > > From vyan2000 at gmail.com Sun Jul 19 00:06:33 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 19 Jul 2009 01:06:33 -0400 Subject: PETSC debugger In-Reply-To: References: Message-ID: I do not have acess to linux right now, I will check it as the first thing tomorrow. Yan On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith wrote: > > Are you sure the other window isn't lurking away somewhere off (or nearly) > off the screen? > > Maybe try shutting down the x server and restarting? > > Barry > > > On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: > > Thank you very much, Barry. >> >> After I use the vyan2000-linux:0.0, I got errors without any gdb window. >> >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display vyan2000-linux:0.0 >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on display >> vyan2000-linux:0.0 on machine vyan2000-linux >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on display >> vyan2000-linux:0.0 on machine vyan2000-linux >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> >> Then >> I changed back, >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display :0.0 >> >> Same as before, error messages with only a single gdb window, (and the >> window shows up at different place at different instances). >> >> Yan >> >> >> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith wrote: >> >> Try using -display vyan2000-linux:0.0 >> >> Shouldn't make any difference but since it appears you are running >> everything on the same machine what you have given should work. >> >> Barry >> >> >> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: >> >> Hi All, >> I am tring to use the PETSc runtime option -start_in_debugger. >> >> However, when I attach the debugger at run time to each process, there are >> error messages and I only get one gdb window(Am I suppose to get as many as >> the number of the processes?) >> >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display :0.0 >> >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display >> :0.0 on machine vyan2000-linux >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display >> :0.0 on machine vyan2000-linux >> >> Then, only a *single* gdb window prompts out. When I run with 3 process, >> there are only *two* gdb windows. >> >> Thank you very much in advance, >> >> Yan >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From b.van-wachem at imperial.ac.uk Sun Jul 19 01:59:58 2009 From: b.van-wachem at imperial.ac.uk (Berend van Wachem) Date: Sun, 19 Jul 2009 07:59:58 +0100 Subject: PETSC debugger In-Reply-To: References: Message-ID: <4A62C46E.6040300@imperial.ac.uk> Dear Ryan, I had a similar issue as you have. I am using KDE as a desktop manager and found that I have to comment out the line "ServerArgsLocal=-nolisten tcp" in kdm, in the kdmrc file (on my system located at /etc/kde/kdm/kdmrc). After restarting kdm, I get all windows of gdb coming up. Regards, Berend. Ryan Yan wrote: > I do not have acess to linux right now, I will check it as the first > thing tomorrow. > > Yan > > On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith > wrote: > > > Are you sure the other window isn't lurking away somewhere off (or > nearly) off the screen? > > Maybe try shutting down the x server and restarting? > > Barry > > > On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: > > Thank you very much, Barry. > > After I use the vyan2000-linux:0.0, I got errors without any gdb > window. > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > mpirun -np 2 ./rpisolve -ksp_monitor_true_residual > -start_in_debugger -display vyan2000-linux:0.0 > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 > on display vyan2000-linux:0.0 on machine vyan2000-linux > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 > on display vyan2000-linux:0.0 on machine vyan2000-linux > xterm Xt error: Can't open display: vyan2000-linux:0.0 > xterm Xt error: Can't open display: vyan2000-linux:0.0 > > Then > I changed back, > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > mpirun -np 2 ./rpisolve -ksp_monitor_true_residual > -start_in_debugger -display :0.0 > > Same as before, error messages with only a single gdb window, > (and the window shows up at different place at different instances). > > Yan > > > On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith > > wrote: > > Try using -display vyan2000-linux:0.0 > > Shouldn't make any difference but since it appears you are > running everything on the same machine what you have given > should work. > > Barry > > > On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: > > Hi All, > I am tring to use the PETSc runtime option -start_in_debugger. > > However, when I attach the debugger at run time to each process, > there are error messages and I only get one gdb window(Am I > suppose to get as many as the number of the processes?) > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > mpirun -np 2 ./rpisolve -ksp_monitor_true_residual > -start_in_debugger -display :0.0 > > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 > on display :0.0 on machine vyan2000-linux > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 > on display :0.0 on machine vyan2000-linux > > Then, only a *single* gdb window prompts out. When I run with 3 > process, there are only *two* gdb windows. > > Thank you very much in advance, > > Yan > > > > From s.kramer at imperial.ac.uk Sun Jul 19 04:11:17 2009 From: s.kramer at imperial.ac.uk (Stephan Kramer) Date: Sun, 19 Jul 2009 10:11:17 +0100 Subject: non-local values being dropped in MatSetValues In-Reply-To: <80B2A52B-2566-4336-A52C-1744E21E200E@mcs.anl.gov> References: <4A607219.3070609@imperial.ac.uk> <80B2A52B-2566-4336-A52C-1744E21E200E@mcs.anl.gov> Message-ID: <4A62E335.5030404@imperial.ac.uk> Barry Smith wrote: > Stephan, > > I'm sorry for your wasted time finding this bug. I have fixed it > in the Mecurial version of PETSc 3.0.0 and in petsc-dev. It will be > fixed in the next 3.0.0 patch that we release. > > Barry Excellent. No problem, thanks a lot for your quick response and fix! Cheers Stephan > > On Jul 17, 2009, at 7:44 AM, Stephan Kramer wrote: > >> Hello >> >> We've spend sometime debugging a problem were in the assembly of a >> parallel MPIAIJ matrix, some values that were created on a process >> other than the owner of the row seemed to disappear. I think I >> narrowed it down to what I think is a bug in MatSetValues_MPIAIJ, >> but please tell me if I'm wrong. >> >> The situation is the following: I'm calling MatSetValues with the >> flag ADD_VALUES and with matrix option MAT_IGNORE_ZERO_ENTRIES. I'm >> inserting multiple values at once, multiple columns and rows, so I >> provide a rank-2 matrix of values. As I'm calling this from fortran >> I'm also using MAT_COLUMN_ORIENTED. Now for provided rows that are >> not owned by the process, it jumps to mpiaij.c:394 (line numbers as >> in petsc-dev). On line 399, it checks for zero entries, but only >> checks the very first entry of the (non-owned) row. If however other >> entries of that same row are nonzero, the entire row is still >> dropped. Note that this is independent of row_oriented/ >> column_oriented as line 396 does exactly the same. >> >> If I don't set the option MAT_IGNORE_ZERO_ENTRIES the problem >> disappears. In that case however we would either have to preallocate >> substantially more nonzeros, or complicate the matrix assembly in >> our code by taking out the zero entries ourselves and call >> MatSetValues for each entry seperately. >> >> Your help would be much appreciated, >> Cheers >> Stephan >> >> >> -- >> Stephan Kramer >> Applied Modelling and Computation Group, >> Department of Earth Science and Engineering, >> Imperial College London > > -- Stephan Kramer Applied Modelling and Computation Group, Department of Earth Science and Engineering, Imperial College London From vyan2000 at gmail.com Sun Jul 19 13:47:23 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 19 Jul 2009 14:47:23 -0400 Subject: PETSC debugger In-Reply-To: <4A62C46E.6040300@imperial.ac.uk> References: <4A62C46E.6040300@imperial.ac.uk> Message-ID: Dear Berend, Thanks for your suggestion. I can see that option right there!! However, I am still struggling to find a way to turn this option off as I can see from vyan2000 at vyan2000-linux:/etc/gdm$ ps -ef | grep /usr/bin/X root 5287 5285 3 14:19 tty7 00:00:42 /usr/bin/X :0 -br -audit 0 -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7 I am using Ubuntu *gnome*, but there is no such gdmrc file that I can correct. I tried several other ways, but have not succeed yet. vyan2000 at vyan2000-linux:/etc/gdm$ uname -a Linux vyan2000-linux 2.6.24-24-generic #1 SMP Tue Jul 7 19:46:39 UTC 2009 i686 GNU/Linux I am still searching.... Regards, Yan On Sun, Jul 19, 2009 at 2:59 AM, Berend van Wachem < b.van-wachem at imperial.ac.uk> wrote: > Dear Ryan, > > I had a similar issue as you have. I am using KDE as a desktop manager and > found that I have to comment out the line > > "ServerArgsLocal=-nolisten tcp" > > in kdm, in the kdmrc file (on my system located at /etc/kde/kdm/kdmrc). > > After restarting kdm, I get all windows of gdb coming up. > > Regards, > > Berend. > > > Ryan Yan wrote: > >> I do not have acess to linux right now, I will check it as the first thing >> tomorrow. >> Yan >> >> On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith > bsmith at mcs.anl.gov>> wrote: >> >> >> Are you sure the other window isn't lurking away somewhere off (or >> nearly) off the screen? >> >> Maybe try shutting down the x server and restarting? >> >> Barry >> >> >> On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: >> >> Thank you very much, Barry. >> >> After I use the vyan2000-linux:0.0, I got errors without any gdb >> window. >> >> vyan2000 at vyan2000-linux >> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual >> -start_in_debugger -display vyan2000-linux:0.0 >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 >> on display vyan2000-linux:0.0 on machine vyan2000-linux >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 >> on display vyan2000-linux:0.0 on machine vyan2000-linux >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> >> Then >> I changed back, >> vyan2000 at vyan2000-linux >> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual >> -start_in_debugger -display :0.0 >> >> Same as before, error messages with only a single gdb window, >> (and the window shows up at different place at different >> instances). >> >> Yan >> >> >> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith >> > wrote: >> >> Try using -display vyan2000-linux:0.0 >> >> Shouldn't make any difference but since it appears you are >> running everything on the same machine what you have given >> should work. >> >> Barry >> >> >> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: >> >> Hi All, >> I am tring to use the PETSc runtime option -start_in_debugger. >> >> However, when I attach the debugger at run time to each process, >> there are error messages and I only get one gdb window(Am I >> suppose to get as many as the number of the processes?) >> >> vyan2000 at vyan2000-linux >> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual >> -start_in_debugger -display :0.0 >> >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 >> on display :0.0 on machine vyan2000-linux >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 >> on display :0.0 on machine vyan2000-linux >> >> Then, only a *single* gdb window prompts out. When I run with 3 >> process, there are only *two* gdb windows. >> >> Thank you very much in advance, >> >> Yan >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From b.van-wachem at imperial.ac.uk Sun Jul 19 13:52:22 2009 From: b.van-wachem at imperial.ac.uk (Berend van Wachem) Date: Sun, 19 Jul 2009 19:52:22 +0100 Subject: PETSC debugger In-Reply-To: References: <4A62C46E.6040300@imperial.ac.uk> Message-ID: <4A636B66.4070605@imperial.ac.uk> Dear Ryan, I am not a Gnome user, but a colleague of mine suggested: For Gnome you should edit the file /etc/gdm/gdm.conf and change the settings: DisallowTCP=false Regards, Berend. Ryan Yan wrote: > Dear Berend, > Thanks for your suggestion. I can see that option right there!! > > However, I am still struggling to find a way to turn this option off as > I can see from > > vyan2000 at vyan2000-linux:/etc/gdm$ ps -ef | grep /usr/bin/X > root 5287 5285 3 14:19 tty7 00:00:42 /usr/bin/X :0 -br -audit > 0 -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7 > > I am using Ubuntu *gnome*, but there is no such gdmrc file that I can > correct. I tried several other ways, but have not succeed yet. > vyan2000 at vyan2000-linux:/etc/gdm$ uname -a > Linux vyan2000-linux 2.6.24-24-generic #1 SMP Tue Jul 7 19:46:39 UTC > 2009 i686 GNU/Linux > > I am still searching.... > > Regards, > > Yan > > > On Sun, Jul 19, 2009 at 2:59 AM, Berend van Wachem > > wrote: > > Dear Ryan, > > I had a similar issue as you have. I am using KDE as a desktop > manager and found that I have to comment out the line > > "ServerArgsLocal=-nolisten tcp" > > in kdm, in the kdmrc file (on my system located at /etc/kde/kdm/kdmrc). > > After restarting kdm, I get all windows of gdb coming up. > > Regards, > > Berend. > > > Ryan Yan wrote: > > I do not have acess to linux right now, I will check it as the > first thing tomorrow. > Yan > > On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith > > >> wrote: > > > Are you sure the other window isn't lurking away somewhere > off (or > nearly) off the screen? > > Maybe try shutting down the x server and restarting? > > Barry > > > On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: > > Thank you very much, Barry. > > After I use the vyan2000-linux:0.0, I got errors without > any gdb > window. > > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > mpirun -np 2 ./rpisolve -ksp_monitor_true_residual > -start_in_debugger -display vyan2000-linux:0.0 > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid > 26518 > on display vyan2000-linux:0.0 on machine vyan2000-linux > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid > 26519 > on display vyan2000-linux:0.0 on machine vyan2000-linux > xterm Xt error: Can't open display: vyan2000-linux:0.0 > xterm Xt error: Can't open display: vyan2000-linux:0.0 > > Then > I changed back, > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > mpirun -np 2 ./rpisolve -ksp_monitor_true_residual > -start_in_debugger -display :0.0 > > Same as before, error messages with only a single gdb window, > (and the window shows up at different place at different > instances). > > Yan > > > On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith > > >> wrote: > > Try using -display vyan2000-linux:0.0 > > Shouldn't make any difference but since it appears you are > running everything on the same machine what you have given > should work. > > Barry > > > On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: > > Hi All, > I am tring to use the PETSc runtime option > -start_in_debugger. > > However, when I attach the debugger at run time to each > process, > there are error messages and I only get one gdb window(Am I > suppose to get as many as the number of the processes?) > > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > mpirun -np 2 ./rpisolve -ksp_monitor_true_residual > -start_in_debugger -display :0.0 > > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid > 26307 > on display :0.0 on machine vyan2000-linux > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid > 26306 > on display :0.0 on machine vyan2000-linux > > Then, only a *single* gdb window prompts out. When I run > with 3 > process, there are only *two* gdb windows. > > Thank you very much in advance, > > Yan > > > > > From vyan2000 at gmail.com Sun Jul 19 14:54:32 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 19 Jul 2009 15:54:32 -0400 Subject: PETSC debugger In-Reply-To: <4A636B66.4070605@imperial.ac.uk> References: <4A62C46E.6040300@imperial.ac.uk> <4A636B66.4070605@imperial.ac.uk> Message-ID: Dear Berend, Thanks to your suggestion, I have turned it off. As it can be shown here: vyan2000 at vyan2000-linux:~$ ps -ef |grep /usr/bin/X root 5273 5271 4 15:45 tty7 00:00:14 /usr/bin/X :0 -br -audit 0 -auth /var/lib/gdm/:0.Xauth vt7 vyan2000 6116 6075 0 15:50 pts/1 00:00:00 grep /usr/bin/X However, There are still errors when PETSc attach the debugger. And sometimes the number of windows is not equal to the number of processes. vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger -display :0.0 [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 6022 on display :0.0 on machine vyan2000-linux [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 6021 on display :0.0 on machine vyan2000-linux It still need more efforts... Regards, Yan On Sun, Jul 19, 2009 at 2:52 PM, Berend van Wachem < b.van-wachem at imperial.ac.uk> wrote: > Dear Ryan, > > I am not a Gnome user, but a colleague of mine suggested: > > For Gnome you should edit the file /etc/gdm/gdm.conf > and change the settings: > DisallowTCP=false > > Regards, > > Berend. > > > Ryan Yan wrote: > >> Dear Berend, >> Thanks for your suggestion. I can see that option right there!! >> >> However, I am still struggling to find a way to turn this option off as I >> can see from >> >> vyan2000 at vyan2000-linux:/etc/gdm$ ps -ef | grep /usr/bin/X >> root 5287 5285 3 14:19 tty7 00:00:42 /usr/bin/X :0 -br -audit 0 >> -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7 >> >> I am using Ubuntu *gnome*, but there is no such gdmrc file that I can >> correct. I tried several other ways, but have not succeed yet. >> vyan2000 at vyan2000-linux:/etc/gdm$ uname -a >> Linux vyan2000-linux 2.6.24-24-generic #1 SMP Tue Jul 7 19:46:39 UTC 2009 >> i686 GNU/Linux >> >> I am still searching.... >> >> Regards, >> >> Yan >> >> >> On Sun, Jul 19, 2009 at 2:59 AM, Berend van Wachem < >> b.van-wachem at imperial.ac.uk > wrote: >> >> Dear Ryan, >> >> I had a similar issue as you have. I am using KDE as a desktop >> manager and found that I have to comment out the line >> >> "ServerArgsLocal=-nolisten tcp" >> >> in kdm, in the kdmrc file (on my system located at /etc/kde/kdm/kdmrc). >> >> After restarting kdm, I get all windows of gdb coming up. >> >> Regards, >> >> Berend. >> >> >> Ryan Yan wrote: >> >> I do not have acess to linux right now, I will check it as the >> first thing tomorrow. >> Yan >> >> On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith >> >> >> wrote: >> >> >> Are you sure the other window isn't lurking away somewhere >> off (or >> nearly) off the screen? >> >> Maybe try shutting down the x server and restarting? >> >> Barry >> >> >> On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: >> >> Thank you very much, Barry. >> >> After I use the vyan2000-linux:0.0, I got errors without >> any gdb >> window. >> >> vyan2000 at vyan2000-linux >> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual >> -start_in_debugger -display vyan2000-linux:0.0 >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid >> 26518 >> on display vyan2000-linux:0.0 on machine vyan2000-linux >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid >> 26519 >> on display vyan2000-linux:0.0 on machine vyan2000-linux >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> >> Then >> I changed back, >> vyan2000 at vyan2000-linux >> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual >> -start_in_debugger -display :0.0 >> >> Same as before, error messages with only a single gdb >> window, >> (and the window shows up at different place at different >> instances). >> >> Yan >> >> >> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith >> >> >> wrote: >> >> Try using -display vyan2000-linux:0.0 >> >> Shouldn't make any difference but since it appears you are >> running everything on the same machine what you have given >> should work. >> >> Barry >> >> >> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: >> >> Hi All, >> I am tring to use the PETSc runtime option >> -start_in_debugger. >> >> However, when I attach the debugger at run time to each >> process, >> there are error messages and I only get one gdb window(Am I >> suppose to get as many as the number of the processes?) >> >> vyan2000 at vyan2000-linux >> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual >> -start_in_debugger -display :0.0 >> >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid >> 26307 >> on display :0.0 on machine vyan2000-linux >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid >> 26306 >> on display :0.0 on machine vyan2000-linux >> >> Then, only a *single* gdb window prompts out. When I run >> with 3 >> process, there are only *two* gdb windows. >> >> Thank you very much in advance, >> >> Yan >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Sun Jul 19 15:49:58 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 19 Jul 2009 16:49:58 -0400 Subject: PETSC debugger In-Reply-To: References: Message-ID: Hi Barry, I restarted x server many times(during turning off the option as ) -nolisten tcp as Berend. And I also checked very carefully each time the bottom margin of the window status panel. The gdb window did not function correctly on my Linux box. Thanks in advance for any suggestions On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith wrote: > > Are you sure the other window isn't lurking away somewhere off (or nearly) > off the screen? > > Maybe try shutting down the x server and restarting? > > Barry > > > On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: > > Thank you very much, Barry. >> >> After I use the vyan2000-linux:0.0, I got errors without any gdb window. >> >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display vyan2000-linux:0.0 >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on display >> vyan2000-linux:0.0 on machine vyan2000-linux >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on display >> vyan2000-linux:0.0 on machine vyan2000-linux >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> >> Then >> I changed back, >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display :0.0 >> >> Same as before, error messages with only a single gdb window, (and the >> window shows up at different place at different instances). >> >> Yan >> >> >> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith wrote: >> >> Try using -display vyan2000-linux:0.0 >> >> Shouldn't make any difference but since it appears you are running >> everything on the same machine what you have given should work. >> >> Barry >> >> >> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: >> >> Hi All, >> I am tring to use the PETSc runtime option -start_in_debugger. >> >> However, when I attach the debugger at run time to each process, there are >> error messages and I only get one gdb window(Am I suppose to get as many as >> the number of the processes?) >> >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display :0.0 >> >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display >> :0.0 on machine vyan2000-linux >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display >> :0.0 on machine vyan2000-linux >> >> Then, only a *single* gdb window prompts out. When I run with 3 process, >> there are only *two* gdb windows. >> >> Thank you very much in advance, >> >> Yan >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Jul 19 16:00:46 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 19 Jul 2009 16:00:46 -0500 Subject: PETSC debugger In-Reply-To: References: Message-ID: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov> This option works by starting up an xterm with gdb running in that xterm. The code is in src/sys/error/adebug.c If you only need to look at one process, you can use the options - start_in_debugger noxterm -debugger_nodes 0 (or 1 or 2) then all the other processes won't use the debugger. Barry On Jul 19, 2009, at 3:49 PM, Ryan Yan wrote: > Hi Barry, > I restarted x server many times(during turning off the option as ) - > nolisten tcp as Berend. And I also checked very carefully each time > the bottom margin of the window status panel. The gdb window did not > function correctly on my Linux box. Thanks in advance for any > suggestions > > > > On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith > wrote: > > Are you sure the other window isn't lurking away somewhere off (or > nearly) off the screen? > > Maybe try shutting down the x server and restarting? > > Barry > > > On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: > > Thank you very much, Barry. > > After I use the vyan2000-linux:0.0, I got errors without any gdb > window. > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ > examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - > ksp_monitor_true_residual -start_in_debugger -display vyan2000-linux: > 0.0 > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on > display vyan2000-linux:0.0 on machine vyan2000-linux > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on > display vyan2000-linux:0.0 on machine vyan2000-linux > xterm Xt error: Can't open display: vyan2000-linux:0.0 > xterm Xt error: Can't open display: vyan2000-linux:0.0 > > Then > I changed back, > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ > examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - > ksp_monitor_true_residual -start_in_debugger -display :0.0 > > Same as before, error messages with only a single gdb window, (and > the window shows up at different place at different instances). > > Yan > > > On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith > wrote: > > Try using -display vyan2000-linux:0.0 > > Shouldn't make any difference but since it appears you are running > everything on the same machine what you have given should work. > > Barry > > > On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: > > Hi All, > I am tring to use the PETSc runtime option -start_in_debugger. > > However, when I attach the debugger at run time to each process, > there are error messages and I only get one gdb window(Am I suppose > to get as many as the number of the processes?) > > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ > examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - > ksp_monitor_true_residual -start_in_debugger -display :0.0 > > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on > display :0.0 on machine vyan2000-linux > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on > display :0.0 on machine vyan2000-linux > > Then, only a *single* gdb window prompts out. When I run with 3 > process, there are only *two* gdb windows. > > Thank you very much in advance, > > Yan > > > > From vyan2000 at gmail.com Sun Jul 19 17:43:21 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Sun, 19 Jul 2009 18:43:21 -0400 Subject: PETSC debugger In-Reply-To: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov> References: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov> Message-ID: Thank you very much, barry, Only debugging one processes is a way to go for my current bug(s). Yan On Sun, Jul 19, 2009 at 5:00 PM, Barry Smith wrote: > > This option works by starting up an xterm with gdb running in that xterm. > The code is in src/sys/error/adebug.c > > If you only need to look at one process, you can use the options > -start_in_debugger noxterm -debugger_nodes 0 (or 1 or 2) then all the other > processes won't use the debugger. > > Barry > > > > > On Jul 19, 2009, at 3:49 PM, Ryan Yan wrote: > > Hi Barry, >> I restarted x server many times(during turning off the option as ) >> -nolisten tcp as Berend. And I also checked very carefully each time the >> bottom margin of the window status panel. The gdb window did not function >> correctly on my Linux box. Thanks in advance for any suggestions >> >> >> >> On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith wrote: >> >> Are you sure the other window isn't lurking away somewhere off (or >> nearly) off the screen? >> >> Maybe try shutting down the x server and restarting? >> >> Barry >> >> >> On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote: >> >> Thank you very much, Barry. >> >> After I use the vyan2000-linux:0.0, I got errors without any gdb window. >> >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display vyan2000-linux:0.0 >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on display >> vyan2000-linux:0.0 on machine vyan2000-linux >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on display >> vyan2000-linux:0.0 on machine vyan2000-linux >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> xterm Xt error: Can't open display: vyan2000-linux:0.0 >> >> Then >> I changed back, >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display :0.0 >> >> Same as before, error messages with only a single gdb window, (and the >> window shows up at different place at different instances). >> >> Yan >> >> >> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith wrote: >> >> Try using -display vyan2000-linux:0.0 >> >> Shouldn't make any difference but since it appears you are running >> everything on the same machine what you have given should work. >> >> Barry >> >> >> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote: >> >> Hi All, >> I am tring to use the PETSc runtime option -start_in_debugger. >> >> However, when I attach the debugger at run time to each process, there are >> error messages and I only get one gdb window(Am I suppose to get as many as >> the number of the processes?) >> >> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ >> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger >> -display :0.0 >> >> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display >> :0.0 on machine vyan2000-linux >> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display >> :0.0 on machine vyan2000-linux >> >> Then, only a *single* gdb window prompts out. When I run with 3 process, >> there are only *two* gdb windows. >> >> Thank you very much in advance, >> >> Yan >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Mon Jul 20 04:47:35 2009 From: jed at 59A2.org (Jed Brown) Date: Mon, 20 Jul 2009 11:47:35 +0200 Subject: PETSC debugger In-Reply-To: References: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov> Message-ID: <4A643D37.10708@59A2.org> Ryan Yan wrote: > Only debugging one processes is a way to go for my current bug(s). There is another option for debugging multiple processes using screen instead of X. For the sake of cleanliness, start a new terminal and open a special screen session for debugging $ screen -S sdebug Now in your original terminal, run the program like $ mpirun -n 4 ./myapp -start_in_debugger -debug_terminal "screen -S sdebug -X screen" (the quotes are important) This opens four new windows within the screen session named "sdebug". Recent versions of MPICH2 have a -gdb option which is a lightweight skin for gdb that sends commands to all processes and collates the results, but also allows you to drop to per-process debugging. I had trouble with it prior to release 1.1. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: From vyan2000 at gmail.com Mon Jul 20 12:04:37 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 20 Jul 2009 13:04:37 -0400 Subject: PETSC debugger In-Reply-To: <4A643D37.10708@59A2.org> References: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov> <4A643D37.10708@59A2.org> Message-ID: Thank you very much, Jed. I will try your suggestion. Yan On Mon, Jul 20, 2009 at 5:47 AM, Jed Brown wrote: > Ryan Yan wrote: > > Only debugging one processes is a way to go for my current bug(s). > > There is another option for debugging multiple processes using screen > instead of X. For the sake of cleanliness, start a new terminal and > open a special screen session for debugging > > $ screen -S sdebug > > Now in your original terminal, run the program like > > $ mpirun -n 4 ./myapp -start_in_debugger -debug_terminal "screen -S sdebug > -X screen" > > (the quotes are important) > > This opens four new windows within the screen session named "sdebug". > > > Recent versions of MPICH2 have a -gdb option which is a lightweight skin > for gdb that sends commands to all processes and collates the results, > but also allows you to drop to per-process debugging. I had trouble > with it prior to release 1.1. > > > Jed > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Mon Jul 20 12:50:21 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 20 Jul 2009 13:50:21 -0400 Subject: PETSC debugger In-Reply-To: <4A643D37.10708@59A2.org> References: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov> <4A643D37.10708@59A2.org> Message-ID: Hi Jed, My X server set up may be messed up somehow. Same error as before. After I creat a new session using, vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ screen -S sdebug Still errors. vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ mpirun -n 4 ./rpisolve -start_in_debugger -debug_terminal "screen -S sdebug -X screen" [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9758 on vyan2000-linux [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9755 on vyan2000-linux [3]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9757 on vyan2000-linux [2]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9756 on vyan2000-linux The number of the screen for gdb is not equal to the number of processes.(Mostly, less than), as you can see in the following "bottom margin" of my screen session. vyan2000-linux | 0.11 0.09 0.08 | 07-20 13:46 |0-$ shell 1$ gdb 2$ gdb 3$* gdb And I use backtrace as the first command in the survived gdb session, I get: Loaded symbols for /usr/lib/libX11.so.6 Reading symbols from /usr/lib/libstdc++.so.6...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /usr/lib/liblapack.so.3gf...done. Loaded symbols for /usr/lib/liblapack.so.3gf Reading symbols from /usr/lib/libblas.so.3gf...done. Loaded symbols for /usr/lib/libblas.so.3gf Reading symbols from /lib/tls/i686/cmov/libdl.so.2...done. Loaded symbols for /lib/tls/i686/cmov/libdl.so.2 Reading symbols from /lib/tls/i686/cmov/libpthread.so.0...done. [Thread debugging using libthread_db enabled] [New Thread 0xb739c6c0 (LWP 10213)] Loaded symbols for /lib/tls/i686/cmov/libpthread.so.0 Reading symbols from /lib/tls/i686/cmov/librt.so.1...done. Loaded symbols for /lib/tls/i686/cmov/librt.so.1 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /usr/lib/libgfortran.so.2...done. Loaded symbols for /usr/lib/libgfortran.so.2 Reading symbols from /lib/tls/i686/cmov/libm.so.6...done. Loaded symbols for /lib/tls/i686/cmov/libm.so.6 Reading symbols from /lib/tls/i686/cmov/libc.so.6...done. Loaded symbols for /lib/tls/i686/cmov/libc.so.6 Reading symbols from /usr/lib/libxcb-xlib.so.0...done. Loaded symbols for /usr/lib/libxcb-xlib.so.0 Reading symbols from /usr/lib/libxcb.so.1...done. Loaded symbols for /usr/lib/libxcb.so.1 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /usr/lib/libXau.so.6...done. Loaded symbols for /usr/lib/libXau.so.6 Reading symbols from /usr/lib/libXdmcp.so.6...done. Loaded symbols for /usr/lib/libXdmcp.so.6 Reading symbols from /lib/tls/i686/cmov/libnss_compat.so.2...done. Loaded symbols for /lib/tls/i686/cmov/libnss_compat.so.2 Reading symbols from /lib/tls/i686/cmov/libnsl.so.1...done. Loaded symbols for /lib/tls/i686/cmov/libnsl.so.1 Reading symbols from /lib/tls/i686/cmov/libnss_nis.so.2...done. Loaded symbols for /lib/tls/i686/cmov/libnss_nis.so.2 Reading symbols from /lib/tls/i686/cmov/libnss_files.so.2...done. Loaded symbols for /lib/tls/i686/cmov/libnss_files.so.2 0xb7f5b410 in __kernel_vsyscall () (gdb) backtrace #0 0xb7f5b410 in __kernel_vsyscall () #1 0xb7456c90 in nanosleep () from /lib/tls/i686/cmov/libc.so.6 #2 0xb7456ac7 in sleep () from /lib/tls/i686/cmov/libc.so.6 #3 0x085cd836 in PetscSleep (s=10) at psleep.c:40 #4 0x0860976f in PetscAttachDebugger () at adebug.c:412 #5 0x085b87ee in PetscOptionsCheckInitial_Private () at init.c:378 #6 0x085bd223 in PetscInitialize (argc=0xbfacffe0, args=0xbfacffe4, file=0x0, help=0x89cbac0 "output the matrix A, rhs b.\n exact solution x : check \n", ' ' , "\n\n") at pinit.c:573 #7 0x0804bf43 in main (argc=9, args=0x3a0001c6) at rpisolve.c:71 I have no idea how to debug this. Yan On Mon, Jul 20, 2009 at 5:47 AM, Jed Brown wrote: > Ryan Yan wrote: > > Only debugging one processes is a way to go for my current bug(s). > > There is another option for debugging multiple processes using screen > instead of X. For the sake of cleanliness, start a new terminal and > open a special screen session for debugging > > $ screen -S sdebug > > Now in your original terminal, run the program like > > $ mpirun -n 4 ./myapp -start_in_debugger -debug_terminal "screen -S sdebug > -X screen" > > (the quotes are important) > > This opens four new windows within the screen session named "sdebug". > > > Recent versions of MPICH2 have a -gdb option which is a lightweight skin > for gdb that sends commands to all processes and collates the results, > but also allows you to drop to per-process debugging. I had trouble > with it prior to release 1.1. > > > Jed > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at 59A2.org Mon Jul 20 14:17:46 2009 From: jed at 59A2.org (Jed Brown) Date: Mon, 20 Jul 2009 21:17:46 +0200 Subject: PETSC debugger In-Reply-To: References: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov> <4A643D37.10708@59A2.org> Message-ID: <4A64C2DA.6020807@59A2.org> Ryan Yan wrote: > My X server set up may be messed up somehow. Same error as before. Using screen bypasses X, so that seems an unlikely candidate. > After I creat a new session using, > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > screen -S sdebug > > Still errors. No errors at this point though, right? > vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > mpirun -n 4 ./rpisolve -start_in_debugger -debug_terminal "screen -S sdebug > -X screen" > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9758 on > vyan2000-linux > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9755 on > vyan2000-linux > [3]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9757 on > vyan2000-linux > [2]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9756 on > vyan2000-linux > > The number of the screen for gdb is not equal to the number of > processes.(Mostly, less than), as you can see in the following "bottom > margin" of my screen session. Are there *ever* more sessions than the number of processes? Are there ever the same number? Is there any consistency to which process is missing? The output above indicates that the debugger is being run and, from the perspective of PetscAttachDebugger, the operation was successful. I have seen this behavior (one missing debug session) on a couple occasions, but it wasn't reproducible so I couldn't debug it. If this continues to be a problem, I recommend attaching the debugger yourself (put "set breakpoint pending on" and "break PetscError" in your .gdbinit or run with -on_error_abort). Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: From vyan2000 at gmail.com Mon Jul 20 14:46:36 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Mon, 20 Jul 2009 15:46:36 -0400 Subject: PETSC debugger In-Reply-To: <4A64C2DA.6020807@59A2.org> References: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov> <4A643D37.10708@59A2.org> <4A64C2DA.6020807@59A2.org> Message-ID: Thanks, Jed. Please see reply below. On Mon, Jul 20, 2009 at 3:17 PM, Jed Brown wrote: > Ryan Yan wrote: > > My X server set up may be messed up somehow. Same error as before. > > Using screen bypasses X, so that seems an unlikely candidate. > > > After I creat a new session using, > > vyan2000 at vyan2000-linux > :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > > screen -S sdebug > > > > Still errors. > > No errors at this point though, right? > Yes, you are right. There is no error at this point. > > > vyan2000 at vyan2000-linux > :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ > > mpirun -n 4 ./rpisolve -start_in_debugger -debug_terminal "screen -S > sdebug > > -X screen" > > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9758 on > > vyan2000-linux > > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9755 on > > vyan2000-linux > > [3]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9757 on > > vyan2000-linux > > [2]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9756 on > > vyan2000-linux > > > > The number of the screen for gdb is not equal to the number of > > processes.(Mostly, less than), as you can see in the following "bottom > > margin" of my screen session. > > Are there *ever* more sessions than the number of processes? No. > Are there ever the same number? Yes, some time. But the error information *is* always there. > Is there any consistency to which process is > missing? I do not how to check which process is missing... Sorry, I can not answer this question. > The output above indicates that the debugger is being run and, > from the perspective of PetscAttachDebugger, the operation was > successful. I agree with you. > I have seen this behavior (one missing debug session) on a > couple occasions, but it wasn't reproducible so I couldn't debug it. I am using Ubuntu Gnome, maybe the error on this specific linux distribution is reproducible (It may takes you a while). One of my colleague also have the same error, she is using a same linux distribution as mine. vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ uname -a Linux vyan2000-linux 2.6.24-24-generic #1 SMP Tue Jul 7 19:46:39 UTC 2009 i686 GNU/Linux vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 8.04.3 LTS Release: 8.04 Codename: hardy > If this continues to be a problem, I recommend attaching the debugger > yourself (put "set breakpoint pending on" and "break PetscError" in your > .gdbinit or run with -on_error_abort). > I will try it. But I do not have any experience of attaching the debugger to something else. Any pointer to a *reference*? Yan > > Jed > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From u.tabak at tudelft.nl Tue Jul 21 15:16:13 2009 From: u.tabak at tudelft.nl (Umut Tabak) Date: Tue, 21 Jul 2009 22:16:13 +0200 Subject: Petsc And Slepc, singular system Message-ID: <4A66220D.8010401@tudelft.nl> Dear all, As a fresh user of Petsc libraries, should thank the developers for such a magnificent endeavor and years of work. So the question directly related to Petsc is that if I have a singular system matrix and try to solve for the unknowns(simple enough 3 by 3) (I am using the simple linear system example from the Petsc user manual as a template where a preconditioner is used, I guess it is Jacobi.), I do not get any warnings for zero pivots in LU decomposition which I could not understand why, and the results are on the order of e+16, also the norm of the error. But why is not there some kind of warning. The second part of the question is related to Slepc, this might not find direct answers here perhaps, but let me give it a try. I have a generalized eigenvalue problem, it is a vibration related problem so I will use K and M instead of A and B, respectively. On my problem, K is singular, and if I use slepc to find the solution, petsc warns me about the zero pivot emergence, and breaks down naturally, there after I apply some shift operations that are already implemented in slepc to overcome the problem. The question is what is the effect of preconditioner on a singular matrix for the linear system explained above, somehow, I was thinking in any case that should also warn me but it did not and gave some wrong results. I am a bit weak on the preconditioners, maybe should have done some reading but I know that singular systems can also have solutions by some order tricks, pseudo inverse, temporary links application solutions with respect to rigid body modes(from structural mechanics too specific maybe). Can Petsc handle singular systems as well? I am a bit confused at this point. Best regards, Umut From knepley at gmail.com Tue Jul 21 15:56:11 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 21 Jul 2009 15:56:11 -0500 Subject: Petsc And Slepc, singular system In-Reply-To: <4A66220D.8010401@tudelft.nl> References: <4A66220D.8010401@tudelft.nl> Message-ID: On Tue, Jul 21, 2009 at 3:16 PM, Umut Tabak wrote: > Dear all, > > As a fresh user of Petsc libraries, should thank the developers for such a > magnificent endeavor and years of work. > > So the question directly related to Petsc is that if I have a singular > system matrix and try to solve for the unknowns(simple enough 3 by 3) (I am > using the simple linear system example from the Petsc user manual as a > template where a preconditioner is used, I guess it is Jacobi.), I do not > get any warnings for zero pivots in LU decomposition which I could not > understand why, and the results are on the order of e+16, also the norm of > the error. But why is not there some kind of warning. If your system is badly scaled, roundoff errors could result in a pivot larger than our tolerance. It is also possible that your preconditioner resulted in a badly scaled system. Matt > > The second part of the question is related to Slepc, this might not find > direct answers here perhaps, but let me give it a try. > > I have a generalized eigenvalue problem, it is a vibration related problem > so I will use K and M instead of A and B, respectively. On my problem, K is > singular, and if I use slepc to find the solution, petsc warns me about the > zero pivot emergence, and breaks down naturally, there after I apply some > shift operations that are already implemented in slepc to overcome the > problem. > > The question is what is the effect of preconditioner on a singular matrix > for the linear system explained above, somehow, I was thinking in any case > that should also warn me but it did not and gave some wrong results. > > I am a bit weak on the preconditioners, maybe should have done some reading > but I know that singular systems can also have solutions by some order > tricks, pseudo inverse, temporary links application solutions with respect > to rigid body modes(from structural mechanics too specific maybe). > > Can Petsc handle singular systems as well? I am a bit confused at this > point. > > Best regards, > > Umut > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From umut.tabak at gmail.com Tue Jul 21 15:39:18 2009 From: umut.tabak at gmail.com (Umut Tabak) Date: Tue, 21 Jul 2009 22:39:18 +0200 Subject: singular systems, petsc and slepc Message-ID: <4A662776.80103@gmail.com> Dear all, As a fresh user of Petsc libraries, should thank the developers for such a magnificent endeavor and years of work. So the question directly related to Petsc is that if I have a singular system matrix and try to solve for the unknowns(simple enough 3 by 3) (I am using the simple linear system example from the Petsc user manual as a template where a preconditioner is used, I guess it is Jacobi.), I do not get any warnings for zero pivots in LU decomposition which I could not understand why, and the results are on the order of e+16, also the norm of the error. But why is not there some kind of warning. The second part of the question is related to Slepc, this might not find direct answers here perhaps, but let me give it a try. I have a generalized eigenvalue problem, it is a vibration related problem so I will use K and M instead of A and B, respectively. On my problem, K is singular, and if I use slepc to find the solution, petsc warns me about the zero pivot emergence, and breaks down naturally, there after I apply some shift operations that are already implemented in slepc to overcome the problem. The question is what is the effect of preconditioner on a singular matrix for the linear system explained above, somehow, I was thinking in any case that should also warn me but it did not and gave some wrong results. I am a bit weak on the preconditioners, maybe should have done some reading but I know that singular systems can also have solutions by some order tricks, pseudo inverse, temporary links application solutions with respect to rigid body modes(from structural mechanics too specific maybe). Can Petsc handle singular systems as well? I am a bit confused at this point. Best regards, Umut From u.tabak at tudelft.nl Tue Jul 21 16:09:54 2009 From: u.tabak at tudelft.nl (Umut Tabak) Date: Tue, 21 Jul 2009 23:09:54 +0200 Subject: Petsc And Slepc, singular system In-Reply-To: References: <4A66220D.8010401@tudelft.nl> Message-ID: <20090721210954.GA31488@dutw689> On Tue, Jul 21, 2009 at 03:56:11PM -0500, Matthew Knepley wrote: > If your system is badly scaled, roundoff errors could result in a pivot > larger than our tolerance. It is also possible that your preconditioner > resulted in a badly scaled system. > > Matt > Hi, Thanks for the fast reply. since the system is singular condition number will be bad any way, but I was wondering if there were already ways to overcome this problem. And are there ways to prevent the preconditioner to give a badly scaled system, what I mean is that the system is singular so will preconditioning improve that? I definitely read about these. Thanks and best, Umut -- Quote: You can not teach a man anything, you can only help him find it within himself. Source: Galileo Galilei From knepley at gmail.com Tue Jul 21 16:24:30 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 21 Jul 2009 16:24:30 -0500 Subject: Petsc And Slepc, singular system In-Reply-To: <20090721210954.GA31488@dutw689> References: <4A66220D.8010401@tudelft.nl> <20090721210954.GA31488@dutw689> Message-ID: On Tue, Jul 21, 2009 at 4:09 PM, Umut Tabak wrote: > On Tue, Jul 21, 2009 at 03:56:11PM -0500, Matthew Knepley wrote: > > If your system is badly scaled, roundoff errors could result in a pivot > > larger than our tolerance. It is also possible that your preconditioner > > resulted in a badly scaled system. > > > > Matt > > > Hi, > Thanks for the fast reply. > since the system is singular condition number will be bad any way, > but I was wondering if there were already ways to overcome this > problem. And are there ways to prevent the preconditioner to give a > badly scaled system, what I mean is that the system is singular so > will preconditioning improve that? I definitely read about these. > Well, the default is ILU which can do horrendously things. Try running with Jacobi as a test. It should fail like you want. In my opinion, blackbox PCs almost never work, and what you really need is something tailored to your problem (which is linear algebra heresy). Matt > > Thanks and best, > Umut > > -- > Quote: You can not teach a man anything, you can only help him find it > within himself. > Source: Galileo Galilei > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jul 21 16:29:23 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 21 Jul 2009 16:29:23 -0500 Subject: Petsc And Slepc, singular system In-Reply-To: References: <4A66220D.8010401@tudelft.nl> <20090721210954.GA31488@dutw689> Message-ID: See the manual page for MatNullSpaceCreate() and KSPSetNullSpace() Barry On Jul 21, 2009, at 4:24 PM, Matthew Knepley wrote: > On Tue, Jul 21, 2009 at 4:09 PM, Umut Tabak > wrote: > On Tue, Jul 21, 2009 at 03:56:11PM -0500, Matthew Knepley wrote: > > If your system is badly scaled, roundoff errors could result in a > pivot > > larger than our tolerance. It is also possible that your > preconditioner > > resulted in a badly scaled system. > > > > Matt > > > Hi, > Thanks for the fast reply. > since the system is singular condition number will be bad any way, > but I was wondering if there were already ways to overcome this > problem. And are there ways to prevent the preconditioner to give a > badly scaled system, what I mean is that the system is singular so > will preconditioning improve that? I definitely read about these. > > Well, the default is ILU which can do horrendously things. Try > running with > Jacobi as a test. It should fail like you want. In my opinion, > blackbox PCs > almost never work, and what you really need is something tailored to > your > problem (which is linear algebra heresy). > > Matt > > > Thanks and best, > Umut > > -- > Quote: You can not teach a man anything, you can only help him > find it within himself. > Source: Galileo Galilei > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From jroman at dsic.upv.es Tue Jul 21 17:01:43 2009 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 22 Jul 2009 00:01:43 +0200 Subject: Petsc And Slepc, singular system In-Reply-To: References: <4A66220D.8010401@tudelft.nl> <20090721210954.GA31488@dutw689> Message-ID: <01BBF9CD-0165-473D-B912-24A5F63AB04D@dsic.upv.es> If your eigenproblem is K x = lambda M x and you are solving it as K^-1 M x = 1/lambda x (with shift-and-invert with shift=0 in SLEPc), then if a basis of the nullspace of K is known (rigid-body modes) you can use EPSAttachDeflationSpace. That will make SLEPc automatically do KSPSetNullSpace in the underlying KSP object. Jose On 21/07/2009, Barry Smith wrote: > > See the manual page for MatNullSpaceCreate() and KSPSetNullSpace() > > Barry > > On Jul 21, 2009, at 4:24 PM, Matthew Knepley wrote: > >> On Tue, Jul 21, 2009 at 4:09 PM, Umut Tabak >> wrote: >> On Tue, Jul 21, 2009 at 03:56:11PM -0500, Matthew Knepley wrote: >> > If your system is badly scaled, roundoff errors could result in a >> pivot >> > larger than our tolerance. It is also possible that your >> preconditioner >> > resulted in a badly scaled system. >> > >> > Matt >> > >> Hi, >> Thanks for the fast reply. >> since the system is singular condition number will be bad any way, >> but I was wondering if there were already ways to overcome this >> problem. And are there ways to prevent the preconditioner to give a >> badly scaled system, what I mean is that the system is singular so >> will preconditioning improve that? I definitely read about these. >> >> Well, the default is ILU which can do horrendously things. Try >> running with >> Jacobi as a test. It should fail like you want. In my opinion, >> blackbox PCs >> almost never work, and what you really need is something tailored >> to your >> problem (which is linear algebra heresy). >> >> Matt >> >> >> Thanks and best, >> Umut >> From Andrew.Barker at Colorado.EDU Tue Jul 21 17:44:32 2009 From: Andrew.Barker at Colorado.EDU (Andrew T Barker) Date: Tue, 21 Jul 2009 16:44:32 -0600 (MDT) Subject: set ASM subdomains in sub KSP Message-ID: <20090721164432.AKA01094@batman.int.colorado.edu> I want to do multiple sweeps of ASM to precondition GMRES. So I do something like -ksp_type gmres -pc_type ksp -ksp_ksp_type richardson -ksp_pc_type asm which works fine. Now I want to set my own subdomains with PCASMSetLocalSubdomains(), which isn't accessible from the command line. I can use PCKSPGetKSP() to get the KSP (and then the PC) to use in PCASMSetLocalSubdomains(), but I have to call KSPSetUp() on the parent KSP in order to use PCKSPGetKSP(). And then I'm not allowed to call PCASMSetLocalSubdomains() if KSPSetUp() has already been called - object is in wrong state. I'm at a loss for how to solve this problem - any help would be appreciated. Andrew --- Andrew T. Barker andrew.barker at colorado.edu Department of Applied Mathematics University of Colorado, Boulder 526 UCB, Boulder, CO 80309-0526 From knepley at gmail.com Tue Jul 21 17:54:27 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 21 Jul 2009 17:54:27 -0500 Subject: set ASM subdomains in sub KSP In-Reply-To: <20090721164432.AKA01094@batman.int.colorado.edu> References: <20090721164432.AKA01094@batman.int.colorado.edu> Message-ID: On Tue, Jul 21, 2009 at 5:44 PM, Andrew T Barker wrote: > I want to do multiple sweeps of ASM to precondition GMRES. So I do > something like > > -ksp_type gmres -pc_type ksp -ksp_ksp_type richardson -ksp_pc_type asm > > which works fine. Now I want to set my own subdomains with > PCASMSetLocalSubdomains(), which isn't accessible from the command line. > > I can use PCKSPGetKSP() to get the KSP (and then the PC) to use in > PCASMSetLocalSubdomains(), but I have to call KSPSetUp() on the parent KSP > in order to use PCKSPGetKSP(). And then I'm not allowed to call > PCASMSetLocalSubdomains() if KSPSetUp() has already been called - object is > in wrong state. > > I'm at a loss for how to solve this problem - any help would be > appreciated. I will think about it, but the easiest way for now is just to create the inner PC yourself: PCCreate(&inner_pc) PCSetType() PCSetFromOptions() PCASMSetLocalSubDomains() KSPSetPC(inner_ksp, inner_pc) Matt > Andrew > --- > Andrew T. Barker > andrew.barker at colorado.edu > Department of Applied Mathematics > University of Colorado, Boulder > 526 UCB, Boulder, CO 80309-0526 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From michel.cancelliere at polito.it Wed Jul 22 06:28:15 2009 From: michel.cancelliere at polito.it (Michel Cancelliere) Date: Wed, 22 Jul 2009 13:28:15 +0200 Subject: SNES Convergence test Message-ID: <7f18de3b0907220428u6b2690bay9efd09181d7bbada@mail.gmail.com> Hi there, I am having problems with the SNES, it seems that changing atol or rtol have no effect on the numbers of Newton Iterations. Do you think that it can be a problem of settings in snes solver or maybe a problem on the routines for Jacobian matrix evaluation? The linear solver do converge at each nonlinear iterations. *My Code* ierr = SNESCreate(PETSC_COMM_WORLD,&snes);CHKERRQ(ierr); /********************************************************/ /* Creation of the matrix and vector data structures*/ /********************************************************/ ierr = VecCreate(PETSC_COMM_WORLD,&x); CHKERRQ(ierr); ierr = VecSetSizes(x,PETSC_DECIDE,2*input.grid.N); CHKERRQ(ierr); ierr = VecSetFromOptions(x); CHKERRQ(ierr); ierr = VecDuplicate(x,&R);CHKERRQ(ierr); ierr = MatCreate(PETSC_COMM_WORLD,&J);CHKERRQ(ierr); ierr = MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,2*input.grid.N,2*input.grid.N);CHKERRQ(ierr); ierr = MatSetFromOptions(J);CHKERRQ(ierr); ///Set function evaluation routine and vector // Assign global variable which is used in the static wrapper function pt2Object = (void*) &system; pt2Object2 = (void*) &system; ierr = SNESSetFunction(snes,R,CSystem::Wrapper_to_FormFunction,&input);CHKERRQ(ierr); // Set Jacobian matrix structure and Jacobian evaluation routine ierr = SNESSetJacobian(snes,J,J,CSystem::Wrapper_to_FormJacobian,&input);CHKERRQ(ierr); /** Customizr non linear solver; set runtime options ***/ /* Set linear solver defaults for this problem. By extracting the KSP,and PC contexts from the SNES context, we can then directly call any KSP and PC routines to set various options*/ ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr); ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); ierr = PCSetType(pc,PCILU);CHKERRQ(ierr); ierr = KSPSetTolerances(ksp,1.e-10,PETSC_DEFAULT,PETSC_DEFAULT,30);CHKERRQ(ierr); ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); /* Set SNES/KSP/PC rountime options, e.g., -snes_view -snes_monitor -ksp_type -pc_type These options will override thos specified above as lon as SNESSetFromOptoons is called _after_ any other customization routines. */ ierr =SNESSetTolerances(snes,1e-100,1e-8,1e-1000,100,1000);CHKERRQ(ierr); * No matter what I put here I am getting the same results (Residual Norm and number of iterations)* ierr = SNESSetFromOptions(snes);CHKERRQ(ierr); /*--------------------------------------------------------------- Evaluate initial guess; then solve nonlinear system -----------------------------------------------------------------*/ ierr = CSystem::Cells2Vec(x,input);CHKERRQ(ierr); for (int i=0;i 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.6669e-001 100.0% 1.4174e+004 100.0% 0.000e+000 0.0% 0.000e+000 0.0% 0.000e+000 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run config/configure.py # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 1 1.0 2.7082e-001 1.0 1.42e+004 1.0 0.0e+000 0.0e+000 0.0e+000 58100 0 0 0 58100 0 0 0 0 SNESLineSearch 2 1.0 2.9807e-002 1.0 4.94e+003 1.0 0.0e+000 0.0e+000 0.0e+000 6 35 0 0 0 6 35 0 0 0 0 SNESFunctionEval 13 1.0 3.2095e-002 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 7 0 0 0 0 7 0 0 0 0 0 SNESJacobianEval 2 1.0 2.5123e-003 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 1 0 0 0 0 1 0 0 0 0 0 VecView 2 1.0 2.3364e-001 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 50 0 0 0 0 50 0 0 0 0 0 VecDot 2 1.0 1.2292e-005 1.0 2.54e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 21 VecMDot 2 1.0 1.0337e-005 1.0 2.54e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 25 VecNorm 20 1.0 1.1873e-004 1.0 2.54e+003 1.0 0.0e+000 0.0e+000 0.0e+000 0 18 0 0 0 0 18 0 0 0 21 VecScale 4 1.0 1.7321e-005 1.0 2.56e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 15 VecCopy 6 1.0 2.6819e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 VecSet 5 1.0 1.9835e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 2 1.0 1.0337e-005 1.0 2.56e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 25 VecWAXPY 12 1.0 5.2521e-005 1.0 1.41e+003 1.0 0.0e+000 0.0e+000 0.0e+000 0 10 0 0 0 0 10 0 0 0 27 VecMAXPY 4 1.0 1.7321e-005 1.0 5.12e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 4 0 0 0 0 4 0 0 0 30 VecAssemblyBegin 13 1.0 3.4641e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 13 1.0 3.1848e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 2 1.0 1.7321e-005 1.0 2.54e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 15 VecReduceComm 1 1.0 3.0730e-006 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 4 1.0 8.4368e-005 1.0 7.64e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 5 0 0 0 0 5 0 0 0 9 MatMult 4 1.0 4.4978e-005 1.0 2.75e+003 1.0 0.0e+000 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 61 MatMultTranspose 1 1.0 2.3467e-005 1.0 7.52e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 5 0 0 0 0 5 0 0 0 32 MatSolve 4 1.0 5.6711e-005 1.0 2.75e+003 1.0 0.0e+000 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 49 MatLUFactorNum 2 1.0 7.7943e-005 1.0 2.06e+003 1.0 0.0e+000 0.0e+000 0.0e+000 0 15 0 0 0 0 15 0 0 0 26 MatILUFactorSym 2 1.0 1.9975e-004 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 2 1.0 6.7048e-006 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 4.1346e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 2 1.0 1.0895e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 2 1.0 2.0254e-004 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 MatView 2 1.0 5.7549e-004 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 2 1.0 5.6152e-005 1.0 5.10e+002 1.0 0.0e+000 0.0e+000 0.0e+000 0 4 0 0 0 0 4 0 0 0 9 KSPSetup 2 1.0 1.5170e-004 1.0 0.00e+000 0.0 0.0e+000 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 2 1.0 1.8486e-003 1.0 7.97e+003 1.0 0.0e+000 0.0e+000 0.0e+000 0 56 0 0 0 0 56 0 0 0 4 PCSetUp 2 1.0 1.2032e-003 1.0 2.06e+003 1.0 0.0e+000 0.0e+000 0.0e+000 0 15 0 0 0 0 15 0 0 0 2 PCApply 4 1.0 8.7441e-005 1.0 2.75e+003 1.0 0.0e+000 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 31 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage SNES 1 1 668 0 Vec 11 11 13596 0 Matrix 3 3 7820 0 Krylov Solver 1 1 17392 0 Preconditioner 1 1 500 0 Viewer 2 2 680 0 Draw 1 1 444 0 Axis 1 1 308 0 Line Graph 1 1 1908 0 Index Set 6 6 3360 0 ======================================================================================================================== Average time to get PetscTime(): 2.03937e-006 #PETSc Option Table entries: -log_summary -mat_type baij -snes_max_it 20 -snes_monitor_residual -snes_view #End o PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 sizeof(PetscScalar) 8 Configure run at: Mon Jul 6 16:28:41 2009 Configure options: --with-cc="win32fe cl --nodetect" --download-c-blas-lapack=1 --with-fc=0 --with-mpi=0 --useThreads=0 --with-shared=0 ----------------------------------------- Libraries compiled on Mon Jul 6 16:39:57 2009 on Idrocp03 Machine characteristics: CYGWIN_NT-5.1 Idrocp03 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin Using PETSc directory: /home/Administrator/petsc Using PETSc arch: cygwin-c-debug ----------------------------------------- Using C compiler: /home/Administrator/petsc/bin/win32fe/win32fe cl --nodetect -MT -wd4996 -Z7 Using Fortran compiler: ----------------------------------------- Using include paths: -I/home/Administrator/petsc/cygwin-c-debug/include -I/home/Administrator/petsc/include -I/home/Administrator/petsc/include/mpiuni ------------------------------------------ Using C linker: /home/Administrator/petsc/bin/win32fe/win32fe cl --nodetect -MT -wd4996 -Z7 Using Fortran linker: Using libraries: -L/home/Administrator/petsc/cygwin-c-debug/lib -L/home/Administrator/petsc/cygwin-c-debug/lib -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -L/home/Administrator/petsc/cygwin-c-debug/lib -lf2clapack -lf2cblas -lmpiuni Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib ------------------------------------------ Thank you in advance for your help, Michel Cancelliere Politecnico di Torino -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jul 22 08:26:39 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 22 Jul 2009 08:26:39 -0500 Subject: SNES Convergence test In-Reply-To: <7f18de3b0907220428u6b2690bay9efd09181d7bbada@mail.gmail.com> References: <7f18de3b0907220428u6b2690bay9efd09181d7bbada@mail.gmail.com> Message-ID: After SNESSolve() you should call SNESGetConvergedReason() and see if the value is negative. Use -snes_monitor and -snes_converged_reason to see why SNES is ending. Barry On Jul 22, 2009, at 6:28 AM, Michel Cancelliere wrote: > Hi there, > > I am having problems with the SNES, it seems that changing atol or > rtol have no effect on the numbers of Newton Iterations. Do you > think that it can be a problem of settings in snes solver or maybe a > problem on the routines for Jacobian matrix evaluation? The linear > solver do converge at each nonlinear iterations. > > My Code > > ierr = SNESCreate(PETSC_COMM_WORLD,&snes);CHKERRQ(ierr); > > /********************************************************/ > /* Creation of the matrix and vector data structures*/ > /********************************************************/ > ierr = VecCreate(PETSC_COMM_WORLD,&x); CHKERRQ(ierr); > ierr = VecSetSizes(x,PETSC_DECIDE,2*input.grid.N); CHKERRQ(ierr); > ierr = VecSetFromOptions(x); CHKERRQ(ierr); > ierr = VecDuplicate(x,&R);CHKERRQ(ierr); > > ierr = MatCreate(PETSC_COMM_WORLD,&J);CHKERRQ(ierr); > ierr = MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,2*input.grid.N, > 2*input.grid.N);CHKERRQ(ierr); > ierr = MatSetFromOptions(J);CHKERRQ(ierr); > > > ///Set function evaluation routine and vector > > // Assign global variable which is used in the static wrapper > function > pt2Object = (void*) &system; > pt2Object2 = (void*) &system; > ierr = > SNESSetFunction > (snes,R,CSystem::Wrapper_to_FormFunction,&input);CHKERRQ(ierr); > > // Set Jacobian matrix structure and Jacobian evaluation routine > ierr = > SNESSetJacobian > (snes,J,J,CSystem::Wrapper_to_FormJacobian,&input);CHKERRQ(ierr); > > /** Customizr non linear solver; set runtime options ***/ > /* Set linear solver defaults for this problem. By extracting the > KSP,and PC contexts from > the SNES context, we can then directly call any KSP and PC routines > to set various options*/ > > ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr); > ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); > ierr = PCSetType(pc,PCILU);CHKERRQ(ierr); > ierr = KSPSetTolerances(ksp,1.e-10,PETSC_DEFAULT,PETSC_DEFAULT, > 30);CHKERRQ(ierr); > ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); > > > > /* > Set SNES/KSP/PC rountime options, e.g., > -snes_view -snes_monitor -ksp_type -pc_type > These options will override thos specified above as lon as > SNESSetFromOptoons is called _after_ any other customization routines. > */ > > ierr =SNESSetTolerances(snes, > 1e-100,1e-8,1e-1000,100,1000);CHKERRQ(ierr); No matter what I put > here I am getting the same results (Residual Norm and number of > iterations) > ierr = SNESSetFromOptions(snes);CHKERRQ(ierr); > > /*--------------------------------------------------------------- > Evaluate initial guess; then solve nonlinear system > -----------------------------------------------------------------*/ > > > ierr = CSystem::Cells2Vec(x,input);CHKERRQ(ierr); > > > for (int i=0;i +) > system.delta_t = system.delta_t_V[i]; > system.t = system.t_V[i]; > input.t = system.t; > input.delta_t = system.delta_t; > /*\\\\\\\\\\\\\\Boundary conditions\\\\\\\\\\\\\\\\*/ > Rate_tot = input.F_boundary(input.t,input.delta_t); > input.bc.F(input.interfaces,Rate_tot); > > ierr = SNESSolve(snes,PETSC_NULL,x);CHKERRQ(ierr); > gauge.push_back(input.cells[0].water.p); > } > > > > > > > > ierr = VecDestroy(x);CHKERRQ(ierr); > ierr = VecDestroy(R);CHKERRQ(ierr); > ierr = MatDestroy(J);CHKERRQ(ierr); > ierr = SNESDestroy(snes);CHKERRQ(ierr); > ierr = PetscFinalize();CHKERRQ(ierr); > return 0; > } > > > Output: > SNES Object: > type: ls > line search variant: SNESLineSearchCubic > alpha=0.0001, maxstep=1e+008, minlambda=1e-012 > maximum iterations=20, maximum function evaluations=1000 > tolerances: relative=1e-008, absolute=1e-100, solution=0 > total number of linear solver iterations=2 > total number of function evaluations=13 > KSP Object: > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-030 > maximum iterations=30, initial guess is zero > tolerances: relative=1e-010, absolute=1e-050, divergence=10000 > left preconditioning > PC Object: > type: ilu > ILU: 0 levels of fill > ILU: factor fill ratio allocated 1 > ILU: tolerance for zero pivot 1e-012 > ILU: using diagonal shift to prevent zero pivot > ILU: using diagonal shift on blocks to prevent zero pivot > out-of-place factorization > matrix ordering: natural > ILU: factor fill ratio needed 0 > Factored matrix follows > Matrix Object: > type=seqbaij, rows=64, cols=64 > package used to perform factorization: petsc > total: nonzeros=376, allocated nonzeros=920 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqbaij, rows=64, cols=64 > total: nonzeros=376, allocated nonzeros=920 > block size is 1 > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript - > r -fCourier9' to print this document *** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > > ex2.exe on a cygwin-c- named IDROCP03 with 1 processor, by > Administrator Wed Jul 22 13:25:47 2009 > Using Petsc Release Version 3.0.0, Patch 6, Fri Jun 5 13:31:12 CDT > 2009 > > Max Max/Min Avg Total > Time (sec): 4.667e-001 1.00000 4.667e-001 > Objects: 2.800e+001 1.00000 2.800e+001 > Flops: 1.417e+004 1.00000 1.417e+004 1.417e+004 > Flops/sec: 3.037e+004 1.00000 3.037e+004 3.037e+004 > Memory: 9.325e+004 1.00000 9.325e+004 > MPI Messages: 0.000e+000 0.00000 0.000e+000 0.000e+000 > MPI Message Lengths: 0.000e+000 0.00000 0.000e+000 0.000e+000 > MPI Reductions: 0.000e+000 0.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of > length N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 4.6669e-001 100.0% 1.4174e+004 100.0% 0.000e > +000 0.0% 0.000e+000 0.0% 0.000e+000 0.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all > processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with > PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in > this phase > %M - percent messages in this phase %L - percent message > lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max > time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was compiled with a debugging option, # > # To get timing results run config/configure.py # > # using --with-debugging=no, the performance will # > # be generally two or three times faster. # > # # > ########################################################## > > > Event Count Time (sec) > Flops --- Global --- --- Stage --- > Total > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > SNESSolve 1 1.0 2.7082e-001 1.0 1.42e+004 1.0 0.0e+000 > 0.0e+000 0.0e+000 58100 0 0 0 58100 0 0 0 0 > SNESLineSearch 2 1.0 2.9807e-002 1.0 4.94e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 6 35 0 0 0 6 35 0 0 0 0 > SNESFunctionEval 13 1.0 3.2095e-002 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 7 0 0 0 0 7 0 0 0 0 0 > SNESJacobianEval 2 1.0 2.5123e-003 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 1 0 0 0 0 1 0 0 0 0 0 > VecView 2 1.0 2.3364e-001 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 50 0 0 0 0 50 0 0 0 0 0 > VecDot 2 1.0 1.2292e-005 1.0 2.54e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 21 > VecMDot 2 1.0 1.0337e-005 1.0 2.54e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 25 > VecNorm 20 1.0 1.1873e-004 1.0 2.54e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 18 0 0 0 0 18 0 0 0 21 > VecScale 4 1.0 1.7321e-005 1.0 2.56e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 15 > VecCopy 6 1.0 2.6819e-005 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > VecSet 5 1.0 1.9835e-005 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 2 1.0 1.0337e-005 1.0 2.56e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 25 > VecWAXPY 12 1.0 5.2521e-005 1.0 1.41e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 10 0 0 0 0 10 0 0 0 27 > VecMAXPY 4 1.0 1.7321e-005 1.0 5.12e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 4 0 0 0 0 4 0 0 0 30 > VecAssemblyBegin 13 1.0 3.4641e-005 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > VecAssemblyEnd 13 1.0 3.1848e-005 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > VecReduceArith 2 1.0 1.7321e-005 1.0 2.54e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 15 > VecReduceComm 1 1.0 3.0730e-006 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > VecNormalize 4 1.0 8.4368e-005 1.0 7.64e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 5 0 0 0 0 5 0 0 0 9 > MatMult 4 1.0 4.4978e-005 1.0 2.75e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 61 > MatMultTranspose 1 1.0 2.3467e-005 1.0 7.52e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 5 0 0 0 0 5 0 0 0 32 > MatSolve 4 1.0 5.6711e-005 1.0 2.75e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 49 > MatLUFactorNum 2 1.0 7.7943e-005 1.0 2.06e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 15 0 0 0 0 15 0 0 0 26 > MatILUFactorSym 2 1.0 1.9975e-004 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyBegin 2 1.0 6.7048e-006 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyEnd 2 1.0 4.1346e-005 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 2 1.0 1.0895e-005 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 2 1.0 2.0254e-004 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > MatView 2 1.0 5.7549e-004 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > KSPGMRESOrthog 2 1.0 5.6152e-005 1.0 5.10e+002 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 4 0 0 0 0 4 0 0 0 9 > KSPSetup 2 1.0 1.5170e-004 1.0 0.00e+000 0.0 0.0e+000 > 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 2 1.0 1.8486e-003 1.0 7.97e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 56 0 0 0 0 56 0 0 0 4 > PCSetUp 2 1.0 1.2032e-003 1.0 2.06e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 15 0 0 0 0 15 0 0 0 2 > PCApply 4 1.0 8.7441e-005 1.0 2.75e+003 1.0 0.0e+000 > 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 31 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' > Mem. > > --- Event Stage 0: Main Stage > > SNES 1 1 668 0 > Vec 11 11 13596 0 > Matrix 3 3 7820 0 > Krylov Solver 1 1 17392 0 > Preconditioner 1 1 500 0 > Viewer 2 2 680 0 > Draw 1 1 444 0 > Axis 1 1 308 0 > Line Graph 1 1 1908 0 > Index Set 6 6 3360 0 > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > ====================================================================== > Average time to get PetscTime(): 2.03937e-006 > #PETSc Option Table entries: > -log_summary > -mat_type baij > -snes_max_it 20 > -snes_monitor_residual > -snes_view > #End o PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 > sizeof(PetscScalar) 8 > Configure run at: Mon Jul 6 16:28:41 2009 > Configure options: --with-cc="win32fe cl --nodetect" --download-c- > blas-lapack=1 --with-fc=0 --with-mpi=0 --useThreads=0 --with-shared=0 > ----------------------------------------- > Libraries compiled on Mon Jul 6 16:39:57 2009 on Idrocp03 > Machine characteristics: CYGWIN_NT-5.1 Idrocp03 1.5.25(0.156/4/2) > 2008-06-12 19:34 i686 Cygwin > Using PETSc directory: /home/Administrator/petsc > Using PETSc arch: cygwin-c-debug > ----------------------------------------- > Using C compiler: /home/Administrator/petsc/bin/win32fe/win32fe cl -- > nodetect -MT -wd4996 -Z7 > Using Fortran compiler: > ----------------------------------------- > Using include paths: -I/home/Administrator/petsc/cygwin-c-debug/ > include -I/home/Administrator/petsc/include -I/home/Administrator/ > petsc/include/mpiuni > ------------------------------------------ > Using C linker: /home/Administrator/petsc/bin/win32fe/win32fe cl -- > nodetect -MT -wd4996 -Z7 > Using Fortran linker: > Using libraries: -L/home/Administrator/petsc/cygwin-c-debug/lib -L/ > home/Administrator/petsc/cygwin-c-debug/lib -lpetscts -lpetscsnes - > lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -L/home/ > Administrator/petsc/cygwin-c-debug/lib -lf2clapack -lf2cblas - > lmpiuni Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib > ------------------------------------------ > > > > Thank you in advance for your help, > > Michel Cancelliere > Politecnico di Torino > From knepley at gmail.com Wed Jul 22 08:42:50 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 22 Jul 2009 08:42:50 -0500 Subject: SNES Convergence test In-Reply-To: References: <7f18de3b0907220428u6b2690bay9efd09181d7bbada@mail.gmail.com> Message-ID: On Wed, Jul 22, 2009 at 8:26 AM, Barry Smith wrote: > > After SNESSolve() you should call SNESGetConvergedReason() and see if the > value is negative. > > Use -snes_monitor and -snes_converged_reason to see why SNES is ending. Also, -snes_view will print the tolerances it actually used. Matt > > Barry > > > On Jul 22, 2009, at 6:28 AM, Michel Cancelliere wrote: > > Hi there, >> >> I am having problems with the SNES, it seems that changing atol or rtol >> have no effect on the numbers of Newton Iterations. Do you think that it can >> be a problem of settings in snes solver or maybe a problem on the routines >> for Jacobian matrix evaluation? The linear solver do converge at each >> nonlinear iterations. >> >> My Code >> >> ierr = SNESCreate(PETSC_COMM_WORLD,&snes);CHKERRQ(ierr); >> >> /********************************************************/ >> /* Creation of the matrix and vector data structures*/ >> /********************************************************/ >> ierr = VecCreate(PETSC_COMM_WORLD,&x); CHKERRQ(ierr); >> ierr = VecSetSizes(x,PETSC_DECIDE,2*input.grid.N); CHKERRQ(ierr); >> ierr = VecSetFromOptions(x); CHKERRQ(ierr); >> ierr = VecDuplicate(x,&R);CHKERRQ(ierr); >> >> ierr = MatCreate(PETSC_COMM_WORLD,&J);CHKERRQ(ierr); >> ierr = >> MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,2*input.grid.N,2*input.grid.N);CHKERRQ(ierr); >> ierr = MatSetFromOptions(J);CHKERRQ(ierr); >> >> >> ///Set function evaluation routine and vector >> >> // Assign global variable which is used in the static wrapper function >> pt2Object = (void*) &system; >> pt2Object2 = (void*) &system; >> ierr = >> SNESSetFunction(snes,R,CSystem::Wrapper_to_FormFunction,&input);CHKERRQ(ierr); >> >> // Set Jacobian matrix structure and Jacobian evaluation routine >> ierr = >> SNESSetJacobian(snes,J,J,CSystem::Wrapper_to_FormJacobian,&input);CHKERRQ(ierr); >> >> /** Customizr non linear solver; set runtime options ***/ >> /* Set linear solver defaults for this problem. By extracting the KSP,and >> PC contexts from >> the SNES context, we can then directly call any KSP and PC routines to set >> various options*/ >> >> ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr); >> ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); >> ierr = PCSetType(pc,PCILU);CHKERRQ(ierr); >> ierr = >> KSPSetTolerances(ksp,1.e-10,PETSC_DEFAULT,PETSC_DEFAULT,30);CHKERRQ(ierr); >> ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); >> >> >> >> /* >> Set SNES/KSP/PC rountime options, e.g., >> -snes_view -snes_monitor -ksp_type -pc_type >> These options will override thos specified above as lon as >> SNESSetFromOptoons is called _after_ any other customization routines. >> */ >> >> ierr =SNESSetTolerances(snes,1e-100,1e-8,1e-1000,100,1000);CHKERRQ(ierr); >> No matter what I put here I am getting the same results (Residual Norm and >> number of iterations) >> ierr = SNESSetFromOptions(snes);CHKERRQ(ierr); >> >> /*--------------------------------------------------------------- >> Evaluate initial guess; then solve nonlinear system >> -----------------------------------------------------------------*/ >> >> >> ierr = CSystem::Cells2Vec(x,input);CHKERRQ(ierr); >> >> >> for (int i=0;i> system.delta_t = system.delta_t_V[i]; >> system.t = system.t_V[i]; >> input.t = system.t; >> input.delta_t = system.delta_t; >> /*\\\\\\\\\\\\\\Boundary conditions\\\\\\\\\\\\\\\\*/ >> Rate_tot = input.F_boundary(input.t,input.delta_t); >> input.bc.F(input.interfaces,Rate_tot); >> >> ierr = SNESSolve(snes,PETSC_NULL,x);CHKERRQ(ierr); >> gauge.push_back(input.cells[0].water.p); >> } >> >> >> >> >> >> >> >> ierr = VecDestroy(x);CHKERRQ(ierr); >> ierr = VecDestroy(R);CHKERRQ(ierr); >> ierr = MatDestroy(J);CHKERRQ(ierr); >> ierr = SNESDestroy(snes);CHKERRQ(ierr); >> ierr = PetscFinalize();CHKERRQ(ierr); >> return 0; >> } >> >> >> Output: >> SNES Object: >> type: ls >> line search variant: SNESLineSearchCubic >> alpha=0.0001, maxstep=1e+008, minlambda=1e-012 >> maximum iterations=20, maximum function evaluations=1000 >> tolerances: relative=1e-008, absolute=1e-100, solution=0 >> total number of linear solver iterations=2 >> total number of function evaluations=13 >> KSP Object: >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-030 >> maximum iterations=30, initial guess is zero >> tolerances: relative=1e-010, absolute=1e-050, divergence=10000 >> left preconditioning >> PC Object: >> type: ilu >> ILU: 0 levels of fill >> ILU: factor fill ratio allocated 1 >> ILU: tolerance for zero pivot 1e-012 >> ILU: using diagonal shift to prevent zero pivot >> ILU: using diagonal shift on blocks to prevent zero pivot >> out-of-place factorization >> matrix ordering: natural >> ILU: factor fill ratio needed 0 >> Factored matrix follows >> Matrix Object: >> type=seqbaij, rows=64, cols=64 >> package used to perform factorization: petsc >> total: nonzeros=376, allocated nonzeros=920 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqbaij, rows=64, cols=64 >> total: nonzeros=376, allocated nonzeros=920 >> block size is 1 >> >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r >> -fCourier9' to print this document *** >> >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance Summary: >> ---------------------------------------------- >> >> ex2.exe on a cygwin-c- named IDROCP03 with 1 processor, by Administrator >> Wed Jul 22 13:25:47 2009 >> Using Petsc Release Version 3.0.0, Patch 6, Fri Jun 5 13:31:12 CDT 2009 >> >> Max Max/Min Avg Total >> Time (sec): 4.667e-001 1.00000 4.667e-001 >> Objects: 2.800e+001 1.00000 2.800e+001 >> Flops: 1.417e+004 1.00000 1.417e+004 1.417e+004 >> Flops/sec: 3.037e+004 1.00000 3.037e+004 3.037e+004 >> Memory: 9.325e+004 1.00000 9.325e+004 >> MPI Messages: 0.000e+000 0.00000 0.000e+000 0.000e+000 >> MPI Message Lengths: 0.000e+000 0.00000 0.000e+000 0.000e+000 >> MPI Reductions: 0.000e+000 0.00000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N >> --> 2N flops >> and VecAXPY() for complex vectors of length N >> --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages >> --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts %Total >> Avg %Total counts %Total >> 0: Main Stage: 4.6669e-001 100.0% 1.4174e+004 100.0% 0.000e+000 >> 0.0% 0.000e+000 0.0% 0.000e+000 0.0% >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> Avg. len: average message length >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and >> PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in this >> phase >> %M - percent messages in this phase %L - percent message lengths >> in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over >> all processors) >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was compiled with a debugging option, # >> # To get timing results run config/configure.py # >> # using --with-debugging=no, the performance will # >> # be generally two or three times faster. # >> # # >> ########################################################## >> >> >> Event Count Time (sec) Flops >> --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg len >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> SNESSolve 1 1.0 2.7082e-001 1.0 1.42e+004 1.0 0.0e+000 >> 0.0e+000 0.0e+000 58100 0 0 0 58100 0 0 0 0 >> SNESLineSearch 2 1.0 2.9807e-002 1.0 4.94e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 6 35 0 0 0 6 35 0 0 0 0 >> SNESFunctionEval 13 1.0 3.2095e-002 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 7 0 0 0 0 7 0 0 0 0 0 >> SNESJacobianEval 2 1.0 2.5123e-003 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 1 0 0 0 0 1 0 0 0 0 0 >> VecView 2 1.0 2.3364e-001 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 50 0 0 0 0 50 0 0 0 0 0 >> VecDot 2 1.0 1.2292e-005 1.0 2.54e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 21 >> VecMDot 2 1.0 1.0337e-005 1.0 2.54e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 25 >> VecNorm 20 1.0 1.1873e-004 1.0 2.54e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 18 0 0 0 0 18 0 0 0 21 >> VecScale 4 1.0 1.7321e-005 1.0 2.56e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 15 >> VecCopy 6 1.0 2.6819e-005 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 5 1.0 1.9835e-005 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 2 1.0 1.0337e-005 1.0 2.56e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 25 >> VecWAXPY 12 1.0 5.2521e-005 1.0 1.41e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 10 0 0 0 0 10 0 0 0 27 >> VecMAXPY 4 1.0 1.7321e-005 1.0 5.12e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 4 0 0 0 0 4 0 0 0 30 >> VecAssemblyBegin 13 1.0 3.4641e-005 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> VecAssemblyEnd 13 1.0 3.1848e-005 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> VecReduceArith 2 1.0 1.7321e-005 1.0 2.54e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 2 0 0 0 0 2 0 0 0 15 >> VecReduceComm 1 1.0 3.0730e-006 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> VecNormalize 4 1.0 8.4368e-005 1.0 7.64e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 5 0 0 0 0 5 0 0 0 9 >> MatMult 4 1.0 4.4978e-005 1.0 2.75e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 61 >> MatMultTranspose 1 1.0 2.3467e-005 1.0 7.52e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 5 0 0 0 0 5 0 0 0 32 >> MatSolve 4 1.0 5.6711e-005 1.0 2.75e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 49 >> MatLUFactorNum 2 1.0 7.7943e-005 1.0 2.06e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 15 0 0 0 0 15 0 0 0 26 >> MatILUFactorSym 2 1.0 1.9975e-004 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> MatAssemblyBegin 2 1.0 6.7048e-006 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> MatAssemblyEnd 2 1.0 4.1346e-005 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> MatGetRowIJ 2 1.0 1.0895e-005 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> MatGetOrdering 2 1.0 2.0254e-004 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> MatView 2 1.0 5.7549e-004 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> KSPGMRESOrthog 2 1.0 5.6152e-005 1.0 5.10e+002 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 4 0 0 0 0 4 0 0 0 9 >> KSPSetup 2 1.0 1.5170e-004 1.0 0.00e+000 0.0 0.0e+000 >> 0.0e+000 0.0e+000 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 2 1.0 1.8486e-003 1.0 7.97e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 56 0 0 0 0 56 0 0 0 4 >> PCSetUp 2 1.0 1.2032e-003 1.0 2.06e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 15 0 0 0 0 15 0 0 0 2 >> PCApply 4 1.0 8.7441e-005 1.0 2.75e+003 1.0 0.0e+000 >> 0.0e+000 0.0e+000 0 19 0 0 0 0 19 0 0 0 31 >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> >> --- Event Stage 0: Main Stage >> >> SNES 1 1 668 0 >> Vec 11 11 13596 0 >> Matrix 3 3 7820 0 >> Krylov Solver 1 1 17392 0 >> Preconditioner 1 1 500 0 >> Viewer 2 2 680 0 >> Draw 1 1 444 0 >> Axis 1 1 308 0 >> Line Graph 1 1 1908 0 >> Index Set 6 6 3360 0 >> >> ======================================================================================================================== >> Average time to get PetscTime(): 2.03937e-006 >> #PETSc Option Table entries: >> -log_summary >> -mat_type baij >> -snes_max_it 20 >> -snes_monitor_residual >> -snes_view >> #End o PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 >> sizeof(PetscScalar) 8 >> Configure run at: Mon Jul 6 16:28:41 2009 >> Configure options: --with-cc="win32fe cl --nodetect" >> --download-c-blas-lapack=1 --with-fc=0 --with-mpi=0 --useThreads=0 >> --with-shared=0 >> ----------------------------------------- >> Libraries compiled on Mon Jul 6 16:39:57 2009 on Idrocp03 >> Machine characteristics: CYGWIN_NT-5.1 Idrocp03 1.5.25(0.156/4/2) >> 2008-06-12 19:34 i686 Cygwin >> Using PETSc directory: /home/Administrator/petsc >> Using PETSc arch: cygwin-c-debug >> ----------------------------------------- >> Using C compiler: /home/Administrator/petsc/bin/win32fe/win32fe cl >> --nodetect -MT -wd4996 -Z7 >> Using Fortran compiler: >> ----------------------------------------- >> Using include paths: -I/home/Administrator/petsc/cygwin-c-debug/include >> -I/home/Administrator/petsc/include >> -I/home/Administrator/petsc/include/mpiuni >> ------------------------------------------ >> Using C linker: /home/Administrator/petsc/bin/win32fe/win32fe cl >> --nodetect -MT -wd4996 -Z7 >> Using Fortran linker: >> Using libraries: -L/home/Administrator/petsc/cygwin-c-debug/lib >> -L/home/Administrator/petsc/cygwin-c-debug/lib -lpetscts -lpetscsnes >> -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc >> -L/home/Administrator/petsc/cygwin-c-debug/lib -lf2clapack -lf2cblas >> -lmpiuni Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib >> ------------------------------------------ >> >> >> >> Thank you in advance for your help, >> >> Michel Cancelliere >> Politecnico di Torino >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xy2102 at columbia.edu Wed Jul 22 15:18:05 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Wed, 22 Jul 2009 16:18:05 -0400 Subject: Is DMMGSolve in Petsc a Newton-multigrid? Message-ID: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu> Is Newton iteration for the outer iterations and multigrid for the linear inner iterations? Thanks -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From knepley at gmail.com Wed Jul 22 15:50:55 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 22 Jul 2009 15:50:55 -0500 Subject: Is DMMGSolve in Petsc a Newton-multigrid? In-Reply-To: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu> References: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu> Message-ID: On Wed, Jul 22, 2009 at 3:18 PM, (Rebecca) Xuefei YUAN wrote: > Is Newton iteration for the outer iterations and multigrid for the linear > inner iterations? This is the usual formulation: Newton to solve F(u) = 0 and Multigrid to precondition F'(u) \delta u = -F(u) Matt > > Thanks > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xy2102 at columbia.edu Wed Jul 22 17:24:52 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Wed, 22 Jul 2009 18:24:52 -0400 Subject: Is DMMGSolve in Petsc a Newton-multigrid? In-Reply-To: References: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu> Message-ID: <20090722182452.xcq0hhuyqocows8s@cubmail.cc.columbia.edu> Dear Matt, Does that mean mg is only used as a pc so far? Where could I check on which smoother it uses and options Petsc provided for dmmg? There is not much information on the user manual. Thanks, Rebecca Quoting Matthew Knepley : > On Wed, Jul 22, 2009 at 3:18 PM, (Rebecca) Xuefei YUAN > wrote: > >> Is Newton iteration for the outer iterations and multigrid for the linear >> inner iterations? > > > This is the usual formulation: > > Newton to solve F(u) = 0 > > and > > Multigrid to precondition F'(u) \delta u = -F(u) > > Matt > > >> >> Thanks >> >> -- >> (Rebecca) Xuefei YUAN >> Department of Applied Physics and Applied Mathematics >> Columbia University >> Tel:917-399-8032 >> www.columbia.edu/~xy2102 >> > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From bsmith at mcs.anl.gov Wed Jul 22 18:41:05 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 22 Jul 2009 18:41:05 -0500 Subject: Is DMMGSolve in Petsc a Newton-multigrid? In-Reply-To: <20090722182452.xcq0hhuyqocows8s@cubmail.cc.columbia.edu> References: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu> <20090722182452.xcq0hhuyqocows8s@cubmail.cc.columbia.edu> Message-ID: <0DFE65B3-6AAD-4437-870C-D4EC6AF07D17@mcs.anl.gov> -snes_view will always show the solver options being used. -help will cause all the options that are currently available. Barry On Jul 22, 2009, at 5:24 PM, (Rebecca) Xuefei YUAN wrote: > Dear Matt, > > Does that mean mg is only used as a pc so far? Where could I check > on which smoother it uses and options Petsc provided for dmmg? There > is not much information on the user manual. > > Thanks, > > Rebecca > > > Quoting Matthew Knepley : > >> On Wed, Jul 22, 2009 at 3:18 PM, (Rebecca) Xuefei YUAN >> wrote: >> >>> Is Newton iteration for the outer iterations and multigrid for the >>> linear >>> inner iterations? >> >> >> This is the usual formulation: >> >> Newton to solve F(u) = 0 >> >> and >> >> Multigrid to precondition F'(u) \delta u = -F(u) >> >> Matt >> >> >>> >>> Thanks >>> >>> -- >>> (Rebecca) Xuefei YUAN >>> Department of Applied Physics and Applied Mathematics >>> Columbia University >>> Tel:917-399-8032 >>> www.columbia.edu/~xy2102 >>> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their >> experiments >> lead. >> -- Norbert Wiener >> > > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From Andreas.Grassl at student.uibk.ac.at Thu Jul 23 07:47:28 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Thu, 23 Jul 2009 14:47:28 +0200 Subject: strange behaviour with PetscViewerBinary on MATIS Message-ID: <4A685BE0.1030802@student.uibk.ac.at> Hello, I want to save my Matrix A to disk and process it then with ksp/ksp/ex10. Doing it for type AIJ is working fine. Using type IS, it seems to save only the local matrix from one processor to the disk and dump the others to stdout. PetscViewerBinaryOpen(commw,"matrix.bin",FILE_MODE_WRITE,&viewer1); MatView(A,viewer1); Is the only workaround to save the LocalToGlobalMapping and the local matrices separately and to read in all this information or do you see an easier way? Is there a canonical way to save and restore the LocalToGlobalMapping? Cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From jed at 59A2.org Thu Jul 23 08:43:51 2009 From: jed at 59A2.org (Jed Brown) Date: Thu, 23 Jul 2009 15:43:51 +0200 Subject: strange behaviour with PetscViewerBinary on MATIS In-Reply-To: <4A685BE0.1030802@student.uibk.ac.at> References: <4A685BE0.1030802@student.uibk.ac.at> Message-ID: <4A686917.9020504@59A2.org> Andreas Grassl wrote: > Hello, > > I want to save my Matrix A to disk and process it then with ksp/ksp/ex10. Doing > it for type AIJ is working fine. > > Using type IS, it seems to save only the local matrix from one processor to the > disk and dump the others to stdout. > > PetscViewerBinaryOpen(commw,"matrix.bin",FILE_MODE_WRITE,&viewer1); > MatView(A,viewer1); The viewer for MATIS is really simplistic, it doesn't ascribe any parallel structure at all. The technical explanation for the behavior you are seeing (which is bad) is the following. MatView_IS gets a "singleton" viewer which for a Binary viewer is just a binary viewer on PETSC_COMM_SELF for rank 0, with the NULL (0) viewer for all other ranks. It then calls MatView with this viewer which is a proper binary viewer for rank 0, but MatView creates a new viewer when called with viewer 0. > Is the only workaround to save the LocalToGlobalMapping and the local matrices > separately and to read in all this information or do you see an easier way? You can put this in MatView_IS if you really need it, but I doubt it will actually be useful. Unfortunately, you cannot change the domain decomposition with Neumann preconditioners, hence they will have limited use for solving a system with a saved matrix. Why do you want to save the matrix, it's vastly slower and less useful than a function which assembles that matrix? Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: From Andreas.Grassl at student.uibk.ac.at Thu Jul 23 10:08:26 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Thu, 23 Jul 2009 17:08:26 +0200 Subject: strange behaviour with PetscViewerBinary on MATIS In-Reply-To: <4A686917.9020504@59A2.org> References: <4A685BE0.1030802@student.uibk.ac.at> <4A686917.9020504@59A2.org> Message-ID: <4A687CEA.3040403@student.uibk.ac.at> Jed Brown schrieb: > You can put this in MatView_IS if you really need it, but I doubt it > will actually be useful. Unfortunately, you cannot change the domain > decomposition with Neumann preconditioners, hence they will have limited > use for solving a system with a saved matrix. Why do you want to save > the matrix, it's vastly slower and less useful than a function which > assembles that matrix? I assemble the Matrix by reading out from a data structure produced by a proprietary program and just used this easy approach to compare the solvers on different machines, where this program is not installed. Since the implementation of the NN-preconditioner is suboptimal at all, I will not waste much time on this issues and my post at the list was lead mostly by my curiosity. thanks for the explanation cheers, ando -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From Stephen.R.Ball at awe.co.uk Fri Jul 24 05:22:40 2009 From: Stephen.R.Ball at awe.co.uk (Stephen Ball) Date: Fri, 24 Jul 2009 11:22:40 +0100 Subject: Any examples of how to set Spooles LU and CHOLESKY direct solvers using the Fortran API? Message-ID: <97OBPJ025484@awe.co.uk> Hi I have recently moved from using PETSc v2.3.3 to v3.0.0 and am trying to update my Fortran code accordingly. Do you have any examples of how to set Spooles LU and CHOLESKY direct solvers using the Fortran API? I am struggling somewhat to understand the correct sequence of calls for your new API, including the matrix and PC creation and set up stages when using Spooles LU and CHOLESKY direct solvers. What calls are required or optional and in what circumstances? Kindest regards Stephen This e-mail and any attachments may contain confidential and privileged information. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this e-mail and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal. From bsmith at mcs.anl.gov Fri Jul 24 08:44:34 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 24 Jul 2009 08:44:34 -0500 Subject: Any examples of how to set Spooles LU and CHOLESKY direct solvers using the Fortran API? In-Reply-To: <97OBPJ025484@awe.co.uk> References: <97OBPJ025484@awe.co.uk> Message-ID: <28FA3B3E-46A3-44D6-ADF4-E9BC16E25AFE@mcs.anl.gov> Stephen, 1) You no longer need to set particular Spooles matrix types. Just use AIJ or SBAIJ (for symmetric case) 2) call KSPGetPC(ksp,pc,ierr) Call PCSetType(pc,PCLU,ierr) call PCFactorSetMatSolverPackage(pc,MAT_SOLVER_SPOOLES,ierr) call KSPSolve() Barry On Jul 24, 2009, at 5:22 AM, Stephen Ball wrote: > Hi > > I have recently moved from using PETSc v2.3.3 to v3.0.0 and am > trying to > update my Fortran code accordingly. > > Do you have any examples of how to set Spooles LU and CHOLESKY direct > solvers using the Fortran API? > > I am struggling somewhat to understand the correct sequence of calls > for > your new API, including the matrix and PC creation and set up stages > when using Spooles LU and CHOLESKY direct solvers. > > What calls are required or optional and in what circumstances? > > Kindest regards > > Stephen > > This e-mail and any attachments may contain confidential and > privileged information. If you are not the intended recipient, > please notify the sender immediately by return e-mail, delete this > e-mail and destroy any copies. Any dissemination or use of this > information by a person other than the intended recipient is > unauthorized and may be illegal. From xy2102 at columbia.edu Sat Jul 25 13:20:33 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Sat, 25 Jul 2009 14:20:33 -0400 Subject: vecload_block_size Message-ID: <20090725142033.inixoni34cg0scg0@cubmail.cc.columbia.edu> Hi, I am reading from a binary file of the previous solution and would like it to be my initial guess. However, I found that the "-vecload_block_size" is shown in the PetscOptionsTable, I did not explicitly state this one in my options file, why it shows up in it? And the block size has been assigned as "4". Thanks! Rebecca Here is this PetscOptionsTable: (gdb) p *options $21 = {N = 28, argc = 3, Naliases = 0, args = 0xbf8e9564, names = { 0x8867768 "da_grid_x", 0x8867788 "da_grid_y", 0x88678c8 "dmmg_iscoloring_type", 0x88678a8 "dmmg_levels", 0x8867a20 "ksp_converged_reason", 0x8867a00 "ksp_max_it", 0x8847bb8 "loadbin", 0x8847bd8 "mx_grid", 0x8867748 "my", 0x88677d0 "number_of_time_steps", 0x8867a60 "pc_asm_overlap", 0x8867a40 "pc_type", 0x8867910 "snes_converged_reason", 0x8867998 "snes_ksp_ew", 0x8867940 "snes_max_fail", 0x88679d8 "snes_max_funcs", 0x88679b8 "snes_max_it", 0x8867968 "snes_max_linear_solve_fail", 0x8867930 "snes_mf", 0x88678f8 "snes_monitor", 0x8867aa8 "sub_pc_factor_shift_nonzero", 0x8867a88 "sub_pc_type", 0x8867850 "time_accuracy_order", 0x8867800 "time_step_monitor", 0x88677a8 "time_step_size", 0x8867818 "time_step_to_save_solution_text", 0x8867878 "time_to_generate_grid", 0x8855890 "vecload_block_size", 0x0 }, values = {0x8867778 "8", 0x8867798 "8", 0x88678e8 "global", 0x88678b8 "1", 0x0, 0x8867a10 "50", 0x8847bc8 "true", 0x8867738 "9", 0x8867758 "9", 0x88677f0 "1", 0x8867a78 "1", 0x8867a50 "asm", 0x0, 0x88679a8 "true", 0x8867958 "100", 0x88679f0 "1000000", 0x88679c8 "10", 0x8867988 "100", 0x0, 0x0, 0x0, 0x8867a98 "ilu", 0x8867868 "1", 0x0, 0x88677c0 "0.2", 0x8867840 "1", 0x8867898 "0.2", 0x8847ba8 "4", 0x0 }, aliases1 = { 0x0 }, aliases2 = {0x0 }, used = { PETSC_FALSE, PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_FALSE, PETSC_TRUE, PETSC_FALSE, PETSC_FALSE, PETSC_FALSE, ---Type to continue, or q to quit--- PETSC_FALSE, PETSC_FALSE, PETSC_FALSE, PETSC_TRUE, PETSC_FALSE }, namegiven = PETSC_TRUE, programname = "/home/rebecca/linux/code/couple/twoway/twoway_oreggt/codes/tworeggt", '\0' , monitor = {0, 0, 0, 0, 0}, monitordestroy = { 0, 0, 0, 0, 0}, monitorcontext = {0x0, 0x0, 0x0, 0x0, 0x0}, numbermonitors = 0} -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From bsmith at mcs.anl.gov Sat Jul 25 13:50:02 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 25 Jul 2009 13:50:02 -0500 Subject: vecload_block_size In-Reply-To: <20090725142033.inixoni34cg0scg0@cubmail.cc.columbia.edu> References: <20090725142033.inixoni34cg0scg0@cubmail.cc.columbia.edu> Message-ID: <226A9ED7-E5A5-4323-951A-957CAF886C40@mcs.anl.gov> Some binary files have another file with the same name and .info on the end. That file contains additional options such as - vecload_block_size. If you don't want that option you can simply remove the file. On Jul 25, 2009, at 1:20 PM, (Rebecca) Xuefei YUAN wrote: > Hi, > > I am reading from a binary file of the previous solution and would > like it to be my initial guess. > > However, I found that the "-vecload_block_size" is shown in the > PetscOptionsTable, I did not explicitly state this one in my options > file, why it shows up in it? And the block size has been assigned as > "4". > > Thanks! > > Rebecca > > > Here is this PetscOptionsTable: > > (gdb) p *options > $21 = {N = 28, argc = 3, Naliases = 0, args = 0xbf8e9564, names = { > 0x8867768 "da_grid_x", 0x8867788 "da_grid_y", > 0x88678c8 "dmmg_iscoloring_type", 0x88678a8 "dmmg_levels", > 0x8867a20 "ksp_converged_reason", 0x8867a00 "ksp_max_it", > 0x8847bb8 "loadbin", 0x8847bd8 "mx_grid", 0x8867748 "my", > 0x88677d0 "number_of_time_steps", 0x8867a60 "pc_asm_overlap", > 0x8867a40 "pc_type", 0x8867910 "snes_converged_reason", > 0x8867998 "snes_ksp_ew", 0x8867940 "snes_max_fail", > 0x88679d8 "snes_max_funcs", 0x88679b8 "snes_max_it", > 0x8867968 "snes_max_linear_solve_fail", 0x8867930 "snes_mf", > 0x88678f8 "snes_monitor", 0x8867aa8 "sub_pc_factor_shift_nonzero", > 0x8867a88 "sub_pc_type", 0x8867850 "time_accuracy_order", > 0x8867800 "time_step_monitor", 0x88677a8 "time_step_size", > 0x8867818 "time_step_to_save_solution_text", > 0x8867878 "time_to_generate_grid", 0x8855890 "vecload_block_size", > 0x0 }, values = {0x8867778 "8", 0x8867798 "8", > 0x88678e8 "global", 0x88678b8 "1", 0x0, 0x8867a10 "50", 0x8847bc8 > "true", > 0x8867738 "9", 0x8867758 "9", 0x88677f0 "1", 0x8867a78 "1", > 0x8867a50 "asm", 0x0, 0x88679a8 "true", 0x8867958 "100", > 0x88679f0 "1000000", 0x88679c8 "10", 0x8867988 "100", 0x0, 0x0, > 0x0, > 0x8867a98 "ilu", 0x8867868 "1", 0x0, 0x88677c0 "0.2", 0x8867840 > "1", > 0x8867898 "0.2", 0x8847ba8 "4", 0x0 }, > aliases1 = { > 0x0 }, aliases2 = {0x0 }, > used = { > PETSC_FALSE, PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, > PETSC_TRUE, > PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_FALSE, PETSC_TRUE, > PETSC_TRUE, > PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, > PETSC_TRUE, > PETSC_FALSE, PETSC_TRUE, PETSC_FALSE, PETSC_FALSE, PETSC_FALSE, > ---Type to continue, or q to quit--- > PETSC_FALSE, PETSC_FALSE, PETSC_FALSE, PETSC_TRUE, > PETSC_FALSE }, namegiven = PETSC_TRUE, > programname = "/home/rebecca/linux/code/couple/twoway/twoway_oreggt/ > codes/tworeggt", '\0' , monitor = {0, 0, 0, 0, > 0}, monitordestroy = { > 0, 0, 0, 0, 0}, monitorcontext = {0x0, 0x0, 0x0, 0x0, 0x0}, > numbermonitors = 0} > > > > > > > > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From xy2102 at columbia.edu Sat Jul 25 16:06:20 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Sat, 25 Jul 2009 17:06:20 -0400 Subject: Any example of assembling a Jacobian matrix of DMComposite object? Message-ID: <20090725170620.ytfi94xgu888gwck@cubmail.cc.columbia.edu> Hi, I have an optimization problem in 2d with some scalar parameter, thus DMComposite is used to manage the date structure. If I am going to write down the Jacobian matrix for the system, I come up with (mx*my+1,mx*my+1) matrix. Is there any example in PETSc of assembling the Jacobian for optimization problem? Thanks very much! -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From bsmith at mcs.anl.gov Sat Jul 25 18:11:01 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 25 Jul 2009 18:11:01 -0500 Subject: Any example of assembling a Jacobian matrix of DMComposite object? In-Reply-To: <20090725170620.ytfi94xgu888gwck@cubmail.cc.columbia.edu> References: <20090725170620.ytfi94xgu888gwck@cubmail.cc.columbia.edu> Message-ID: <453232BD-FE37-4AAB-B9A6-2F1739E7B40C@mcs.anl.gov> On Jul 25, 2009, at 4:06 PM, (Rebecca) Xuefei YUAN wrote: > Hi, > > I have an optimization problem in 2d with some scalar parameter, > thus DMComposite is used to manage the date structure. If I am going > to write down the Jacobian matrix for the system, I come up with > (mx*my+1,mx*my+1) matrix. Is there any example in PETSc of > assembling the Jacobian for optimization problem? No > > Thanks very much! > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From xy2102 at columbia.edu Sun Jul 26 18:51:36 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Sun, 26 Jul 2009 19:51:36 -0400 Subject: possible bug in DMCompositeGetMatrix(). Message-ID: <20090726195136.f07iqopggk0skkg0@cubmail.cc.columbia.edu> Hi, I am working on an optimization problem, in which I would like to assemble a Jacobian matrix. Thus DMMGSetSNES(dmmg,FormFunction,FormJacobian) is called. In damgsnes.c:637, in calling DMGetMatrix(), it calls DMCompositeGetMatrix() where the temp matrix Atmp has been freed before it passes any information to J at pack.c:1722 and 1774. So after calling DMGetMatrix() in DMMGSetSNES, the stencil of the dmmg[i]->B has unchanged, i.e., (gdb) p dmmg[0]->B->stencil $107 = {dim = 0, dims = {0, 0, 0, 0}, starts = {0, 0, 0, 0}, noc = PETSC_FALSE} (gdb) where #0 DMMGSetSNES (dmmg=0x8856208, function=0x804c84f , jacobian=0x8052932 ) at damgsnes.c:641 #1 0x0804c246 in main (argc=Cannot access memory at address 0x0 ) at tworeggt.c:126 I compare this with http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/snes/examples/tutorials/ex18.c.html and it shows that the stencil has been carried out and passed to dmmg[0]->B as follows: (gdb) p dmmg[i]->B->stencil $80 = {dim = 2, dims = {5, 5, 1, 0}, starts = {0, 0, 0, 0}, noc = PETSC_TRUE} (gdb) where #0 DMMGSetSNES (dmmg=0x884b530, function=0x804c364 , jacobian=0x804d34d ) at damgsnes.c:642 #1 0x0804b969 in main (argc=Cannot access memory at address 0x2 ) at ex18.c:100 Because of this missing stencil of Jacobian matrix, I get the error code as follows: Program received signal SIGSEGV, Segmentation fault. 0x082447c2 in ISLocalToGlobalMappingApply (mapping=0x0, N=1, in=0xbff8f250, out=0xbff8ce14) at /home/rebecca/soft/petsc-3.0.0-p1/include/petscis.h:129 129 PetscInt i,*idx = mapping->indices,Nmax = mapping->n; (gdb) where #0 0x082447c2 in ISLocalToGlobalMappingApply (mapping=0x0, N=1, in=0xbff8f250, out=0xbff8ce14) at /home/rebecca/soft/petsc-3.0.0-p1/include/petscis.h:129 #1 0x0824440c in MatSetValuesLocal (mat=0x88825e8, nrow=1, irow=0xbff8f250, ncol=4, icol=0xbff8ee50, y=0xbff8f628, addv=INSERT_VALUES) at matrix.c:1583 #2 0x08240aae in MatSetValuesStencil (mat=0x88825e8, m=1, idxm=0xbff8f6b8, n=4, idxn=0xbff8f4b4, v=0xbff8f628, addv=INSERT_VALUES) at matrix.c:1099 #3 0x08053835 in FormJacobian (snes=0x8874700, X=0x8856778, J=0x88747d0, B=0x88747d4, flg=0xbff8f8d4, ptr=0x8856338) at tworeggt.c:937 #4 0x0805a5cf in DMMGComputeJacobian_Multigrid (snes=0x8874700, X=0x8856778, J=0x88747d0, B=0x88747d4, flag=0xbff8f8d4, ptr=0x8856208) at damgsnes.c:60 #5 0x0806b18a in SNESComputeJacobian (snes=0x8874700, X=0x8856778, A=0x88747d0, B=0x88747d4, flg=0xbff8f8d4) at snes.c:1111 #6 0x08084945 in SNESSolve_LS (snes=0x8874700) at ls.c:189 #7 0x08073198 in SNESSolve (snes=0x8874700, b=0x0, x=0x8856778) at snes.c:2221 #8 0x0805d5f9 in DMMGSolveSNES (dmmg=0x8856208, level=0) at damgsnes.c:510 #9 0x08056e38 in DMMGSolve (dmmg=0x8856208) at damg.c:372 #10 0x0804c3fe in main (argc=128, argv=0xbff90c04) at tworeggt.c:131 I think there might be a bug in DMCompositeGetMatrix(). Thanks very much! Cheers, -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From tim.kroeger at cevis.uni-bremen.de Mon Jul 27 04:35:10 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Mon, 27 Jul 2009 11:35:10 +0200 (CEST) Subject: Solver problem Message-ID: Dear all, In my application, there is a linear system to be solved in every time step. Steps 0 and 1 work well, but in step 2 PETSc fails to converge. I suspected that the system might be unsolvable in that step and checked that by writing matrix and the right hand side to files and loading them into "octave". Surprisingly, "octave" does find a solution to the system without any problems. The problem occurs even on a single core. I am using PETSc version 2.3.3-p11 with the GMRES solver and ILU preconditioner. Can anybody give me a hint which settings would PETSc reliably enable solving systems of the type that I face? I have put matrix and right hand side on my homepage; they can be downloaded from www.mevis.de/~tim/m-and-v.tar.gz (7MB). In octave, I used the following commands to find and check the solution: octave:1> matrix2 octave:2> vector2 octave:3> x=Mat_0\Vec_1; octave:4> res=Mat_0*x-Vec_1; octave:5> norm(res) ans = 1.0032e-12 octave:6> norm(Vec_1) ans = 27.976 octave:7> norm(Mat_0,"fro") ans = 2.5917e+22 octave:8> norm(x) ans = 3855.3 Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From knepley at gmail.com Mon Jul 27 07:47:30 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 27 Jul 2009 07:47:30 -0500 Subject: Solver problem In-Reply-To: References: Message-ID: On Mon, Jul 27, 2009 at 4:35 AM, Tim Kroeger < tim.kroeger at cevis.uni-bremen.de> wrote: > Dear all, > > In my application, there is a linear system to be solved in every time > step. Steps 0 and 1 work well, but in step 2 PETSc fails to converge. I > suspected that the system might be unsolvable in that step and checked that > by writing matrix and the right hand side to files and loading them into > "octave". Surprisingly, "octave" does find a solution to the system without > any problems. > > The problem occurs even on a single core. I am using PETSc version > 2.3.3-p11 with the GMRES solver and ILU preconditioner. > > Can anybody give me a hint which settings would PETSc reliably enable > solving systems of the type that I face? If we could, we would already have retired. There are simply no iterative solvers that work for all systems. The best preconditioners are usually tailored to the particular equations being solved. I would suggest a search of the literature for PCs for your equations. Thanks, Matt > > I have put matrix and right hand side on my homepage; they can be > downloaded from www.mevis.de/~tim/m-and-v.tar.gz(7MB). In octave, I used the following commands to find and check the > solution: > > octave:1> matrix2 > octave:2> vector2 > octave:3> x=Mat_0\Vec_1; > octave:4> res=Mat_0*x-Vec_1; > octave:5> norm(res) > ans = 1.0032e-12 > octave:6> norm(Vec_1) > ans = 27.976 > octave:7> norm(Mat_0,"fro") > ans = 2.5917e+22 > octave:8> norm(x) > ans = 3855.3 > > > Best Regards, > > Tim > > -- > Dr. Tim Kroeger > tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 > tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 > > Fraunhofer MEVIS, Institute for Medical Image Computing > Universitaetsallee 29, 28359 Bremen, Germany > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jul 27 09:30:07 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 27 Jul 2009 09:30:07 -0500 Subject: Solver problem In-Reply-To: References: Message-ID: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> On Jul 27, 2009, at 4:35 AM, Tim Kroeger wrote: > Dear all, > > In my application, there is a linear system to be solved in every > time step. Steps 0 and 1 work well, but in step 2 PETSc fails to > converge. I suspected that the system might be unsolvable in that > step and checked that by writing matrix and the right hand side to > files and loading them into "octave". Surprisingly, "octave" does > find a solution to the system without any problems. Octave is using a direct solver. Did you try PETSc's direct solver using -pc_type lu? Barry > > The problem occurs even on a single core. I am using PETSc version > 2.3.3-p11 with the GMRES solver and ILU preconditioner. > > Can anybody give me a hint which settings would PETSc reliably > enable solving systems of the type that I face? > > I have put matrix and right hand side on my homepage; they can be > downloaded from www.mevis.de/~tim/m-and-v.tar.gz (7MB). In octave, > I used the following commands to find and check the solution: > > octave:1> matrix2 > octave:2> vector2 > octave:3> x=Mat_0\Vec_1; > octave:4> res=Mat_0*x-Vec_1; > octave:5> norm(res) > ans = 1.0032e-12 > octave:6> norm(Vec_1) > ans = 27.976 > octave:7> norm(Mat_0,"fro") > ans = 2.5917e+22 > octave:8> norm(x) > ans = 3855.3 > > > Best Regards, > > Tim > > -- > Dr. Tim Kroeger > tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 > tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 > > Fraunhofer MEVIS, Institute for Medical Image Computing > Universitaetsallee 29, 28359 Bremen, Germany > From fernandez858 at gmail.com Mon Jul 27 09:43:17 2009 From: fernandez858 at gmail.com (Michel Cancelliere) Date: Mon, 27 Jul 2009 16:43:17 +0200 Subject: Solver problem In-Reply-To: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> Message-ID: <7f18de3b0907270743i7fe33021tf3b88c2e4f0d80be@mail.gmail.com> Do you mean steps (iterations) 0 and 1 for SNES or KSP? If the iterations are for SNES probably you have problems with you nonlinear solver for which Octave can find a solution to the linear system but the actual problem is not in there. Do you use -snes_converged_reason? Are you sure that Matrix and right hand side routines are working well? Michel On Mon, Jul 27, 2009 at 4:30 PM, Barry Smith wrote: > > On Jul 27, 2009, at 4:35 AM, Tim Kroeger wrote: > > Dear all, >> >> In my application, there is a linear system to be solved in every time >> step. Steps 0 and 1 work well, but in step 2 PETSc fails to converge. I >> suspected that the system might be unsolvable in that step and checked that >> by writing matrix and the right hand side to files and loading them into >> "octave". Surprisingly, "octave" does find a solution to the system without >> any problems. >> > > Octave is using a direct solver. Did you try PETSc's direct solver using > -pc_type lu? > > Barry > > > >> The problem occurs even on a single core. I am using PETSc version >> 2.3.3-p11 with the GMRES solver and ILU preconditioner. >> >> Can anybody give me a hint which settings would PETSc reliably enable >> solving systems of the type that I face? >> >> I have put matrix and right hand side on my homepage; they can be >> downloaded from www.mevis.de/~tim/m-and-v.tar.gz (7MB). In octave, I >> used the following commands to find and check the solution: >> >> octave:1> matrix2 >> octave:2> vector2 >> octave:3> x=Mat_0\Vec_1; >> octave:4> res=Mat_0*x-Vec_1; >> octave:5> norm(res) >> ans = 1.0032e-12 >> octave:6> norm(Vec_1) >> ans = 27.976 >> octave:7> norm(Mat_0,"fro") >> ans = 2.5917e+22 >> octave:8> norm(x) >> ans = 3855.3 >> >> >> Best Regards, >> >> Tim >> >> -- >> Dr. Tim Kroeger >> tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 >> tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 >> >> Fraunhofer MEVIS, Institute for Medical Image Computing >> Universitaetsallee 29, 28359 Bremen, Germany >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Harun.BAYRAKTAR at 3ds.com Mon Jul 27 12:46:36 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Mon, 27 Jul 2009 13:46:36 -0400 Subject: Avoiding assembly with MatDiagonalSet call for zeroed matrix to preserve preallocated space Message-ID: Hello, I have a rather simple question. For a Matrix that was preallocated with the correct diagonal and offdiagonal nonzero counts the following operations cause a deallocation of all data except the diagonal which casues later MatSetValues to have to reallocate. MatCreateSeqAIJ MatZeroEntries MatDiagaonalSet Using -info and the debugger I see that MatDiagonalSet ends up calling MatDiagonalSet_Default which forces an assembly. Is there a way to do the same thing and preserve the preallocated storage for future use by MatSetValues? I tried MatSetOption with MAT_KEEP_ZEROED_ROWS and MAT_IGNORE_ZERO_ENTRIED but it did not help. Thanks a lot, Harun -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 27 14:38:08 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 27 Jul 2009 14:38:08 -0500 Subject: Avoiding assembly with MatDiagonalSet call for zeroed matrix to preserve preallocated space In-Reply-To: References: Message-ID: On Mon, Jul 27, 2009 at 12:46 PM, BAYRAKTAR Harun wrote: > Hello, > > > > I have a rather simple question. For a Matrix that was preallocated with > the correct diagonal and offdiagonal nonzero counts the following operations > cause a deallocation of all data except the diagonal which casues later > MatSetValues to have to reallocate. > > > > MatCreateSeqAIJ > > MatZeroEntries > > MatDiagaonalSet > > > > Using -info and the debugger I see that MatDiagonalSet ends up calling > MatDiagonalSet_Default which forces an assembly. Is there a way to do the > same thing and preserve the preallocated storage for future use by > MatSetValues? I tried MatSetOption with MAT_KEEP_ZEROED_ROWS and > MAT_IGNORE_ZERO_ENTRIED but it did not help. > Is it possible to set the diagonal after the rest of the entires? Matt > Thanks a lot, > > Harun > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Harun.BAYRAKTAR at 3ds.com Mon Jul 27 15:09:05 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Mon, 27 Jul 2009 16:09:05 -0400 Subject: Avoiding assembly with MatDiagonalSet call for zeroed matrix to preserve preallocated space In-Reply-To: References: Message-ID: Matt, Your suggestion was what we tried as a workaround before I wrote the message and it fixes the problem completely. I just wanted to know if there was a less restrictive way that allows the diagonal set before the entries. Sounds like there isn't. Thanks, Harun From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Monday, July 27, 2009 3:38 PM To: PETSc users list Subject: Re: Avoiding assembly with MatDiagonalSet call for zeroed matrix to preserve preallocated space On Mon, Jul 27, 2009 at 12:46 PM, BAYRAKTAR Harun wrote: Hello, I have a rather simple question. For a Matrix that was preallocated with the correct diagonal and offdiagonal nonzero counts the following operations cause a deallocation of all data except the diagonal which casues later MatSetValues to have to reallocate. MatCreateSeqAIJ MatZeroEntries MatDiagaonalSet Using -info and the debugger I see that MatDiagonalSet ends up calling MatDiagonalSet_Default which forces an assembly. Is there a way to do the same thing and preserve the preallocated storage for future use by MatSetValues? I tried MatSetOption with MAT_KEEP_ZEROED_ROWS and MAT_IGNORE_ZERO_ENTRIED but it did not help. Is it possible to set the diagonal after the rest of the entires? Matt Thanks a lot, Harun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xy2102 at columbia.edu Mon Jul 27 15:51:47 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Mon, 27 Jul 2009 16:51:47 -0400 Subject: memory check of /snes/example/tutorials/ex29.c Message-ID: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> Hi, My own code has some left bytes still reachable according to valgrind, then I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and make the files, it gives me different number of bytes left still reachable. Moreover, I picked up the /snes/example/tutorials/ex29.c as another example, and found that some bytes are still reachable, what is the cause of it? It shows that it is from DACreate2D() and the I use -malloc_dump to get those unfreed informations. I understand that for those 5 loss record, the 2nd, 3rd and 4th are true for all examples, but where do 1st and 5th ones come from? Also, the -malloc_dump information shows that there are "[0]Total space allocated 37780 bytes", but valgrind gives the information as "==26628== still reachable: 132,828 bytes in 323 blocks" Why there is a big difference? Thanks very much! Rebecca Here is the message from valgrind of running ex29: ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5 ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) ==26628== by 0x804B796: main (ex29.c:139) ==26628== ==26628== ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 2 of 5 ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x4732FDB: ??? ==26628== by 0x473413C: ??? ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) ==26628== by 0x804B796: main (ex29.c:139) ==26628== ==26628== ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5 ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x4732FFB: ??? ==26628== by 0x473413C: ??? ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) ==26628== by 0x804B796: main (ex29.c:139) ==26628== ==26628== ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5 ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x4732FFB: ??? ==26628== by 0x473413C: ??? ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) ==26628== by 0x804B796: main (ex29.c:139) ==26628== ==26628== ==26628== 132,796 bytes in 321 blocks are still reachable in loss record 5 of 5 ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) ==26628== by 0x804BAFB: main (ex29.c:153) ==26628== ==26628== LEAK SUMMARY: ==26628== definitely lost: 36 bytes in 1 blocks. ==26628== indirectly lost: 120 bytes in 10 blocks. ==26628== possibly lost: 0 bytes in 0 blocks. ==26628== still reachable: 132,828 bytes in 323 blocks. ==26628== suppressed: 0 bytes in 0 blocks. -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From knepley at gmail.com Mon Jul 27 16:22:10 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 27 Jul 2009 16:22:10 -0500 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> Message-ID: On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN wrote: > Hi, > > My own code has some left bytes still reachable according to valgrind, then > I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and > make the files, it gives me different number of bytes left still reachable. > Moreover, I picked up the /snes/example/tutorials/ex29.c as another example, > and found that some bytes are still reachable, what is the cause of it? It > shows that it is from DACreate2D() and the I use -malloc_dump to get those > unfreed informations. > > I understand that for those 5 loss record, the 2nd, 3rd and 4th are true > for all examples, but where do 1st and 5th ones come from? Also, the > -malloc_dump information shows that there are > "[0]Total space allocated 37780 bytes", > but valgrind gives the information as > "==26628== still reachable: 132,828 bytes in 323 blocks" > > Why there is a big difference? 1 is fine. It is from PMPI setup, which has some bytes not freed from setting up the MPI processes. The last one looks like an unfreed header for a DA, which is strange. Matt > > Thanks very much! > > Rebecca > > Here is the message from valgrind of running ex29: > ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5 > ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) > ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) > ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) > ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) > ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) > ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) > ==26628== by 0x804B796: main (ex29.c:139) > ==26628== > ==26628== > ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely > lost in loss record 2 of 5 > ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) > ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) > ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/ > libc-2.7.so) > ==26628== by 0x4732FDB: ??? > ==26628== by 0x473413C: ??? > ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) > ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) > ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) > ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) > ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) > ==26628== by 0x804B796: main (ex29.c:139) > ==26628== > ==26628== > ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5 > ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) > ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/ > libc-2.7.so) > ==26628== by 0x4732FFB: ??? > ==26628== by 0x473413C: ??? > ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) > ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) > ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) > ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) > ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) > ==26628== by 0x804B796: main (ex29.c:139) > ==26628== > ==26628== > ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5 > ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) > ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so) > ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/ > libc-2.7.so) > ==26628== by 0x4732FFB: ??? > ==26628== by 0x473413C: ??? > ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) > ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) > ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) > ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) > ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) > ==26628== by 0x804B796: main (ex29.c:139) > ==26628== > ==26628== > ==26628== 132,796 bytes in 321 blocks are still reachable in loss record 5 > of 5 > ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) > ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) > ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) > ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) > ==26628== by 0x804BAFB: main (ex29.c:153) > ==26628== > ==26628== LEAK SUMMARY: > ==26628== definitely lost: 36 bytes in 1 blocks. > ==26628== indirectly lost: 120 bytes in 10 blocks. > ==26628== possibly lost: 0 bytes in 0 blocks. > ==26628== still reachable: 132,828 bytes in 323 blocks. > ==26628== suppressed: 0 bytes in 0 blocks. > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xy2102 at columbia.edu Mon Jul 27 16:34:39 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Mon, 27 Jul 2009 17:34:39 -0400 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> Message-ID: <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> Those unfreed bytes cause "out of memory" when it runs at bigger grid sizes. So I have to find out those unfreed memory and free them... Any suggestions? Thanks, R Quoting Matthew Knepley : > On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN > wrote: > >> Hi, >> >> My own code has some left bytes still reachable according to valgrind, then >> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and >> make the files, it gives me different number of bytes left still reachable. >> Moreover, I picked up the /snes/example/tutorials/ex29.c as another example, >> and found that some bytes are still reachable, what is the cause of it? It >> shows that it is from DACreate2D() and the I use -malloc_dump to get those >> unfreed informations. >> >> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true >> for all examples, but where do 1st and 5th ones come from? Also, the >> -malloc_dump information shows that there are >> "[0]Total space allocated 37780 bytes", >> but valgrind gives the information as >> "==26628== still reachable: 132,828 bytes in 323 blocks" >> >> Why there is a big difference? > > > 1 is fine. It is from PMPI setup, which has some bytes not freed from > setting up the MPI > processes. The last one looks like an unfreed header for a DA, which is > strange. > > Matt > > >> >> Thanks very much! >> >> Rebecca >> >> Here is the message from valgrind of running ex29: >> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5 >> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >> ==26628== by 0x804B796: main (ex29.c:139) >> ==26628== >> ==26628== >> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely >> lost in loss record 2 of 5 >> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >> ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/ >> libc-2.7.so) >> ==26628== by 0x4732FDB: ??? >> ==26628== by 0x473413C: ??? >> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) >> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >> ==26628== by 0x804B796: main (ex29.c:139) >> ==26628== >> ==26628== >> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5 >> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >> ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/ >> libc-2.7.so) >> ==26628== by 0x4732FFB: ??? >> ==26628== by 0x473413C: ??? >> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) >> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >> ==26628== by 0x804B796: main (ex29.c:139) >> ==26628== >> ==26628== >> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5 >> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so) >> ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/ >> libc-2.7.so) >> ==26628== by 0x4732FFB: ??? >> ==26628== by 0x473413C: ??? >> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) >> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >> ==26628== by 0x804B796: main (ex29.c:139) >> ==26628== >> ==26628== >> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record 5 >> of 5 >> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >> ==26628== by 0x804BAFB: main (ex29.c:153) >> ==26628== >> ==26628== LEAK SUMMARY: >> ==26628== definitely lost: 36 bytes in 1 blocks. >> ==26628== indirectly lost: 120 bytes in 10 blocks. >> ==26628== possibly lost: 0 bytes in 0 blocks. >> ==26628== still reachable: 132,828 bytes in 323 blocks. >> ==26628== suppressed: 0 bytes in 0 blocks. >> >> >> -- >> (Rebecca) Xuefei YUAN >> Department of Applied Physics and Applied Mathematics >> Columbia University >> Tel:917-399-8032 >> www.columbia.edu/~xy2102 >> >> > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From knepley at gmail.com Mon Jul 27 16:42:10 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 27 Jul 2009 16:42:10 -0500 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> Message-ID: On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN wrote: > Those unfreed bytes cause "out of memory" when it runs at bigger grid > sizes. So I have to find out those unfreed memory and free them... Any > suggestions? Not from what you mailed in. On that DA line, I see PetscHeaderCreate(). Is that what you see? Matt > > Thanks, > > R > > > Quoting Matthew Knepley : > > On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN >> wrote: >> >> Hi, >>> >>> My own code has some left bytes still reachable according to valgrind, >>> then >>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and >>> make the files, it gives me different number of bytes left still >>> reachable. >>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another >>> example, >>> and found that some bytes are still reachable, what is the cause of it? >>> It >>> shows that it is from DACreate2D() and the I use -malloc_dump to get >>> those >>> unfreed informations. >>> >>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true >>> for all examples, but where do 1st and 5th ones come from? Also, the >>> -malloc_dump information shows that there are >>> "[0]Total space allocated 37780 bytes", >>> but valgrind gives the information as >>> "==26628== still reachable: 132,828 bytes in 323 blocks" >>> >>> Why there is a big difference? >>> >> >> >> 1 is fine. It is from PMPI setup, which has some bytes not freed from >> setting up the MPI >> processes. The last one looks like an unfreed header for a DA, which is >> strange. >> >> Matt >> >> >> >>> Thanks very much! >>> >>> Rebecca >>> >>> Here is the message from valgrind of running ex29: >>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >>> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >>> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >>> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >>> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >>> ==26628== by 0x804B796: main (ex29.c:139) >>> ==26628== >>> ==26628== >>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely >>> lost in loss record 2 of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >>> ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x4732FDB: ??? >>> ==26628== by 0x473413C: ??? >>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>> ) >>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>> ==26628== by 0x804B796: main (ex29.c:139) >>> ==26628== >>> ==26628== >>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x4732FFB: ??? >>> ==26628== by 0x473413C: ??? >>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>> ) >>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>> ==26628== by 0x804B796: main (ex29.c:139) >>> ==26628== >>> ==26628== >>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so) >>> ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x4732FFB: ??? >>> ==26628== by 0x473413C: ??? >>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>> ) >>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>> ==26628== by 0x804B796: main (ex29.c:139) >>> ==26628== >>> ==26628== >>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record >>> 5 >>> of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >>> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >>> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >>> ==26628== by 0x804BAFB: main (ex29.c:153) >>> ==26628== >>> ==26628== LEAK SUMMARY: >>> ==26628== definitely lost: 36 bytes in 1 blocks. >>> ==26628== indirectly lost: 120 bytes in 10 blocks. >>> ==26628== possibly lost: 0 bytes in 0 blocks. >>> ==26628== still reachable: 132,828 bytes in 323 blocks. >>> ==26628== suppressed: 0 bytes in 0 blocks. >>> >>> >>> -- >>> (Rebecca) Xuefei YUAN >>> Department of Applied Physics and Applied Mathematics >>> Columbia University >>> Tel:917-399-8032 >>> www.columbia.edu/~xy2102 < >>> http://www.columbia.edu/%7Exy2102> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> >> > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Jul 27 16:53:20 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 27 Jul 2009 16:53:20 -0500 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> Message-ID: This was due to a small memory leak in DMMGSetNullSpace() of the vectors creating internally to hold the null space. I have pushed a fix to petsc-3.0.0 and petsc-dev It will be fixed in the next 3.0.0 patch. Note this would not cause the "out of memory" for runs at bigger grid sizes. That is likely just coming from trying to run too large a problem for your memory size. Thanks for reporting the memory leak, Barry On Jul 27, 2009, at 4:34 PM, (Rebecca) Xuefei YUAN wrote: > Those unfreed bytes cause "out of memory" when it runs at bigger > grid sizes. So I have to find out those unfreed memory and free > them... Any suggestions? > > Thanks, > > R > > Quoting Matthew Knepley : > >> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN >> wrote: >> >>> Hi, >>> >>> My own code has some left bytes still reachable according to >>> valgrind, then >>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to >>> compile and >>> make the files, it gives me different number of bytes left still >>> reachable. >>> Moreover, I picked up the /snes/example/tutorials/ex29.c as >>> another example, >>> and found that some bytes are still reachable, what is the cause >>> of it? It >>> shows that it is from DACreate2D() and the I use -malloc_dump to >>> get those >>> unfreed informations. >>> >>> I understand that for those 5 loss record, the 2nd, 3rd and 4th >>> are true >>> for all examples, but where do 1st and 5th ones come from? Also, the >>> -malloc_dump information shows that there are >>> "[0]Total space allocated 37780 bytes", >>> but valgrind gives the information as >>> "==26628== still reachable: 132,828 bytes in 323 blocks" >>> >>> Why there is a big difference? >> >> >> 1 is fine. It is from PMPI setup, which has some bytes not freed from >> setting up the MPI >> processes. The last one looks like an unfreed header for a DA, >> which is >> strange. >> >> Matt >> >> >>> >>> Thanks very much! >>> >>> Rebecca >>> >>> Here is the message from valgrind of running ex29: >>> ==26628== 32 bytes in 2 blocks are still reachable in loss record >>> 1 of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >>> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >>> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >>> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >>> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >>> ==26628== by 0x804B796: main (ex29.c:139) >>> ==26628== >>> ==26628== >>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are >>> definitely >>> lost in loss record 2 of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >>> ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/ >>> cmov/ >>> libc-2.7.so) >>> ==26628== by 0x4732FDB: ??? >>> ==26628== by 0x473413C: ??? >>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c: >>> 68) >>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>> ==26628== by 0x804B796: main (ex29.c:139) >>> ==26628== >>> ==26628== >>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record >>> 3 of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/ >>> cmov/ >>> libc-2.7.so) >>> ==26628== by 0x4732FFB: ??? >>> ==26628== by 0x473413C: ??? >>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c: >>> 68) >>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>> ==26628== by 0x804B796: main (ex29.c:139) >>> ==26628== >>> ==26628== >>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record >>> 4 of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/ >>> cmov/ >>> libc-2.7.so) >>> ==26628== by 0x4732FFB: ??? >>> ==26628== by 0x473413C: ??? >>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>> libc-2.7.so) >>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c: >>> 68) >>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>> ==26628== by 0x804B796: main (ex29.c:139) >>> ==26628== >>> ==26628== >>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss >>> record 5 >>> of 5 >>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >>> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >>> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >>> ==26628== by 0x804BAFB: main (ex29.c:153) >>> ==26628== >>> ==26628== LEAK SUMMARY: >>> ==26628== definitely lost: 36 bytes in 1 blocks. >>> ==26628== indirectly lost: 120 bytes in 10 blocks. >>> ==26628== possibly lost: 0 bytes in 0 blocks. >>> ==26628== still reachable: 132,828 bytes in 323 blocks. >>> ==26628== suppressed: 0 bytes in 0 blocks. >>> >>> >>> -- >>> (Rebecca) Xuefei YUAN >>> Department of Applied Physics and Applied Mathematics >>> Columbia University >>> Tel:917-399-8032 >>> www.columbia.edu/~xy2102 >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their >> experiments >> lead. >> -- Norbert Wiener >> > > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From xy2102 at columbia.edu Mon Jul 27 17:04:56 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Mon, 27 Jul 2009 18:04:56 -0400 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> Message-ID: <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu> Dear Matt, I ran the code on a 321*321 grid, with dof=4. The matrix is a sparse matrix with type aij. After set up user defined options calls, the memory status is Mem: 2033752k total, 607456k used, 1426296k free, 4832k buffers ierr = DMMGCreate(comm, parameters.numberOfLevels, &appCtx, &dmmg);CHKERRQ(ierr); ierr = DACreate2d(comm,DA_NONPERIODIC,DA_STENCIL_BOX, -5, -5, PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); ierr = DMMGSetDM(dmmg, (DM)da);CHKERRQ(ierr); ierr = DASetFieldName(DMMGGetDA(dmmg), 0, "phi");CHKERRQ(ierr); ierr = DASetFieldName(DMMGGetDA(dmmg), 1, "vz");CHKERRQ(ierr); ierr = DASetFieldName(DMMGGetDA(dmmg), 2, "psi");CHKERRQ(ierr); ierr = DASetFieldName(DMMGGetDA(dmmg), 3, "bz");CHKERRQ(ierr); before DAGetMatrix() called, the memory status is Mem: 2033752k total, 642940k used, 1390812k free, 4972k buffers ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); In gdb, it uses around 500M memory after DAGetMatrix(), which I do not think it is right, since for a sparse matrix with 13 nonzeros per row, the memory it needs should be 321*321*4(dof)*13(nonzeros per row)*8(PetscReal) = 42865056 bytes ~ 40M. It is strange. i.e., after DAGetMatrix() call, the memory status is Mem: 2033752k total, 1152032k used, 881720k free, 5072k buffers Then when it goes the call of DMMGSetSNESLocal(), I found my memory is using till the message of corruption has appeared. ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, FormJacobianLocal,0,0);CHKERRQ(ierr); The memory corruption happens. The error message is: 0 SNES Function norm 4.925849247379e-03 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Out of memory. This could be due to allocating [0]PETSC ERROR: too large an object or bleeding by not properly [0]PETSC ERROR: destroying unneeded objects. [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process 1630638080 [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. [0]PETSC ERROR: Memory requested 327270824! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan 1 13:54:27 CST 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 on a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009 [0]PETSC ERROR: Libraries linked from /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009 [0]PETSC ERROR: Configure options --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/ --download-mpich=1 --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in src/mat/impls/aij/seq/aijfact.c [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c [0]PETSC ERROR: Solve() line 318 in qffxmhd.c [0]PETSC ERROR: main() line 172 in qffxmhd.c application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 Program exited with code 01. 0 SNES Function norm 4.925849247379e-03 [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Out of memory. This could be due to allocating [0]PETSC ERROR: too large an object or bleeding by not properly [0]PETSC ERROR: destroying unneeded objects. [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process 1630638080 [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. [0]PETSC ERROR: Memory requested 327270824! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan 1 13:54:27 CST 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 on a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009 [0]PETSC ERROR: Libraries linked from /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009 [0]PETSC ERROR: Configure options --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/ --download-mpich=1 --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in src/mat/impls/aij/seq/aijfact.c [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c [0]PETSC ERROR: Solve() line 318 in qffxmhd.c [0]PETSC ERROR: main() line 172 in qffxmhd.c application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 Program exited with code 01. I thought it might because of the unfreed memory, so I picked up ex29.c as a comparision. Thanks, Rebecca Quoting Matthew Knepley : > On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN > wrote: > >> Those unfreed bytes cause "out of memory" when it runs at bigger grid >> sizes. So I have to find out those unfreed memory and free them... Any >> suggestions? > > > Not from what you mailed in. On that DA line, I see PetscHeaderCreate(). Is > that what you see? > > Matt > > >> >> Thanks, >> >> R >> >> >> Quoting Matthew Knepley : >> >> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN >>> wrote: >>> >>> Hi, >>>> >>>> My own code has some left bytes still reachable according to valgrind, >>>> then >>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and >>>> make the files, it gives me different number of bytes left still >>>> reachable. >>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another >>>> example, >>>> and found that some bytes are still reachable, what is the cause of it? >>>> It >>>> shows that it is from DACreate2D() and the I use -malloc_dump to get >>>> those >>>> unfreed informations. >>>> >>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true >>>> for all examples, but where do 1st and 5th ones come from? Also, the >>>> -malloc_dump information shows that there are >>>> "[0]Total space allocated 37780 bytes", >>>> but valgrind gives the information as >>>> "==26628== still reachable: 132,828 bytes in 323 blocks" >>>> >>>> Why there is a big difference? >>>> >>> >>> >>> 1 is fine. It is from PMPI setup, which has some bytes not freed from >>> setting up the MPI >>> processes. The last one looks like an unfreed header for a DA, which is >>> strange. >>> >>> Matt >>> >>> >>> >>>> Thanks very much! >>>> >>>> Rebecca >>>> >>>> Here is the message from valgrind of running ex29: >>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >>>> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >>>> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >>>> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >>>> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >>>> ==26628== by 0x804B796: main (ex29.c:139) >>>> ==26628== >>>> ==26628== >>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely >>>> lost in loss record 2 of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/ >>>> libc-2.7.so) >>>> ==26628== by 0x4732FDB: ??? >>>> ==26628== by 0x473413C: ??? >>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>>> ) >>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>> ==26628== by 0x804B796: main (ex29.c:139) >>>> ==26628== >>>> ==26628== >>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/ >>>> libc-2.7.so) >>>> ==26628== by 0x4732FFB: ??? >>>> ==26628== by 0x473413C: ??? >>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>>> ) >>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>> ==26628== by 0x804B796: main (ex29.c:139) >>>> ==26628== >>>> ==26628== >>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/ >>>> libc-2.7.so) >>>> ==26628== by 0x4732FFB: ??? >>>> ==26628== by 0x473413C: ??? >>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>>> ) >>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>> ==26628== by 0x804B796: main (ex29.c:139) >>>> ==26628== >>>> ==26628== >>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record >>>> 5 >>>> of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >>>> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >>>> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >>>> ==26628== by 0x804BAFB: main (ex29.c:153) >>>> ==26628== >>>> ==26628== LEAK SUMMARY: >>>> ==26628== definitely lost: 36 bytes in 1 blocks. >>>> ==26628== indirectly lost: 120 bytes in 10 blocks. >>>> ==26628== possibly lost: 0 bytes in 0 blocks. >>>> ==26628== still reachable: 132,828 bytes in 323 blocks. >>>> ==26628== suppressed: 0 bytes in 0 blocks. >>>> >>>> >>>> -- >>>> (Rebecca) Xuefei YUAN >>>> Department of Applied Physics and Applied Mathematics >>>> Columbia University >>>> Tel:917-399-8032 >>>> www.columbia.edu/~xy2102 < >>>> http://www.columbia.edu/%7Exy2102> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments >>> is infinitely more interesting than any results to which their experiments >>> lead. >>> -- Norbert Wiener >>> >>> >> >> >> -- >> (Rebecca) Xuefei YUAN >> Department of Applied Physics and Applied Mathematics >> Columbia University >> Tel:917-399-8032 >> www.columbia.edu/~xy2102 >> >> > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From xy2102 at columbia.edu Mon Jul 27 17:08:09 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Mon, 27 Jul 2009 18:08:09 -0400 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> Message-ID: <20090727180809.sh1t22n8wsgw8ckc@cubmail.cc.columbia.edu> Dear Barry, Do you mean that this small memory leak has been fixed in the current petsc-3.0.0-p7? Since the version I have is petsc-3.0.0-p1. By the way, have you noticed another email about the possible bug in DMCompositeGetMatrix() function? Thanks, Rebecca Quoting Barry Smith : > > This was due to a small memory leak in DMMGSetNullSpace() of the > vectors creating internally to hold the null space. I have pushed a fix > to petsc-3.0.0 and petsc-dev > It will be fixed in the next 3.0.0 patch. > > Note this would not cause the "out of memory" for runs at bigger > grid sizes. That is likely just coming from trying to run too large a > problem for your memory size. > > Thanks for reporting the memory leak, > > Barry > > > On Jul 27, 2009, at 4:34 PM, (Rebecca) Xuefei YUAN wrote: > >> Those unfreed bytes cause "out of memory" when it runs at bigger >> grid sizes. So I have to find out those unfreed memory and free >> them... Any suggestions? >> >> Thanks, >> >> R >> >> Quoting Matthew Knepley : >> >>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN >>> wrote: >>> >>>> Hi, >>>> >>>> My own code has some left bytes still reachable according to >>>> valgrind, then >>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and >>>> make the files, it gives me different number of bytes left still >>>> reachable. >>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as >>>> another example, >>>> and found that some bytes are still reachable, what is the cause of it? It >>>> shows that it is from DACreate2D() and the I use -malloc_dump to get those >>>> unfreed informations. >>>> >>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true >>>> for all examples, but where do 1st and 5th ones come from? Also, the >>>> -malloc_dump information shows that there are >>>> "[0]Total space allocated 37780 bytes", >>>> but valgrind gives the information as >>>> "==26628== still reachable: 132,828 bytes in 323 blocks" >>>> >>>> Why there is a big difference? >>> >>> >>> 1 is fine. It is from PMPI setup, which has some bytes not freed from >>> setting up the MPI >>> processes. The last one looks like an unfreed header for a DA, which is >>> strange. >>> >>> Matt >>> >>> >>>> >>>> Thanks very much! >>>> >>>> Rebecca >>>> >>>> Here is the message from valgrind of running ex29: >>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >>>> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >>>> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >>>> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >>>> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >>>> ==26628== by 0x804B796: main (ex29.c:139) >>>> ==26628== >>>> ==26628== >>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely >>>> lost in loss record 2 of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/ >>>> libc-2.7.so) >>>> ==26628== by 0x4732FDB: ??? >>>> ==26628== by 0x473413C: ??? >>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>> ==26628== by 0x804B796: main (ex29.c:139) >>>> ==26628== >>>> ==26628== >>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/ >>>> libc-2.7.so) >>>> ==26628== by 0x4732FFB: ??? >>>> ==26628== by 0x473413C: ??? >>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>> ==26628== by 0x804B796: main (ex29.c:139) >>>> ==26628== >>>> ==26628== >>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/ >>>> libc-2.7.so) >>>> ==26628== by 0x4732FFB: ??? >>>> ==26628== by 0x473413C: ??? >>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>> ==26628== by 0x804B796: main (ex29.c:139) >>>> ==26628== >>>> ==26628== >>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record 5 >>>> of 5 >>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >>>> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >>>> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >>>> ==26628== by 0x804BAFB: main (ex29.c:153) >>>> ==26628== >>>> ==26628== LEAK SUMMARY: >>>> ==26628== definitely lost: 36 bytes in 1 blocks. >>>> ==26628== indirectly lost: 120 bytes in 10 blocks. >>>> ==26628== possibly lost: 0 bytes in 0 blocks. >>>> ==26628== still reachable: 132,828 bytes in 323 blocks. >>>> ==26628== suppressed: 0 bytes in 0 blocks. >>>> >>>> >>>> -- >>>> (Rebecca) Xuefei YUAN >>>> Department of Applied Physics and Applied Mathematics >>>> Columbia University >>>> Tel:917-399-8032 >>>> www.columbia.edu/~xy2102 >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments >>> is infinitely more interesting than any results to which their experiments >>> lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> (Rebecca) Xuefei YUAN >> Department of Applied Physics and Applied Mathematics >> Columbia University >> Tel:917-399-8032 >> www.columbia.edu/~xy2102 >> -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From knepley at gmail.com Mon Jul 27 17:08:42 2009 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 27 Jul 2009 17:08:42 -0500 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu> References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu> Message-ID: On Mon, Jul 27, 2009 at 5:04 PM, (Rebecca) Xuefei YUAN wrote: > Dear Matt, > > I ran the code on a 321*321 grid, with dof=4. The matrix is a sparse matrix > with type aij. > > After set up user defined options calls, the memory status is > Mem: 2033752k total, 607456k used, 1426296k free, 4832k buffers > > ierr = DMMGCreate(comm, parameters.numberOfLevels, &appCtx, > &dmmg);CHKERRQ(ierr); > ierr = DACreate2d(comm,DA_NONPERIODIC,DA_STENCIL_BOX, -5, -5, > PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); > ierr = DMMGSetDM(dmmg, (DM)da);CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 0, "phi");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 1, "vz");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 2, "psi");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 3, "bz");CHKERRQ(ierr); > > > before DAGetMatrix() called, the memory status is > Mem: 2033752k total, 642940k used, 1390812k free, 4972k buffers > > ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, > &jacobian);CHKERRQ(ierr); > > In gdb, it uses around 500M memory after DAGetMatrix(), which I do not > think it is right, since for a sparse matrix with 13 nonzeros per row, the > memory it needs should be 321*321*4(dof)*13(nonzeros per row)*8(PetscReal) = > 42865056 bytes ~ 40M. It is strange. > i.e., after DAGetMatrix() call, the memory status is Are you sure you have stencil width 1? This is not the only memory used by a matrix, but should be close. The number of nonzero is reported by -ksp_view. Check this. > > Mem: 2033752k total, 1152032k used, 881720k free, 5072k buffers > > Then when it goes the call of DMMGSetSNESLocal(), I found my memory is > using till the message of corruption has appeared. > > ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, > FormJacobianLocal,0,0);CHKERRQ(ierr); > > The memory corruption happens. The error message is: This is not corruption, just using up memory. Matt > > 0 SNES Function norm 4.925849247379e-03 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process > 1630638080 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 327270824! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan 1 13:54:27 > CST 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 on > a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009 > [0]PETSC ERROR: Libraries linked from > /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009 > [0]PETSC ERROR: Configure options > --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/ > --download-mpich=1 --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c > [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in > src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in > src/mat/impls/aij/seq/aijfact.c > [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in > src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c > [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c > [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c > [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c > [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c > [0]PETSC ERROR: Solve() line 318 in qffxmhd.c > [0]PETSC ERROR: main() line 172 in qffxmhd.c > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > > Program exited with code 01. > > > 0 SNES Function norm 4.925849247379e-03 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process > 1630638080 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 327270824! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan 1 13:54:27 > CST 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 on > a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009 > [0]PETSC ERROR: Libraries linked from > /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009 > [0]PETSC ERROR: Configure options > --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/ > --download-mpich=1 --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c > [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in > src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in > src/mat/impls/aij/seq/aijfact.c > [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in > src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c > [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c > [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c > [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c > [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c > [0]PETSC ERROR: Solve() line 318 in qffxmhd.c > [0]PETSC ERROR: main() line 172 in qffxmhd.c > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > > Program exited with code 01. > > I thought it might because of the unfreed memory, so I picked up ex29.c as > a comparision. > > Thanks, > > Rebecca > > > > Quoting Matthew Knepley : > > On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN >> wrote: >> >> Those unfreed bytes cause "out of memory" when it runs at bigger grid >>> sizes. So I have to find out those unfreed memory and free them... Any >>> suggestions? >>> >> >> >> Not from what you mailed in. On that DA line, I see PetscHeaderCreate(). >> Is >> that what you see? >> >> Matt >> >> >> >>> Thanks, >>> >>> R >>> >>> >>> Quoting Matthew Knepley : >>> >>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN >>> >>>> wrote: >>>> >>>> Hi, >>>> >>>>> >>>>> My own code has some left bytes still reachable according to valgrind, >>>>> then >>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile >>>>> and >>>>> make the files, it gives me different number of bytes left still >>>>> reachable. >>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another >>>>> example, >>>>> and found that some bytes are still reachable, what is the cause of it? >>>>> It >>>>> shows that it is from DACreate2D() and the I use -malloc_dump to get >>>>> those >>>>> unfreed informations. >>>>> >>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are >>>>> true >>>>> for all examples, but where do 1st and 5th ones come from? Also, the >>>>> -malloc_dump information shows that there are >>>>> "[0]Total space allocated 37780 bytes", >>>>> but valgrind gives the information as >>>>> "==26628== still reachable: 132,828 bytes in 323 blocks" >>>>> >>>>> Why there is a big difference? >>>>> >>>>> >>>> >>>> 1 is fine. It is from PMPI setup, which has some bytes not freed from >>>> setting up the MPI >>>> processes. The last one looks like an unfreed header for a DA, which is >>>> strange. >>>> >>>> Matt >>>> >>>> >>>> >>>> Thanks very much! >>>>> >>>>> Rebecca >>>>> >>>>> Here is the message from valgrind of running ex29: >>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of >>>>> 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >>>>> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >>>>> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >>>>> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >>>>> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are >>>>> definitely >>>>> lost in loss record 2 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x429BC2D: __nss_database_lookup (in >>>>> /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FDB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so >>>>> ) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of >>>>> 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x429AFBB: __nss_lookup_function (in >>>>> /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FFB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so >>>>> ) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of >>>>> 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x429AF7D: __nss_lookup_function (in >>>>> /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FFB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so >>>>> ) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss >>>>> record >>>>> 5 >>>>> of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >>>>> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >>>>> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >>>>> ==26628== by 0x804BAFB: main (ex29.c:153) >>>>> ==26628== >>>>> ==26628== LEAK SUMMARY: >>>>> ==26628== definitely lost: 36 bytes in 1 blocks. >>>>> ==26628== indirectly lost: 120 bytes in 10 blocks. >>>>> ==26628== possibly lost: 0 bytes in 0 blocks. >>>>> ==26628== still reachable: 132,828 bytes in 323 blocks. >>>>> ==26628== suppressed: 0 bytes in 0 blocks. >>>>> >>>>> >>>>> -- >>>>> (Rebecca) Xuefei YUAN >>>>> Department of Applied Physics and Applied Mathematics >>>>> Columbia University >>>>> Tel:917-399-8032 >>>>> www.columbia.edu/~xy2102 < >>>>> http://www.columbia.edu/%7Exy2102> < >>>>> http://www.columbia.edu/%7Exy2102> >>>>> >>>>> >>>>> >>>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments >>>> is infinitely more interesting than any results to which their >>>> experiments >>>> lead. >>>> -- Norbert Wiener >>>> >>>> >>>> >>> >>> -- >>> (Rebecca) Xuefei YUAN >>> Department of Applied Physics and Applied Mathematics >>> Columbia University >>> Tel:917-399-8032 >>> www.columbia.edu/~xy2102 < >>> http://www.columbia.edu/%7Exy2102> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> >> > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xy2102 at columbia.edu Mon Jul 27 17:14:32 2009 From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN) Date: Mon, 27 Jul 2009 18:14:32 -0400 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu> References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu> Message-ID: <20090727181432.w0t5xicq804wkw4o@cubmail.cc.columbia.edu> The memory status after running DMMGSetSNESLocal() is Mem: 2033752k total, 1821800k used, 211952k free, 5944k buffers then when it calls DMMGSolve(), the memory has been used up... till corruption. R Quoting "(Rebecca) Xuefei YUAN" : > Dear Matt, > > I ran the code on a 321*321 grid, with dof=4. The matrix is a sparse > matrix with type aij. > > After set up user defined options calls, the memory status is > Mem: 2033752k total, 607456k used, 1426296k free, 4832k buffers > > ierr = DMMGCreate(comm, parameters.numberOfLevels, &appCtx, > &dmmg);CHKERRQ(ierr); > ierr = DACreate2d(comm,DA_NONPERIODIC,DA_STENCIL_BOX, -5, -5, > PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); > ierr = DMMGSetDM(dmmg, (DM)da);CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 0, "phi");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 1, "vz");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 2, "psi");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 3, "bz");CHKERRQ(ierr); > > > before DAGetMatrix() called, the memory status is > Mem: 2033752k total, 642940k used, 1390812k free, 4972k buffers > > ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > > In gdb, it uses around 500M memory after DAGetMatrix(), which I do not > think it is right, since for a sparse matrix with 13 nonzeros per row, > the memory it needs should be 321*321*4(dof)*13(nonzeros per > row)*8(PetscReal) = 42865056 bytes ~ 40M. It is strange. > i.e., after DAGetMatrix() call, the memory status is > > Mem: 2033752k total, 1152032k used, 881720k free, 5072k buffers > > Then when it goes the call of DMMGSetSNESLocal(), I found my memory is > using till the message of corruption has appeared. > > ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, > FormJacobianLocal,0,0);CHKERRQ(ierr); > > The memory corruption happens. The error message is: > > 0 SNES Function norm 4.925849247379e-03 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process 1630638080 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 327270824! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan 1 > 13:54:27 CST 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 > on a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009 > [0]PETSC ERROR: Libraries linked from > /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009 > [0]PETSC ERROR: Configure options > --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/ > --download-mpich=1 --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c > [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in > src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in > src/mat/impls/aij/seq/aijfact.c > [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in > src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c > [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c > [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c > [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c > [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c > [0]PETSC ERROR: Solve() line 318 in qffxmhd.c > [0]PETSC ERROR: main() line 172 in qffxmhd.c > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > > Program exited with code 01. > > > 0 SNES Function norm 4.925849247379e-03 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process 1630638080 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 327270824! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan 1 > 13:54:27 CST 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 > on a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009 > [0]PETSC ERROR: Libraries linked from > /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009 > [0]PETSC ERROR: Configure options > --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/ > --download-mpich=1 --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c > [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in > src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in > src/mat/impls/aij/seq/aijfact.c > [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in > src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c > [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c > [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c > [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c > [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c > [0]PETSC ERROR: Solve() line 318 in qffxmhd.c > [0]PETSC ERROR: main() line 172 in qffxmhd.c > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > > Program exited with code 01. > > I thought it might because of the unfreed memory, so I picked up ex29.c > as a comparision. > > Thanks, > > Rebecca > > > Quoting Matthew Knepley : > >> On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN >> wrote: >> >>> Those unfreed bytes cause "out of memory" when it runs at bigger grid >>> sizes. So I have to find out those unfreed memory and free them... Any >>> suggestions? >> >> >> Not from what you mailed in. On that DA line, I see PetscHeaderCreate(). Is >> that what you see? >> >> Matt >> >> >>> >>> Thanks, >>> >>> R >>> >>> >>> Quoting Matthew Knepley : >>> >>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN >>>> wrote: >>>> >>>> Hi, >>>>> >>>>> My own code has some left bytes still reachable according to valgrind, >>>>> then >>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and >>>>> make the files, it gives me different number of bytes left still >>>>> reachable. >>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another >>>>> example, >>>>> and found that some bytes are still reachable, what is the cause of it? >>>>> It >>>>> shows that it is from DACreate2D() and the I use -malloc_dump to get >>>>> those >>>>> unfreed informations. >>>>> >>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true >>>>> for all examples, but where do 1st and 5th ones come from? Also, the >>>>> -malloc_dump information shows that there are >>>>> "[0]Total space allocated 37780 bytes", >>>>> but valgrind gives the information as >>>>> "==26628== still reachable: 132,828 bytes in 323 blocks" >>>>> >>>>> Why there is a big difference? >>>>> >>>> >>>> >>>> 1 is fine. It is from PMPI setup, which has some bytes not freed from >>>> setting up the MPI >>>> processes. The last one looks like an unfreed header for a DA, which is >>>> strange. >>>> >>>> Matt >>>> >>>> >>>> >>>>> Thanks very much! >>>>> >>>>> Rebecca >>>>> >>>>> Here is the message from valgrind of running ex29: >>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >>>>> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >>>>> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >>>>> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >>>>> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely >>>>> lost in loss record 2 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FDB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FFB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FFB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record >>>>> 5 >>>>> of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >>>>> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >>>>> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >>>>> ==26628== by 0x804BAFB: main (ex29.c:153) >>>>> ==26628== >>>>> ==26628== LEAK SUMMARY: >>>>> ==26628== definitely lost: 36 bytes in 1 blocks. >>>>> ==26628== indirectly lost: 120 bytes in 10 blocks. >>>>> ==26628== possibly lost: 0 bytes in 0 blocks. >>>>> ==26628== still reachable: 132,828 bytes in 323 blocks. >>>>> ==26628== suppressed: 0 bytes in 0 blocks. >>>>> >>>>> >>>>> -- >>>>> (Rebecca) Xuefei YUAN >>>>> Department of Applied Physics and Applied Mathematics >>>>> Columbia University >>>>> Tel:917-399-8032 >>>>> www.columbia.edu/~xy2102 < >>>>> http://www.columbia.edu/%7Exy2102> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments >>>> is infinitely more interesting than any results to which their experiments >>>> lead. >>>> -- Norbert Wiener >>>> >>>> >>> >>> >>> -- >>> (Rebecca) Xuefei YUAN >>> Department of Applied Physics and Applied Mathematics >>> Columbia University >>> Tel:917-399-8032 >>> www.columbia.edu/~xy2102 >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >> > > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 -- (Rebecca) Xuefei YUAN Department of Applied Physics and Applied Mathematics Columbia University Tel:917-399-8032 www.columbia.edu/~xy2102 From bsmith at mcs.anl.gov Mon Jul 27 20:35:37 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 27 Jul 2009 20:35:37 -0500 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu> References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu> Message-ID: <4D7AAA33-B20F-4D55-B3D6-E0E9F507CC6F@mcs.anl.gov> The memory for the matrix will be (321*321*4) * (25* 4) * (12) = 494,596,800 which is 500 megabytes ( # rows) (nonzeros per row) (1 double + 1 int per nonzero) There are 25 * 4 nonzeros per row because you have box stencil of width 2. DAGetMatrix() has no way of knowing that your equations have only 13 nonzeros per row it has to assume a nonzero for each possible couple in the stencil. You can use DASetBlockFills() to indicate which parts of the stencil are truly nonzero and thus greatly reduce the matrix memory usage. Barry On Jul 27, 2009, at 5:04 PM, (Rebecca) Xuefei YUAN wrote: > Dear Matt, > > I ran the code on a 321*321 grid, with dof=4. The matrix is a sparse > matrix with type aij. > > After set up user defined options calls, the memory status is > Mem: 2033752k total, 607456k used, 1426296k free, 4832k > buffers > > ierr = DMMGCreate(comm, parameters.numberOfLevels, &appCtx, > &dmmg);CHKERRQ(ierr); > ierr = DACreate2d(comm,DA_NONPERIODIC,DA_STENCIL_BOX, -5, -5, > PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr); > ierr = DMMGSetDM(dmmg, (DM)da);CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 0, "phi");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 1, "vz");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 2, "psi");CHKERRQ(ierr); > ierr = DASetFieldName(DMMGGetDA(dmmg), 3, "bz");CHKERRQ(ierr); > > > before DAGetMatrix() called, the memory status is > Mem: 2033752k total, 642940k used, 1390812k free, 4972k > buffers > > ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr); > > In gdb, it uses around 500M memory after DAGetMatrix(), which I do > not think it is right, since for a sparse matrix with 13 nonzeros > per row, the memory it needs should be 321*321*4(dof)*13(nonzeros > per row)*8(PetscReal) = 42865056 bytes ~ 40M. It is strange. > i.e., after DAGetMatrix() call, the memory status is > > Mem: 2033752k total, 1152032k used, 881720k free, 5072k > buffers > > Then when it goes the call of DMMGSetSNESLocal(), I found my memory > is using till the message of corruption has appeared. > > ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, FormJacobianLocal, > 0,0);CHKERRQ(ierr); > > The memory corruption happens. The error message is: > > 0 SNES Function norm 4.925849247379e-03 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process > 1630638080 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 327270824! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan 1 > 13:54:27 CST 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/ > qffxmhd3 on a linux-gnu named YuanWork by rebecca Mon Jul 27 > 17:49:53 2009 > [0]PETSC ERROR: Libraries linked from /home/rebecca/soft/petsc-3.0.0- > p1/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009 > [0]PETSC ERROR: Configure options --with-blas-lapack-dir=./ > externalpackages/fblaslapack-3.1.1/ --download-mpich=1 --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/ > mtr.c > [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in src/mat/ > impls/aij/seq/aij.c > [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in src/mat/ > impls/aij/seq/aijfact.c > [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in src/mat/ > interface/matrix.c > [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ > ilu/ilu.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c > [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c > [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c > [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c > [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c > [0]PETSC ERROR: Solve() line 318 in qffxmhd.c > [0]PETSC ERROR: main() line 172 in qffxmhd.c > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > > Program exited with code 01. > > > 0 SNES Function norm 4.925849247379e-03 > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process > 1630638080 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 327270824! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan 1 > 13:54:27 CST 2009 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/ > qffxmhd3 on a linux-gnu named YuanWork by rebecca Mon Jul 27 > 17:49:53 2009 > [0]PETSC ERROR: Libraries linked from /home/rebecca/soft/petsc-3.0.0- > p1/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009 > [0]PETSC ERROR: Configure options --with-blas-lapack-dir=./ > externalpackages/fblaslapack-3.1.1/ --download-mpich=1 --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/ > mtr.c > [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in src/mat/ > impls/aij/seq/aij.c > [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in src/mat/ > impls/aij/seq/aijfact.c > [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in src/mat/ > interface/matrix.c > [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ > ilu/ilu.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c > [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c > [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c > [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c > [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c > [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c > [0]PETSC ERROR: Solve() line 318 in qffxmhd.c > [0]PETSC ERROR: main() line 172 in qffxmhd.c > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > > Program exited with code 01. > > I thought it might because of the unfreed memory, so I picked up > ex29.c as a comparision. > > Thanks, > > Rebecca > > > Quoting Matthew Knepley : > >> On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN >> wrote: >> >>> Those unfreed bytes cause "out of memory" when it runs at bigger >>> grid >>> sizes. So I have to find out those unfreed memory and free them... >>> Any >>> suggestions? >> >> >> Not from what you mailed in. On that DA line, I see >> PetscHeaderCreate(). Is >> that what you see? >> >> Matt >> >> >>> >>> Thanks, >>> >>> R >>> >>> >>> Quoting Matthew Knepley : >>> >>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN >>>> wrote: >>>> >>>> Hi, >>>>> >>>>> My own code has some left bytes still reachable according to >>>>> valgrind, >>>>> then >>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to >>>>> compile and >>>>> make the files, it gives me different number of bytes left still >>>>> reachable. >>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as >>>>> another >>>>> example, >>>>> and found that some bytes are still reachable, what is the cause >>>>> of it? >>>>> It >>>>> shows that it is from DACreate2D() and the I use -malloc_dump to >>>>> get >>>>> those >>>>> unfreed informations. >>>>> >>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th >>>>> are true >>>>> for all examples, but where do 1st and 5th ones come from? Also, >>>>> the >>>>> -malloc_dump information shows that there are >>>>> "[0]Total space allocated 37780 bytes", >>>>> but valgrind gives the information as >>>>> "==26628== still reachable: 132,828 bytes in 323 blocks" >>>>> >>>>> Why there is a big difference? >>>>> >>>> >>>> >>>> 1 is fine. It is from PMPI setup, which has some bytes not freed >>>> from >>>> setting up the MPI >>>> processes. The last one looks like an unfreed header for a DA, >>>> which is >>>> strange. >>>> >>>> Matt >>>> >>>> >>>> >>>>> Thanks very much! >>>>> >>>>> Rebecca >>>>> >>>>> Here is the message from valgrind of running ex29: >>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss >>>>> record 1 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >>>>> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >>>>> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >>>>> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >>>>> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are >>>>> definitely >>>>> lost in loss record 2 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/ >>>>> i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FDB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize >>>>> (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss >>>>> record 3 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/ >>>>> i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FFB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize >>>>> (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss >>>>> record 4 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/ >>>>> i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FFB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so >>>>> ) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize >>>>> (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in >>>>> loss record >>>>> 5 >>>>> of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >>>>> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >>>>> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >>>>> ==26628== by 0x804BAFB: main (ex29.c:153) >>>>> ==26628== >>>>> ==26628== LEAK SUMMARY: >>>>> ==26628== definitely lost: 36 bytes in 1 blocks. >>>>> ==26628== indirectly lost: 120 bytes in 10 blocks. >>>>> ==26628== possibly lost: 0 bytes in 0 blocks. >>>>> ==26628== still reachable: 132,828 bytes in 323 blocks. >>>>> ==26628== suppressed: 0 bytes in 0 blocks. >>>>> >>>>> >>>>> -- >>>>> (Rebecca) Xuefei YUAN >>>>> Department of Applied Physics and Applied Mathematics >>>>> Columbia University >>>>> Tel:917-399-8032 >>>>> www.columbia.edu/~xy2102 < >>>>> http://www.columbia.edu/%7Exy2102> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments >>>> is infinitely more interesting than any results to which their >>>> experiments >>>> lead. >>>> -- Norbert Wiener >>>> >>>> >>> >>> >>> -- >>> (Rebecca) Xuefei YUAN >>> Department of Applied Physics and Applied Mathematics >>> Columbia University >>> Tel:917-399-8032 >>> www.columbia.edu/~xy2102 >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments >> is infinitely more interesting than any results to which their >> experiments >> lead. >> -- Norbert Wiener >> > > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From bsmith at mcs.anl.gov Mon Jul 27 20:36:35 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 27 Jul 2009 20:36:35 -0500 Subject: memory check of /snes/example/tutorials/ex29.c In-Reply-To: <20090727180809.sh1t22n8wsgw8ckc@cubmail.cc.columbia.edu> References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu> <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu> <20090727180809.sh1t22n8wsgw8ckc@cubmail.cc.columbia.edu> Message-ID: <446098C7-F361-485A-A704-F81AB497C591@mcs.anl.gov> On Jul 27, 2009, at 5:08 PM, (Rebecca) Xuefei YUAN wrote: > Dear Barry, > > Do you mean that this small memory leak has been fixed in the > current petsc-3.0.0-p7? No, it is fixed in the Mecurial version and in petsc-dev. It will be fixed in the next patch. As I said before it is not a serious memory leak. Barry > Since the version I have is petsc-3.0.0-p1. > > By the way, have you noticed another email about the possible bug in > DMCompositeGetMatrix() function? > > Thanks, > > Rebecca > > Quoting Barry Smith : > >> >> This was due to a small memory leak in DMMGSetNullSpace() of the >> vectors creating internally to hold the null space. I have pushed a >> fix >> to petsc-3.0.0 and petsc-dev >> It will be fixed in the next 3.0.0 patch. >> >> Note this would not cause the "out of memory" for runs at bigger >> grid sizes. That is likely just coming from trying to run too large a >> problem for your memory size. >> >> Thanks for reporting the memory leak, >> >> Barry >> >> >> On Jul 27, 2009, at 4:34 PM, (Rebecca) Xuefei YUAN wrote: >> >>> Those unfreed bytes cause "out of memory" when it runs at bigger >>> grid sizes. So I have to find out those unfreed memory and free >>> them... Any suggestions? >>> >>> Thanks, >>> >>> R >>> >>> Quoting Matthew Knepley : >>> >>>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> My own code has some left bytes still reachable according to >>>>> valgrind, then >>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to >>>>> compile and >>>>> make the files, it gives me different number of bytes left >>>>> still reachable. >>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as >>>>> another example, >>>>> and found that some bytes are still reachable, what is the cause >>>>> of it? It >>>>> shows that it is from DACreate2D() and the I use -malloc_dump to >>>>> get those >>>>> unfreed informations. >>>>> >>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th >>>>> are true >>>>> for all examples, but where do 1st and 5th ones come from? Also, >>>>> the >>>>> -malloc_dump information shows that there are >>>>> "[0]Total space allocated 37780 bytes", >>>>> but valgrind gives the information as >>>>> "==26628== still reachable: 132,828 bytes in 323 blocks" >>>>> >>>>> Why there is a big difference? >>>> >>>> >>>> 1 is fine. It is from PMPI setup, which has some bytes not freed >>>> from >>>> setting up the MPI >>>> processes. The last one looks like an unfreed header for a DA, >>>> which is >>>> strange. >>>> >>>> Matt >>>> >>>> >>>>> >>>>> Thanks very much! >>>>> >>>>> Rebecca >>>>> >>>>> Here is the message from valgrind of running ex29: >>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss >>>>> record 1 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62) >>>>> ==26628== by 0x86F743A: MPID_Init (mpid_init.c:116) >>>>> ==26628== by 0x86D040B: MPIR_Init_thread (initthread.c:288) >>>>> ==26628== by 0x86CFF2D: PMPI_Init (init.c:106) >>>>> ==26628== by 0x8613D69: PetscInitialize (pinit.c:503) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are >>>>> definitely >>>>> lost in loss record 2 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so) >>>>> ==26628== by 0x429BC2D: __nss_database_lookup (in /lib/tls/ >>>>> i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FDB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize >>>>> (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss >>>>> record 3 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x429AFBB: __nss_lookup_function (in /lib/tls/ >>>>> i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FFB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize >>>>> (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss >>>>> record 4 of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x428839B: tsearch (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x429AF7D: __nss_lookup_function (in /lib/tls/ >>>>> i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x4732FFB: ??? >>>>> ==26628== by 0x473413C: ??? >>>>> ==26628== by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ >>>>> libc-2.7.so) >>>>> ==26628== by 0x8623509: PetscGetUserName (fuser.c:68) >>>>> ==26628== by 0x85E0CF0: PetscErrorPrintfInitialize >>>>> (errtrace.c:68) >>>>> ==26628== by 0x8613E23: PetscInitialize (pinit.c:518) >>>>> ==26628== by 0x804B796: main (ex29.c:139) >>>>> ==26628== >>>>> ==26628== >>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in >>>>> loss record 5 >>>>> of 5 >>>>> ==26628== at 0x4022AB8: malloc (vg_replace_malloc.c:207) >>>>> ==26628== by 0x85EF3AC: PetscMallocAlign (mal.c:40) >>>>> ==26628== by 0x85F049B: PetscTrMallocDefault (mtr.c:194) >>>>> ==26628== by 0x81BCD3F: DACreate2d (da2.c:364) >>>>> ==26628== by 0x804BAFB: main (ex29.c:153) >>>>> ==26628== >>>>> ==26628== LEAK SUMMARY: >>>>> ==26628== definitely lost: 36 bytes in 1 blocks. >>>>> ==26628== indirectly lost: 120 bytes in 10 blocks. >>>>> ==26628== possibly lost: 0 bytes in 0 blocks. >>>>> ==26628== still reachable: 132,828 bytes in 323 blocks. >>>>> ==26628== suppressed: 0 bytes in 0 blocks. >>>>> >>>>> >>>>> -- >>>>> (Rebecca) Xuefei YUAN >>>>> Department of Applied Physics and Applied Mathematics >>>>> Columbia University >>>>> Tel:917-399-8032 >>>>> www.columbia.edu/~xy2102 >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments >>>> is infinitely more interesting than any results to which their >>>> experiments >>>> lead. >>>> -- Norbert Wiener >>>> >>> >>> >>> >>> -- >>> (Rebecca) Xuefei YUAN >>> Department of Applied Physics and Applied Mathematics >>> Columbia University >>> Tel:917-399-8032 >>> www.columbia.edu/~xy2102 >>> > > > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From bsmith at mcs.anl.gov Mon Jul 27 21:04:16 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 27 Jul 2009 21:04:16 -0500 Subject: possible bug in DMCompositeGetMatrix(). In-Reply-To: <20090726195136.f07iqopggk0skkg0@cubmail.cc.columbia.edu> References: <20090726195136.f07iqopggk0skkg0@cubmail.cc.columbia.edu> Message-ID: You cannot use the stencil operations to put values into a "composite matrix". The numbering of rows and columns of the composite matrix reflect all the different variables (unknowns) sp do not match what they are for a single component. Barry On Jul 26, 2009, at 6:51 PM, (Rebecca) Xuefei YUAN wrote: > Hi, > > I am working on an optimization problem, in which I would like to > assemble a Jacobian matrix. Thus > DMMGSetSNES(dmmg,FormFunction,FormJacobian) is called. > > In damgsnes.c:637, in calling DMGetMatrix(), it calls > DMCompositeGetMatrix() where the temp matrix Atmp has been freed > before it passes any information to J at pack.c:1722 and 1774. > > So after calling DMGetMatrix() in DMMGSetSNES, the stencil of the > dmmg[i]->B has unchanged, i.e., > > (gdb) p dmmg[0]->B->stencil > $107 = {dim = 0, dims = {0, 0, 0, 0}, starts = {0, 0, 0, 0}, noc = > PETSC_FALSE} > (gdb) where > #0 DMMGSetSNES (dmmg=0x8856208, function=0x804c84f , > jacobian=0x8052932 ) at damgsnes.c:641 > #1 0x0804c246 in main (argc=Cannot access memory at address 0x0 > ) at tworeggt.c:126 > > I compare this with > http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/snes/examples/tutorials/ex18.c.html > > and it shows that the stencil has been carried out and passed to > dmmg[0]->B as follows: > > (gdb) p dmmg[i]->B->stencil > $80 = {dim = 2, dims = {5, 5, 1, 0}, starts = {0, 0, 0, 0}, noc = > PETSC_TRUE} > (gdb) where > #0 DMMGSetSNES (dmmg=0x884b530, function=0x804c364 , > jacobian=0x804d34d ) at damgsnes.c:642 > #1 0x0804b969 in main (argc=Cannot access memory at address 0x2 > ) at ex18.c:100 > > Because of this missing stencil of Jacobian matrix, I get the error > code as follows: > Program received signal SIGSEGV, Segmentation fault. > 0x082447c2 in ISLocalToGlobalMappingApply (mapping=0x0, N=1, > in=0xbff8f250, > out=0xbff8ce14) at /home/rebecca/soft/petsc-3.0.0-p1/include/ > petscis.h:129 > 129 PetscInt i,*idx = mapping->indices,Nmax = mapping->n; > (gdb) where > #0 0x082447c2 in ISLocalToGlobalMappingApply (mapping=0x0, N=1, > in=0xbff8f250, out=0xbff8ce14) > at /home/rebecca/soft/petsc-3.0.0-p1/include/petscis.h:129 > #1 0x0824440c in MatSetValuesLocal (mat=0x88825e8, nrow=1, > irow=0xbff8f250, > ncol=4, icol=0xbff8ee50, y=0xbff8f628, addv=INSERT_VALUES) at > matrix.c:1583 > #2 0x08240aae in MatSetValuesStencil (mat=0x88825e8, m=1, > idxm=0xbff8f6b8, > n=4, idxn=0xbff8f4b4, v=0xbff8f628, addv=INSERT_VALUES) at > matrix.c:1099 > #3 0x08053835 in FormJacobian (snes=0x8874700, X=0x8856778, > J=0x88747d0, > B=0x88747d4, flg=0xbff8f8d4, ptr=0x8856338) at tworeggt.c:937 > #4 0x0805a5cf in DMMGComputeJacobian_Multigrid (snes=0x8874700, > X=0x8856778, > J=0x88747d0, B=0x88747d4, flag=0xbff8f8d4, ptr=0x8856208) at > damgsnes.c:60 > #5 0x0806b18a in SNESComputeJacobian (snes=0x8874700, X=0x8856778, > A=0x88747d0, B=0x88747d4, flg=0xbff8f8d4) at snes.c:1111 > #6 0x08084945 in SNESSolve_LS (snes=0x8874700) at ls.c:189 > #7 0x08073198 in SNESSolve (snes=0x8874700, b=0x0, x=0x8856778) at > snes.c:2221 > #8 0x0805d5f9 in DMMGSolveSNES (dmmg=0x8856208, level=0) at > damgsnes.c:510 > #9 0x08056e38 in DMMGSolve (dmmg=0x8856208) at damg.c:372 > #10 0x0804c3fe in main (argc=128, argv=0xbff90c04) at tworeggt.c:131 > > I think there might be a bug in DMCompositeGetMatrix(). > > Thanks very much! > > Cheers, > > -- > (Rebecca) Xuefei YUAN > Department of Applied Physics and Applied Mathematics > Columbia University > Tel:917-399-8032 > www.columbia.edu/~xy2102 > From sekikawa at msi.co.jp Tue Jul 28 01:10:44 2009 From: sekikawa at msi.co.jp (Takuya Sekikawa) Date: Tue, 28 Jul 2009 15:10:44 +0900 Subject: eigenvector on singlar matrix Message-ID: <20090728145459.AEB0.SEKIKAWA@msi.co.jp> Hi I have a question about SLEPc. What are EigenVectors calculated when given matrix is singular? (ex1) for example, matrix like this: (1 1 1) A = (1 1 1) (1 1 1) matrix A have 2 eigenvalues, one is 0 (double multiple root), and other is 3. in this case eigenvector related to eigenvalue 0, is (z1, z2, -(z1+z2))t (z1, z2 can be any value. i.e. freedom degree is 2) (ex2) B = (0 1) (0 0) in this case B's eigenvalue is only 0. (double multiple root) but eigenvector has only 1 freedom degree. (z1, 0)t My question is, what will be solution by SLEPc in these case? Thanks, Takuya --------------------------------------------------------------- Takuya Sekikawa Mathematical Systems, Inc sekikawa at msi.co.jp --------------------------------------------------------------- From tim.kroeger at cevis.uni-bremen.de Tue Jul 28 01:22:49 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Tue, 28 Jul 2009 08:22:49 +0200 (CEST) Subject: Solver problem In-Reply-To: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> Message-ID: Dear Barry, On Mon, 27 Jul 2009, Barry Smith wrote: > On Jul 27, 2009, at 4:35 AM, Tim Kroeger wrote: > >> In my application, there is a linear system to be solved in every time >> step. Steps 0 and 1 work well, but in step 2 PETSc fails to converge. I >> suspected that the system might be unsolvable in that step and checked that >> by writing matrix and the right hand side to files and loading them into >> "octave". Surprisingly, "octave" does find a solution to the system >> without any problems. > > Octave is using a direct solver. Did you try PETSc's direct solver using > -pc_type lu? Good idea! It works, and I'm actually surprised that it does. I did try ILU(3) before (i.e., -pc_type ilu -pc_factor_levels 3), which took forever to compute. Hence I thought that full LU would take even longer, but this turned out to be not true; LU is accetible in performance and solves the problem for that test case. I'll see how it will behave on a larger-scale problem and on multiple cores. Would you recommend to try MUMPS as well? (I.e., will MUMPS have a change to be faster than ILU?) Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From tim.kroeger at cevis.uni-bremen.de Tue Jul 28 01:26:14 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Tue, 28 Jul 2009 08:26:14 +0200 (CEST) Subject: Solver problem In-Reply-To: <7f18de3b0907270743i7fe33021tf3b88c2e4f0d80be@mail.gmail.com> References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> <7f18de3b0907270743i7fe33021tf3b88c2e4f0d80be@mail.gmail.com> Message-ID: Dear Michel, On Mon, 27 Jul 2009, Michel Cancelliere wrote: > Do you mean steps (iterations) 0 and 1 for SNES or KSP? Neither nor. I meant time steps. I am not using SNES, but I have a time dependent problem where in each step a linear system is solved using KSP. Anyway, thank you for your reply. Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From Andreas.Grassl at student.uibk.ac.at Tue Jul 28 02:30:46 2009 From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl) Date: Tue, 28 Jul 2009 09:30:46 +0200 Subject: Solver problem In-Reply-To: References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> Message-ID: <4A6EA926.3090400@student.uibk.ac.at> Tim Kroeger schrieb: > Would you recommend to try MUMPS as well? (I.e., will MUMPS have a > change to be faster than ILU?) I would highly recommend to give it a try. MUMPS is a direct sparse solver, so it is not comparable to the combination ilu-gmres, because iterative solver have a complexity of O(N) and highly depend on the spectrum/condition of the matrix. Sparse direct solver have a complexity of around O(N^2), but the runtime is not connected to the condition of the matrix (only the accuracy of the result). In my experience MUMPS is much faster than lu, because lu is only sequential and the implementation is only for "verification"-reasons, what i understood. Maybe -ksp_monitor_singular_value is interesting for you? -- /"\ Grassl Andreas \ / ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML email Technikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 From jroman at dsic.upv.es Tue Jul 28 02:57:58 2009 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 28 Jul 2009 09:57:58 +0200 Subject: eigenvector on singlar matrix In-Reply-To: <20090728145459.AEB0.SEKIKAWA@msi.co.jp> References: <20090728145459.AEB0.SEKIKAWA@msi.co.jp> Message-ID: <3B6A7B28-1818-4063-A263-A8852E1D4727@dsic.upv.es> On 28/07/2009, Takuya Sekikawa wrote: > Hi > > I have a question about SLEPc. > > What are EigenVectors calculated when given matrix is singular? > > (ex1) > for example, matrix like this: > > (1 1 1) > A = (1 1 1) > (1 1 1) > > matrix A have 2 eigenvalues, one is 0 (double multiple root), > and other is 3. > > in this case eigenvector related to eigenvalue 0, is > (z1, z2, -(z1+z2))t (z1, z2 can be any value. i.e. freedom degree is > 2) > > > (ex2) > > B = (0 1) > (0 0) > > in this case B's eigenvalue is only 0. (double multiple root) > but eigenvector has only 1 freedom degree. > (z1, 0)t > > My question is, what will be solution by SLEPc in these case? > > Thanks, > Takuya > --------------------------------------------------------------- > Takuya Sekikawa > Mathematical Systems, Inc > sekikawa at msi.co.jp > --------------------------------------------------------------- For such small matrices, the computed solution will be the same as the one provided by Lapack. If your problem matrices are small, use Lapack instead of SLEPc. For large matrices, if the dimension of the nullspace is small then you should have no problems when computing the eigenvectors of zero eigenvalues with SLEPc. But if you have a large nullspace then things may get problematic - I have not tried this case. Please report any problems to the SLEPc maintainance email. Jose From tim.kroeger at cevis.uni-bremen.de Tue Jul 28 08:42:20 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Tue, 28 Jul 2009 15:42:20 +0200 (CEST) Subject: Solver problem In-Reply-To: <4A6EA926.3090400@student.uibk.ac.at> References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> <4A6EA926.3090400@student.uibk.ac.at> Message-ID: Dear Andreas, On Tue, 28 Jul 2009, Andreas Grassl wrote: > Tim Kroeger schrieb: > >> Would you recommend to try MUMPS as well? (I.e., will MUMPS have a >> change to be faster than ILU?) > > I would highly recommend to give it a try. I can't get it running. )-: I used the chance and updated to petsc-3.0.0-p7. I configured with MUMPS and compiled succesfully. I run my application with -pc_factor_mat_solver_package mumps -pc_type lu, and then it crashes with the following message: symbol lookup error: /home/tkroeger/archives/petsc-3.0.0-p7/linux-gnu/lib/libpetscmat.so: undefined symbol: _gfortran_allocate_array Otherwise, petsc-3.0.0-p7 works fine; that is, if I don't use the above options, it doesn't crash. What did I do wrong? Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From tim.kroeger at cevis.uni-bremen.de Tue Jul 28 09:40:28 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Tue, 28 Jul 2009 16:40:28 +0200 (CEST) Subject: Solver problem In-Reply-To: References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> <4A6EA926.3090400@student.uibk.ac.at> Message-ID: On Tue, 28 Jul 2009, Tim Kroeger wrote: > I used the chance and updated to petsc-3.0.0-p7. I configured with > MUMPS and compiled succesfully. I run my application with > -pc_factor_mat_solver_package mumps -pc_type lu, and then it crashes > with the following message: > > symbol lookup error: /home/tkroeger/archives/petsc-3.0.0-p7/linux-gnu/lib/libpetscmat.so: undefined symbol: _gfortran_allocate_array I should add that the crash occurs inside KSPSolve(). I'm at a loss. Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From rlmackie862 at gmail.com Tue Jul 28 10:17:35 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Tue, 28 Jul 2009 08:17:35 -0700 Subject: suggestions for debugging code Message-ID: <4A6F168F.5070208@gmail.com> I have run into a very difficult debugging problem. I have recently made some modifications to my PETSc code, to add some new features. When I compiled the code in debug mode (we are using the Intel compilers and mvapich on Infiniband), the code runs fine with any number of processes. When the code is compiled in optimize mode, it runs fine on, say, up to 32 processes, but not 64, bombing out someplace strange, with a Segmentation Violation. I've tried using Valgrind, but you can't use it with PETSc and my code compiled in Debug mode because the code finishes successfully, and the other problem I have with Valgrind + mvapich is there are about a million messages spewed out, making it extremely difficult to see if there are really any issues in MY code. I've thought to have PETSc download and compile MPICH2, which I would hope would produce less output from Valgrind. Anyone have any suggestions on how to debug this tricky situation? Any suggestions would be greatly appreciated. Randy From u.tabak at tudelft.nl Tue Jul 28 10:28:00 2009 From: u.tabak at tudelft.nl (Umut Tabak) Date: Tue, 28 Jul 2009 17:28:00 +0200 Subject: Normalization options in slepc Message-ID: <4A6F1900.10400@tudelft.nl> Dear all, Is there a normalization selection option in Slepc for eigenvectors, as far as I can see, it normalizes the eigenvectors so that their norm is equal to 1. Can this normalization be customized with respect to B, in a generalized problem context. It is not hard to write a function for this, but I wondered if there is already an option for this. Best regards, Umut From bsmith at mcs.anl.gov Tue Jul 28 10:35:56 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 28 Jul 2009 10:35:56 -0500 Subject: Solver problem In-Reply-To: References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> <4A6EA926.3090400@student.uibk.ac.at> Message-ID: <897733B6-228F-4397-9C49-81CC0BE296F9@mcs.anl.gov> > undefined symbol: _gfortran_allocate_array This is likely a symbol in the gfortran compiler libraries. Are you linking your application code against all the libraries it needs to be linked against? In the PETSc directory run make getlinklibs make sure all the libraries are listed in your makefile or use a PETSc makefile as a template. Or this is coming from the fact that you are using shared or dynamic libraries. If you don't need shared libraries then run PETSc's config/configure.py with --with-shared=0 Barry On Jul 28, 2009, at 8:42 AM, Tim Kroeger wrote: > Dear Andreas, > > On Tue, 28 Jul 2009, Andreas Grassl wrote: > >> Tim Kroeger schrieb: >> >>> Would you recommend to try MUMPS as well? (I.e., will MUMPS have a >>> change to be faster than ILU?) >> >> I would highly recommend to give it a try. > > I can't get it running. )-: > > I used the chance and updated to petsc-3.0.0-p7. I configured with > MUMPS and compiled succesfully. I run my application with > -pc_factor_mat_solver_package mumps -pc_type lu, and then it crashes > with the following message: > > symbol lookup error: /home/tkroeger/archives/petsc-3.0.0-p7/linux- > gnu/lib/libpetscmat.so: undefined symbol: _gfortran_allocate_array > > Otherwise, petsc-3.0.0-p7 works fine; that is, if I don't use the > above options, it doesn't crash. > > What did I do wrong? > > Best Regards, > > Tim > > -- > Dr. Tim Kroeger > tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 > tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 > > Fraunhofer MEVIS, Institute for Medical Image Computing > Universitaetsallee 29, 28359 Bremen, Germany > From knepley at gmail.com Tue Jul 28 10:41:11 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 28 Jul 2009 10:41:11 -0500 Subject: suggestions for debugging code In-Reply-To: <4A6F168F.5070208@gmail.com> References: <4A6F168F.5070208@gmail.com> Message-ID: On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie wrote: > I have run into a very difficult debugging problem. I have recently made > some > modifications to my PETSc code, to add some new features. When I compiled > the > code in debug mode (we are using the Intel compilers and mvapich on > Infiniband), > the code runs fine with any number of processes. > > When the code is compiled in optimize mode, it runs fine on, say, up to 32 > processes, > but not 64, bombing out someplace strange, with a Segmentation Violation. > > I've tried using Valgrind, but you can't use it with PETSc and my code > compiled in > Debug mode because the code finishes successfully, and the other problem I > have with Sometimes valgrind will catch things even when code does not crash. > > Valgrind + mvapich is there are about a million messages spewed out, making > it > extremely difficult to see if there are really any issues in MY code. I've > thought > to have PETSc download and compile MPICH2, which I would hope would produce > less > output from Valgrind. In order to filter these out, you use a "suppressions file" for valgrind. The manual has a good section on this and it should not be hard to wipre out most of them. Satish designed one for our unit tests. Matt > > Anyone have any suggestions on how to debug this tricky situation? Any > suggestions > would be greatly appreciated. > > Randy > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From darach at tchpc.tcd.ie Tue Jul 28 11:28:13 2009 From: darach at tchpc.tcd.ie (darach at tchpc.tcd.ie) Date: Tue, 28 Jul 2009 17:28:13 +0100 Subject: Compiling Boost & Sieve & complex scalar Message-ID: <20090728162813.GH19239@tchpc.tcd.ie> Hi, I have been trying to compile petsc with boost & sieve & complex scalars using the following configuration commands (I also compiled with PetscScalar=real as a comparison). I'm getting an error when the petsc scalar type is complex, but no error with scalar type real (just warnings). What configuration options am I missing? I include longer output below; first the complex case and then the real case. petsc was compiled from the following tar file: petsc-3.0.0-p7.tar.gz Complex: ./config/configure.py --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-complex --with-scalar-type=complex --with-clanguage=cxx --with-boost=1 --download-boost=/home/user/Compile/petsc3p7-complex/externalpackages/boost.tar.gz --with-sieve=1 Real: ./config/configure.py --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-real --with-scalar-type=real --with-clanguage=cxx --with-boost=1 --download-boost=/home/user/Compile/petsc3p7-real/externalpackages/boost.tar.gz --with-sieve=1 Darach Longer Output: Complex Scalar: ./config/configure.py --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-complex --with-scalar-type=complex --with-clanguage=cxx --with-boost=1 --download-boost=/home/user/Compile/petsc3p7-complex/externalpackages/boost.tar.gz --with-sieve=1 .... ================================================================================= Configuring PETSc to compile on your system ================================================================================= TESTING: alternateConfigureLibrary from PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:69) Compilers: C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g Fortran Compiler: mpif90 -Wall -Wno-unused-variable -g Linkers: Static linker: /usr/bin/ar cr PETSc: ** ** Before running "make" your PETSC_ARCH must be specified with: ** ** setenv PETSC_ARCH linux-gnu-cxx-debug (csh/tcsh) ** ** PETSC_ARCH=linux-gnu-cxx-debug; export PETSC_ARCH (sh/bash) ** PETSC_DIR: /home/user/Compile/petsc3p7-complex ** ** Now build the libraries with "make all" ** Clanguage: Cxx PETSc shared libraries: disabled PETSc dynamic libraries: disabled Scalar type:complex MPI: Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib X11: Includes: [''] Library: ['-lX11'] BLAS/LAPACK: -llapack -lblas Sieve: Includes: -I/home/user/Compile/petsc3p7-complex/include/sieve Boost: Includes: -I/home/user/Compile/petsc3p7-complex/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib Errors: ------------------------------------------------------------------------ ...... ...... libfast in: /home/user/Compile/petsc3p7-complex/src/dm/mesh mesh.c: In function ???PetscErrorCode assembleVector(_p_Vec*, PetscInt, PetscScalar*, InsertMode)???: mesh.c:1104: error: no matching function for call to ???ALE::IMesh > > >::update(const ALE::Obj >, ALE::malloc_allocator > > >&, int, PetscScalar*&)??? /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1715: note: candidates are: void ALE::IMesh::update(const ALE::Obj >&, const typename ALE::IBundle >, ALE::IGeneralSection >, ALE::IGeneralSection >, Label_, ALE::UniformSection, int, 1, ALE::malloc_allocator > >::sieve_type::point_type&, const typename Section::value_type*) [with Section = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >] mesh.c:1106: error: no matching function for call to ???ALE::IMesh > > >::updateAdd(const ALE::Obj >, ALE::malloc_allocator > > >&, int, PetscScalar*&)??? /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1741: note: candidates are: void ALE::IMesh::updateAdd(const ALE::Obj >&, const typename ALE::IBundle >, ALE::IGeneralSection >, ALE::IGeneralSection >, Label_, ALE::UniformSection, int, 1, ALE::malloc_allocator > >::sieve_type::point_type&, const typename Section::value_type*) [with Section = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >] mesh.c: In function ???PetscErrorCode MeshRestrictClosure(_p_Mesh*, _p_SectionReal*, PetscInt, PetscInt, PetscScalar*)???: mesh.c:2523: error: no matching function for call to ???ALE::IMesh > > >::restrictClosure(ALE::Obj >, ALE::malloc_allocator > > >&, PetscInt&, PetscScalar*&, PetscInt&)??? /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1691: note: candidates are: const typename Section::value_type* ALE::IMesh::restrictClosure(const ALE::Obj >&, const typename ALE::IBundle >, ALE::IGeneralSection >, ALE::IGeneralSection >, Label_, ALE::UniformSection, int, 1, ALE::malloc_allocator > >::sieve_type::point_type&, typename Section::value_type*, int) [with Section = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >] mesh.c: In function ???PetscErrorCode MeshUpdateClosure(_p_Mesh*, _p_SectionReal*, PetscInt, PetscScalar*)???: mesh.c:2557: error: no matching function for call to ???ALE::IMesh > > >::update(ALE::Obj >, ALE::malloc_allocator > > >&, PetscInt&, PetscScalar*&)??? /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1715: note: candidates are: void ALE::IMesh::update(const ALE::Obj >&, const typename ALE::IBundle >, ALE::IGeneralSection >, ALE::IGeneralSection >, Label_, ALE::UniformSection, int, 1, ALE::malloc_allocator > >::sieve_type::point_type&, const typename Section::value_type*) [with Section = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >] /home/user/Compile/petsc3p7-complex/include/petscmesh.hh: In function ???PetscErrorCode MeshCreateGlobalScatter(const ALE::Obj >&, const ALE::Obj >&, _p_VecScatter**) [with Mesh = ALE::IMesh > > >, Section = ALE::IGeneralSection >]???: mesh.c:815: instantiated from here /home/user/Compile/petsc3p7-complex/include/petscmesh.hh:93: error: no matching function for call to ???VecCreateSeqWithArray(ompi_communicator_t*, int, const double*, _p_Vec**)??? /home/user/Compile/petsc3p7-complex/include/petscvec.h:66: note: candidates are: PetscErrorCode VecCreateSeqWithArray(ompi_communicator_t*, PetscInt, const PetscScalar*, _p_Vec**) /home/user/Compile/petsc3p7-complex/include/petscvec.h:67: note: PetscErrorCode VecCreateSeqWithArray(PetscInt, PetscScalar*, _p_Vec**) /home/user/Compile/petsc3p7-complex/include/petscmesh.hh: In function ???PetscErrorCode updateOperator(_p_Mat*, const Sieve&, Visitor&, const int&, PetscScalar*, InsertMode) [with Sieve = ALE::IFSieve >, Visitor = updateOperator(_p_Mat*, const ALE::Obj > > >, ALE::malloc_allocator > > > > >&, const ALE::Obj >, ALE::malloc_allocator > > >&, const ALE::Obj, ALE::malloc_allocator > >&, const int&, PetscScalar*, InsertMode)::visitor_type]???: mesh.c:1121: instantiated from here ..... ..... ------------------------------------------------------------------------ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Real Scalar ./config/configure.py --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-real --with-scalar-type=real --with-clanguage=cxx --with-boost=1 --download-boost=/home/user/Compile/petsc3p7-real/externalpackages/boost.tar.gz --with-sieve=1 .... Configuring PETSc to compile on your system ================================================================================= TESTING: alternateConfigureLibrary from PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:69) Compilers: C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g Fortran Compiler: mpif90 -Wall -Wno-unused-variable -g Linkers: Static linker: /usr/bin/ar cr PETSc: ** ** Before running "make" your PETSC_ARCH must be specified with: ** ** setenv PETSC_ARCH linux-gnu-cxx-debug (csh/tcsh) ** ** PETSC_ARCH=linux-gnu-cxx-debug; export PETSC_ARCH (sh/bash) ** PETSC_DIR: /home/user/Compile/petsc3p7-real ** ** Now build the libraries with "make all" ** Clanguage: Cxx PETSc shared libraries: disabled PETSc dynamic libraries: disabled Scalar type:real MPI: Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib X11: Includes: [''] Library: ['-lX11'] BLAS/LAPACK: -llapack -lblas Sieve: Includes: -I/home/user/Compile/petsc3p7-real/include/sieve Boost: Includes: -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib No Errors; warnings: ------------------------------------------------------------------------ ..... ..... libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/sieve /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347: instantiated from ???ALE::IBundle::~IBundle() [with Sieve_ = ALE::IFSieve >, RealSection_ = ALE::IGeneralSection >, IntSection_ = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >, ArrowSection_ = ALE::UniformSection, int, 1, ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? meshmgsnes.c:63: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? meshmgsnes.c:349: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/ftn-auto libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/snes/f90-mod mpif90 -c -Wall -Wno-unused-variable -g -I/home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc3p7-real/include -I/home/user/Compile/petsc3p7-real/include/sieve -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -o petscsnesmod.o petscsnesmod.F /usr/bin/ar cr /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/lib/libpetscsnes.a petscsnesmod.o /bin/cp -f *.mod /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include libfast in: /home/user/Compile/petsc3p7-real/src/snes/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/ts libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface/ftn-auto libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/euler libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/rk libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/rk/ftn-auto libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit/beuler libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit/cn libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/pseudo libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/pseudo/ftn-auto libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples/tests libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples/tutorials libfast in: /home/user/Compile/petsc3p7-real/src/ts/f90-mod mpif90 -c -Wall -Wno-unused-variable -g -I/home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc3p7-real/include -I/home/user/Compile/petsc3p7-real/include/sieve -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -o petsctsmod.o petsctsmod.F /usr/bin/ar cr /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/lib/libpetscts.a petsctsmod.o /bin/cp -f *.mod /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include libfast in: /home/user/Compile/petsc3p7-real/src/dm libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface/ftn-auto libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/basic libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/basic/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping/ftn-auto libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples/tests libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples/tutorials libfast in: /home/user/Compile/petsc3p7-real/src/dm/da libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/ftn-auto libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/f90-custom libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples/tests libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples/tutorials libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/ftn-auto libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/ftn-custom libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/f90-custom libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1145: instantiated from ???ALE::IFSieve::IFSieve(ompi_communicator_t*, int) [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? mesh.c:1600: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? mesh.c:2641: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: meshpcice.c:387: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: meshpcice.c:388: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347: instantiated from ???ALE::IBundle::~IBundle() [with Sieve_ = ALE::IFSieve >, RealSection_ = ALE::IGeneralSection >, IntSection_ = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >, ArrowSection_ = ALE::UniformSection, int, 1, ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? meshpflotran.c:235: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? meshpflotran.c:903: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1331: instantiated from ???ALE::IBundle::IBundle(ompi_communicator_t*, int) [with Sieve_ = ALE::IFSieve >, RealSection_ = ALE::IGeneralSection >, IntSection_ = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >, ArrowSection_ = ALE::UniformSection, int, 1, ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1598: instantiated from ???ALE::IMesh::IMesh(ompi_communicator_t*, int, int) [with Label_ = ALE::LabelSifter > >]??? meshexodus.c:183: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? meshexodus.c:364: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/petscmesh_viewers.hh:482: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? section.c:1405: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/sieve libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/impls libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/impls/cartesian /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347: instantiated from ???ALE::IBundle::~IBundle() [with Sieve_ = ALE::IFSieve >, RealSection_ = ALE::IGeneralSection >, IntSection_ = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >, ArrowSection_ = ALE::UniformSection, int, 1, ALE::malloc_allocator >]??? /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? cartesian.c:263: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? cartesian.c:269: instantiated from here /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor ------------------------------------------------------------------------ From jroman at dsic.upv.es Tue Jul 28 12:17:56 2009 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 28 Jul 2009 19:17:56 +0200 Subject: Normalization options in slepc In-Reply-To: <4A6F1900.10400@tudelft.nl> References: <4A6F1900.10400@tudelft.nl> Message-ID: <17BE0BBF-66E5-478C-8E5D-2360E34DABA2@dsic.upv.es> On 28/07/2009, Umut Tabak wrote: > Dear all, > > Is there a normalization selection option in Slepc for eigenvectors, > as far as I can see, it normalizes the eigenvectors so that their > norm is equal to 1. Can this normalization be customized with > respect to B, in a generalized problem context. It is not hard to > write a function for this, but I wondered if there is already an > option for this. > > Best regards, > > Umut In symmetric-definite generalized problems, it makes more sense to return B-normalized eigenvectors. Some time ago we considered changing this but for some reason we didn't. We will change this in the next patch (slepc-3.0.0-p5). Jose From knepley at gmail.com Tue Jul 28 17:15:59 2009 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 28 Jul 2009 17:15:59 -0500 Subject: Compiling Boost & Sieve & complex scalar In-Reply-To: <20090728162813.GH19239@tchpc.tcd.ie> References: <20090728162813.GH19239@tchpc.tcd.ie> Message-ID: Sorry, I need to put an error in 3.0. Sieve does not work with the complex type in 3.0. I have fixed this in petsc-dev. There are instructions on the website for getting the development version if you need complex scalars. Thanks, Matt On Tue, Jul 28, 2009 at 11:28 AM, wrote: > Hi, > > I have been trying to compile petsc with boost & sieve & complex > scalars using the following configuration commands (I also compiled > with PetscScalar=real as a comparison). I'm getting an error when the > petsc scalar type is complex, but no error with scalar type real (just > warnings). What configuration options am I missing? I include longer > output below; first the complex case and then the real case. > > petsc was compiled from the following tar file: petsc-3.0.0-p7.tar.gz > > Complex: > ./config/configure.py > --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-complex > --with-scalar-type=complex --with-clanguage=cxx --with-boost=1 > --download-boost=/home/user/Compile/petsc3p7-complex/externalpackages/boost.tar.gz > --with-sieve=1 > > Real: > ./config/configure.py > --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-real > --with-scalar-type=real --with-clanguage=cxx --with-boost=1 > --download-boost=/home/user/Compile/petsc3p7-real/externalpackages/boost.tar.gz > --with-sieve=1 > > Darach > > > Longer Output: > > Complex Scalar: > > ./config/configure.py > --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-complex > --with-scalar-type=complex --with-clanguage=cxx --with-boost=1 > --download-boost=/home/user/Compile/petsc3p7-complex/externalpackages/boost.tar.gz > --with-sieve=1 > .... > > ================================================================================= > Configuring PETSc to compile on your system > > ================================================================================= > TESTING: alternateConfigureLibrary from > PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:69) > Compilers: > C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 > C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g > Fortran Compiler: mpif90 -Wall -Wno-unused-variable -g > Linkers: > Static linker: /usr/bin/ar cr > PETSc: > ** > ** Before running "make" your PETSC_ARCH must be specified with: > ** ** setenv PETSC_ARCH linux-gnu-cxx-debug (csh/tcsh) > ** ** PETSC_ARCH=linux-gnu-cxx-debug; export PETSC_ARCH (sh/bash) > ** > PETSC_DIR: /home/user/Compile/petsc3p7-complex > ** > ** Now build the libraries with "make all" > ** > Clanguage: Cxx > PETSc shared libraries: disabled > PETSc dynamic libraries: disabled > Scalar type:complex > MPI: > Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib > X11: > Includes: [''] > Library: ['-lX11'] > BLAS/LAPACK: -llapack -lblas > Sieve: > Includes: -I/home/user/Compile/petsc3p7-complex/include/sieve > Boost: > Includes: -I/home/user/Compile/petsc3p7-complex/externalpackages/Boost/ > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib > > > Errors: > ------------------------------------------------------------------------ > ...... > ...... > > libfast in: /home/user/Compile/petsc3p7-complex/src/dm/mesh > mesh.c: In function ? PetscErrorCode assembleVector(_p_Vec*, PetscInt, > PetscScalar*, InsertMode)? : > mesh.c:1104: error: no matching function for call to ? > ALE::IMesh ALE::malloc_allocator > > >::update(const > ALE::Obj >, > ALE::malloc_allocator ALE::malloc_allocator > > >&, int, PetscScalar*&)? > /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1715: note: > candidates are: void ALE::IMesh::update(const ALE::Obj ALE::malloc_allocator >&, const typename ALE::IBundle ALE::malloc_allocator >, ALE::IGeneralSection ALE::malloc_allocator >, ALE::IGeneralSection ALE::malloc_allocator >, Label_, > ALE::UniformSection, int, 1, > ALE::malloc_allocator > >::sieve_type::point_type&, const typename > Section::value_type*) [with Section = ALE::IGeneralSection ALE::malloc_allocator >, Label_ = ALE::LabelSifter ALE::malloc_allocator > >] > mesh.c:1106: error: no matching function for call to ? > ALE::IMesh ALE::malloc_allocator > > > >::updateAdd(const ALE::Obj ALE::malloc_allocator >, > ALE::malloc_allocator ALE::malloc_allocator > > >&, int, PetscScalar*&)? > /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1741: note: > candidates are: void ALE::IMesh::updateAdd(const ALE::Obj ALE::malloc_allocator >&, const typename ALE::IBundle ALE::malloc_allocator >, ALE::IGeneralSection ALE::malloc_allocator >, ALE::IGeneralSection ALE::malloc_allocator >, Label_, > ALE::UniformSection, int, 1, > ALE::malloc_allocator > >::sieve_type::point_type&, const typename > Section::value_type*) [with Section = ALE::IGeneralSection ALE::malloc_allocator >, Label_ = ALE::LabelSifter ALE::malloc_allocator > >] > mesh.c: In function ? PetscErrorCode MeshRestrictClosure(_p_Mesh*, > _p_SectionReal*, PetscInt, PetscInt, PetscScalar*)? : > mesh.c:2523: error: no matching function for call to ? > ALE::IMesh ALE::malloc_allocator > > > >::restrictClosure(ALE::Obj ALE::malloc_allocator >, > ALE::malloc_allocator ALE::malloc_allocator > > >&, PetscInt&, PetscScalar*&, PetscInt&)? > /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1691: note: > candidates are: const typename Section::value_type* > ALE::IMesh::restrictClosure(const ALE::Obj ALE::malloc_allocator >&, const typename ALE::IBundle ALE::malloc_allocator >, ALE::IGeneralSection ALE::malloc_allocator >, ALE::IGeneralSection ALE::malloc_allocator >, Label_, > ALE::UniformSection, int, 1, > ALE::malloc_allocator > >::sieve_type::point_type&, typename > Section::value_type*, int) [with Section = ALE::IGeneralSection ALE::malloc_allocator >, Label_ = ALE::LabelSifter ALE::malloc_allocator > >] > mesh.c: In function ? PetscErrorCode MeshUpdateClosure(_p_Mesh*, > _p_SectionReal*, PetscInt, PetscScalar*)? : > mesh.c:2557: error: no matching function for call to ? > ALE::IMesh ALE::malloc_allocator > > > >::update(ALE::Obj ALE::malloc_allocator >, > ALE::malloc_allocator ALE::malloc_allocator > > >&, PetscInt&, PetscScalar*&)? > /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1715: note: > candidates are: void ALE::IMesh::update(const ALE::Obj ALE::malloc_allocator >&, const typename ALE::IBundle ALE::malloc_allocator >, ALE::IGeneralSection ALE::malloc_allocator >, ALE::IGeneralSection ALE::malloc_allocator >, Label_, > ALE::UniformSection, int, 1, > ALE::malloc_allocator > >::sieve_type::point_type&, const typename > Section::value_type*) [with Section = ALE::IGeneralSection ALE::malloc_allocator >, Label_ = ALE::LabelSifter ALE::malloc_allocator > >] > /home/user/Compile/petsc3p7-complex/include/petscmesh.hh: In function ? > PetscErrorCode MeshCreateGlobalScatter(const ALE::Obj ALE::malloc_allocator >&, const ALE::Obj ALE::malloc_allocator >&, _p_VecScatter**) [with Mesh = > ALE::IMesh ALE::malloc_allocator > > >, Section = > ALE::IGeneralSection >]? : > mesh.c:815: instantiated from here > /home/user/Compile/petsc3p7-complex/include/petscmesh.hh:93: error: no > matching function for call to ? VecCreateSeqWithArray(ompi_communicator_t*, > int, const double*, _p_Vec**)? > /home/user/Compile/petsc3p7-complex/include/petscvec.h:66: note: candidates > are: PetscErrorCode VecCreateSeqWithArray(ompi_communicator_t*, PetscInt, > const PetscScalar*, _p_Vec**) > /home/user/Compile/petsc3p7-complex/include/petscvec.h:67: note: > PetscErrorCode VecCreateSeqWithArray(PetscInt, PetscScalar*, _p_Vec**) > /home/user/Compile/petsc3p7-complex/include/petscmesh.hh: In function ? > PetscErrorCode updateOperator(_p_Mat*, const Sieve&, Visitor&, const int&, > PetscScalar*, InsertMode) [with Sieve = ALE::IFSieve ALE::malloc_allocator >, Visitor = updateOperator(_p_Mat*, const > ALE::Obj ALE::malloc_allocator > > >, > ALE::malloc_allocator ALE::malloc_allocator > > > > >&, const > ALE::Obj >, > ALE::malloc_allocator ALE::malloc_allocator > > >&, const ALE::Obj ALE::Point>, ALE::malloc_allocator > >&, > const int&, PetscScalar*, InsertMode)::visitor_type]? : > mesh.c:1121: instantiated from here > ..... > ..... > ------------------------------------------------------------------------ > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > Real Scalar > > ./config/configure.py > --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-real > --with-scalar-type=real --with-clanguage=cxx --with-boost=1 > --download-boost=/home/user/Compile/petsc3p7-real/externalpackages/boost.tar.gz > --with-sieve=1 > .... > Configuring PETSc to compile on your system > > ================================================================================= > TESTING: alternateConfigureLibrary from > PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:69) > Compilers: > C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 > C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g > Fortran Compiler: mpif90 -Wall -Wno-unused-variable -g > Linkers: > Static linker: /usr/bin/ar cr > PETSc: > ** > ** Before running "make" your PETSC_ARCH must be specified with: > ** ** setenv PETSC_ARCH linux-gnu-cxx-debug (csh/tcsh) > ** ** PETSC_ARCH=linux-gnu-cxx-debug; export PETSC_ARCH (sh/bash) > ** > PETSC_DIR: /home/user/Compile/petsc3p7-real > ** > ** Now build the libraries with "make all" > ** > Clanguage: Cxx > PETSc shared libraries: disabled > PETSc dynamic libraries: disabled > Scalar type:real > MPI: > Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib > X11: > Includes: [''] > Library: ['-lX11'] > BLAS/LAPACK: -llapack -lblas > Sieve: > Includes: -I/home/user/Compile/petsc3p7-real/include/sieve > Boost: > Includes: -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib > > > > > No Errors; warnings: > ------------------------------------------------------------------------ > ..... > ..... > > libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/sieve > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence? : > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: > instantiated from ? ALE::IFSieve::~IFSieve() [with > Point_ = int, Allocator_ = ALE::malloc_allocator]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieve >, A = > ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = ALE::IFSieve ALE::malloc_allocator >, A = ALE::malloc_allocator ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347: instantiated > from ? ALE::IBundle ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve ALE::malloc_allocator >, RealSection_ = ALE::IGeneralSection double, ALE::malloc_allocator >, IntSection_ = > ALE::IGeneralSection >, Label_ = > ALE::LabelSifter ALE::malloc_allocator > >, ArrowSection_ > = ALE::UniformSection, int, 1, > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572: instantiated > from ? void ALE::Obj::destroy() [with X = > ALE::IMesh ALE::malloc_allocator > > >, A = > ALE::malloc_allocator ALE::malloc_allocator > > > >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = > ALE::IMesh ALE::malloc_allocator > > >, A = > ALE::malloc_allocator ALE::malloc_allocator > > > >]? > meshmgsnes.c:63: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ? > class ALE::IFSieveDef::Sequence? has virtual functions but > non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence::const_iterator? : > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: > instantiated from ? ALE::IFSieveDef::Sequence::const_iterator > ALE::IFSieveDef::Sequence::begin() const [with PointType_ = > int]? > meshmgsnes.c:349: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ? > class ALE::IFSieveDef::Sequence::const_iterator? has virtual > functions but non-virtual destructor > libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/ftn-auto > libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/snes/f90-mod > mpif90 -c -Wall -Wno-unused-variable -g > -I/home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include > -I/home/user/Compile/petsc3p7-real/include > -I/home/user/Compile/petsc3p7-real/include/sieve > -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -o petscsnesmod.o > petscsnesmod.F > /usr/bin/ar cr > /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/lib/libpetscsnes.a > petscsnesmod.o > /bin/cp -f *.mod > /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include > libfast in: /home/user/Compile/petsc3p7-real/src/snes/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/ts > libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface > libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface/ftn-auto > libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/euler > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/rk > libfast in: > /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/rk/ftn-auto > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit/beuler > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit/cn > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/pseudo > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/pseudo/ftn-auto > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples > libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples/tests > libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples/tutorials > libfast in: /home/user/Compile/petsc3p7-real/src/ts/f90-mod > mpif90 -c -Wall -Wno-unused-variable -g > -I/home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include > -I/home/user/Compile/petsc3p7-real/include > -I/home/user/Compile/petsc3p7-real/include/sieve > -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include > -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -o petsctsmod.o > petsctsmod.F > /usr/bin/ar cr > /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/lib/libpetscts.a > petsctsmod.o > /bin/cp -f *.mod > /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include > libfast in: /home/user/Compile/petsc3p7-real/src/dm > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface/ftn-auto > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/basic > libfast in: > /home/user/Compile/petsc3p7-real/src/dm/ao/impls/basic/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping > libfast in: > /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping/ftn-auto > libfast in: > /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples/tests > libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples/tutorials > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/ftn-auto > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/f90-custom > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples/tests > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples/tutorials > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/ftn-auto > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/ftn-custom > libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/f90-custom > libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence? : > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1145: > instantiated from ? ALE::IFSieve Allocator_>::IFSieve(ompi_communicator_t*, int) [with Point_ = int, > Allocator_ = ALE::malloc_allocator]? > mesh.c:1600: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ? > class ALE::IFSieveDef::Sequence? has virtual functions but > non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence::const_iterator? : > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: > instantiated from ? ALE::IFSieveDef::Sequence::const_iterator > ALE::IFSieveDef::Sequence::begin() const [with PointType_ = > int]? > mesh.c:2641: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ? > class ALE::IFSieveDef::Sequence::const_iterator? has virtual > functions but non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence? : > meshpcice.c:387: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ? > class ALE::IFSieveDef::Sequence? has virtual functions but > non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence::const_iterator? : > meshpcice.c:388: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ? > class ALE::IFSieveDef::Sequence::const_iterator? has virtual > functions but non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence? : > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: > instantiated from ? ALE::IFSieve::~IFSieve() [with > Point_ = int, Allocator_ = ALE::malloc_allocator]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieve >, A = > ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = ALE::IFSieve ALE::malloc_allocator >, A = ALE::malloc_allocator ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347: instantiated > from ? ALE::IBundle ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve ALE::malloc_allocator >, RealSection_ = ALE::IGeneralSection double, ALE::malloc_allocator >, IntSection_ = > ALE::IGeneralSection >, Label_ = > ALE::LabelSifter ALE::malloc_allocator > >, ArrowSection_ > = ALE::UniformSection, int, 1, > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572: instantiated > from ? void ALE::Obj::destroy() [with X = > ALE::IMesh ALE::malloc_allocator > > >, A = > ALE::malloc_allocator ALE::malloc_allocator > > > >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = > ALE::IMesh ALE::malloc_allocator > > >, A = > ALE::malloc_allocator ALE::malloc_allocator > > > >]? > meshpflotran.c:235: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ? > class ALE::IFSieveDef::Sequence? has virtual functions but > non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence::const_iterator? : > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: > instantiated from ? ALE::IFSieveDef::Sequence::const_iterator > ALE::IFSieveDef::Sequence::begin() const [with PointType_ = > int]? > meshpflotran.c:903: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ? > class ALE::IFSieveDef::Sequence::const_iterator? has virtual > functions but non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence? : > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: > instantiated from ? ALE::IFSieve::~IFSieve() [with > Point_ = int, Allocator_ = ALE::malloc_allocator]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieve >, A = > ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = ALE::IFSieve ALE::malloc_allocator >, A = ALE::malloc_allocator ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1331: instantiated > from ? ALE::IBundle ArrowSection_>::IBundle(ompi_communicator_t*, int) [with Sieve_ = > ALE::IFSieve >, RealSection_ = > ALE::IGeneralSection >, > IntSection_ = ALE::IGeneralSection >, > Label_ = ALE::LabelSifter ALE::malloc_allocator > >, ArrowSection_ > = ALE::UniformSection, int, 1, > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1598: instantiated > from ? ALE::IMesh::IMesh(ompi_communicator_t*, int, int) [with > Label_ = ALE::LabelSifter ALE::malloc_allocator > >]? > meshexodus.c:183: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ? > class ALE::IFSieveDef::Sequence? has virtual functions but > non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence::const_iterator? : > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: > instantiated from ? ALE::IFSieveDef::Sequence::const_iterator > ALE::IFSieveDef::Sequence::begin() const [with PointType_ = > int]? > meshexodus.c:364: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ? > class ALE::IFSieveDef::Sequence::const_iterator? has virtual > functions but non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence? : > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: > instantiated from ? ALE::IFSieve::~IFSieve() [with > Point_ = int, Allocator_ = ALE::malloc_allocator]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieve >, A = > ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = ALE::IFSieve ALE::malloc_allocator >, A = ALE::malloc_allocator ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/petscmesh_viewers.hh:482: > instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ? > class ALE::IFSieveDef::Sequence? has virtual functions but > non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence::const_iterator? : > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: > instantiated from ? ALE::IFSieveDef::Sequence::const_iterator > ALE::IFSieveDef::Sequence::begin() const [with PointType_ = > int]? > section.c:1405: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ? > class ALE::IFSieveDef::Sequence::const_iterator? has virtual > functions but non-virtual destructor > libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/sieve > libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/impls > libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/impls/cartesian > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence? : > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = > ALE::IFSieveDef::Sequence, A = > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156: > instantiated from ? ALE::IFSieve::~IFSieve() [with > Point_ = int, Allocator_ = ALE::malloc_allocator]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759: > instantiated from ? void ALE::Obj::destroy() [with X = > ALE::IFSieve >, A = > ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = ALE::IFSieve ALE::malloc_allocator >, A = ALE::malloc_allocator ALE::malloc_allocator > >]? > /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347: instantiated > from ? ALE::IBundle ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve ALE::malloc_allocator >, RealSection_ = ALE::IGeneralSection double, ALE::malloc_allocator >, IntSection_ = > ALE::IGeneralSection >, Label_ = > ALE::LabelSifter ALE::malloc_allocator > >, ArrowSection_ > = ALE::UniformSection, int, 1, > ALE::malloc_allocator >]? > /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572: instantiated > from ? void ALE::Obj::destroy() [with X = > ALE::IMesh ALE::malloc_allocator > > >, A = > ALE::malloc_allocator ALE::malloc_allocator > > > >]? > /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705: > instantiated from ? ALE::Obj::~Obj() [with X = > ALE::IMesh ALE::malloc_allocator > > >, A = > ALE::malloc_allocator ALE::malloc_allocator > > > >]? > cartesian.c:263: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ? > class ALE::IFSieveDef::Sequence? has virtual functions but > non-virtual destructor > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation > of ? ALE::IFSieveDef::Sequence::const_iterator? : > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991: > instantiated from ? ALE::IFSieveDef::Sequence::const_iterator > ALE::IFSieveDef::Sequence::begin() const [with PointType_ = > int]? > cartesian.c:269: instantiated from here > /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ? > class ALE::IFSieveDef::Sequence::const_iterator? has virtual > functions but non-virtual destructor > > ------------------------------------------------------------------------ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From vyan2000 at gmail.com Tue Jul 28 20:15:48 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 28 Jul 2009 21:15:48 -0400 Subject: How to set the ksp true residual tolerance on command line Message-ID: Hi all, Is there any way to set a true residual tolerance on command line. -ksp_atol is for preconditioned residual norm only. Thanks, Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jul 28 20:28:25 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 28 Jul 2009 20:28:25 -0500 Subject: How to set the ksp true residual tolerance on command line In-Reply-To: References: Message-ID: <019038C4-3647-4580-9F0D-4E323D49C64C@mcs.anl.gov> This depends on the Krylov solver and if you are using left or right preconditioning. With right preconditioning it always uses the true residual norm. So you can do -ksp_type gmres -ksp_pc_right Some methods we do not have implemented for right preconditioning, like CG, in that case you can use -ksp_type cg -ksp_norm_type unpreconditioned For CG many people prefer the energy norm which you can access with - ksp_norm_type natural -ksp_view should always show which it is using Barry On Jul 28, 2009, at 8:15 PM, Ryan Yan wrote: > Hi all, > Is there any way to set a true residual tolerance on command line. - > ksp_atol is for preconditioned residual norm only. > > Thanks, > > Yan From vyan2000 at gmail.com Tue Jul 28 21:18:22 2009 From: vyan2000 at gmail.com (Ryan Yan) Date: Tue, 28 Jul 2009 22:18:22 -0400 Subject: How to set the ksp true residual tolerance on command line In-Reply-To: <019038C4-3647-4580-9F0D-4E323D49C64C@mcs.anl.gov> References: <019038C4-3647-4580-9F0D-4E323D49C64C@mcs.anl.gov> Message-ID: Barry, thank you very much for the suggestion and the clarification. Yan PS: It should be -ksp_right_pc, instead of -ksp_pc_right: -ksp_monitor_draw_true_residual: Monitor graphically true residual norm (KSPMonitorSet) -ksp_monitor_range_draw: Monitor graphically preconditioned residual norm (KSPMonitorSet) Pick at most one of ------------- -ksp_left_pc: Use left preconditioning (KSPSetPreconditionerSide) -ksp_right_pc: Use right preconditioning (KSPSetPreconditionerSide) -ksp_symmetric_pc: Use symmetric (factorized) preconditioning (KSPSetPreconditionerSide) -ksp_compute_singularvalues: Compute singular values of preconditioned operator (KSPSetComputeSingularValues) -ksp_compute_eigenvalues: Compute eigenvalues of preconditioned operator (KSPSetComputeSingularValues) On Tue, Jul 28, 2009 at 9:28 PM, Barry Smith wrote: > > This depends on the Krylov solver and if you are using left or right > preconditioning. With right preconditioning it always uses the true > residual norm. > So you can do -ksp_type gmres -ksp_pc_right > Some methods we do not have implemented for right preconditioning, like CG, > in that case you can use -ksp_type cg -ksp_norm_type unpreconditioned > For CG many people prefer the energy norm which you can access with > -ksp_norm_type natural > > -ksp_view should always show which it is using > > Barry > > > On Jul 28, 2009, at 8:15 PM, Ryan Yan wrote: > > Hi all, >> Is there any way to set a true residual tolerance on command line. >> -ksp_atol is for preconditioned residual norm only. >> >> Thanks, >> >> Yan >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.kroeger at cevis.uni-bremen.de Wed Jul 29 03:33:30 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Wed, 29 Jul 2009 10:33:30 +0200 (CEST) Subject: Solver problem In-Reply-To: <897733B6-228F-4397-9C49-81CC0BE296F9@mcs.anl.gov> References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> <4A6EA926.3090400@student.uibk.ac.at> <897733B6-228F-4397-9C49-81CC0BE296F9@mcs.anl.gov> Message-ID: Dear Barry, On Tue, 28 Jul 2009, Barry Smith wrote: >> undefined symbol: _gfortran_allocate_array > > This is likely a symbol in the gfortran compiler libraries. > > Are you linking your application code against all the libraries it needs to > be linked against? Thank you for your help. It seems I have found the reason now. It has to do with the cluster I am working on: The operating system on the master (where I compile my application) does not coincide with the system on the nodes (where the application is run). That is, e.g. /usr/lib/libgfortran.so differs between these installations. I will ask the cluster's admin for assistance. Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From tim.kroeger at cevis.uni-bremen.de Wed Jul 29 08:49:50 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Wed, 29 Jul 2009 15:49:50 +0200 (CEST) Subject: Solver problem In-Reply-To: References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> Message-ID: Dear all, On Tue, 28 Jul 2009, Tim Kroeger wrote: > Would you recommend to try MUMPS as well? (I.e., will MUMPS have a change to > be faster than ILU?) It seems as if I can't use MUMPS since the cluster I am working on doesn't meet some system requirements. (PETSc otherwise works fine on the cluster.) However, I understand that PETSc also interfaces a large number of other sparse direct solvers. Are there any recommendations about which one might be a good choice if MUMPS cannot be used? Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From knepley at gmail.com Wed Jul 29 11:31:59 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 29 Jul 2009 11:31:59 -0500 Subject: Solver problem In-Reply-To: References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> Message-ID: On Wed, Jul 29, 2009 at 8:49 AM, Tim Kroeger < tim.kroeger at cevis.uni-bremen.de> wrote: > Dear all, > > On Tue, 28 Jul 2009, Tim Kroeger wrote: > > Would you recommend to try MUMPS as well? (I.e., will MUMPS have a change >> to be faster than ILU?) >> > > It seems as if I can't use MUMPS since the cluster I am working on doesn't > meet some system requirements. (PETSc otherwise works fine on the cluster.) > However, I understand that PETSc also interfaces a large number of other > sparse direct solvers. Are there any recommendations about which one might > be a good choice if MUMPS cannot be used? You can try SuperLU_dist. Matt > > Best Regards, > > Tim > > -- > Dr. Tim Kroeger > tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 > tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 > > Fraunhofer MEVIS, Institute for Medical Image Computing > Universitaetsallee 29, 28359 Bremen, Germany > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Wed Jul 29 11:36:07 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 29 Jul 2009 09:36:07 -0700 Subject: suggestions for debugging code In-Reply-To: References: <4A6F168F.5070208@gmail.com> Message-ID: <4A707A77.2050403@gmail.com> Matthew, Thanks - it took me the better part of the day yesterday to get the suppression file so that it cut out most of the MPI stuff, and then I was able to eventually zero in and find the offending bug. Randy Matthew Knepley wrote: > On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie > wrote: > > I have run into a very difficult debugging problem. I have recently > made some > modifications to my PETSc code, to add some new features. When I > compiled the > code in debug mode (we are using the Intel compilers and mvapich on > Infiniband), > the code runs fine with any number of processes. > > When the code is compiled in optimize mode, it runs fine on, say, up > to 32 processes, > but not 64, bombing out someplace strange, with a Segmentation > Violation. > > I've tried using Valgrind, but you can't use it with PETSc and my > code compiled in > Debug mode because the code finishes successfully, and the other > problem I have with > > > Sometimes valgrind will catch things even when code does not crash. > > > > Valgrind + mvapich is there are about a million messages spewed out, > making it > extremely difficult to see if there are really any issues in MY > code. I've thought > to have PETSc download and compile MPICH2, which I would hope would > produce less > output from Valgrind. > > > In order to filter these out, you use a "suppressions file" for > valgrind. The manual has a > good section on this and it should not be hard to wipre out most of > them. Satish designed > one for our unit tests. > > Matt > > > > Anyone have any suggestions on how to debug this tricky situation? > Any suggestions > would be greatly appreciated. > > Randy > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From yaakoub at tacc.utexas.edu Wed Jul 29 12:34:53 2009 From: yaakoub at tacc.utexas.edu (Yaakoub El Khamra) Date: Wed, 29 Jul 2009 12:34:53 -0500 Subject: suggestions for debugging code In-Reply-To: <4A707A77.2050403@gmail.com> References: <4A6F168F.5070208@gmail.com> <4A707A77.2050403@gmail.com> Message-ID: <47a831090907291034r68f51517mc080ff408436c2a6@mail.gmail.com> Just my 2c, but if you might want to check out DDT, it is a parallel debugger with built-in memory checking. If you have a teragrid account, you can probably use it on ranger or lonestar. It is licensed, but until the eclipse PTP project comes around with memory checking, it is an alternative. Regards Yaakoub El Khamra On Wed, Jul 29, 2009 at 11:36 AM, Randall Mackie wrote: > Matthew, > > Thanks - it took me the better part of the day yesterday to get the suppression > file so that it cut out most of the MPI stuff, and then I was able to eventually > zero in and find the offending bug. > > > Randy > > > Matthew Knepley wrote: >> On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie > > wrote: >> >> ? ? I have run into a very difficult debugging problem. I have recently >> ? ? made some >> ? ? modifications to my PETSc code, to add some new features. When I >> ? ? compiled the >> ? ? code in debug mode (we are using the Intel compilers and mvapich on >> ? ? Infiniband), >> ? ? the code runs fine with any number of processes. >> >> ? ? When the code is compiled in optimize mode, it runs fine on, say, up >> ? ? to 32 processes, >> ? ? but not 64, bombing out someplace strange, with a Segmentation >> ? ? Violation. >> >> ? ? I've tried using Valgrind, but you can't use it with PETSc and my >> ? ? code compiled in >> ? ? Debug mode because the code finishes successfully, and the other >> ? ? problem I have with >> >> >> Sometimes valgrind will catch things even when code does not crash. >> >> >> >> ? ? Valgrind + mvapich is there are about a million messages spewed out, >> ? ? making it >> ? ? extremely difficult to see if there are really any issues in MY >> ? ? code. I've thought >> ? ? to have PETSc download and compile MPICH2, which I would hope would >> ? ? produce less >> ? ? output from Valgrind. >> >> >> In order to filter these out, you use a "suppressions file" for >> valgrind. The manual has a >> good section on this and it should not be hard to wipre out most of >> them. Satish designed >> one for our unit tests. >> >> ? Matt >> >> >> >> ? ? Anyone have any suggestions on how to debug this tricky situation? >> ? ? Any suggestions >> ? ? would be greatly appreciated. >> >> ? ? Randy >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener > From rlmackie862 at gmail.com Wed Jul 29 14:09:32 2009 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 29 Jul 2009 12:09:32 -0700 Subject: suggestions for debugging code In-Reply-To: <47a831090907291034r68f51517mc080ff408436c2a6@mail.gmail.com> References: <4A6F168F.5070208@gmail.com> <4A707A77.2050403@gmail.com> <47a831090907291034r68f51517mc080ff408436c2a6@mail.gmail.com> Message-ID: <4A709E6C.5030500@gmail.com> That looks interesting. Has anyone tried Totalview or PGDBG? Randy Yaakoub El Khamra wrote: > Just my 2c, but if you might want to check out DDT, it is a parallel > debugger with built-in memory checking. If you have a teragrid > account, you can probably use it on ranger or lonestar. It is > licensed, but until the eclipse PTP project comes around with memory > checking, it is an alternative. > > Regards > Yaakoub El Khamra > > > > > On Wed, Jul 29, 2009 at 11:36 AM, Randall Mackie wrote: >> Matthew, >> >> Thanks - it took me the better part of the day yesterday to get the suppression >> file so that it cut out most of the MPI stuff, and then I was able to eventually >> zero in and find the offending bug. >> >> >> Randy >> >> >> Matthew Knepley wrote: >>> On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie >> > wrote: >>> >>> I have run into a very difficult debugging problem. I have recently >>> made some >>> modifications to my PETSc code, to add some new features. When I >>> compiled the >>> code in debug mode (we are using the Intel compilers and mvapich on >>> Infiniband), >>> the code runs fine with any number of processes. >>> >>> When the code is compiled in optimize mode, it runs fine on, say, up >>> to 32 processes, >>> but not 64, bombing out someplace strange, with a Segmentation >>> Violation. >>> >>> I've tried using Valgrind, but you can't use it with PETSc and my >>> code compiled in >>> Debug mode because the code finishes successfully, and the other >>> problem I have with >>> >>> >>> Sometimes valgrind will catch things even when code does not crash. >>> >>> >>> >>> Valgrind + mvapich is there are about a million messages spewed out, >>> making it >>> extremely difficult to see if there are really any issues in MY >>> code. I've thought >>> to have PETSc download and compile MPICH2, which I would hope would >>> produce less >>> output from Valgrind. >>> >>> >>> In order to filter these out, you use a "suppressions file" for >>> valgrind. The manual has a >>> good section on this and it should not be hard to wipre out most of >>> them. Satish designed >>> one for our unit tests. >>> >>> Matt >>> >>> >>> >>> Anyone have any suggestions on how to debug this tricky situation? >>> Any suggestions >>> would be greatly appreciated. >>> >>> Randy >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener From yaakoub at tacc.utexas.edu Wed Jul 29 14:19:00 2009 From: yaakoub at tacc.utexas.edu (Yaakoub El Khamra) Date: Wed, 29 Jul 2009 14:19:00 -0500 Subject: suggestions for debugging code In-Reply-To: <4A709E6C.5030500@gmail.com> References: <4A6F168F.5070208@gmail.com> <4A707A77.2050403@gmail.com> <47a831090907291034r68f51517mc080ff408436c2a6@mail.gmail.com> <4A709E6C.5030500@gmail.com> Message-ID: <47a831090907291219q65b5fb4dx43b28dbbad5a78bc@mail.gmail.com> I used totalview heavily, it has memory debugging but at the time it was only on a single core. That restriction might have been lifted, I am not sure. Regards Yaakoub El Khamra On Wed, Jul 29, 2009 at 2:09 PM, Randall Mackie wrote: > That looks interesting. Has anyone tried Totalview or PGDBG? > > Randy > > Yaakoub El Khamra wrote: >> Just my 2c, but if you might want to check out DDT, it is a parallel >> debugger with built-in memory checking. If you have a teragrid >> account, you can probably use it on ranger or lonestar. It is >> licensed, but until the eclipse PTP project comes around with memory >> checking, it is an alternative. >> >> Regards >> Yaakoub El Khamra >> >> >> >> >> On Wed, Jul 29, 2009 at 11:36 AM, Randall Mackie wrote: >>> Matthew, >>> >>> Thanks - it took me the better part of the day yesterday to get the suppression >>> file so that it cut out most of the MPI stuff, and then I was able to eventually >>> zero in and find the offending bug. >>> >>> >>> Randy >>> >>> >>> Matthew Knepley wrote: >>>> On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie >>> > wrote: >>>> >>>> ? ? I have run into a very difficult debugging problem. I have recently >>>> ? ? made some >>>> ? ? modifications to my PETSc code, to add some new features. When I >>>> ? ? compiled the >>>> ? ? code in debug mode (we are using the Intel compilers and mvapich on >>>> ? ? Infiniband), >>>> ? ? the code runs fine with any number of processes. >>>> >>>> ? ? When the code is compiled in optimize mode, it runs fine on, say, up >>>> ? ? to 32 processes, >>>> ? ? but not 64, bombing out someplace strange, with a Segmentation >>>> ? ? Violation. >>>> >>>> ? ? I've tried using Valgrind, but you can't use it with PETSc and my >>>> ? ? code compiled in >>>> ? ? Debug mode because the code finishes successfully, and the other >>>> ? ? problem I have with >>>> >>>> >>>> Sometimes valgrind will catch things even when code does not crash. >>>> >>>> >>>> >>>> ? ? Valgrind + mvapich is there are about a million messages spewed out, >>>> ? ? making it >>>> ? ? extremely difficult to see if there are really any issues in MY >>>> ? ? code. I've thought >>>> ? ? to have PETSc download and compile MPICH2, which I would hope would >>>> ? ? produce less >>>> ? ? output from Valgrind. >>>> >>>> >>>> In order to filter these out, you use a "suppressions file" for >>>> valgrind. The manual has a >>>> good section on this and it should not be hard to wipre out most of >>>> them. Satish designed >>>> one for our unit tests. >>>> >>>> ? Matt >>>> >>>> >>>> >>>> ? ? Anyone have any suggestions on how to debug this tricky situation? >>>> ? ? Any suggestions >>>> ? ? would be greatly appreciated. >>>> >>>> ? ? Randy >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which >>>> their experiments lead. >>>> -- Norbert Wiener > From Harun.BAYRAKTAR at 3ds.com Wed Jul 29 15:54:35 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Wed, 29 Jul 2009 16:54:35 -0400 Subject: Smoother settings for AMG Message-ID: Hi, I am trying to solve a system of equations and I am having difficulty picking the right smoothers for AMG (using ML as pc_type) in PETSc for parallel execution. First here is what happens in terms of CG (ksp_type) iteration counts (both columns use block jacobi): cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 ------------------------------------------------------ 1 | 43 | 243 4 | 699 | 379 x1 or x4 means 1 or 4 iterations of smoother application at each AMG level (all details from ksp view for the 4 cpu run are below). The main observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls apart in parallel. SOR on the other hand experiences a 1.5X increase in iteration count which is totally expected from the quality of coarsening ML delivers in parallel. I basically would like to find a way (if possible) to have the number of iterations in parallel stay with 1-2X of 1 cpu iteration count for the AMG w/ ICC case. Is there a way to achieve this? Thanks, Harun %%%%%%%%%%%%%%%%%%%%%%%%% AMG w/ ICC(0) x1 ksp_view %%%%%%%%%%%%%%%%%%%%%%%%% KSP Object: type: cg maximum iterations=10000 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: ml MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, post-smooths=1 Coarse gride solver -- level 0 ------------------------------- KSP Object:(mg_coarse_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(mg_coarse_redundant_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_redundant_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill ratio needed 2.17227 Factored matrix follows Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=21651, allocated nonzeros=21651 using I-node routines: found 186 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=14150 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=9967 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=0.9 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_1_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.514899 Factored matrix follows Matrix Object: type=seqsbaij, rows=2813, cols=2813 total: nonzeros=48609, allocated nonzeros=48609 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=2813, cols=2813 total: nonzeros=94405, allocated nonzeros=94405 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Up solver (post-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=0.9 maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_1_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.514899 Factored matrix follows Matrix Object: type=seqsbaij, rows=2813, cols=2813 total: nonzeros=48609, allocated nonzeros=48609 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=2813, cols=2813 total: nonzeros=94405, allocated nonzeros=94405 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Down solver (pre-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=0.9 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_2_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.519045 Factored matrix follows Matrix Object: type=seqsbaij, rows=101164, cols=101164 total: nonzeros=1378558, allocated nonzeros=1378558 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=101164, cols=101164 total: nonzeros=2655952, allocated nonzeros=5159364 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines Up solver (post-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=0.9 maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_2_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.519045 Factored matrix follows Matrix Object: type=seqsbaij, rows=101164, cols=101164 total: nonzeros=1378558, allocated nonzeros=1378558 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=101164, cols=101164 total: nonzeros=2655952, allocated nonzeros=5159364 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines %%%%%%%%%%%%%%%%%%%%%% AMG w/ SOR x4 ksp_view %%%%%%%%%%%%%%%%%%%%%% KSP Object: type: cg maximum iterations=10000 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: ml MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, post-smooths=1 Coarse gride solver -- level 0 ------------------------------- KSP Object:(mg_coarse_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(mg_coarse_redundant_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_redundant_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill ratio needed 2.17227 Factored matrix follows Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=21651, allocated nonzeros=21651 using I-node routines: found 186 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=14150 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=9967 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=1 maximum iterations=4, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Up solver (post-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=1 maximum iterations=4 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Down solver (pre-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=1 maximum iterations=4, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines Up solver (post-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=1 maximum iterations=4 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines From knepley at gmail.com Wed Jul 29 16:00:08 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 29 Jul 2009 16:00:08 -0500 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun wrote: > Hi, > > I am trying to solve a system of equations and I am having difficulty > picking the right smoothers for AMG (using ML as pc_type) in PETSc for > parallel execution. First here is what happens in terms of CG (ksp_type) > iteration counts (both columns use block jacobi): Are you sure you have an elliptic system? These iteration counts are extremely high. Matt > > cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 > ------------------------------------------------------ > 1 | 43 | 243 > 4 | 699 | 379 > > x1 or x4 means 1 or 4 iterations of smoother application at each AMG > level (all details from ksp view for the 4 cpu run are below). The main > observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls > apart in parallel. SOR on the other hand experiences a 1.5X increase in > iteration count which is totally expected from the quality of coarsening > ML delivers in parallel. > > I basically would like to find a way (if possible) to have the number of > iterations in parallel stay with 1-2X of 1 cpu iteration count for the > AMG w/ ICC case. Is there a way to achieve this? > > Thanks, > Harun > > %%%%%%%%%%%%%%%%%%%%%%%%% > AMG w/ ICC(0) x1 ksp_view > %%%%%%%%%%%%%%%%%%%%%%%%% > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > %%%%%%%%%%%%%%%%%%%%%% > AMG w/ SOR x4 ksp_view > %%%%%%%%%%%%%%%%%%%%%% > > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jul 29 16:05:19 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 29 Jul 2009 16:05:19 -0500 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: <02E52450-1C28-4451-BAE8-E31563BCF1A3@mcs.anl.gov> Can you save the matrix and right hand side with the option - ksp_view_binary and send the file "output" to petsc-maint at mcs.anl.gov (not this email). Barry If it is too big to email you can ftp it to info.mcs.anl.gov (anonymous login) and put it in the directory incoming then send us email petsc-maint at mcs.anl.gov with the filename. On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote: > Hi, > > I am trying to solve a system of equations and I am having difficulty > picking the right smoothers for AMG (using ML as pc_type) in PETSc for > parallel execution. First here is what happens in terms of CG > (ksp_type) > iteration counts (both columns use block jacobi): > > cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 > ------------------------------------------------------ > 1 | 43 | 243 > 4 | 699 | 379 > > x1 or x4 means 1 or 4 iterations of smoother application at each AMG > level (all details from ksp view for the 4 cpu run are below). The > main > observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but > falls > apart in parallel. SOR on the other hand experiences a 1.5X increase > in > iteration count which is totally expected from the quality of > coarsening > ML delivers in parallel. > > I basically would like to find a way (if possible) to have the > number of > iterations in parallel stay with 1-2X of 1 cpu iteration count for the > AMG w/ ICC case. Is there a way to achieve this? > > Thanks, > Harun > > %%%%%%%%%%%%%%%%%%%%%%%%% > AMG w/ ICC(0) x1 ksp_view > %%%%%%%%%%%%%%%%%%%%%%%%% > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > %%%%%%%%%%%%%%%%%%%%%% > AMG w/ SOR x4 ksp_view > %%%%%%%%%%%%%%%%%%%%%% > > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > From Harun.BAYRAKTAR at 3ds.com Wed Jul 29 16:17:12 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Wed, 29 Jul 2009 17:17:12 -0400 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: Matt, It is from the pressure poisson equation for incompressible navier-stokes so it is elliptic. Also on 1 cpu, I am able to solve it with reason able iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the parallel runs that really concern me. Thanks, Harun From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Wednesday, July 29, 2009 5:00 PM To: PETSc users list Subject: Re: Smoother settings for AMG On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun wrote: Hi, I am trying to solve a system of equations and I am having difficulty picking the right smoothers for AMG (using ML as pc_type) in PETSc for parallel execution. First here is what happens in terms of CG (ksp_type) iteration counts (both columns use block jacobi): Are you sure you have an elliptic system? These iteration counts are extremely high. Matt cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 ------------------------------------------------------ 1 | 43 | 243 4 | 699 | 379 x1 or x4 means 1 or 4 iterations of smoother application at each AMG level (all details from ksp view for the 4 cpu run are below). The main observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls apart in parallel. SOR on the other hand experiences a 1.5X increase in iteration count which is totally expected from the quality of coarsening ML delivers in parallel. I basically would like to find a way (if possible) to have the number of iterations in parallel stay with 1-2X of 1 cpu iteration count for the AMG w/ ICC case. Is there a way to achieve this? Thanks, Harun %%%%%%%%%%%%%%%%%%%%%%%%% AMG w/ ICC(0) x1 ksp_view %%%%%%%%%%%%%%%%%%%%%%%%% KSP Object: type: cg maximum iterations=10000 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: ml MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, post-smooths=1 Coarse gride solver -- level 0 ------------------------------- KSP Object:(mg_coarse_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(mg_coarse_redundant_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_redundant_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill ratio needed 2.17227 Factored matrix follows Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=21651, allocated nonzeros=21651 using I-node routines: found 186 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=14150 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=9967 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=0.9 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_1_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.514899 Factored matrix follows Matrix Object: type=seqsbaij, rows=2813, cols=2813 total: nonzeros=48609, allocated nonzeros=48609 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=2813, cols=2813 total: nonzeros=94405, allocated nonzeros=94405 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Up solver (post-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=0.9 maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_1_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.514899 Factored matrix follows Matrix Object: type=seqsbaij, rows=2813, cols=2813 total: nonzeros=48609, allocated nonzeros=48609 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=2813, cols=2813 total: nonzeros=94405, allocated nonzeros=94405 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Down solver (pre-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=0.9 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_2_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.519045 Factored matrix follows Matrix Object: type=seqsbaij, rows=101164, cols=101164 total: nonzeros=1378558, allocated nonzeros=1378558 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=101164, cols=101164 total: nonzeros=2655952, allocated nonzeros=5159364 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines Up solver (post-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=0.9 maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_2_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.519045 Factored matrix follows Matrix Object: type=seqsbaij, rows=101164, cols=101164 total: nonzeros=1378558, allocated nonzeros=1378558 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=101164, cols=101164 total: nonzeros=2655952, allocated nonzeros=5159364 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines %%%%%%%%%%%%%%%%%%%%%% AMG w/ SOR x4 ksp_view %%%%%%%%%%%%%%%%%%%%%% KSP Object: type: cg maximum iterations=10000 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: ml MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, post-smooths=1 Coarse gride solver -- level 0 ------------------------------- KSP Object:(mg_coarse_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(mg_coarse_redundant_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_redundant_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill ratio needed 2.17227 Factored matrix follows Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=21651, allocated nonzeros=21651 using I-node routines: found 186 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=14150 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=9967 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=1 maximum iterations=4, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Up solver (post-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=1 maximum iterations=4 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Down solver (pre-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=1 maximum iterations=4, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines Up solver (post-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=1 maximum iterations=4 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jul 29 16:22:39 2009 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 29 Jul 2009 16:22:39 -0500 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun wrote: > Matt, > > It is from the pressure poisson equation for incompressible navier-stokes > so it is elliptic. Also on 1 cpu, I am able to solve it with reason able > iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the > parallel runs that really concern me. > Actually, it was the 43 that really concerned me. In my experience, an MG that is doing what it is supposed to on Poisson takes < 10 iterations. However, if your grid is pretty distorted, maybe it can get this bad. Matt > > Thanks, > > Harun > > > > > > *From:* petsc-users-bounces at mcs.anl.gov [mailto: > petsc-users-bounces at mcs.anl.gov] *On Behalf Of *Matthew Knepley > *Sent:* Wednesday, July 29, 2009 5:00 PM > *To:* PETSc users list > *Subject:* Re: Smoother settings for AMG > > > > On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun > wrote: > > Hi, > > I am trying to solve a system of equations and I am having difficulty > picking the right smoothers for AMG (using ML as pc_type) in PETSc for > parallel execution. First here is what happens in terms of CG (ksp_type) > iteration counts (both columns use block jacobi): > > > Are you sure you have an elliptic system? These iteration counts are > extremely > high. > > Matt > > > > cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 > ------------------------------------------------------ > 1 | 43 | 243 > 4 | 699 | 379 > > x1 or x4 means 1 or 4 iterations of smoother application at each AMG > level (all details from ksp view for the 4 cpu run are below). The main > observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls > apart in parallel. SOR on the other hand experiences a 1.5X increase in > iteration count which is totally expected from the quality of coarsening > ML delivers in parallel. > > I basically would like to find a way (if possible) to have the number of > iterations in parallel stay with 1-2X of 1 cpu iteration count for the > AMG w/ ICC case. Is there a way to achieve this? > > Thanks, > Harun > > %%%%%%%%%%%%%%%%%%%%%%%%% > AMG w/ ICC(0) x1 ksp_view > %%%%%%%%%%%%%%%%%%%%%%%%% > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > %%%%%%%%%%%%%%%%%%%%%% > AMG w/ SOR x4 ksp_view > %%%%%%%%%%%%%%%%%%%%%% > > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Harun.BAYRAKTAR at 3ds.com Wed Jul 29 16:48:50 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Wed, 29 Jul 2009 17:48:50 -0400 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: Matt, Sorry, I thought you meant the ones with iteration counts in the hundreds. I had a typo it was 46 actually (not 43). That's very interesting that you think it should converge in less than 10 iterations, all I can say is I wish I could get there. This mesh is a graded mesh and does have some poor aspect ratio elements. For a uniform mesh I see 20 iterations or so. Thanks, Harun From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Wednesday, July 29, 2009 5:23 PM To: PETSc users list Subject: Re: Smoother settings for AMG On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun wrote: Matt, It is from the pressure poisson equation for incompressible navier-stokes so it is elliptic. Also on 1 cpu, I am able to solve it with reason able iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the parallel runs that really concern me. Actually, it was the 43 that really concerned me. In my experience, an MG that is doing what it is supposed to on Poisson takes < 10 iterations. However, if your grid is pretty distorted, maybe it can get this bad. Matt Thanks, Harun From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: Wednesday, July 29, 2009 5:00 PM To: PETSc users list Subject: Re: Smoother settings for AMG On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun wrote: Hi, I am trying to solve a system of equations and I am having difficulty picking the right smoothers for AMG (using ML as pc_type) in PETSc for parallel execution. First here is what happens in terms of CG (ksp_type) iteration counts (both columns use block jacobi): Are you sure you have an elliptic system? These iteration counts are extremely high. Matt cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 ------------------------------------------------------ 1 | 43 | 243 4 | 699 | 379 x1 or x4 means 1 or 4 iterations of smoother application at each AMG level (all details from ksp view for the 4 cpu run are below). The main observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls apart in parallel. SOR on the other hand experiences a 1.5X increase in iteration count which is totally expected from the quality of coarsening ML delivers in parallel. I basically would like to find a way (if possible) to have the number of iterations in parallel stay with 1-2X of 1 cpu iteration count for the AMG w/ ICC case. Is there a way to achieve this? Thanks, Harun %%%%%%%%%%%%%%%%%%%%%%%%% AMG w/ ICC(0) x1 ksp_view %%%%%%%%%%%%%%%%%%%%%%%%% KSP Object: type: cg maximum iterations=10000 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: ml MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, post-smooths=1 Coarse gride solver -- level 0 ------------------------------- KSP Object:(mg_coarse_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(mg_coarse_redundant_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_redundant_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill ratio needed 2.17227 Factored matrix follows Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=21651, allocated nonzeros=21651 using I-node routines: found 186 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=14150 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=9967 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=0.9 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_1_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.514899 Factored matrix follows Matrix Object: type=seqsbaij, rows=2813, cols=2813 total: nonzeros=48609, allocated nonzeros=48609 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=2813, cols=2813 total: nonzeros=94405, allocated nonzeros=94405 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Up solver (post-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=0.9 maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_1_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.514899 Factored matrix follows Matrix Object: type=seqsbaij, rows=2813, cols=2813 total: nonzeros=48609, allocated nonzeros=48609 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=2813, cols=2813 total: nonzeros=94405, allocated nonzeros=94405 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Down solver (pre-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=0.9 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_2_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.519045 Factored matrix follows Matrix Object: type=seqsbaij, rows=101164, cols=101164 total: nonzeros=1378558, allocated nonzeros=1378558 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=101164, cols=101164 total: nonzeros=2655952, allocated nonzeros=5159364 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines Up solver (post-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=0.9 maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(mg_levels_2_sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_sub_) type: icc ICC: 0 levels of fill ICC: factor fill ratio allocated 1 ICC: using Manteuffel shift ICC: factor fill ratio needed 0.519045 Factored matrix follows Matrix Object: type=seqsbaij, rows=101164, cols=101164 total: nonzeros=1378558, allocated nonzeros=1378558 block size is 1 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=101164, cols=101164 total: nonzeros=2655952, allocated nonzeros=5159364 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines %%%%%%%%%%%%%%%%%%%%%% AMG w/ SOR x4 ksp_view %%%%%%%%%%%%%%%%%%%%%% KSP Object: type: cg maximum iterations=10000 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: ml MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, post-smooths=1 Coarse gride solver -- level 0 ------------------------------- KSP Object:(mg_coarse_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(mg_coarse_redundant_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_coarse_redundant_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill ratio needed 2.17227 Factored matrix follows Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=21651, allocated nonzeros=21651 using I-node routines: found 186 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=14150 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=283, cols=283 total: nonzeros=9967, allocated nonzeros=9967 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=1 maximum iterations=4, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Up solver (post-smoother) on level 1 ------------------------------- KSP Object:(mg_levels_1_) type: richardson Richardson: damping factor=1 maximum iterations=4 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_1_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=10654, cols=10654 total: nonzeros=376634, allocated nonzeros=376634 not using I-node (on process 0) routines Down solver (pre-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=1 maximum iterations=4, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines Up solver (post-smoother) on level 2 ------------------------------- KSP Object:(mg_levels_2_) type: richardson Richardson: damping factor=1 maximum iterations=4 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(mg_levels_2_) type: sor SOR: type = local_symmetric, iterations = 1, omega = 1 linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines linear system matrix = precond matrix: Matrix Object: type=mpiaij, rows=411866, cols=411866 total: nonzeros=10941434, allocated nonzeros=42010332 not using I-node (on process 0) routines -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Wed Jul 29 17:40:26 2009 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 29 Jul 2009 19:40:26 -0300 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: On Wed, Jul 29, 2009 at 6:48 PM, BAYRAKTAR Harun wrote: > > For a uniform mesh I see 20 iterations or so. > That's around the iteration count I would expect for for ML on uniform meshes... Do this iteration count stays at about 20 when you run in 4 CPU's while still using an uniform mesh? > From: petsc-users-bounces at mcs.anl.gov > [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: Wednesday, July 29, 2009 5:23 PM > > To: PETSc users list > Subject: Re: Smoother settings for AMG > > > > On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun > wrote: > > Matt, > > It is from the pressure poisson equation for incompressible navier-stokes so > it is elliptic. Also on 1 cpu, I am able to solve it with reason able > iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the > parallel runs that really concern me. > > Actually, it was the 43 that really concerned me. In my experience, an MG > that is doing what it is supposed > to on Poisson takes < 10 iterations. However, if your grid is pretty > distorted, maybe it can get this bad. > > ? Matt > > > > Thanks, > > Harun > > > > > > From: petsc-users-bounces at mcs.anl.gov > [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: Wednesday, July 29, 2009 5:00 PM > To: PETSc users list > Subject: Re: Smoother settings for AMG > > > > On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun > wrote: > > Hi, > > I am trying to solve a system of equations and I am having difficulty > picking the right smoothers for AMG (using ML as pc_type) in PETSc for > parallel execution. First here is what happens in terms of CG (ksp_type) > iteration counts (both columns use block jacobi): > > Are you sure you have an elliptic system? These iteration counts are > extremely > high. > > ? Matt > > > cpus ? ?| ? ? ? AMG w/ ICC(0) x1 ? ? ? ?| ? ? ? AMG w/ SOR x4 > ------------------------------------------------------ > 1 ? ? ? | ? ? ? ? ? ? ? 43 ? ? ? ? ? ? ?| ? ? ? ? ? ? ? 243 > 4 ? ? ? | ? ? ? ? ? ? ? 699 ? ? ? ? ? ? | ? ? ? ? ? ? ? 379 > > x1 or x4 means 1 or 4 iterations of smoother application at each AMG > level (all details from ksp view for the 4 cpu run are below). The main > observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls > apart in parallel. SOR on the other hand experiences a 1.5X increase in > iteration count which is totally expected from the quality of coarsening > ML delivers in parallel. > > I basically would like to find a way (if possible) to have the number of > iterations in parallel stay with 1-2X of 1 cpu iteration count for the > AMG w/ ICC case. Is there a way to achieve this? > > Thanks, > Harun > > %%%%%%%%%%%%%%%%%%%%%%%%% > AMG w/ ICC(0) x1 ksp_view > %%%%%%%%%%%%%%%%%%%%%%%%% > KSP Object: > ?type: cg > ?maximum iterations=10000 > ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: ml > ? ?MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > ?Coarse gride solver -- level 0 ------------------------------- > ? ?KSP Object:(mg_coarse_) > ? ? ?type: preonly > ? ? ?maximum iterations=1, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_coarse_) > ? ? ?type: redundant > ? ? ? ?Redundant preconditioner: First (color=0) of 4 PCs follows > ? ? ?KSP Object:(mg_coarse_redundant_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_coarse_redundant_) > ? ? ? ?type: lu > ? ? ? ? ?LU: out-of-place factorization > ? ? ? ? ? ?matrix ordering: nd > ? ? ? ? ?LU: tolerance for zero pivot 1e-12 > ? ? ? ? ?LU: factor fill ratio needed 2.17227 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqaij, rows=283, cols=283 > ? ? ? ? ? ? ? ?total: nonzeros=21651, allocated nonzeros=21651 > ? ? ? ? ? ? ? ? ?using I-node routines: found 186 nodes, limit used is > 5 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=283, cols=283 > ? ? ? ? ?total: nonzeros=9967, allocated nonzeros=14150 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=283, cols=283 > ? ? ? ?total: nonzeros=9967, allocated nonzeros=9967 > ? ? ? ? ?not using I-node (on process 0) routines > ?Down solver (pre-smoother) on level 1 ------------------------------- > ? ?KSP Object:(mg_levels_1_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=0.9 > ? ? ?maximum iterations=1, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_1_) > ? ? ?type: bjacobi > ? ? ? ?block Jacobi: number of blocks = 4 > ? ? ? ?Local solve is same for all blocks, in the following KSP and PC > objects: > ? ? ?KSP Object:(mg_levels_1_sub_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_levels_1_sub_) > ? ? ? ?type: icc > ? ? ? ? ?ICC: 0 levels of fill > ? ? ? ? ?ICC: factor fill ratio allocated 1 > ? ? ? ? ?ICC: using Manteuffel shift > ? ? ? ? ?ICC: factor fill ratio needed 0.514899 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqsbaij, rows=2813, cols=2813 > ? ? ? ? ? ? ? ?total: nonzeros=48609, allocated nonzeros=48609 > ? ? ? ? ? ? ? ? ? ?block size is 1 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=2813, cols=2813 > ? ? ? ? ?total: nonzeros=94405, allocated nonzeros=94405 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=10654, cols=10654 > ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634 > ? ? ? ? ?not using I-node (on process 0) routines > ?Up solver (post-smoother) on level 1 ------------------------------- > ? ?KSP Object:(mg_levels_1_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=0.9 > ? ? ?maximum iterations=1 > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_1_) > ? ? ?type: bjacobi > ? ? ? ?block Jacobi: number of blocks = 4 > ? ? ? ?Local solve is same for all blocks, in the following KSP and PC > objects: > ? ? ?KSP Object:(mg_levels_1_sub_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_levels_1_sub_) > ? ? ? ?type: icc > ? ? ? ? ?ICC: 0 levels of fill > ? ? ? ? ?ICC: factor fill ratio allocated 1 > ? ? ? ? ?ICC: using Manteuffel shift > ? ? ? ? ?ICC: factor fill ratio needed 0.514899 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqsbaij, rows=2813, cols=2813 > ? ? ? ? ? ? ? ?total: nonzeros=48609, allocated nonzeros=48609 > ? ? ? ? ? ? ? ? ? ?block size is 1 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=2813, cols=2813 > ? ? ? ? ?total: nonzeros=94405, allocated nonzeros=94405 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=10654, cols=10654 > ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634 > ? ? ? ? ?not using I-node (on process 0) routines > ?Down solver (pre-smoother) on level 2 ------------------------------- > ? ?KSP Object:(mg_levels_2_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=0.9 > ? ? ?maximum iterations=1, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_2_) > ? ? ?type: bjacobi > ? ? ? ?block Jacobi: number of blocks = 4 > ? ? ? ?Local solve is same for all blocks, in the following KSP and PC > objects: > ? ? ?KSP Object:(mg_levels_2_sub_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_levels_2_sub_) > ? ? ? ?type: icc > ? ? ? ? ?ICC: 0 levels of fill > ? ? ? ? ?ICC: factor fill ratio allocated 1 > ? ? ? ? ?ICC: using Manteuffel shift > ? ? ? ? ?ICC: factor fill ratio needed 0.519045 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqsbaij, rows=101164, cols=101164 > ? ? ? ? ? ? ? ?total: nonzeros=1378558, allocated nonzeros=1378558 > ? ? ? ? ? ? ? ? ? ?block size is 1 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=101164, cols=101164 > ? ? ? ? ?total: nonzeros=2655952, allocated nonzeros=5159364 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=411866, cols=411866 > ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ? ? ?not using I-node (on process 0) routines > ?Up solver (post-smoother) on level 2 ------------------------------- > ? ?KSP Object:(mg_levels_2_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=0.9 > ? ? ?maximum iterations=1 > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_2_) > ? ? ?type: bjacobi > ? ? ? ?block Jacobi: number of blocks = 4 > ? ? ? ?Local solve is same for all blocks, in the following KSP and PC > objects: > ? ? ?KSP Object:(mg_levels_2_sub_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_levels_2_sub_) > ? ? ? ?type: icc > ? ? ? ? ?ICC: 0 levels of fill > ? ? ? ? ?ICC: factor fill ratio allocated 1 > ? ? ? ? ?ICC: using Manteuffel shift > ? ? ? ? ?ICC: factor fill ratio needed 0.519045 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqsbaij, rows=101164, cols=101164 > ? ? ? ? ? ? ? ?total: nonzeros=1378558, allocated nonzeros=1378558 > ? ? ? ? ? ? ? ? ? ?block size is 1 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=101164, cols=101164 > ? ? ? ? ?total: nonzeros=2655952, allocated nonzeros=5159364 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=411866, cols=411866 > ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ? ? ?not using I-node (on process 0) routines > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=411866, cols=411866 > ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ?not using I-node (on process 0) routines > > %%%%%%%%%%%%%%%%%%%%%% > AMG w/ SOR x4 ksp_view > %%%%%%%%%%%%%%%%%%%%%% > > KSP Object: > ?type: cg > ?maximum iterations=10000 > ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: ml > ? ?MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > ?Coarse gride solver -- level 0 ------------------------------- > ? ?KSP Object:(mg_coarse_) > ? ? ?type: preonly > ? ? ?maximum iterations=1, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_coarse_) > ? ? ?type: redundant > ? ? ? ?Redundant preconditioner: First (color=0) of 4 PCs follows > ? ? ?KSP Object:(mg_coarse_redundant_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_coarse_redundant_) > ? ? ? ?type: lu > ? ? ? ? ?LU: out-of-place factorization > ? ? ? ? ? ?matrix ordering: nd > ? ? ? ? ?LU: tolerance for zero pivot 1e-12 > ? ? ? ? ?LU: factor fill ratio needed 2.17227 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqaij, rows=283, cols=283 > ? ? ? ? ? ? ? ?total: nonzeros=21651, allocated nonzeros=21651 > ? ? ? ? ? ? ? ? ?using I-node routines: found 186 nodes, limit used is > 5 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=283, cols=283 > ? ? ? ? ?total: nonzeros=9967, allocated nonzeros=14150 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=283, cols=283 > ? ? ? ?total: nonzeros=9967, allocated nonzeros=9967 > ? ? ? ? ?not using I-node (on process 0) routines > ?Down solver (pre-smoother) on level 1 ------------------------------- > ? ?KSP Object:(mg_levels_1_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=1 > ? ? ?maximum iterations=4, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_1_) > ? ? ?type: sor > ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1 > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=10654, cols=10654 > ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634 > ? ? ? ? ?not using I-node (on process 0) routines > ?Up solver (post-smoother) on level 1 ------------------------------- > ? ?KSP Object:(mg_levels_1_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=1 > ? ? ?maximum iterations=4 > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_1_) > ? ? ?type: sor > ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1 > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=10654, cols=10654 > ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634 > ? ? ? ? ?not using I-node (on process 0) routines > ?Down solver (pre-smoother) on level 2 ------------------------------- > ? ?KSP Object:(mg_levels_2_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=1 > ? ? ?maximum iterations=4, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_2_) > ? ? ?type: sor > ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1 > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=411866, cols=411866 > ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ? ? ?not using I-node (on process 0) routines > ?Up solver (post-smoother) on level 2 ------------------------------- > ? ?KSP Object:(mg_levels_2_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=1 > ? ? ?maximum iterations=4 > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_2_) > ? ? ?type: sor > ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1 > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=411866, cols=411866 > ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ? ? ?not using I-node (on process 0) routines > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=411866, cols=411866 > ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ?not using I-node (on process 0) routines > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From tim.kroeger at cevis.uni-bremen.de Thu Jul 30 04:52:28 2009 From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger) Date: Thu, 30 Jul 2009 11:52:28 +0200 (CEST) Subject: Solver problem In-Reply-To: References: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov> Message-ID: Dear Matt, On Wed, 29 Jul 2009, Matthew Knepley wrote: > On Wed, Jul 29, 2009 at 8:49 AM, Tim Kroeger wrote: > >> It seems as if I can't use MUMPS since the cluster I am working on doesn't >> meet some system requirements. (PETSc otherwise works fine on the cluster.) >> However, I understand that PETSc also interfaces a large number of other >> sparse direct solvers. Are there any recommendations about which one might >> be a good choice if MUMPS cannot be used? > > You can try SuperLU_dist. Thank you very much. SuperLU_Dist works great! Best Regards, Tim -- Dr. Tim Kroeger tim.kroeger at mevis.fraunhofer.de Phone +49-421-218-7710 tim.kroeger at cevis.uni-bremen.de Fax +49-421-218-4236 Fraunhofer MEVIS, Institute for Medical Image Computing Universitaetsallee 29, 28359 Bremen, Germany From Harun.BAYRAKTAR at 3ds.com Thu Jul 30 06:28:35 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Thu, 30 Jul 2009 07:28:35 -0400 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: I always see a jump going from sequential to parallel runs with ML for all types of problems I have tried (structural mechanics, incompressible flow). But once parallel the iteration counts stay roughly constant given I use a true parallel smoother like Chebychev. My experience is that the iteration count jump is somewhere between 1.5-2.0X. -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Lisandro Dalcin Sent: Wednesday, July 29, 2009 6:40 PM To: PETSc users list Subject: Re: Smoother settings for AMG On Wed, Jul 29, 2009 at 6:48 PM, BAYRAKTAR Harun wrote: > > For a uniform mesh I see 20 iterations or so. > That's around the iteration count I would expect for for ML on uniform meshes... Do this iteration count stays at about 20 when you run in 4 CPU's while still using an uniform mesh? > From: petsc-users-bounces at mcs.anl.gov > [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: Wednesday, July 29, 2009 5:23 PM > > To: PETSc users list > Subject: Re: Smoother settings for AMG > > > > On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun > wrote: > > Matt, > > It is from the pressure poisson equation for incompressible navier-stokes so > it is elliptic. Also on 1 cpu, I am able to solve it with reason able > iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the > parallel runs that really concern me. > > Actually, it was the 43 that really concerned me. In my experience, an MG > that is doing what it is supposed > to on Poisson takes < 10 iterations. However, if your grid is pretty > distorted, maybe it can get this bad. > > ? Matt > > > > Thanks, > > Harun > > > > > > From: petsc-users-bounces at mcs.anl.gov > [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: Wednesday, July 29, 2009 5:00 PM > To: PETSc users list > Subject: Re: Smoother settings for AMG > > > > On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun > wrote: > > Hi, > > I am trying to solve a system of equations and I am having difficulty > picking the right smoothers for AMG (using ML as pc_type) in PETSc for > parallel execution. First here is what happens in terms of CG (ksp_type) > iteration counts (both columns use block jacobi): > > Are you sure you have an elliptic system? These iteration counts are > extremely > high. > > ? Matt > > > cpus ? ?| ? ? ? AMG w/ ICC(0) x1 ? ? ? ?| ? ? ? AMG w/ SOR x4 > ------------------------------------------------------ > 1 ? ? ? | ? ? ? ? ? ? ? 43 ? ? ? ? ? ? ?| ? ? ? ? ? ? ? 243 > 4 ? ? ? | ? ? ? ? ? ? ? 699 ? ? ? ? ? ? | ? ? ? ? ? ? ? 379 > > x1 or x4 means 1 or 4 iterations of smoother application at each AMG > level (all details from ksp view for the 4 cpu run are below). The main > observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls > apart in parallel. SOR on the other hand experiences a 1.5X increase in > iteration count which is totally expected from the quality of coarsening > ML delivers in parallel. > > I basically would like to find a way (if possible) to have the number of > iterations in parallel stay with 1-2X of 1 cpu iteration count for the > AMG w/ ICC case. Is there a way to achieve this? > > Thanks, > Harun > > %%%%%%%%%%%%%%%%%%%%%%%%% > AMG w/ ICC(0) x1 ksp_view > %%%%%%%%%%%%%%%%%%%%%%%%% > KSP Object: > ?type: cg > ?maximum iterations=10000 > ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: ml > ? ?MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > ?Coarse gride solver -- level 0 ------------------------------- > ? ?KSP Object:(mg_coarse_) > ? ? ?type: preonly > ? ? ?maximum iterations=1, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_coarse_) > ? ? ?type: redundant > ? ? ? ?Redundant preconditioner: First (color=0) of 4 PCs follows > ? ? ?KSP Object:(mg_coarse_redundant_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_coarse_redundant_) > ? ? ? ?type: lu > ? ? ? ? ?LU: out-of-place factorization > ? ? ? ? ? ?matrix ordering: nd > ? ? ? ? ?LU: tolerance for zero pivot 1e-12 > ? ? ? ? ?LU: factor fill ratio needed 2.17227 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqaij, rows=283, cols=283 > ? ? ? ? ? ? ? ?total: nonzeros=21651, allocated nonzeros=21651 > ? ? ? ? ? ? ? ? ?using I-node routines: found 186 nodes, limit used is > 5 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=283, cols=283 > ? ? ? ? ?total: nonzeros=9967, allocated nonzeros=14150 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=283, cols=283 > ? ? ? ?total: nonzeros=9967, allocated nonzeros=9967 > ? ? ? ? ?not using I-node (on process 0) routines > ?Down solver (pre-smoother) on level 1 ------------------------------- > ? ?KSP Object:(mg_levels_1_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=0.9 > ? ? ?maximum iterations=1, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_1_) > ? ? ?type: bjacobi > ? ? ? ?block Jacobi: number of blocks = 4 > ? ? ? ?Local solve is same for all blocks, in the following KSP and PC > objects: > ? ? ?KSP Object:(mg_levels_1_sub_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_levels_1_sub_) > ? ? ? ?type: icc > ? ? ? ? ?ICC: 0 levels of fill > ? ? ? ? ?ICC: factor fill ratio allocated 1 > ? ? ? ? ?ICC: using Manteuffel shift > ? ? ? ? ?ICC: factor fill ratio needed 0.514899 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqsbaij, rows=2813, cols=2813 > ? ? ? ? ? ? ? ?total: nonzeros=48609, allocated nonzeros=48609 > ? ? ? ? ? ? ? ? ? ?block size is 1 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=2813, cols=2813 > ? ? ? ? ?total: nonzeros=94405, allocated nonzeros=94405 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=10654, cols=10654 > ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634 > ? ? ? ? ?not using I-node (on process 0) routines > ?Up solver (post-smoother) on level 1 ------------------------------- > ? ?KSP Object:(mg_levels_1_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=0.9 > ? ? ?maximum iterations=1 > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_1_) > ? ? ?type: bjacobi > ? ? ? ?block Jacobi: number of blocks = 4 > ? ? ? ?Local solve is same for all blocks, in the following KSP and PC > objects: > ? ? ?KSP Object:(mg_levels_1_sub_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_levels_1_sub_) > ? ? ? ?type: icc > ? ? ? ? ?ICC: 0 levels of fill > ? ? ? ? ?ICC: factor fill ratio allocated 1 > ? ? ? ? ?ICC: using Manteuffel shift > ? ? ? ? ?ICC: factor fill ratio needed 0.514899 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqsbaij, rows=2813, cols=2813 > ? ? ? ? ? ? ? ?total: nonzeros=48609, allocated nonzeros=48609 > ? ? ? ? ? ? ? ? ? ?block size is 1 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=2813, cols=2813 > ? ? ? ? ?total: nonzeros=94405, allocated nonzeros=94405 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=10654, cols=10654 > ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634 > ? ? ? ? ?not using I-node (on process 0) routines > ?Down solver (pre-smoother) on level 2 ------------------------------- > ? ?KSP Object:(mg_levels_2_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=0.9 > ? ? ?maximum iterations=1, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_2_) > ? ? ?type: bjacobi > ? ? ? ?block Jacobi: number of blocks = 4 > ? ? ? ?Local solve is same for all blocks, in the following KSP and PC > objects: > ? ? ?KSP Object:(mg_levels_2_sub_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_levels_2_sub_) > ? ? ? ?type: icc > ? ? ? ? ?ICC: 0 levels of fill > ? ? ? ? ?ICC: factor fill ratio allocated 1 > ? ? ? ? ?ICC: using Manteuffel shift > ? ? ? ? ?ICC: factor fill ratio needed 0.519045 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqsbaij, rows=101164, cols=101164 > ? ? ? ? ? ? ? ?total: nonzeros=1378558, allocated nonzeros=1378558 > ? ? ? ? ? ? ? ? ? ?block size is 1 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=101164, cols=101164 > ? ? ? ? ?total: nonzeros=2655952, allocated nonzeros=5159364 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=411866, cols=411866 > ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ? ? ?not using I-node (on process 0) routines > ?Up solver (post-smoother) on level 2 ------------------------------- > ? ?KSP Object:(mg_levels_2_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=0.9 > ? ? ?maximum iterations=1 > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_2_) > ? ? ?type: bjacobi > ? ? ? ?block Jacobi: number of blocks = 4 > ? ? ? ?Local solve is same for all blocks, in the following KSP and PC > objects: > ? ? ?KSP Object:(mg_levels_2_sub_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_levels_2_sub_) > ? ? ? ?type: icc > ? ? ? ? ?ICC: 0 levels of fill > ? ? ? ? ?ICC: factor fill ratio allocated 1 > ? ? ? ? ?ICC: using Manteuffel shift > ? ? ? ? ?ICC: factor fill ratio needed 0.519045 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqsbaij, rows=101164, cols=101164 > ? ? ? ? ? ? ? ?total: nonzeros=1378558, allocated nonzeros=1378558 > ? ? ? ? ? ? ? ? ? ?block size is 1 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=101164, cols=101164 > ? ? ? ? ?total: nonzeros=2655952, allocated nonzeros=5159364 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=411866, cols=411866 > ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ? ? ?not using I-node (on process 0) routines > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=411866, cols=411866 > ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ?not using I-node (on process 0) routines > > %%%%%%%%%%%%%%%%%%%%%% > AMG w/ SOR x4 ksp_view > %%%%%%%%%%%%%%%%%%%%%% > > KSP Object: > ?type: cg > ?maximum iterations=10000 > ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ?left preconditioning > PC Object: > ?type: ml > ? ?MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > ?Coarse gride solver -- level 0 ------------------------------- > ? ?KSP Object:(mg_coarse_) > ? ? ?type: preonly > ? ? ?maximum iterations=1, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_coarse_) > ? ? ?type: redundant > ? ? ? ?Redundant preconditioner: First (color=0) of 4 PCs follows > ? ? ?KSP Object:(mg_coarse_redundant_) > ? ? ? ?type: preonly > ? ? ? ?maximum iterations=10000, initial guess is zero > ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ? ?left preconditioning > ? ? ?PC Object:(mg_coarse_redundant_) > ? ? ? ?type: lu > ? ? ? ? ?LU: out-of-place factorization > ? ? ? ? ? ?matrix ordering: nd > ? ? ? ? ?LU: tolerance for zero pivot 1e-12 > ? ? ? ? ?LU: factor fill ratio needed 2.17227 > ? ? ? ? ? ? ? Factored matrix follows > ? ? ? ? ? ? ?Matrix Object: > ? ? ? ? ? ? ? ?type=seqaij, rows=283, cols=283 > ? ? ? ? ? ? ? ?total: nonzeros=21651, allocated nonzeros=21651 > ? ? ? ? ? ? ? ? ?using I-node routines: found 186 nodes, limit used is > 5 > ? ? ? ?linear system matrix = precond matrix: > ? ? ? ?Matrix Object: > ? ? ? ? ?type=seqaij, rows=283, cols=283 > ? ? ? ? ?total: nonzeros=9967, allocated nonzeros=14150 > ? ? ? ? ? ?not using I-node routines > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=283, cols=283 > ? ? ? ?total: nonzeros=9967, allocated nonzeros=9967 > ? ? ? ? ?not using I-node (on process 0) routines > ?Down solver (pre-smoother) on level 1 ------------------------------- > ? ?KSP Object:(mg_levels_1_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=1 > ? ? ?maximum iterations=4, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_1_) > ? ? ?type: sor > ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1 > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=10654, cols=10654 > ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634 > ? ? ? ? ?not using I-node (on process 0) routines > ?Up solver (post-smoother) on level 1 ------------------------------- > ? ?KSP Object:(mg_levels_1_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=1 > ? ? ?maximum iterations=4 > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_1_) > ? ? ?type: sor > ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1 > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=10654, cols=10654 > ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634 > ? ? ? ? ?not using I-node (on process 0) routines > ?Down solver (pre-smoother) on level 2 ------------------------------- > ? ?KSP Object:(mg_levels_2_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=1 > ? ? ?maximum iterations=4, initial guess is zero > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_2_) > ? ? ?type: sor > ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1 > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=411866, cols=411866 > ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ? ? ?not using I-node (on process 0) routines > ?Up solver (post-smoother) on level 2 ------------------------------- > ? ?KSP Object:(mg_levels_2_) > ? ? ?type: richardson > ? ? ? ?Richardson: damping factor=1 > ? ? ?maximum iterations=4 > ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000 > ? ? ?left preconditioning > ? ?PC Object:(mg_levels_2_) > ? ? ?type: sor > ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1 > ? ? ?linear system matrix = precond matrix: > ? ? ?Matrix Object: > ? ? ? ?type=mpiaij, rows=411866, cols=411866 > ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ? ? ?not using I-node (on process 0) routines > ?linear system matrix = precond matrix: > ?Matrix Object: > ? ?type=mpiaij, rows=411866, cols=411866 > ? ?total: nonzeros=10941434, allocated nonzeros=42010332 > ? ? ?not using I-node (on process 0) routines > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From Harun.BAYRAKTAR at 3ds.com Thu Jul 30 06:30:30 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Thu, 30 Jul 2009 07:30:30 -0400 Subject: Smoother settings for AMG In-Reply-To: <02E52450-1C28-4451-BAE8-E31563BCF1A3@mcs.anl.gov> References: <02E52450-1C28-4451-BAE8-E31563BCF1A3@mcs.anl.gov> Message-ID: Barry, I sent the matrix and rhs file name yesterday to the petsc-maint address. Did you get it OK? Thanks a lot for your help, Harun -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Wednesday, July 29, 2009 5:05 PM To: PETSc users list Subject: Re: Smoother settings for AMG Can you save the matrix and right hand side with the option - ksp_view_binary and send the file "output" to petsc-maint at mcs.anl.gov (not this email). Barry If it is too big to email you can ftp it to info.mcs.anl.gov (anonymous login) and put it in the directory incoming then send us email petsc-maint at mcs.anl.gov with the filename. On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote: > Hi, > > I am trying to solve a system of equations and I am having difficulty > picking the right smoothers for AMG (using ML as pc_type) in PETSc for > parallel execution. First here is what happens in terms of CG > (ksp_type) > iteration counts (both columns use block jacobi): > > cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 > ------------------------------------------------------ > 1 | 43 | 243 > 4 | 699 | 379 > > x1 or x4 means 1 or 4 iterations of smoother application at each AMG > level (all details from ksp view for the 4 cpu run are below). The > main > observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but > falls > apart in parallel. SOR on the other hand experiences a 1.5X increase > in > iteration count which is totally expected from the quality of > coarsening > ML delivers in parallel. > > I basically would like to find a way (if possible) to have the > number of > iterations in parallel stay with 1-2X of 1 cpu iteration count for the > AMG w/ ICC case. Is there a way to achieve this? > > Thanks, > Harun > > %%%%%%%%%%%%%%%%%%%%%%%%% > AMG w/ ICC(0) x1 ksp_view > %%%%%%%%%%%%%%%%%%%%%%%%% > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > %%%%%%%%%%%%%%%%%%%%%% > AMG w/ SOR x4 ksp_view > %%%%%%%%%%%%%%%%%%%%%% > > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > From darach at tchpc.tcd.ie Thu Jul 30 10:08:11 2009 From: darach at tchpc.tcd.ie (darach at tchpc.tcd.ie) Date: Thu, 30 Jul 2009 16:08:11 +0100 Subject: Compiling petsc-dev with c++/boost/sieve Message-ID: <20090730150811.GG23977@tchpc.tcd.ie> Hi, I'm trying to run examples from the $PETSC_DIR/src/dm/mesh/examples/tutorials directory. I'm interested in the mixedpoisson example, but I've tried compiling the ex[1-3] examples, with results that I give below. I've also compiled some files in the $PETSC_DIR/src/dm/mesh/examples/tests directory with mostly failures as well. Details of the petsc compilation are at the bottom of the email These compilations take place with PETSC_DIR=, and with PETSC_ARCH set, but I've had similar problems after a 'make install' Examples elsewhere in the petsc tree appear to compile and run correctly I don't want to waste anyones time wading through reams of output, so I'm really asking whether this output indicates obviously that I have failed to configure/compile petsc correctly with c++/boost/sieve, and therefore have no hope of successfully compiling the examples? Darach petsc-dev: HG revision: f9c1b044f127006244143f415f257bfbc93c7a6e HG Date: Tue Jul 28 16:59:48 2009 -0500 %> gcc --version gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44) ..... %> make ex1 mpicxx -o ex1.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g -I/home/user/Compile/petsc-dev/src/dm/mesh/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -I/home/user/Compile/petsc-dev/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc-dev/include -I/home/user/Compile/petsc-dev/include/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -D__SDIR__="src/dm/mesh/examples/tutorials/" ex1.c ex1.c: In function ???PetscErrorCode CreatePartition(_p_Mesh*, _p_SectionInt**)???: ex1.c:170: error: invalid initialization of reference of type ???ALE::Obj > > >, ALE::malloc_allocator > > > > >&??? from expression of type ???ALE::Obj >??? /home/user/Compile/petsc-dev/include/petscmesh.h:108: error: in passing argument 2 of ???PetscErrorCode MeshGetMesh(_p_Mesh*, ALE::Obj > > >, ALE::malloc_allocator > > > > >&)??? ex1.c:171: error: invalid conversion from ???int??? to ???const char*??? ex1.c:171: error: invalid conversion from ???_p_SectionInt**??? to ???PetscInt??? /home/user/Compile/petsc-dev/include/petscmesh.h:255: error: too few arguments to function ???PetscErrorCode MeshGetCellSectionInt(_p_Mesh*, const char*, PetscInt, _p_SectionInt**)??? ex1.c:171: error: at this point in file ex1.c:173: error: invalid initialization of reference of type ???ALE::Obj >, ALE::malloc_allocator > > >&??? from expression of type ???ALE::Obj, ALE::UniformSection > >, ALE::malloc_allocator, ALE::UniformSection > > > >??? .... %> make ex2 mpicxx -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g -I/home/user/Compile/petsc-dev/src/dm/mesh/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -I/home/user/Compile/petsc-dev/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc-dev/include -I/home/user/Compile/petsc-dev/include/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -D__SDIR__="src/dm/mesh/examples/tutorials/" ex2.c ex2.c:22:38: error: ../src/dm/mesh/meshpcice.h: No such file or directory ex2.c: In function ???PetscErrorCode CreateSquareBoundary(const ALE::Obj >&)???: ex2.c:148: error: ???class ALE::Mesh::topology_type??? has not been declared ex2.c:148: error: expected initializer before ???patch??? ex2.c:153: error: ???topology_type??? is not a member of ???ALE::Mesh??? ex2.c:153: error: ???topology_type??? is not a member of ???ALE::Mesh??? ex2.c:153: error: template argument 1 is invalid ex2.c:153: error: template argument 2 is invalid ex2.c:153: error: invalid type in declaration before ???=??? token ex2.c:153: error: expected type-specifier ex2.c:153: error: invalid conversion from ???int*??? to ???int??? ex2.c:153: error: expected ???,??? or ???;??? ex2.c:182: error: base operand of ???->??? is not a pointer ex2.c:182: error: ???patch??? was not declared in this scope ex2.c:183: error: base operand of ???->??? is not a pointer ex2.c:184: error: ???class ALE::Mesh??? has no member named ???setTopology??? ex2.c:185: error: ???SieveBuilder??? is not a member of ???ALE::New??? ex2.c:185: error: expected primary-expression before ???>??? token ex2.c:185: error: ???::buildCoordinates??? has not been declared ex2.c:187: error: ???topology_type??? is not a member of ???ALE::Mesh??? ex2.c:187: error: ???topology_type??? is not a member of ???ALE::Mesh??? ex2.c:187: error: template argument 1 is invalid .... %> make ex3 mpicxx -o ex3.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g -I/home/user/Compile/petsc-dev/src/dm/mesh/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -I/home/user/Compile/petsc-dev/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc-dev/include -I/home/user/Compile/petsc-dev/include/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -D__SDIR__="src/dm/mesh/examples/tutorials/" ex3.c ex3.c:26: error: ???Two??? is not a member of ???ALE??? ex3.c:26: error: ???Two??? is not a member of ???ALE??? ex3.c:26: error: template argument 1 is invalid ex3.c:26: error: template argument 2 is invalid ex3.c:27: error: ???Two??? is not a member of ???ALE??? ex3.c:27: error: ???Two??? is not a member of ???ALE??? .... Compilation Details: --------------------- [user at machine petsc-dev]$ ./config/configure.py --prefix=/home/user/install_home/petsc-dev-defaultboost-sieve-comp --with-scalar-type=complex --with-clanguage=cxx --with-boost=1 --download-boost=/home/user/Compile/petsc-dev/externalpackages/boost.tar.gz --with-sieve=1 ================================================================================= Configuring PETSc to compile on your system ================================================================================= TESTING: alternateConfigureLibrary from PETSc.packages.mpi4py(config/PETSc/packages/mpi4py.py:54) Compilers: C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -g Fortran Compiler: mpif90 -Wall -Wno-unused-variable -g Linkers: Static linker: /usr/bin/ar cr PETSc: PETSC_ARCH: linux-gnu-cxx-debug PETSC_DIR: /home/user/Compile/petsc-dev ** ** Now build the libraries with "make all" ** Clanguage: Cxx Scalar type:complex MPI: Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib X11: Includes: [''] Library: ['-lX11'] PETSc shared libraries: disabled PETSc dynamic libraries: disabled BLAS/LAPACK: -llapack -lblas Sieve: Includes: -I/home/user/Compile/petsc-dev/include/sieve Boost: Includes: -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib c2html: sowing: Using mpiexec: /misc/shared/apps/openmpi/gcc/64/1.2.8/bin/mpiexec ========================================== /bin/rm -f -f /home/user/Compile/petsc-dev/linux-gnu-cxx-debug/lib/libpetsc*.* /bin/rm -f -f /home/user/Compile/petsc-dev/linux-gnu-cxx-debug/include/petsc*.mod BEGINNING TO COMPILE LIBRARIES IN ALL DIRECTORIES ========================================= .... libfast in: /home/user/Compile/petsc-dev/src/snes/examples/tutorials libfast in: /home/user/Compile/petsc-dev/src/snes/examples/tutorials/ex10d libfast in: /home/user/Compile/petsc-dev/src/snes/utils libfast in: /home/user/Compile/petsc-dev/src/snes/utils/sieve /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1348: instantiated from ???ALE::IBundle::~IBundle() [with Sieve_ = ALE::IFSieve >, RealSection_ = ALE::IGeneralSection >, IntSection_ = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >, ArrowSection_ = ALE::UniformSection, int, 1, ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1576: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? meshmgsnes.c:63: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? meshmgsnes.c:349: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor .... libfast in: /home/user/Compile/petsc-dev/src/dm/da/utils/f90-custom libfast in: /home/user/Compile/petsc-dev/src/dm/mesh /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1148: instantiated from ???ALE::IFSieve::IFSieve(ompi_communicator_t*, int) [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? mesh.c:1642: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? mesh.c:2846: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: meshpcice.c:387: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: meshpcice.c:388: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1348: instantiated from ???ALE::IBundle::~IBundle() [with Sieve_ = ALE::IFSieve >, RealSection_ = ALE::IGeneralSection >, IntSection_ = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >, ArrowSection_ = ALE::UniformSection, int, 1, ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1576: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? meshpflotran.c:235: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? meshpflotran.c:903: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1332: instantiated from ???ALE::IBundle::IBundle(ompi_communicator_t*, int) [with Sieve_ = ALE::IFSieve >, RealSection_ = ALE::IGeneralSection >, IntSection_ = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >, ArrowSection_ = ALE::UniformSection, int, 1, ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1602: instantiated from ???ALE::IMesh::IMesh(ompi_communicator_t*, int, int) [with Label_ = ALE::LabelSifter > >]??? meshexodus.c:183: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? meshexodus.c:364: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor section.c: In function ???PetscErrorCode SectionRealCreateLocalVector(_p_SectionReal*, _p_Vec**)???: section.c:582: warning: unused variable ???ierr??? /home/user/Compile/petsc-dev/include/sieve/Field.hh: In member function ???void ALE::GeneralSection::zero() [with Point_ = int, Value_ = int, Alloc_ = ALE::malloc_allocator, Atlas_ = ALE::IUniformSection >, BCAtlas_ = ALE::ISection >]???: section.c:1354: instantiated from here /home/user/Compile/petsc-dev/include/sieve/Field.hh:1740: warning: passing ???double??? for argument 1 to ???void ALE::GeneralSection::set(Value_) [with Point_ = int, Value_ = int, Alloc_ = ALE::malloc_allocator, Atlas_ = ALE::IUniformSection >, BCAtlas_ = ALE::ISection >]??? /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: At global scope: /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/petscmesh_viewers.hh:490: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? section.c:1532: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/sieve libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/impls libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/impls/cartesian /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence???: /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieveDef::Sequence, A = ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159: instantiated from ???ALE::IFSieve::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IFSieve >, A = ALE::malloc_allocator > >]??? /home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1348: instantiated from ???ALE::IBundle::~IBundle() [with Sieve_ = ALE::IFSieve >, RealSection_ = ALE::IGeneralSection >, IntSection_ = ALE::IGeneralSection >, Label_ = ALE::LabelSifter > >, ArrowSection_ = ALE::UniformSection, int, 1, ALE::malloc_allocator >]??? /home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1576: instantiated from ???void ALE::Obj::destroy() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? /home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745: instantiated from ???ALE::Obj::~Obj() [with X = ALE::IMesh > > >, A = ALE::malloc_allocator > > > >]??? cartesian.c:263: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence??? has virtual functions but non-virtual destructor /home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence::const_iterator???: /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994: instantiated from ???ALE::IFSieveDef::Sequence::const_iterator ALE::IFSieveDef::Sequence::begin() const [with PointType_ = int]??? cartesian.c:269: instantiated from here /home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence::const_iterator??? has virtual functions but non-virtual destructor libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/ftn-auto libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/ftn-custom libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/f90-custom libfast in: /home/user/Compile/petsc-dev/src/dm/adda libfast in: /home/user/Compile/petsc-dev/src/dm/adda/examples libfast in: /home/user/Compile/petsc-dev/src/dm/adda/examples/tests libfast in: /home/user/Compile/petsc-dev/src/dm/adda/ftn-auto libfast in: /home/user/Compile/petsc-dev/src/dm/f90-mod libfast in: /home/user/Compile/petsc-dev/src/dm/ftn-custom libfast in: /home/user/Compile/petsc-dev/src/contrib libfast in: /home/user/Compile/petsc-dev/src/contrib/fun3d libfast in: /home/user/Compile/petsc-dev/src/benchmarks libfast in: /home/user/Compile/petsc-dev/src/docs libfast in: /home/user/Compile/petsc-dev/include libfast in: /home/user/Compile/petsc-dev/include/finclude libfast in: /home/user/Compile/petsc-dev/include/finclude/ftn-auto libfast in: /home/user/Compile/petsc-dev/include/finclude/ftn-custom libfast in: /home/user/Compile/petsc-dev/include/private libfast in: /home/user/Compile/petsc-dev/include/sieve libfast in: /home/user/Compile/petsc-dev/include/ftn-auto petschf.c: In function ???PetscErrorCode petscmemcpy_(void*, void*, size_t*, int*)???: petschf.c:42: warning: no return statement in function returning non-void libfast in: /home/user/Compile/petsc-dev/tutorials libfast in: /home/user/Compile/petsc-dev/tutorials/multiphysics Completed building libraries From u.tabak at tudelft.nl Thu Jul 30 14:52:37 2009 From: u.tabak at tudelft.nl (Umut Tabak) Date: Thu, 30 Jul 2009 21:52:37 +0200 Subject: About binary matrix formats Message-ID: <4A71FA05.3060206@tudelft.nl> Dear all, I was looking at the format on the MatLoad reference page, I would like to interface some matrices to petsc . To read them in matrix market format can take too long if the matrices are large. The 1st integer is binary file marker, MAT_FILE_COOKIE. When I read the file in Matlab with fread(fid, 1, 'int') I can get the value 1211216 which shows that it is a binary file. But doing the same in C++, I tried with the following code, does not give the true value. I know that reinterpret_cast can be quite implementation dependent. K33.bin is a matrix saved in binary format by Petsc. char c[4]; char inname[] = "K33.bin"; ifstream infile(inname, ios::binary); if (!infile){ cout << "Couldn't open file " << inname << " for reading." << endl; return 1; } infile.read(c,4); int val; cout << (val = *(reinterpret_cast(c))); If I write some numbers in binary format with some simple code, say 0 1 2, I can convert them back to numerical values with this code, but I could not understand what happened in the case of the binary file of Petsc. Should I do sth byte by byte? Most probably, this is due to my deficient programming knowledge, but clarification is appreciated. Thanks and best regards, Umut From bsmith at mcs.anl.gov Thu Jul 30 18:45:19 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 30 Jul 2009 18:45:19 -0500 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: Harun, I have played around with this matrix. It is a nasty matrix; I think it is really beyond the normal capacity of ML (and hypre's boomerAMG). Even the "convergence" you were getting below is BOGUS. If you run with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual you'll see that the "true" residual norm is actually creeping to zero and at the converged 43 iterations below the true residual norm has decreased by like less than 1/10. (The preconditioned residual norm has decreased by 1.e 5 so the iteration stops and you think it has converged. In really hard problems preconditioners sometimes scales things in a funky way so a large decrease in preconditioned residual norm does not mean a large decrease in true residual norm). In other words the "answer" you got out of the runs below is garbage. I suggest, 1) check carefully that the matrix being created actually matches the model's equations, if they seem right then 2) see if you can change the model so it does not generate such hopeless matrices. If you MUST solve this nasty matrix 3) bite the bullet and use a parallel direct solver from PETSc. Try both MUMPS and SuperLU_dist Good luck, Barry On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote: > Hi, > > I am trying to solve a system of equations and I am having difficulty > picking the right smoothers for AMG (using ML as pc_type) in PETSc for > parallel execution. First here is what happens in terms of CG > (ksp_type) > iteration counts (both columns use block jacobi): > > cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 > ------------------------------------------------------ > 1 | 43 | 243 > 4 | 699 | 379 > > x1 or x4 means 1 or 4 iterations of smoother application at each AMG > level (all details from ksp view for the 4 cpu run are below). The > main > observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but > falls > apart in parallel. SOR on the other hand experiences a 1.5X increase > in > iteration count which is totally expected from the quality of > coarsening > ML delivers in parallel. > > I basically would like to find a way (if possible) to have the > number of > iterations in parallel stay with 1-2X of 1 cpu iteration count for the > AMG w/ ICC case. Is there a way to achieve this? > > Thanks, > Harun > > %%%%%%%%%%%%%%%%%%%%%%%%% > AMG w/ ICC(0) x1 ksp_view > %%%%%%%%%%%%%%%%%%%%%%%%% > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > %%%%%%%%%%%%%%%%%%%%%% > AMG w/ SOR x4 ksp_view > %%%%%%%%%%%%%%%%%%%%%% > > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > From Harun.BAYRAKTAR at 3ds.com Fri Jul 31 13:15:17 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Fri, 31 Jul 2009 14:15:17 -0400 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: Barry, Thanks a lot for looking in to this. One thing I want to clarify is that the 43 (should have been 46 sorry for the typo) iterations on 1 cpu seems like a real convergence to me. I do look at the unpreconditioned residual norm to determine convergence. For this I use: ierr = KSPSetNormType(m_solver, KSP_NORM_UNPRECONDITIONED); CHKERRQ(ierr); Then I check convergence through KSPSetConvergenceTest. As an experiment I commented out the line above where I tell KSP to use the unpreconditioned norm and while the ||r|| values changed (naturally), it still converged in slightly more number of iterations (56). I am familiar with the preconditioned norm going down 6 orders while the true relative norm is 0.1 or so (i.e., problem not solved at all). This usually happens to me in structural mechanics problems with ill conditioned systems and I use a KSP method that does not allow for the unpreconditioned residual to be monitored. However, this does not seem to be one of those cases though, maybe I am missing something. Out of curiosity did you use ksp/ksp/examples/tutorials/ex10.c to solve this? Thanks again, Harun -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Thursday, July 30, 2009 7:45 PM To: PETSc users list Subject: Re: Smoother settings for AMG Harun, I have played around with this matrix. It is a nasty matrix; I think it is really beyond the normal capacity of ML (and hypre's boomerAMG). Even the "convergence" you were getting below is BOGUS. If you run with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual you'll see that the "true" residual norm is actually creeping to zero and at the converged 43 iterations below the true residual norm has decreased by like less than 1/10. (The preconditioned residual norm has decreased by 1.e 5 so the iteration stops and you think it has converged. In really hard problems preconditioners sometimes scales things in a funky way so a large decrease in preconditioned residual norm does not mean a large decrease in true residual norm). In other words the "answer" you got out of the runs below is garbage. I suggest, 1) check carefully that the matrix being created actually matches the model's equations, if they seem right then 2) see if you can change the model so it does not generate such hopeless matrices. If you MUST solve this nasty matrix 3) bite the bullet and use a parallel direct solver from PETSc. Try both MUMPS and SuperLU_dist Good luck, Barry On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote: > Hi, > > I am trying to solve a system of equations and I am having difficulty > picking the right smoothers for AMG (using ML as pc_type) in PETSc for > parallel execution. First here is what happens in terms of CG > (ksp_type) > iteration counts (both columns use block jacobi): > > cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 > ------------------------------------------------------ > 1 | 43 | 243 > 4 | 699 | 379 > > x1 or x4 means 1 or 4 iterations of smoother application at each AMG > level (all details from ksp view for the 4 cpu run are below). The > main > observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but > falls > apart in parallel. SOR on the other hand experiences a 1.5X increase > in > iteration count which is totally expected from the quality of > coarsening > ML delivers in parallel. > > I basically would like to find a way (if possible) to have the > number of > iterations in parallel stay with 1-2X of 1 cpu iteration count for the > AMG w/ ICC case. Is there a way to achieve this? > > Thanks, > Harun > > %%%%%%%%%%%%%%%%%%%%%%%%% > AMG w/ ICC(0) x1 ksp_view > %%%%%%%%%%%%%%%%%%%%%%%%% > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_1_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.514899 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=2813, cols=2813 > total: nonzeros=48609, allocated nonzeros=48609 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=2813, cols=2813 > total: nonzeros=94405, allocated nonzeros=94405 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=0.9 > maximum iterations=1 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: bjacobi > block Jacobi: number of blocks = 4 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object:(mg_levels_2_sub_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_sub_) > type: icc > ICC: 0 levels of fill > ICC: factor fill ratio allocated 1 > ICC: using Manteuffel shift > ICC: factor fill ratio needed 0.519045 > Factored matrix follows > Matrix Object: > type=seqsbaij, rows=101164, cols=101164 > total: nonzeros=1378558, allocated nonzeros=1378558 > block size is 1 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=101164, cols=101164 > total: nonzeros=2655952, allocated nonzeros=5159364 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > %%%%%%%%%%%%%%%%%%%%%% > AMG w/ SOR x4 ksp_view > %%%%%%%%%%%%%%%%%%%%%% > > KSP Object: > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: ml > MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, > post-smooths=1 > Coarse gride solver -- level 0 ------------------------------- > KSP Object:(mg_coarse_) > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_) > type: redundant > Redundant preconditioner: First (color=0) of 4 PCs follows > KSP Object:(mg_coarse_redundant_) > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_coarse_redundant_) > type: lu > LU: out-of-place factorization > matrix ordering: nd > LU: tolerance for zero pivot 1e-12 > LU: factor fill ratio needed 2.17227 > Factored matrix follows > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=21651, allocated nonzeros=21651 > using I-node routines: found 186 nodes, limit used is > 5 > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=14150 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=283, cols=283 > total: nonzeros=9967, allocated nonzeros=9967 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 1 ------------------------------- > KSP Object:(mg_levels_1_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_1_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=10654, cols=10654 > total: nonzeros=376634, allocated nonzeros=376634 > not using I-node (on process 0) routines > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object:(mg_levels_2_) > type: richardson > Richardson: damping factor=1 > maximum iterations=4 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > PC Object:(mg_levels_2_) > type: sor > SOR: type = local_symmetric, iterations = 1, omega = 1 > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Matrix Object: > type=mpiaij, rows=411866, cols=411866 > total: nonzeros=10941434, allocated nonzeros=42010332 > not using I-node (on process 0) routines > > From bsmith at mcs.anl.gov Fri Jul 31 13:25:13 2009 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 31 Jul 2009 13:25:13 -0500 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: On Jul 31, 2009, at 1:15 PM, BAYRAKTAR Harun wrote: > Barry, > > Thanks a lot for looking in to this. One thing I want to clarify is > that the 43 (should have been 46 sorry for the typo) iterations on 1 > cpu seems like a real convergence to me. I do look at the > unpreconditioned residual norm to determine convergence. For this I > use: > > ierr = KSPSetNormType(m_solver, KSP_NORM_UNPRECONDITIONED); > CHKERRQ(ierr); > > Then I check convergence through KSPSetConvergenceTest. As an > experiment I commented out the line above where I tell KSP to use > the unpreconditioned norm and while the ||r|| values changed > (naturally), it still converged in slightly more number of > iterations (56). > > I am familiar with the preconditioned norm going down 6 orders while > the true relative norm is 0.1 or so (i.e., problem not solved at > all). This usually happens to me in structural mechanics problems > with ill conditioned systems and I use a KSP method that does not > allow for the unpreconditioned residual to be monitored. However, > this does not seem to be one of those cases though, maybe I am > missing something. Ok. I didn't see what you report (I saw it just iterating away for a long time with the unpreconditioned norm) but then you never sent the command line options for the solver you used so I may have run it differently. > > Out of curiosity did you use ksp/ksp/examples/tutorials/ex10.c to > solve this? Yes. > > Thanks again, > Harun > > > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov > ] On Behalf Of Barry Smith > Sent: Thursday, July 30, 2009 7:45 PM > To: PETSc users list > Subject: Re: Smoother settings for AMG > > > Harun, > > I have played around with this matrix. It is a nasty matrix; I > think it is really beyond the normal capacity of ML (and hypre's > boomerAMG). > > Even the "convergence" you were getting below is BOGUS. If you run > with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual > you'll see that the "true" residual norm is actually creeping to zero > and at the converged 43 iterations below the true residual norm has > decreased by like less than 1/10. (The preconditioned residual norm > has decreased by 1.e 5 so the iteration stops and you think it has > converged. In really hard problems preconditioners sometimes scales > things in a funky way so a large decrease in preconditioned residual > norm does not mean a large decrease in true residual norm). In other > words the "answer" you got out of the runs below is garbage. > > I suggest, > 1) check carefully that the matrix being created actually matches the > model's equations, if they seem right then > 2) see if you can change the model so it does not generate such > hopeless matrices. If you MUST solve this nasty matrix > 3) bite the bullet and use a parallel direct solver from PETSc. Try > both MUMPS and SuperLU_dist > > Good luck, > > Barry > > > > > On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote: > >> Hi, >> >> I am trying to solve a system of equations and I am having difficulty >> picking the right smoothers for AMG (using ML as pc_type) in PETSc >> for >> parallel execution. First here is what happens in terms of CG >> (ksp_type) >> iteration counts (both columns use block jacobi): >> >> cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 >> ------------------------------------------------------ >> 1 | 43 | 243 >> 4 | 699 | 379 >> >> x1 or x4 means 1 or 4 iterations of smoother application at each AMG >> level (all details from ksp view for the 4 cpu run are below). The >> main >> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but >> falls >> apart in parallel. SOR on the other hand experiences a 1.5X increase >> in >> iteration count which is totally expected from the quality of >> coarsening >> ML delivers in parallel. >> >> I basically would like to find a way (if possible) to have the >> number of >> iterations in parallel stay with 1-2X of 1 cpu iteration count for >> the >> AMG w/ ICC case. Is there a way to achieve this? >> >> Thanks, >> Harun >> >> %%%%%%%%%%%%%%%%%%%%%%%%% >> AMG w/ ICC(0) x1 ksp_view >> %%%%%%%%%%%%%%%%%%%%%%%%% >> KSP Object: >> type: cg >> maximum iterations=10000 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: ml >> MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, >> post-smooths=1 >> Coarse gride solver -- level 0 ------------------------------- >> KSP Object:(mg_coarse_) >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_coarse_) >> type: redundant >> Redundant preconditioner: First (color=0) of 4 PCs follows >> KSP Object:(mg_coarse_redundant_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_coarse_redundant_) >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: factor fill ratio needed 2.17227 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=283, cols=283 >> total: nonzeros=21651, allocated nonzeros=21651 >> using I-node routines: found 186 nodes, limit used is >> 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=283, cols=283 >> total: nonzeros=9967, allocated nonzeros=14150 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=283, cols=283 >> total: nonzeros=9967, allocated nonzeros=9967 >> not using I-node (on process 0) routines >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object:(mg_levels_1_) >> type: richardson >> Richardson: damping factor=0.9 >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_) >> type: bjacobi >> block Jacobi: number of blocks = 4 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(mg_levels_1_sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_sub_) >> type: icc >> ICC: 0 levels of fill >> ICC: factor fill ratio allocated 1 >> ICC: using Manteuffel shift >> ICC: factor fill ratio needed 0.514899 >> Factored matrix follows >> Matrix Object: >> type=seqsbaij, rows=2813, cols=2813 >> total: nonzeros=48609, allocated nonzeros=48609 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=2813, cols=2813 >> total: nonzeros=94405, allocated nonzeros=94405 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=10654, cols=10654 >> total: nonzeros=376634, allocated nonzeros=376634 >> not using I-node (on process 0) routines >> Up solver (post-smoother) on level 1 ------------------------------- >> KSP Object:(mg_levels_1_) >> type: richardson >> Richardson: damping factor=0.9 >> maximum iterations=1 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_) >> type: bjacobi >> block Jacobi: number of blocks = 4 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(mg_levels_1_sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_sub_) >> type: icc >> ICC: 0 levels of fill >> ICC: factor fill ratio allocated 1 >> ICC: using Manteuffel shift >> ICC: factor fill ratio needed 0.514899 >> Factored matrix follows >> Matrix Object: >> type=seqsbaij, rows=2813, cols=2813 >> total: nonzeros=48609, allocated nonzeros=48609 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=2813, cols=2813 >> total: nonzeros=94405, allocated nonzeros=94405 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=10654, cols=10654 >> total: nonzeros=376634, allocated nonzeros=376634 >> not using I-node (on process 0) routines >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object:(mg_levels_2_) >> type: richardson >> Richardson: damping factor=0.9 >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_) >> type: bjacobi >> block Jacobi: number of blocks = 4 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(mg_levels_2_sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_sub_) >> type: icc >> ICC: 0 levels of fill >> ICC: factor fill ratio allocated 1 >> ICC: using Manteuffel shift >> ICC: factor fill ratio needed 0.519045 >> Factored matrix follows >> Matrix Object: >> type=seqsbaij, rows=101164, cols=101164 >> total: nonzeros=1378558, allocated nonzeros=1378558 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=101164, cols=101164 >> total: nonzeros=2655952, allocated nonzeros=5159364 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> Up solver (post-smoother) on level 2 ------------------------------- >> KSP Object:(mg_levels_2_) >> type: richardson >> Richardson: damping factor=0.9 >> maximum iterations=1 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_) >> type: bjacobi >> block Jacobi: number of blocks = 4 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(mg_levels_2_sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_sub_) >> type: icc >> ICC: 0 levels of fill >> ICC: factor fill ratio allocated 1 >> ICC: using Manteuffel shift >> ICC: factor fill ratio needed 0.519045 >> Factored matrix follows >> Matrix Object: >> type=seqsbaij, rows=101164, cols=101164 >> total: nonzeros=1378558, allocated nonzeros=1378558 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=101164, cols=101164 >> total: nonzeros=2655952, allocated nonzeros=5159364 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> >> %%%%%%%%%%%%%%%%%%%%%% >> AMG w/ SOR x4 ksp_view >> %%%%%%%%%%%%%%%%%%%%%% >> >> KSP Object: >> type: cg >> maximum iterations=10000 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: ml >> MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, >> post-smooths=1 >> Coarse gride solver -- level 0 ------------------------------- >> KSP Object:(mg_coarse_) >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_coarse_) >> type: redundant >> Redundant preconditioner: First (color=0) of 4 PCs follows >> KSP Object:(mg_coarse_redundant_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_coarse_redundant_) >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: factor fill ratio needed 2.17227 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=283, cols=283 >> total: nonzeros=21651, allocated nonzeros=21651 >> using I-node routines: found 186 nodes, limit used is >> 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=283, cols=283 >> total: nonzeros=9967, allocated nonzeros=14150 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=283, cols=283 >> total: nonzeros=9967, allocated nonzeros=9967 >> not using I-node (on process 0) routines >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object:(mg_levels_1_) >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=4, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_) >> type: sor >> SOR: type = local_symmetric, iterations = 1, omega = 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=10654, cols=10654 >> total: nonzeros=376634, allocated nonzeros=376634 >> not using I-node (on process 0) routines >> Up solver (post-smoother) on level 1 ------------------------------- >> KSP Object:(mg_levels_1_) >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=4 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_) >> type: sor >> SOR: type = local_symmetric, iterations = 1, omega = 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=10654, cols=10654 >> total: nonzeros=376634, allocated nonzeros=376634 >> not using I-node (on process 0) routines >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object:(mg_levels_2_) >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=4, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_) >> type: sor >> SOR: type = local_symmetric, iterations = 1, omega = 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> Up solver (post-smoother) on level 2 ------------------------------- >> KSP Object:(mg_levels_2_) >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=4 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_) >> type: sor >> SOR: type = local_symmetric, iterations = 1, omega = 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> >> > From Harun.BAYRAKTAR at 3ds.com Fri Jul 31 13:55:49 2009 From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun) Date: Fri, 31 Jul 2009 14:55:49 -0400 Subject: Smoother settings for AMG In-Reply-To: References: Message-ID: Barry, On Monday I'll use ex10.c to reproduce and send you the full options. Thanks, Harun -----Original Message----- From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Friday, July 31, 2009 2:25 PM To: PETSc users list Subject: Re: Smoother settings for AMG On Jul 31, 2009, at 1:15 PM, BAYRAKTAR Harun wrote: > Barry, > > Thanks a lot for looking in to this. One thing I want to clarify is > that the 43 (should have been 46 sorry for the typo) iterations on 1 > cpu seems like a real convergence to me. I do look at the > unpreconditioned residual norm to determine convergence. For this I > use: > > ierr = KSPSetNormType(m_solver, KSP_NORM_UNPRECONDITIONED); > CHKERRQ(ierr); > > Then I check convergence through KSPSetConvergenceTest. As an > experiment I commented out the line above where I tell KSP to use > the unpreconditioned norm and while the ||r|| values changed > (naturally), it still converged in slightly more number of > iterations (56). > > I am familiar with the preconditioned norm going down 6 orders while > the true relative norm is 0.1 or so (i.e., problem not solved at > all). This usually happens to me in structural mechanics problems > with ill conditioned systems and I use a KSP method that does not > allow for the unpreconditioned residual to be monitored. However, > this does not seem to be one of those cases though, maybe I am > missing something. Ok. I didn't see what you report (I saw it just iterating away for a long time with the unpreconditioned norm) but then you never sent the command line options for the solver you used so I may have run it differently. > > Out of curiosity did you use ksp/ksp/examples/tutorials/ex10.c to > solve this? Yes. > > Thanks again, > Harun > > > > -----Original Message----- > From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov > ] On Behalf Of Barry Smith > Sent: Thursday, July 30, 2009 7:45 PM > To: PETSc users list > Subject: Re: Smoother settings for AMG > > > Harun, > > I have played around with this matrix. It is a nasty matrix; I > think it is really beyond the normal capacity of ML (and hypre's > boomerAMG). > > Even the "convergence" you were getting below is BOGUS. If you run > with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual > you'll see that the "true" residual norm is actually creeping to zero > and at the converged 43 iterations below the true residual norm has > decreased by like less than 1/10. (The preconditioned residual norm > has decreased by 1.e 5 so the iteration stops and you think it has > converged. In really hard problems preconditioners sometimes scales > things in a funky way so a large decrease in preconditioned residual > norm does not mean a large decrease in true residual norm). In other > words the "answer" you got out of the runs below is garbage. > > I suggest, > 1) check carefully that the matrix being created actually matches the > model's equations, if they seem right then > 2) see if you can change the model so it does not generate such > hopeless matrices. If you MUST solve this nasty matrix > 3) bite the bullet and use a parallel direct solver from PETSc. Try > both MUMPS and SuperLU_dist > > Good luck, > > Barry > > > > > On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote: > >> Hi, >> >> I am trying to solve a system of equations and I am having difficulty >> picking the right smoothers for AMG (using ML as pc_type) in PETSc >> for >> parallel execution. First here is what happens in terms of CG >> (ksp_type) >> iteration counts (both columns use block jacobi): >> >> cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4 >> ------------------------------------------------------ >> 1 | 43 | 243 >> 4 | 699 | 379 >> >> x1 or x4 means 1 or 4 iterations of smoother application at each AMG >> level (all details from ksp view for the 4 cpu run are below). The >> main >> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but >> falls >> apart in parallel. SOR on the other hand experiences a 1.5X increase >> in >> iteration count which is totally expected from the quality of >> coarsening >> ML delivers in parallel. >> >> I basically would like to find a way (if possible) to have the >> number of >> iterations in parallel stay with 1-2X of 1 cpu iteration count for >> the >> AMG w/ ICC case. Is there a way to achieve this? >> >> Thanks, >> Harun >> >> %%%%%%%%%%%%%%%%%%%%%%%%% >> AMG w/ ICC(0) x1 ksp_view >> %%%%%%%%%%%%%%%%%%%%%%%%% >> KSP Object: >> type: cg >> maximum iterations=10000 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: ml >> MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, >> post-smooths=1 >> Coarse gride solver -- level 0 ------------------------------- >> KSP Object:(mg_coarse_) >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_coarse_) >> type: redundant >> Redundant preconditioner: First (color=0) of 4 PCs follows >> KSP Object:(mg_coarse_redundant_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_coarse_redundant_) >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: factor fill ratio needed 2.17227 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=283, cols=283 >> total: nonzeros=21651, allocated nonzeros=21651 >> using I-node routines: found 186 nodes, limit used is >> 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=283, cols=283 >> total: nonzeros=9967, allocated nonzeros=14150 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=283, cols=283 >> total: nonzeros=9967, allocated nonzeros=9967 >> not using I-node (on process 0) routines >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object:(mg_levels_1_) >> type: richardson >> Richardson: damping factor=0.9 >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_) >> type: bjacobi >> block Jacobi: number of blocks = 4 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(mg_levels_1_sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_sub_) >> type: icc >> ICC: 0 levels of fill >> ICC: factor fill ratio allocated 1 >> ICC: using Manteuffel shift >> ICC: factor fill ratio needed 0.514899 >> Factored matrix follows >> Matrix Object: >> type=seqsbaij, rows=2813, cols=2813 >> total: nonzeros=48609, allocated nonzeros=48609 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=2813, cols=2813 >> total: nonzeros=94405, allocated nonzeros=94405 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=10654, cols=10654 >> total: nonzeros=376634, allocated nonzeros=376634 >> not using I-node (on process 0) routines >> Up solver (post-smoother) on level 1 ------------------------------- >> KSP Object:(mg_levels_1_) >> type: richardson >> Richardson: damping factor=0.9 >> maximum iterations=1 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_) >> type: bjacobi >> block Jacobi: number of blocks = 4 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(mg_levels_1_sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_sub_) >> type: icc >> ICC: 0 levels of fill >> ICC: factor fill ratio allocated 1 >> ICC: using Manteuffel shift >> ICC: factor fill ratio needed 0.514899 >> Factored matrix follows >> Matrix Object: >> type=seqsbaij, rows=2813, cols=2813 >> total: nonzeros=48609, allocated nonzeros=48609 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=2813, cols=2813 >> total: nonzeros=94405, allocated nonzeros=94405 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=10654, cols=10654 >> total: nonzeros=376634, allocated nonzeros=376634 >> not using I-node (on process 0) routines >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object:(mg_levels_2_) >> type: richardson >> Richardson: damping factor=0.9 >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_) >> type: bjacobi >> block Jacobi: number of blocks = 4 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(mg_levels_2_sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_sub_) >> type: icc >> ICC: 0 levels of fill >> ICC: factor fill ratio allocated 1 >> ICC: using Manteuffel shift >> ICC: factor fill ratio needed 0.519045 >> Factored matrix follows >> Matrix Object: >> type=seqsbaij, rows=101164, cols=101164 >> total: nonzeros=1378558, allocated nonzeros=1378558 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=101164, cols=101164 >> total: nonzeros=2655952, allocated nonzeros=5159364 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> Up solver (post-smoother) on level 2 ------------------------------- >> KSP Object:(mg_levels_2_) >> type: richardson >> Richardson: damping factor=0.9 >> maximum iterations=1 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_) >> type: bjacobi >> block Jacobi: number of blocks = 4 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object:(mg_levels_2_sub_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_sub_) >> type: icc >> ICC: 0 levels of fill >> ICC: factor fill ratio allocated 1 >> ICC: using Manteuffel shift >> ICC: factor fill ratio needed 0.519045 >> Factored matrix follows >> Matrix Object: >> type=seqsbaij, rows=101164, cols=101164 >> total: nonzeros=1378558, allocated nonzeros=1378558 >> block size is 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=101164, cols=101164 >> total: nonzeros=2655952, allocated nonzeros=5159364 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> >> %%%%%%%%%%%%%%%%%%%%%% >> AMG w/ SOR x4 ksp_view >> %%%%%%%%%%%%%%%%%%%%%% >> >> KSP Object: >> type: cg >> maximum iterations=10000 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object: >> type: ml >> MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1, >> post-smooths=1 >> Coarse gride solver -- level 0 ------------------------------- >> KSP Object:(mg_coarse_) >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_coarse_) >> type: redundant >> Redundant preconditioner: First (color=0) of 4 PCs follows >> KSP Object:(mg_coarse_redundant_) >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_coarse_redundant_) >> type: lu >> LU: out-of-place factorization >> matrix ordering: nd >> LU: tolerance for zero pivot 1e-12 >> LU: factor fill ratio needed 2.17227 >> Factored matrix follows >> Matrix Object: >> type=seqaij, rows=283, cols=283 >> total: nonzeros=21651, allocated nonzeros=21651 >> using I-node routines: found 186 nodes, limit used is >> 5 >> linear system matrix = precond matrix: >> Matrix Object: >> type=seqaij, rows=283, cols=283 >> total: nonzeros=9967, allocated nonzeros=14150 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=283, cols=283 >> total: nonzeros=9967, allocated nonzeros=9967 >> not using I-node (on process 0) routines >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object:(mg_levels_1_) >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=4, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_) >> type: sor >> SOR: type = local_symmetric, iterations = 1, omega = 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=10654, cols=10654 >> total: nonzeros=376634, allocated nonzeros=376634 >> not using I-node (on process 0) routines >> Up solver (post-smoother) on level 1 ------------------------------- >> KSP Object:(mg_levels_1_) >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=4 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_1_) >> type: sor >> SOR: type = local_symmetric, iterations = 1, omega = 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=10654, cols=10654 >> total: nonzeros=376634, allocated nonzeros=376634 >> not using I-node (on process 0) routines >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object:(mg_levels_2_) >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=4, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_) >> type: sor >> SOR: type = local_symmetric, iterations = 1, omega = 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> Up solver (post-smoother) on level 2 ------------------------------- >> KSP Object:(mg_levels_2_) >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=4 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> PC Object:(mg_levels_2_) >> type: sor >> SOR: type = local_symmetric, iterations = 1, omega = 1 >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Matrix Object: >> type=mpiaij, rows=411866, cols=411866 >> total: nonzeros=10941434, allocated nonzeros=42010332 >> not using I-node (on process 0) routines >> >> >