From erlend.pedersen at holberger.com Fri Feb 1 05:54:24 2008 From: erlend.pedersen at holberger.com (Erlend Pedersen :.) Date: Fri, 01 Feb 2008 12:54:24 +0100 Subject: Overdetermined, non-linear Message-ID: <1201866864.6394.25.camel@erlend-ws.in.holberger.com> I am attempting to use the PETSc nonlinear solver on an overdetermined system of non-linear equations. Hence, the Jacobian is not square, and so far we have unfortunately not succeeded with any combination of snes, ksp and pc. Could you confirm that snes actually works for overdetermined systems, and if so, is there an application example we could look at in order to make sure there is nothing wrong with our test-setup? We have previously used the MINPACK routine LMDER very successfully, but for our current problem sizes we rely on the use of sparse matrix representations and parallel architectures. PETSc's abstractions and automatic MPI makes this system very attractive for us, and we have already used the PETSc LSQR solver with great success. Thank you very much. Regards, Erlend Pedersen :. From geenen at gmail.com Sat Feb 2 03:32:37 2008 From: geenen at gmail.com (Thomas Geenen) Date: Sat, 2 Feb 2008 10:32:37 +0100 Subject: assembly Message-ID: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> Dear Petsc users, I would like to understand what is slowing down the assembly phase of my matrix. I create a matrix with MatCreateMPIAIJ i make a rough guess of the number of off diagonal entries and then use a conservative value to make sure I do not need extra mallocs. (the number of diagonal entries is exact) next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. The first time i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd it takes about 170 seconds the second time 0.3 seconds. I run it on 6 cpu's and I do fill quit a number of row-entries on the "wrong" cpu. However thats also the case the second run. I checked that there are no additional mallocs MatGetInfo info.mallocs=0 both after MatSetValues and after MatAssemblyBegin, MatAssemblyEnd. cheers Thomas From jiaxun_hou at yahoo.com.cn Sat Feb 2 06:49:03 2008 From: jiaxun_hou at yahoo.com.cn (jiaxun hou) Date: Sat, 2 Feb 2008 20:49:03 +0800 (CST) Subject: About unpreconditioned residuals in Left Preconditioned GMRES Message-ID: <520833.35747.qm@web15802.mail.cnb.yahoo.com> Hi everyone, I want to use the Left Preconditioned GMRES to solve a linear system, and the stopping criterion must be based on the actual residuals (b-Ax). But the GMRES codes of PETSc seems to use the preconditioned residuals (B^-1(b-Ax)) only. In addition, when I set KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED), I receive the error message: "Currently can use GMRES with only preconditioned residual (right preconditioning not coded)". So, is there any way to set stopping criterion based on the actual residuals? Best regards, Jiaxun --------------------------------- ??????????????????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Sat Feb 2 07:29:33 2008 From: dave.mayhem23 at gmail.com (Dave May) Date: Sun, 3 Feb 2008 00:29:33 +1100 Subject: About unpreconditioned residuals in Left Preconditioned GMRES In-Reply-To: <520833.35747.qm@web15802.mail.cnb.yahoo.com> References: <520833.35747.qm@web15802.mail.cnb.yahoo.com> Message-ID: <956373f0802020529m5501b2b8t44549bebe9063e47@mail.gmail.com> Hi, You can use the function PetscErrorCode PETSCKSP_DLLEXPORT KSPSetConvergenceTest(KSP ksp,PetscErrorCode (*converge)(KSP,PetscInt,PetscReal,KSPConvergedReason*,void*),void *cctx) to define your own convergence test. Cheers, Dave. 2008/2/2 jiaxun hou : > Hi everyone, > > I want to use the Left Preconditioned GMRES to solve a linear system, and > the stopping criterion must be based on the actual residuals (b-Ax). But > the GMRES codes of PETSc seems to use the preconditioned residuals > (B^-1(b-Ax)) only. In addition, when I set > KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED), I receive the error message: > "Currently can use GMRES with only preconditioned residual (right > preconditioning not coded)". So, is there any way to set stopping criterion > based on the actual residuals? > > Best regards, > Jiaxun > > ------------------------------ > ??????????????????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiaxun_hou at yahoo.com.cn Sat Feb 2 10:09:02 2008 From: jiaxun_hou at yahoo.com.cn (jiaxun hou) Date: Sun, 3 Feb 2008 00:09:02 +0800 (CST) Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20About=20unpreconditioned=20residual?= =?gb2312?q?s=20in=20Left=20Preconditioned=20GMRES?= In-Reply-To: <956373f0802020529m5501b2b8t44549bebe9063e47@mail.gmail.com> Message-ID: <491152.9114.qm@web15815.mail.cnb.yahoo.com> Thank you, Dave. But there is still a question: how can I get the residual vector in each iteration? It seems difficult to get it without modifying the GMRES codes. Best regards, Jiaxun Dave May ??? Hi, You can use the function PetscErrorCode PETSCKSP_DLLEXPORT KSPSetConvergenceTest(KSP ksp,PetscErrorCode (*converge)(KSP,PetscInt,PetscReal,KSPConvergedReason*,void*),void *cctx) to define your own convergence test. Cheers, Dave. 2008/2/2 jiaxun hou : Hi everyone, I want to use the Left Preconditioned GMRES to solve a linear system, and the stopping criterion must be based on the actual residuals (b-Ax). But the GMRES codes of PETSc seems to use the preconditioned residuals (B^-1(b-Ax)) only. In addition, when I set KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED), I receive the error message: "Currently can use GMRES with only preconditioned residual (right preconditioning not coded)". So, is there any way to set stopping criterion based on the actual residuals? Best regards, Jiaxun --------------------------------- ??????????????????? --------------------------------- ??????????????????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Sat Feb 2 11:33:51 2008 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Sat, 2 Feb 2008 11:33:51 -0600 (CST) Subject: assembly In-Reply-To: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> Message-ID: On Sat, 2 Feb 2008, Thomas Geenen wrote: > Dear Petsc users, > > I would like to understand what is slowing down the assembly phase of my matrix. > I create a matrix with MatCreateMPIAIJ i make a rough guess of the > number of off diagonal entries and then use a conservative value to > make sure I do not need extra mallocs. (the number of diagonal entries > is exact) > next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. > The first time i call MatSetValues and MatAssemblyBegin, > MatAssemblyEnd it takes about 170 seconds > the second time 0.3 seconds. > I run it on 6 cpu's and I do fill quit a number of row-entries on the > "wrong" cpu. However thats also the case the second run. I checked > that there are no additional mallocs > MatGetInfo info.mallocs=0 both after MatSetValues and after > MatAssemblyBegin, MatAssemblyEnd. Run your code with the option '-log_summary' and check which function call dominates the execution time. > I run it on 6 cpu's and I do fill quit a number of row-entries on the > "wrong" cpu. Likely, the communication that sending the entries to the corrected cpu consume the time. Can you fill the entries in the correct cpu? Hong > > cheers > Thomas > > From geenen at gmail.com Sat Feb 2 12:30:49 2008 From: geenen at gmail.com (Thomas Geenen) Date: Sat, 2 Feb 2008 19:30:49 +0100 Subject: assembly In-Reply-To: References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> Message-ID: <200802021930.49084.geenen@gmail.com> On Saturday 02 February 2008 18:33, Hong Zhang wrote: > On Sat, 2 Feb 2008, Thomas Geenen wrote: > > Dear Petsc users, > > > > I would like to understand what is slowing down the assembly phase of my > > matrix. I create a matrix with MatCreateMPIAIJ i make a rough guess of > > the number of off diagonal entries and then use a conservative value to > > make sure I do not need extra mallocs. (the number of diagonal entries is > > exact) > > next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. > > The first time i call MatSetValues and MatAssemblyBegin, > > MatAssemblyEnd it takes about 170 seconds > > the second time 0.3 seconds. > > I run it on 6 cpu's and I do fill quit a number of row-entries on the > > "wrong" cpu. However thats also the case the second run. I checked > > that there are no additional mallocs > > MatGetInfo info.mallocs=0 both after MatSetValues and after > > MatAssemblyBegin, MatAssemblyEnd. > > Run your code with the option '-log_summary' and check which function > call dominates the execution time. the time is spend in MatStashScatterGetMesg_Private > > > I run it on 6 cpu's and I do fill quit a number of row-entries on the > > "wrong" cpu. > > Likely, the communication that sending the entries to the > corrected cpu consume the time. Can you fill the entries in the > correct cpu? the second time the entries are filled on the wrong CPU as well. i am curious about the difference in time between run 1 and 2. > > Hong > > > cheers > > Thomas From bsmith at mcs.anl.gov Sat Feb 2 16:10:44 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 2 Feb 2008 16:10:44 -0600 Subject: =?GB2312?Q?Re:_=BB=D8=B8=B4=A3=BA_Re:_About_unpreconditioned_res?= =?GB2312?Q?iduals_in_Left_Preconditioned_GMRES?= In-Reply-To: <491152.9114.qm@web15815.mail.cnb.yahoo.com> References: <491152.9114.qm@web15815.mail.cnb.yahoo.com> Message-ID: <2BEBE7C4-DF7E-4B6D-9F47-510DFAC86153@mcs.anl.gov> To calculate the true residual norm at each iteration of left preconditioned GMRES requires actually forming b - A*x which means computing A*x which means computing x (which is not available without additional calculations at each iteration). This is why we do not support left preconditioning with true residual norm convergence test. You should use the KSP type of FGMRES, it is written using right preconditioning and for a standard PC is identical to regular GMRES. Barry On Feb 2, 2008, at 10:09 AM, jiaxun hou wrote: > Thank you, Dave. But there is still a question: how can I get the > residual vector in each iteration? It seems difficult to get it > without modifying the GMRES codes. > > Best regards, > Jiaxun > Dave May ??? > Hi, > You can use the function > PetscErrorCode PETSCKSP_DLLEXPORT KSPSetConvergenceTest(KSP > ksp,PetscErrorCode (*converge) > (KSP,PetscInt,PetscReal,KSPConvergedReason*,void*),void *cctx) > to define your own convergence test. > > Cheers, > Dave. > > > > > > 2008/2/2 jiaxun hou : > Hi everyone, > > I want to use the Left Preconditioned GMRES to solve a linear > system, and the stopping criterion must be based on the actual > residuals (b-Ax). But the GMRES codes of PETSc seems to use the > preconditioned residuals (B^-1(b-Ax)) only. In addition, when I set > KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED), I receive the error > message: "Currently can use GMRES with only preconditioned residual > (right preconditioning not coded)". So, is there any way to set > stopping criterion based on the actual residuals? > > Best regards, > Jiaxun > ??????????????????? > > > > ??????????????????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Feb 2 16:19:37 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 2 Feb 2008 16:19:37 -0600 Subject: assembly In-Reply-To: <200802021930.49084.geenen@gmail.com> References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> Message-ID: <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> The matstash has a concept of preallocation also. During the first setvalues it is allocating more and more memory for the stash. In the second setvalues the stash is large enough so does not require any addition allocation. You can use the option -matstash_initial_size to allocate enough space initially so that the first setvalues is also fast. It does not look like there is a way coded to get the that you should use. It should be set to the maximum nonzeros any process has that belongs to other processes. The stash handling code is in src/mat/utils/matstash.c, perhaps you can figure out how to printout with PetscInfo() the sizes needed? Barry On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote: > On Saturday 02 February 2008 18:33, Hong Zhang wrote: >> On Sat, 2 Feb 2008, Thomas Geenen wrote: >>> Dear Petsc users, >>> >>> I would like to understand what is slowing down the assembly phase >>> of my >>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough >>> guess of >>> the number of off diagonal entries and then use a conservative >>> value to >>> make sure I do not need extra mallocs. (the number of diagonal >>> entries is >>> exact) >>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. >>> The first time i call MatSetValues and MatAssemblyBegin, >>> MatAssemblyEnd it takes about 170 seconds >>> the second time 0.3 seconds. >>> I run it on 6 cpu's and I do fill quit a number of row-entries on >>> the >>> "wrong" cpu. However thats also the case the second run. I checked >>> that there are no additional mallocs >>> MatGetInfo info.mallocs=0 both after MatSetValues and after >>> MatAssemblyBegin, MatAssemblyEnd. >> >> Run your code with the option '-log_summary' and check which function >> call dominates the execution time. > > the time is spend in MatStashScatterGetMesg_Private > >> >>> I run it on 6 cpu's and I do fill quit a number of row-entries on >>> the >>> "wrong" cpu. >> >> Likely, the communication that sending the entries to the >> corrected cpu consume the time. Can you fill the entries in the >> correct cpu? > > the second time the entries are filled on the wrong CPU as well. > i am curious about the difference in time between run 1 and 2. > >> >> Hong >> >>> cheers >>> Thomas > From geenen at gmail.com Sun Feb 3 06:44:56 2008 From: geenen at gmail.com (Thomas Geenen) Date: Sun, 3 Feb 2008 13:44:56 +0100 Subject: assembly In-Reply-To: <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> Message-ID: <200802031344.56290.geenen@gmail.com> i call ierr = MatStashSetInitialSize(A[*seqsolve],stash_size, stash_size);CHKERRQ(ierr); with 100 000 000 for the stash size to make sure that's not the bottleneck the assemble time remains unchanged however. nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485 reallocs in MatAssemblyBegin_MPIAIJ = 0 cheers Thomas On Saturday 02 February 2008 23:19, Barry Smith wrote: > The matstash has a concept of preallocation also. During the first > setvalues > it is allocating more and more memory for the stash. In the second > setvalues > the stash is large enough so does not require any addition allocation. > > You can use the option -matstash_initial_size to allocate > enough space > initially so that the first setvalues is also fast. It does not look > like there is a way > coded to get the that you should use. It should be set to the > maximum nonzeros > any process has that belongs to other processes. The stash handling > code is > in src/mat/utils/matstash.c, perhaps you can figure out how to > printout with PetscInfo() > the sizes needed? > > > Barry > > On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote: > > On Saturday 02 February 2008 18:33, Hong Zhang wrote: > >> On Sat, 2 Feb 2008, Thomas Geenen wrote: > >>> Dear Petsc users, > >>> > >>> I would like to understand what is slowing down the assembly phase > >>> of my > >>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough > >>> guess of > >>> the number of off diagonal entries and then use a conservative > >>> value to > >>> make sure I do not need extra mallocs. (the number of diagonal > >>> entries is > >>> exact) > >>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. > >>> The first time i call MatSetValues and MatAssemblyBegin, > >>> MatAssemblyEnd it takes about 170 seconds > >>> the second time 0.3 seconds. > >>> I run it on 6 cpu's and I do fill quit a number of row-entries on > >>> the > >>> "wrong" cpu. However thats also the case the second run. I checked > >>> that there are no additional mallocs > >>> MatGetInfo info.mallocs=0 both after MatSetValues and after > >>> MatAssemblyBegin, MatAssemblyEnd. > >> > >> Run your code with the option '-log_summary' and check which function > >> call dominates the execution time. > > > > the time is spend in MatStashScatterGetMesg_Private > > > >>> I run it on 6 cpu's and I do fill quit a number of row-entries on > >>> the > >>> "wrong" cpu. > >> > >> Likely, the communication that sending the entries to the > >> corrected cpu consume the time. Can you fill the entries in the > >> correct cpu? > > > > the second time the entries are filled on the wrong CPU as well. > > i am curious about the difference in time between run 1 and 2. > > > >> Hong > >> > >>> cheers > >>> Thomas From bsmith at mcs.anl.gov Sun Feb 3 13:51:51 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 3 Feb 2008 13:51:51 -0600 Subject: assembly In-Reply-To: <200802031344.56290.geenen@gmail.com> References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> <200802031344.56290.geenen@gmail.com> Message-ID: <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov> Hmmm, are you saying the first round of setting values still takes much longer then the second round? Or is it the time in MatAssemblyBegin() much longer the first time? The MatAssembly process has one piece of code that's work is order n*size; where n is the stash size and size is the number of processes, all other work is only order n. Could you send the -log_summary output? Barry The a On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote: > i call > ierr = MatStashSetInitialSize(A[*seqsolve],stash_size, > stash_size);CHKERRQ(ierr); > with 100 000 000 for the stash size to make sure that's not the > bottleneck > > the assemble time remains unchanged however. > > nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485 > reallocs in MatAssemblyBegin_MPIAIJ = 0 > > cheers > Thomas > > On Saturday 02 February 2008 23:19, Barry Smith wrote: >> The matstash has a concept of preallocation also. During the first >> setvalues >> it is allocating more and more memory for the stash. In the second >> setvalues >> the stash is large enough so does not require any addition >> allocation. >> >> You can use the option -matstash_initial_size to allocate >> enough space >> initially so that the first setvalues is also fast. It does not look >> like there is a way >> coded to get the that you should use. It should be set to the >> maximum nonzeros >> any process has that belongs to other processes. The stash handling >> code is >> in src/mat/utils/matstash.c, perhaps you can figure out how to >> printout with PetscInfo() >> the sizes needed? >> >> >> Barry >> >> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote: >>> On Saturday 02 February 2008 18:33, Hong Zhang wrote: >>>> On Sat, 2 Feb 2008, Thomas Geenen wrote: >>>>> Dear Petsc users, >>>>> >>>>> I would like to understand what is slowing down the assembly phase >>>>> of my >>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough >>>>> guess of >>>>> the number of off diagonal entries and then use a conservative >>>>> value to >>>>> make sure I do not need extra mallocs. (the number of diagonal >>>>> entries is >>>>> exact) >>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. >>>>> The first time i call MatSetValues and MatAssemblyBegin, >>>>> MatAssemblyEnd it takes about 170 seconds >>>>> the second time 0.3 seconds. >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on >>>>> the >>>>> "wrong" cpu. However thats also the case the second run. I checked >>>>> that there are no additional mallocs >>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after >>>>> MatAssemblyBegin, MatAssemblyEnd. >>>> >>>> Run your code with the option '-log_summary' and check which >>>> function >>>> call dominates the execution time. >>> >>> the time is spend in MatStashScatterGetMesg_Private >>> >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on >>>>> the >>>>> "wrong" cpu. >>>> >>>> Likely, the communication that sending the entries to the >>>> corrected cpu consume the time. Can you fill the entries in the >>>> correct cpu? >>> >>> the second time the entries are filled on the wrong CPU as well. >>> i am curious about the difference in time between run 1 and 2. >>> >>>> Hong >>>> >>>>> cheers >>>>> Thomas > From grs2103 at columbia.edu Sun Feb 3 16:29:43 2008 From: grs2103 at columbia.edu (Gideon Simpson) Date: Sun, 3 Feb 2008 17:29:43 -0500 Subject: intel mkl on os x Message-ID: <9BC9DBA4-C8FC-4C53-8D40-748DEF0AF709@columbia.edu> If I wished to use the intel MKL instead of Apple's vecLib framework for my BLAS/LAPACK, what would be the appropriate flags to give petsc when it's configuring? -gideon From bsmith at mcs.anl.gov Sun Feb 3 18:13:13 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 3 Feb 2008 18:13:13 -0600 Subject: intel mkl on os x In-Reply-To: <9BC9DBA4-C8FC-4C53-8D40-748DEF0AF709@columbia.edu> References: <9BC9DBA4-C8FC-4C53-8D40-748DEF0AF709@columbia.edu> Message-ID: Locate the library libmkl_lapack.a then use --with-blas-lapack- dir=/the path to the libmkl_lapack.a Good luck, Barry On Feb 3, 2008, at 4:29 PM, Gideon Simpson wrote: > If I wished to use the intel MKL instead of Apple's vecLib framework > for my BLAS/LAPACK, what would be the appropriate flags to give > petsc when it's configuring? > > -gideon > From knepley at gmail.com Sun Feb 3 19:59:00 2008 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 3 Feb 2008 19:59:00 -0600 Subject: Overdetermined, non-linear In-Reply-To: <1201866864.6394.25.camel@erlend-ws.in.holberger.com> References: <1201866864.6394.25.camel@erlend-ws.in.holberger.com> Message-ID: On Feb 1, 2008 5:54 AM, Erlend Pedersen :. wrote: > I am attempting to use the PETSc nonlinear solver on an overdetermined > system of non-linear equations. Hence, the Jacobian is not square, and > so far we have unfortunately not succeeded with any combination of snes, > ksp and pc. > > Could you confirm that snes actually works for overdetermined systems, > and if so, is there an application example we could look at in order to > make sure there is nothing wrong with our test-setup? > > We have previously used the MINPACK routine LMDER very successfully, but > for our current problem sizes we rely on the use of sparse matrix > representations and parallel architectures. PETSc's abstractions and > automatic MPI makes this system very attractive for us, and we have > already used the PETSc LSQR solver with great success. So in the sense that SNES is really just an iteration with an embedded solve, yes it can solve non-square nonlinear systems. However, the user has to understand what is meant by the Function and Jacobian evaluation methods. I suggest implementing the simplest algorithm for non-square systems: http://en.wikipedia.org/wiki/Gauss-Newton_algorithm By implement, I mean your Function and Jacobian methods should return the correct terms. I believe the reason you have not seen convergence is that the result of the solve does not "mean" the correct thing for the iteration in your current setup. Matt > Thank you very much. > > > Regards, > Erlend Pedersen :. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From recrusader at gmail.com Mon Feb 4 00:37:23 2008 From: recrusader at gmail.com (Yujie) Date: Mon, 4 Feb 2008 14:37:23 +0800 Subject: how to inverse a sparse matrix in Petsc? Message-ID: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> Hi, Now, I want to inverse a sparse matrix. I have browsed the manual, however, I can't find some information. could you give me some advice? thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 4 00:46:29 2008 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 4 Feb 2008 00:46:29 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> Message-ID: On Feb 4, 2008 12:37 AM, Yujie wrote: > Hi, > Now, I want to inverse a sparse matrix. I have browsed the manual, however, > I can't find some information. could you give me some advice? This is generally a bad idea since the inverse is dense. However, you can use sparse direct factorization if you configure with 3rd party packages like MUMPS, SuperLU, DSCPACK, or Spooles. Matt > thanks a lot. > > Regards, > Yujie -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Mon Feb 4 07:10:15 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Feb 2008 07:10:15 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> Message-ID: <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> For sequential AIJ matrices you can fill the B matrix with the identity and then use MatMatSolve(). Note since the inverse of a sparse matrix is dense the B matrix is a SeqDense matrix. Barry On Feb 4, 2008, at 12:37 AM, Yujie wrote: > Hi, > Now, I want to inverse a sparse matrix. I have browsed the manual, > however, I can't find some information. could you give me some advice? > > thanks a lot. > > Regards, > Yujie > From li76pan at yahoo.com Mon Feb 4 07:49:36 2008 From: li76pan at yahoo.com (li pan) Date: Mon, 4 Feb 2008 05:49:36 -0800 (PST) Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> Message-ID: <351384.55725.qm@web36802.mail.mud.yahoo.com> hi, Does MatMatSolve() use Gauss elimination method? thanx pan --- Barry Smith wrote: > > For sequential AIJ matrices you can fill the B > matrix with the > identity and then use > MatMatSolve(). > > Note since the inverse of a sparse matrix is > dense the B matrix is > a SeqDense matrix. > > Barry > > On Feb 4, 2008, at 12:37 AM, Yujie wrote: > > > Hi, > > Now, I want to inverse a sparse matrix. I have > browsed the manual, > > however, I can't find some information. could you > give me some advice? > > > > thanks a lot. > > > > Regards, > > Yujie > > > > ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From bsmith at mcs.anl.gov Mon Feb 4 07:58:27 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Feb 2008 07:58:27 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <351384.55725.qm@web36802.mail.mud.yahoo.com> References: <351384.55725.qm@web36802.mail.mud.yahoo.com> Message-ID: Yes. It uses the LU factorization of the matrix computed with MatLUFactor(). Barry On Feb 4, 2008, at 7:49 AM, li pan wrote: > hi, > Does MatMatSolve() use Gauss elimination method? > > thanx > > pan > > > --- Barry Smith wrote: > >> >> For sequential AIJ matrices you can fill the B >> matrix with the >> identity and then use >> MatMatSolve(). >> >> Note since the inverse of a sparse matrix is >> dense the B matrix is >> a SeqDense matrix. >> >> Barry >> >> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >> >>> Hi, >>> Now, I want to inverse a sparse matrix. I have >> browsed the manual, >>> however, I can't find some information. could you >> give me some advice? >>> >>> thanks a lot. >>> >>> Regards, >>> Yujie >>> >> >> > > > > > ____________________________________________________________________________________ > Never miss a thing. Make Yahoo your home page. > http://www.yahoo.com/r/hs > From dave.mayhem23 at gmail.com Mon Feb 4 08:04:17 2008 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 5 Feb 2008 01:04:17 +1100 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> Message-ID: <956373f0802040604u7cfaa7e1t682e5b36f1791a4@mail.gmail.com> Hi, Does anyone know how much faster (approximately) using MatMatSolve is compared to using PCComputeExplicitOperator(), when the PC in the latter function is defined to be LU? Cheers, Dave. On Feb 5, 2008 12:10 AM, Barry Smith wrote: > > For sequential AIJ matrices you can fill the B matrix with the > identity and then use > MatMatSolve(). > > Note since the inverse of a sparse matrix is dense the B matrix is > a SeqDense matrix. > > Barry > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Feb 4 08:06:58 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Feb 2008 08:06:58 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <956373f0802040604u7cfaa7e1t682e5b36f1791a4@mail.gmail.com> References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> <956373f0802040604u7cfaa7e1t682e5b36f1791a4@mail.gmail.com> Message-ID: <74F47236-A9CB-4FB9-83F3-71F62DF07868@mcs.anl.gov> They should be pretty much the same. In both cases the huge bulk of the time is spent in the triangular solves. Barry On Feb 4, 2008, at 8:04 AM, Dave May wrote: > Hi, > Does anyone know how much faster (approximately) using > MatMatSolve is compared > to using PCComputeExplicitOperator(), when the PC in the latter > function is defined to be LU? > > Cheers, > Dave. > > > On Feb 5, 2008 12:10 AM, Barry Smith wrote: > > For sequential AIJ matrices you can fill the B matrix with the > identity and then use > MatMatSolve(). > > Note since the inverse of a sparse matrix is dense the B matrix is > a SeqDense matrix. > > Barry > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From geenen at gmail.com Mon Feb 4 10:41:11 2008 From: geenen at gmail.com (Thomas Geenen) Date: Mon, 4 Feb 2008 17:41:11 +0100 Subject: assembly In-Reply-To: <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov> References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> <200802031344.56290.geenen@gmail.com> <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov> Message-ID: <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com> On Feb 3, 2008 8:51 PM, Barry Smith wrote: > > Hmmm, are you saying the first round of setting values still > takes much longer then the second round? yes >Or is it the time > in MatAssemblyBegin() much longer the first time? > > The MatAssembly process has one piece of code that's > work is order n*size; where n is the stash size and size is the > number of processes, all other work is only order n. > > Could you send the -log_summary output? the timing is cumulative i guess? in between these two solves i solve a smaller system for which i do not include the timing. run 1 Max Max/Min Avg Total Time (sec): 2.154e+02 1.00001 2.154e+02 Objects: 2.200e+01 1.00000 2.200e+01 Flops: 0.000e+00 0.00000 0.000e+00 0.000e+00 Flops/sec: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Messages: 1.750e+01 1.25000 1.633e+01 9.800e+01 MPI Message Lengths: 3.460e+06 1.29903 1.855e+05 1.818e+07 MPI Reductions: 4.167e+00 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.1537e+02 100.0% 0.0000e+00 0.0% 9.800e+01 100.0% 1.855e+05 100.0% 2.500e+01 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatAssemblyBegin 1 1.0 2.9536e-01 4.0 0.00e+00 0.0 4.2e+01 4.2e+05 2.0e+00 0 0 43 98 8 0 0 43 98 8 0 MatAssemblyEnd 1 1.0 2.1410e+02 1.0 0.00e+00 0.0 2.8e+01 8.2e+03 7.0e+00 99 0 29 1 28 99 0 29 1 28 0 MatZeroEntries 1 1.0 9.3739e-02 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 4 1.0 3.9721e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage Matrix 3 0 0 0 Index Set 6 6 45500 0 Vec 6 1 196776 0 Vec Scatter 3 0 0 0 IS L to G Mapping 2 0 0 0 Krylov Solver 1 0 0 0 Preconditioner 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 1.71661e-06 Average time for MPI_Barrier(): 0.000159979 Average time for zero size MPI_Send(): 1.29938e-05 Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Fri Sep 28 23:34:20 2007 run2 Max Max/Min Avg Total Time (sec): 2.298e+02 1.00000 2.298e+02 Objects: 2.600e+02 1.00000 2.600e+02 Flops: 1.265e+09 1.17394 1.161e+09 6.969e+09 Flops/sec: 5.505e+06 1.17394 5.054e+06 3.032e+07 MPI Messages: 1.436e+03 1.20816 1.326e+03 7.956e+03 MPI Message Lengths: 2.120e+07 1.23141 1.457e+04 1.159e+08 MPI Reductions: 4.192e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.2943e+02 99.8% 6.9689e+09 100.0% 7.944e+03 99.8% 1.457e+04 100.0% 2.230e+02 8.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 135 1.0 2.0830e+00 1.4 2.30e+08 1.9 2.4e+03 1.3e+04 0.0e+00 1 26 30 26 0 1 26 30 26 0 862 MatMultAdd 40 1.0 3.2598e-01 4.5 2.68e+07 5.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31 MatSolve 44 1.0 6.7841e-01 1.7 1.93e+08 1.7 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 672 MatRelax 80 1.0 4.2949e+00 1.6 8.77e+07 1.2 2.2e+03 1.3e+04 0.0e+00 1 23 28 26 0 1 23 28 26 0 374 MatLUFactorSym 1 1.0 7.6739e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 2.9370e+00 1.5 2.53e+08 1.5 0.0e+00 0.0e+00 0.0e+00 1 44 0 0 0 1 44 0 0 0 1037 MatILUFactorSym 1 1.0 6.7334e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 7 1.0 7.1652e-01 4.4 0.00e+00 0.0 1.3e+02 2.9e+05 8.0e+00 0 0 2 32 0 0 0 2 32 4 0 MatAssemblyEnd 7 1.0 2.1473e+02 1.0 0.00e+00 0.0 8.4e+01 3.2e+03 2.2e+01 93 0 1 0 1 94 0 1 0 10 0 MatGetRowIJ 2 1.0 1.8899e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 3.1915e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0 MatGetOrdering 2 1.0 1.2184e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 MatIncreaseOvrlp 1 1.0 7.6865e-02 1.0 0.00e+00 0.0 1.1e+03 2.4e+03 2.0e+01 0 0 13 2 1 0 0 13 2 9 0 MatZeroEntries 3 1.0 1.0429e-01 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MAT_GetRedundantMatrix 1 1.0 3.0144e-02 1.2 0.00e+00 0.0 9.0e+01 8.0e+04 2.0e+00 0 0 1 6 0 0 0 1 6 1 0 VecDot 39 1.0 9.0680e-01140.0 2.95e+08191.3 0.0e+00 0.0e+00 3.9e+01 0 0 0 0 2 0 0 0 0 17 11 VecMDot 8 1.0 2.8777e-03 2.8 4.16e+07 3.3 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 4 84 VecNorm 31 1.0 6.2301e-02 4.2 7.17e+07 5.8 0.0e+00 0.0e+00 3.1e+01 0 0 0 0 1 0 0 0 0 14 86 VecScale 85 1.0 1.7729e-03 1.4 4.94e+08 1.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2138 VecCopy 4 1.0 6.2108e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 139 1.0 3.5934e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 98 1.0 5.1496e-03 1.3 9.24e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4705 VecAYPX 40 1.0 1.0311e-02 1.4 1.90e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 969 VecWAXPY 75 1.0 2.6060e-02 1.4 1.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 618 VecMAXPY 9 1.0 2.1315e-04 1.6 3.01e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1355 VecAssemblyBegin 4 1.0 1.9898e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 5 0 VecAssemblyEnd 4 1.0 2.1219e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 267 1.0 1.0224e-01 1.6 0.00e+00 0.0 6.3e+03 1.1e+04 0.0e+00 0 0 79 59 0 0 0 79 59 0 0 VecScatterEnd 267 1.0 7.8653e-0111.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 4 1.0 3.7677e-03 1.9 4.78e+07 2.4 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 5 141 KSPSetup 6 1.0 1.0260e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 KSPSolve 2 1.0 7.8238e+00 1.0 1.16e+08 1.2 5.9e+03 1.1e+04 8.1e+01 3 70 75 54 3 3 70 75 54 36 622 PCSetUp 3 1.0 5.1323e+00 1.2 1.29e+08 1.2 1.6e+03 9.1e+03 7.3e+01 2 47 20 13 3 2 47 20 13 33 636 PCSetUpOnBlocks 1 1.0 1.3325e+00 1.0 1.47e+08 1.0 0.0e+00 0.0e+00 3.0e+00 1 17 0 0 0 1 17 0 0 1 871 PCApply 44 1.0 5.7917e+00 1.1 9.60e+07 1.2 4.7e+03 1.0e+04 0.0e+00 2 41 59 42 0 2 41 59 42 0 497 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage Matrix 16 0 0 0 Index Set 36 29 256760 0 Vec 176 93 16582464 0 Vec Scatter 10 0 0 0 IS L to G Mapping 4 0 0 0 Krylov Solver 6 0 0 0 Preconditioner 6 0 0 0 Viewer 4 2 0 0 Container 2 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 8.10623e-07 Average time for MPI_Barrier(): 0.000178194 Average time for zero size MPI_Send(): 1.33117e-05 OptionTable: -mg_levels_ksp_type richardson OptionTable: -mg_levels_pc_sor_omega 1.05 OptionTable: -mg_levels_pc_type sor OptionTable: -pc_ml_PrintLevel 4 OptionTable: -pc_ml_maxNlevels 2 OptionTable: -pc_type ml Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Fri Sep 28 23:34:20 2007 > > Barry > > > The a > > On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote: > > > i call > > ierr = MatStashSetInitialSize(A[*seqsolve],stash_size, > > stash_size);CHKERRQ(ierr); > > with 100 000 000 for the stash size to make sure that's not the > > bottleneck > > > > the assemble time remains unchanged however. > > > > nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485 > > reallocs in MatAssemblyBegin_MPIAIJ = 0 > > > > cheers > > Thomas > > > > On Saturday 02 February 2008 23:19, Barry Smith wrote: > >> The matstash has a concept of preallocation also. During the first > >> setvalues > >> it is allocating more and more memory for the stash. In the second > >> setvalues > >> the stash is large enough so does not require any addition > >> allocation. > >> > >> You can use the option -matstash_initial_size to allocate > >> enough space > >> initially so that the first setvalues is also fast. It does not look > >> like there is a way > >> coded to get the that you should use. It should be set to the > >> maximum nonzeros > >> any process has that belongs to other processes. The stash handling > >> code is > >> in src/mat/utils/matstash.c, perhaps you can figure out how to > >> printout with PetscInfo() > >> the sizes needed? > >> > >> > >> Barry > >> > >> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote: > >>> On Saturday 02 February 2008 18:33, Hong Zhang wrote: > >>>> On Sat, 2 Feb 2008, Thomas Geenen wrote: > >>>>> Dear Petsc users, > >>>>> > >>>>> I would like to understand what is slowing down the assembly phase > >>>>> of my > >>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough > >>>>> guess of > >>>>> the number of off diagonal entries and then use a conservative > >>>>> value to > >>>>> make sure I do not need extra mallocs. (the number of diagonal > >>>>> entries is > >>>>> exact) > >>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. > >>>>> The first time i call MatSetValues and MatAssemblyBegin, > >>>>> MatAssemblyEnd it takes about 170 seconds > >>>>> the second time 0.3 seconds. > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on > >>>>> the > >>>>> "wrong" cpu. However thats also the case the second run. I checked > >>>>> that there are no additional mallocs > >>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after > >>>>> MatAssemblyBegin, MatAssemblyEnd. > >>>> > >>>> Run your code with the option '-log_summary' and check which > >>>> function > >>>> call dominates the execution time. > >>> > >>> the time is spend in MatStashScatterGetMesg_Private > >>> > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on > >>>>> the > >>>>> "wrong" cpu. > >>>> > >>>> Likely, the communication that sending the entries to the > >>>> corrected cpu consume the time. Can you fill the entries in the > >>>> correct cpu? > >>> > >>> the second time the entries are filled on the wrong CPU as well. > >>> i am curious about the difference in time between run 1 and 2. > >>> > >>>> Hong > >>>> > >>>>> cheers > >>>>> Thomas > > > > From knepley at gmail.com Mon Feb 4 10:47:44 2008 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 4 Feb 2008 10:47:44 -0600 Subject: assembly In-Reply-To: <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com> References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> <200802031344.56290.geenen@gmail.com> <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov> <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com> Message-ID: On Feb 4, 2008 10:41 AM, Thomas Geenen wrote: > On Feb 3, 2008 8:51 PM, Barry Smith wrote: > > > > Hmmm, are you saying the first round of setting values still > > takes much longer then the second round? > > yes > > >Or is it the time > > in MatAssemblyBegin() much longer the first time? > > > > The MatAssembly process has one piece of code that's > > work is order n*size; where n is the stash size and size is the > > number of processes, all other work is only order n. > > > > Could you send the -log_summary output? > > the timing is cumulative i guess? > in between these two solves i solve a smaller system for which i do > not include the timing. I ma having a little trouble reading this. I think the easiest thing to do is wrap the two section of code in their own sections: PetscLogStageRegister(&stage1, "First assembly"); PetscLogStageRegister(&stage2, "Second assembly"); PetscLogStagePush(stage1); PetscLogStagePop(); PetscLogStagePush(stage2); PetscLogStagePop(); Then we can also get a look at how many messages are sent and how big they are. Thanks, Matt > run 1 > Max Max/Min Avg Total > Time (sec): 2.154e+02 1.00001 2.154e+02 > Objects: 2.200e+01 1.00000 2.200e+01 > Flops: 0.000e+00 0.00000 0.000e+00 0.000e+00 > Flops/sec: 0.000e+00 0.00000 0.000e+00 0.000e+00 > MPI Messages: 1.750e+01 1.25000 1.633e+01 9.800e+01 > MPI Message Lengths: 3.460e+06 1.29903 1.855e+05 1.818e+07 > MPI Reductions: 4.167e+00 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length > N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 2.1537e+02 100.0% 0.0000e+00 0.0% 9.800e+01 > 100.0% 1.855e+05 100.0% 2.500e+01 100.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops/sec: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message > lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was run without the PreLoadBegin() # > # macros. To get timing results we always recommend # > # preloading. otherwise timing numbers may be # > # meaningless. # > ########################################################## > > > Event Count Time (sec) Flops/sec > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatAssemblyBegin 1 1.0 2.9536e-01 4.0 0.00e+00 0.0 4.2e+01 > 4.2e+05 2.0e+00 0 0 43 98 8 0 0 43 98 8 0 > MatAssemblyEnd 1 1.0 2.1410e+02 1.0 0.00e+00 0.0 2.8e+01 > 8.2e+03 7.0e+00 99 0 29 1 28 99 0 29 1 28 0 > MatZeroEntries 1 1.0 9.3739e-02 5.9 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 4 1.0 3.9721e-04 2.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > > --- Event Stage 0: Main Stage > > Matrix 3 0 0 0 > Index Set 6 6 45500 0 > Vec 6 1 196776 0 > Vec Scatter 3 0 0 0 > IS L to G Mapping 2 0 0 0 > Krylov Solver 1 0 0 0 > Preconditioner 1 0 0 0 > ======================================================================================================================== > Average time to get PetscTime(): 1.71661e-06 > Average time for MPI_Barrier(): 0.000159979 > Average time for zero size MPI_Send(): 1.29938e-05 > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > Configure run at: Fri Sep 28 23:34:20 2007 > > run2 > Max Max/Min Avg Total > Time (sec): 2.298e+02 1.00000 2.298e+02 > Objects: 2.600e+02 1.00000 2.600e+02 > Flops: 1.265e+09 1.17394 1.161e+09 6.969e+09 > Flops/sec: 5.505e+06 1.17394 5.054e+06 3.032e+07 > MPI Messages: 1.436e+03 1.20816 1.326e+03 7.956e+03 > MPI Message Lengths: 2.120e+07 1.23141 1.457e+04 1.159e+08 > MPI Reductions: 4.192e+02 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length > N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 2.2943e+02 99.8% 6.9689e+09 100.0% 7.944e+03 > 99.8% 1.457e+04 100.0% 2.230e+02 8.9% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops/sec: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message > lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was run without the PreLoadBegin() # > # macros. To get timing results we always recommend # > # preloading. otherwise timing numbers may be # > # meaningless. # > ########################################################## > > > Event Count Time (sec) Flops/sec > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 135 1.0 2.0830e+00 1.4 2.30e+08 1.9 2.4e+03 > 1.3e+04 0.0e+00 1 26 30 26 0 1 26 30 26 0 862 > MatMultAdd 40 1.0 3.2598e-01 4.5 2.68e+07 5.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31 > MatSolve 44 1.0 6.7841e-01 1.7 1.93e+08 1.7 0.0e+00 > 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 672 > MatRelax 80 1.0 4.2949e+00 1.6 8.77e+07 1.2 2.2e+03 > 1.3e+04 0.0e+00 1 23 28 26 0 1 23 28 26 0 374 > MatLUFactorSym 1 1.0 7.6739e-02 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 2 1.0 2.9370e+00 1.5 2.53e+08 1.5 0.0e+00 > 0.0e+00 0.0e+00 1 44 0 0 0 1 44 0 0 0 1037 > MatILUFactorSym 1 1.0 6.7334e-01 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyBegin 7 1.0 7.1652e-01 4.4 0.00e+00 0.0 1.3e+02 > 2.9e+05 8.0e+00 0 0 2 32 0 0 0 2 32 4 0 > MatAssemblyEnd 7 1.0 2.1473e+02 1.0 0.00e+00 0.0 8.4e+01 > 3.2e+03 2.2e+01 93 0 1 0 1 94 0 1 0 10 0 > MatGetRowIJ 2 1.0 1.8899e-03 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrice 1 1.0 3.1915e-02 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0 > MatGetOrdering 2 1.0 1.2184e-02 1.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 > MatIncreaseOvrlp 1 1.0 7.6865e-02 1.0 0.00e+00 0.0 1.1e+03 > 2.4e+03 2.0e+01 0 0 13 2 1 0 0 13 2 9 0 > MatZeroEntries 3 1.0 1.0429e-01 4.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MAT_GetRedundantMatrix 1 1.0 3.0144e-02 1.2 0.00e+00 0.0 9.0e+01 > 8.0e+04 2.0e+00 0 0 1 6 0 0 0 1 6 1 0 > VecDot 39 1.0 9.0680e-01140.0 2.95e+08191.3 0.0e+00 > 0.0e+00 3.9e+01 0 0 0 0 2 0 0 0 0 17 11 > VecMDot 8 1.0 2.8777e-03 2.8 4.16e+07 3.3 0.0e+00 > 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 4 84 > VecNorm 31 1.0 6.2301e-02 4.2 7.17e+07 5.8 0.0e+00 > 0.0e+00 3.1e+01 0 0 0 0 1 0 0 0 0 14 86 > VecScale 85 1.0 1.7729e-03 1.4 4.94e+08 1.4 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2138 > VecCopy 4 1.0 6.2108e-04 1.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 139 1.0 3.5934e-03 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 98 1.0 5.1496e-03 1.3 9.24e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4705 > VecAYPX 40 1.0 1.0311e-02 1.4 1.90e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 969 > VecWAXPY 75 1.0 2.6060e-02 1.4 1.22e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 618 > VecMAXPY 9 1.0 2.1315e-04 1.6 3.01e+08 1.2 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1355 > VecAssemblyBegin 4 1.0 1.9898e-03 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 5 0 > VecAssemblyEnd 4 1.0 2.1219e-05 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 267 1.0 1.0224e-01 1.6 0.00e+00 0.0 6.3e+03 > 1.1e+04 0.0e+00 0 0 79 59 0 0 0 79 59 0 0 > VecScatterEnd 267 1.0 7.8653e-0111.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPGMRESOrthog 4 1.0 3.7677e-03 1.9 4.78e+07 2.4 0.0e+00 > 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 5 141 > KSPSetup 6 1.0 1.0260e-02 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 > KSPSolve 2 1.0 7.8238e+00 1.0 1.16e+08 1.2 5.9e+03 > 1.1e+04 8.1e+01 3 70 75 54 3 3 70 75 54 36 622 > PCSetUp 3 1.0 5.1323e+00 1.2 1.29e+08 1.2 1.6e+03 > 9.1e+03 7.3e+01 2 47 20 13 3 2 47 20 13 33 636 > PCSetUpOnBlocks 1 1.0 1.3325e+00 1.0 1.47e+08 1.0 0.0e+00 > 0.0e+00 3.0e+00 1 17 0 0 0 1 17 0 0 1 871 > PCApply 44 1.0 5.7917e+00 1.1 9.60e+07 1.2 4.7e+03 > 1.0e+04 0.0e+00 2 41 59 42 0 2 41 59 42 0 497 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > > --- Event Stage 0: Main Stage > > Matrix 16 0 0 0 > Index Set 36 29 256760 0 > Vec 176 93 16582464 0 > Vec Scatter 10 0 0 0 > IS L to G Mapping 4 0 0 0 > Krylov Solver 6 0 0 0 > Preconditioner 6 0 0 0 > Viewer 4 2 0 0 > Container 2 0 0 0 > ======================================================================================================================== > Average time to get PetscTime(): 8.10623e-07 > Average time for MPI_Barrier(): 0.000178194 > Average time for zero size MPI_Send(): 1.33117e-05 > OptionTable: -mg_levels_ksp_type richardson > OptionTable: -mg_levels_pc_sor_omega 1.05 > OptionTable: -mg_levels_pc_type sor > OptionTable: -pc_ml_PrintLevel 4 > OptionTable: -pc_ml_maxNlevels 2 > OptionTable: -pc_type ml > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > Configure run at: Fri Sep 28 23:34:20 2007 > > > > > > Barry > > > > > > The a > > > > On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote: > > > > > i call > > > ierr = MatStashSetInitialSize(A[*seqsolve],stash_size, > > > stash_size);CHKERRQ(ierr); > > > with 100 000 000 for the stash size to make sure that's not the > > > bottleneck > > > > > > the assemble time remains unchanged however. > > > > > > nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485 > > > reallocs in MatAssemblyBegin_MPIAIJ = 0 > > > > > > cheers > > > Thomas > > > > > > On Saturday 02 February 2008 23:19, Barry Smith wrote: > > >> The matstash has a concept of preallocation also. During the first > > >> setvalues > > >> it is allocating more and more memory for the stash. In the second > > >> setvalues > > >> the stash is large enough so does not require any addition > > >> allocation. > > >> > > >> You can use the option -matstash_initial_size to allocate > > >> enough space > > >> initially so that the first setvalues is also fast. It does not look > > >> like there is a way > > >> coded to get the that you should use. It should be set to the > > >> maximum nonzeros > > >> any process has that belongs to other processes. The stash handling > > >> code is > > >> in src/mat/utils/matstash.c, perhaps you can figure out how to > > >> printout with PetscInfo() > > >> the sizes needed? > > >> > > >> > > >> Barry > > >> > > >> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote: > > >>> On Saturday 02 February 2008 18:33, Hong Zhang wrote: > > >>>> On Sat, 2 Feb 2008, Thomas Geenen wrote: > > >>>>> Dear Petsc users, > > >>>>> > > >>>>> I would like to understand what is slowing down the assembly phase > > >>>>> of my > > >>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough > > >>>>> guess of > > >>>>> the number of off diagonal entries and then use a conservative > > >>>>> value to > > >>>>> make sure I do not need extra mallocs. (the number of diagonal > > >>>>> entries is > > >>>>> exact) > > >>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. > > >>>>> The first time i call MatSetValues and MatAssemblyBegin, > > >>>>> MatAssemblyEnd it takes about 170 seconds > > >>>>> the second time 0.3 seconds. > > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on > > >>>>> the > > >>>>> "wrong" cpu. However thats also the case the second run. I checked > > >>>>> that there are no additional mallocs > > >>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after > > >>>>> MatAssemblyBegin, MatAssemblyEnd. > > >>>> > > >>>> Run your code with the option '-log_summary' and check which > > >>>> function > > >>>> call dominates the execution time. > > >>> > > >>> the time is spend in MatStashScatterGetMesg_Private > > >>> > > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on > > >>>>> the > > >>>>> "wrong" cpu. > > >>>> > > >>>> Likely, the communication that sending the entries to the > > >>>> corrected cpu consume the time. Can you fill the entries in the > > >>>> correct cpu? > > >>> > > >>> the second time the entries are filled on the wrong CPU as well. > > >>> i am curious about the difference in time between run 1 and 2. > > >>> > > >>>> Hong > > >>>> > > >>>>> cheers > > >>>>> Thomas > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From geenen at gmail.com Mon Feb 4 11:34:29 2008 From: geenen at gmail.com (Thomas Geenen) Date: Mon, 4 Feb 2008 18:34:29 +0100 Subject: assembly In-Reply-To: References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> <200802031344.56290.geenen@gmail.com> <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov> <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com> Message-ID: <8aa042e10802040934p7308142fjff32c882308eceda@mail.gmail.com> hi matt, this is indeed much clearer i put the push and pop around MatAssemblyBegin/End 1: First_assembly: 1.4724e+02 90.8% 0.0000e+00 0.0% 7.000e+01 0.9% 2.266e+03 15.5% 9.000e+00 0.2% 2: Second_assembly: 2.3823e-01 0.1% 0.0000e+00 0.0% 7.000e+01 0.9% 1.276e+02 0.9% 9.000e+00 0.2% 3: Third_assembly: 5.0168e-01 0.3% 0.0000e+00 0.0% 4.200e+01 0.5% 2.237e+03 15.4% 3.000e+00 0.1% The second assembly is another system of equations (pressure correction in simpler) so 1 and 3 are 1 and 2 ...... cheers Thomas ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- Unknown Name on a linux-gnu named etna.geo.uu.nl with 6 processors, by geenen Mon Feb 4 18:27:48 2008 Using Petsc Release Version 2.3.3, Patch 3, Fri Jun 15 16:51:25 CDT 2007 HG revision: f051789beadcd36f77fb6111d20225e26ed7cc0d Max Max/Min Avg Total Time (sec): 1.621e+02 1.00000 1.621e+02 Objects: 2.600e+02 1.00000 2.600e+02 Flops: 1.265e+09 1.17394 1.161e+09 6.969e+09 Flops/sec: 7.806e+06 1.17393 7.166e+06 4.300e+07 MPI Messages: 1.436e+03 1.20816 1.326e+03 7.956e+03 MPI Message Lengths: 2.120e+07 1.23141 1.457e+04 1.159e+08 MPI Reductions: 9.862e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3160e+01 8.1% 6.9689e+09 100.0% 7.762e+03 97.6% 9.941e+03 68.2% 2.020e+02 3.4% 1: First_assembly: 1.4724e+02 90.8% 0.0000e+00 0.0% 7.000e+01 0.9% 2.266e+03 15.5% 9.000e+00 0.2% 2: Second_assembly: 2.3823e-01 0.1% 0.0000e+00 0.0% 7.000e+01 0.9% 1.276e+02 0.9% 9.000e+00 0.2% 3: Third_assembly: 5.0168e-01 0.3% 0.0000e+00 0.0% 4.200e+01 0.5% 2.237e+03 15.4% 3.000e+00 0.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops/sec: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was run without the PreLoadBegin() # # macros. To get timing results we always recommend # # preloading. otherwise timing numbers may be # # meaningless. # ########################################################## Event Count Time (sec) Flops/sec --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 135 1.0 1.5541e+00 1.3 3.03e+08 1.8 2.4e+03 1.3e+04 0.0e+00 1 26 30 26 0 11 26 30 38 0 1155 MatMultAdd 40 1.0 3.2611e-01 8.3 3.64e+07 7.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 31 MatSolve 44 1.0 6.7682e-01 1.7 1.94e+08 1.7 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 4 7 0 0 0 673 MatRelax 80 1.0 3.4453e+00 1.4 9.46e+07 1.1 2.2e+03 1.3e+04 0.0e+00 2 23 28 26 0 22 23 29 38 0 466 MatLUFactorSym 1 1.0 6.7567e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 2 1.0 2.8804e+00 1.4 2.53e+08 1.4 0.0e+00 0.0e+00 0.0e+00 1 44 0 0 0 17 44 0 0 0 1058 MatILUFactorSym 1 1.0 6.7676e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 5 0 0 0 0 0 MatAssemblyBegin 4 1.0 2.7711e-0237.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 1 0 MatAssemblyEnd 4 1.0 2.4401e-02 1.2 0.00e+00 0.0 2.8e+01 4.7e+02 7.0e+00 0 0 0 0 0 0 0 0 0 3 0 MatGetRowIJ 2 1.0 1.2948e-02 7.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 1 1.0 2.8603e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0 MatGetOrdering 2 1.0 2.3054e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 MatIncreaseOvrlp 1 1.0 8.1528e-02 1.0 0.00e+00 0.0 1.1e+03 2.4e+03 2.0e+01 0 0 13 2 0 1 0 14 3 10 0 MatZeroEntries 3 1.0 3.4422e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MAT_GetRedundantMatrix 1 1.0 3.5774e-02 1.6 0.00e+00 0.0 9.0e+01 8.0e+04 2.0e+00 0 0 1 6 0 0 0 1 9 1 0 VecDot 39 1.0 5.8092e-0131.0 1.02e+0842.3 0.0e+00 0.0e+00 3.9e+01 0 0 0 0 1 2 0 0 0 19 17 VecMDot 8 1.0 3.4735e-03 2.9 4.52e+07 5.5 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 4 69 VecNorm 31 1.0 3.8690e-02 4.1 1.11e+08 5.6 0.0e+00 0.0e+00 3.1e+01 0 0 0 0 1 0 0 0 0 15 139 VecScale 85 1.0 1.7631e-03 1.2 5.59e+08 1.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2150 VecCopy 4 1.0 5.5027e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 139 1.0 3.0956e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 98 1.0 5.0848e-03 1.3 9.35e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4765 VecAYPX 40 1.0 1.0264e-02 1.4 2.01e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 973 VecWAXPY 75 1.0 2.6191e-02 1.4 1.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 615 VecMAXPY 9 1.0 2.1935e-04 1.7 2.93e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1317 VecAssemblyBegin 4 1.0 1.9331e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 6 0 VecAssemblyEnd 4 1.0 2.3842e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 267 1.0 1.0370e-01 1.5 0.00e+00 0.0 6.3e+03 1.1e+04 0.0e+00 0 0 79 59 0 1 0 81 86 0 0 VecScatterEnd 267 1.0 4.1189e-01 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 KSPGMRESOrthog 4 1.0 4.5178e-03 2.0 5.22e+07 3.8 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 6 117 KSPSetup 6 1.0 7.9882e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 KSPSolve 2 1.0 6.6590e+00 1.0 1.36e+08 1.2 5.9e+03 1.1e+04 8.1e+01 4 70 75 54 1 51 70 76 80 40 731 PCSetUp 3 1.0 5.0877e+00 1.2 1.29e+08 1.2 1.6e+03 9.1e+03 7.3e+01 3 47 20 13 1 34 47 21 19 36 642 PCSetUpOnBlocks 1 1.0 1.3292e+00 1.0 1.46e+08 1.0 0.0e+00 0.0e+00 3.0e+00 1 17 0 0 0 10 17 0 0 1 873 PCApply 44 1.0 4.7444e+00 1.1 1.16e+08 1.2 4.7e+03 1.0e+04 0.0e+00 3 41 59 42 0 34 41 60 61 0 607 --- Event Stage 1: First_assembly MatAssemblyBegin 1 1.0 3.0375e-01 3.5 0.00e+00 0.0 4.2e+01 4.2e+05 2.0e+00 0 0 1 15 0 0 0 60 99 22 0 MatAssemblyEnd 1 1.0 1.4709e+02 1.0 0.00e+00 0.0 2.8e+01 8.2e+03 7.0e+00 91 0 0 0 0 100 0 40 1 78 0 --- Event Stage 2: Second_assembly MatAssemblyBegin 1 1.0 1.7451e-02 5.0 0.00e+00 0.0 4.2e+01 2.4e+04 2.0e+00 0 0 1 1 0 5 0 60 98 22 0 MatAssemblyEnd 1 1.0 2.3056e-01 1.0 0.00e+00 0.0 2.8e+01 8.4e+02 7.0e+00 0 0 0 0 0 95 0 40 2 78 0 --- Event Stage 3: Third_assembly MatAssemblyBegin 1 1.0 3.3676e-01 3.8 0.00e+00 0.0 4.2e+01 4.2e+05 2.0e+00 0 0 1 15 0 45 0100100 67 0 MatAssemblyEnd 1 1.0 3.3125e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 55 0 0 0 33 0 --- Event Stage 4: Unknown --- Event Stage 5: Unknown --- Event Stage 6: Unknown --- Event Stage 7: Unknown --- Event Stage 8: Unknown --- Event Stage 9: Unknown ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. --- Event Stage 0: Main Stage Matrix 16 0 0 0 Index Set 32 25 207888 0 Vec 172 91 16374224 0 Vec Scatter 8 0 0 0 IS L to G Mapping 4 0 0 0 Krylov Solver 6 0 0 0 Preconditioner 6 0 0 0 Viewer 4 2 0 0 Container 2 0 0 0 --- Event Stage 1: First_assembly Index Set 2 2 44140 0 Vec 2 1 196776 0 Vec Scatter 1 0 0 0 --- Event Stage 2: Second_assembly Index Set 2 2 4732 0 Vec 2 1 11464 0 Vec Scatter 1 0 0 0 --- Event Stage 3: Third_assembly --- Event Stage 4: Unknown --- Event Stage 5: Unknown --- Event Stage 6: Unknown --- Event Stage 7: Unknown --- Event Stage 8: Unknown --- Event Stage 9: Unknown ======================================================================================================================== Average time to get PetscTime(): 8.82149e-07 Average time for MPI_Barrier(): 0.000153208 Average time for zero size MPI_Send(): 1.86761e-05 OptionTable: -mg_levels_ksp_type richardson OptionTable: -mg_levels_pc_sor_omega 1.05 OptionTable: -mg_levels_pc_type sor OptionTable: -pc_ml_PrintLevel 4 OptionTable: -pc_ml_maxNlevels 2 OptionTable: -pc_type ml Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Fri Sep 28 23:34:20 2007 On Feb 4, 2008 5:47 PM, Matthew Knepley wrote: > On Feb 4, 2008 10:41 AM, Thomas Geenen wrote: > > On Feb 3, 2008 8:51 PM, Barry Smith wrote: > > > > > > Hmmm, are you saying the first round of setting values still > > > takes much longer then the second round? > > > > yes > > > > >Or is it the time > > > in MatAssemblyBegin() much longer the first time? > > > > > > The MatAssembly process has one piece of code that's > > > work is order n*size; where n is the stash size and size is the > > > number of processes, all other work is only order n. > > > > > > Could you send the -log_summary output? > > > > the timing is cumulative i guess? > > in between these two solves i solve a smaller system for which i do > > not include the timing. > > I ma having a little trouble reading this. I think the easiest thing to do > is wrap the two section of code in their own sections: > > PetscLogStageRegister(&stage1, "First assembly"); > PetscLogStageRegister(&stage2, "Second assembly"); > > PetscLogStagePush(stage1); > > PetscLogStagePop(); > > PetscLogStagePush(stage2); > > PetscLogStagePop(); > > Then we can also get a look at how many messages are sent > and how big they are. > > Thanks, > > Matt > > > > run 1 > > Max Max/Min Avg Total > > Time (sec): 2.154e+02 1.00001 2.154e+02 > > Objects: 2.200e+01 1.00000 2.200e+01 > > Flops: 0.000e+00 0.00000 0.000e+00 0.000e+00 > > Flops/sec: 0.000e+00 0.00000 0.000e+00 0.000e+00 > > MPI Messages: 1.750e+01 1.25000 1.633e+01 9.800e+01 > > MPI Message Lengths: 3.460e+06 1.29903 1.855e+05 1.818e+07 > > MPI Reductions: 4.167e+00 1.00000 > > > > Flop counting convention: 1 flop = 1 real number operation of type > > (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length > > N --> 2N flops > > and VecAXPY() for complex vectors of > > length N --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > > Messages --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts > > %Total Avg %Total counts %Total > > 0: Main Stage: 2.1537e+02 100.0% 0.0000e+00 0.0% 9.800e+01 > > 100.0% 1.855e+05 100.0% 2.500e+01 100.0% > > > > ------------------------------------------------------------------------------------------------------------------------ > > See the 'Profiling' chapter of the users' manual for details on > > interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops/sec: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all processors > > Mess: number of messages sent > > Avg. len: average message length > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() > > and PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this phase > > %M - percent messages in this phase %L - percent message > > lengths in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > > over all processors) > > ------------------------------------------------------------------------------------------------------------------------ > > > > > > ########################################################## > > # # > > # WARNING!!! # > > # # > > # This code was run without the PreLoadBegin() # > > # macros. To get timing results we always recommend # > > # preloading. otherwise timing numbers may be # > > # meaningless. # > > ########################################################## > > > > > > Event Count Time (sec) Flops/sec > > --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg > > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > > > --- Event Stage 0: Main Stage > > > > MatAssemblyBegin 1 1.0 2.9536e-01 4.0 0.00e+00 0.0 4.2e+01 > > 4.2e+05 2.0e+00 0 0 43 98 8 0 0 43 98 8 0 > > MatAssemblyEnd 1 1.0 2.1410e+02 1.0 0.00e+00 0.0 2.8e+01 > > 8.2e+03 7.0e+00 99 0 29 1 28 99 0 29 1 28 0 > > MatZeroEntries 1 1.0 9.3739e-02 5.9 0.00e+00 0.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 4 1.0 3.9721e-04 2.2 0.00e+00 0.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > ------------------------------------------------------------------------------------------------------------------------ > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' Mem. > > > > --- Event Stage 0: Main Stage > > > > Matrix 3 0 0 0 > > Index Set 6 6 45500 0 > > Vec 6 1 196776 0 > > Vec Scatter 3 0 0 0 > > IS L to G Mapping 2 0 0 0 > > Krylov Solver 1 0 0 0 > > Preconditioner 1 0 0 0 > > ======================================================================================================================== > > Average time to get PetscTime(): 1.71661e-06 > > Average time for MPI_Barrier(): 0.000159979 > > Average time for zero size MPI_Send(): 1.29938e-05 > > Compiled without FORTRAN kernels > > Compiled with full precision matrices (default) > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > > sizeof(PetscScalar) 8 > > Configure run at: Fri Sep 28 23:34:20 2007 > > > > run2 > > Max Max/Min Avg Total > > Time (sec): 2.298e+02 1.00000 2.298e+02 > > Objects: 2.600e+02 1.00000 2.600e+02 > > Flops: 1.265e+09 1.17394 1.161e+09 6.969e+09 > > Flops/sec: 5.505e+06 1.17394 5.054e+06 3.032e+07 > > MPI Messages: 1.436e+03 1.20816 1.326e+03 7.956e+03 > > MPI Message Lengths: 2.120e+07 1.23141 1.457e+04 1.159e+08 > > MPI Reductions: 4.192e+02 1.00000 > > > > Flop counting convention: 1 flop = 1 real number operation of type > > (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length > > N --> 2N flops > > and VecAXPY() for complex vectors of > > length N --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > > Messages --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts > > %Total Avg %Total counts %Total > > 0: Main Stage: 2.2943e+02 99.8% 6.9689e+09 100.0% 7.944e+03 > > 99.8% 1.457e+04 100.0% 2.230e+02 8.9% > > > > ------------------------------------------------------------------------------------------------------------------------ > > See the 'Profiling' chapter of the users' manual for details on > > interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops/sec: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all processors > > Mess: number of messages sent > > Avg. len: average message length > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() > > and PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this phase > > %M - percent messages in this phase %L - percent message > > lengths in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > > over all processors) > > ------------------------------------------------------------------------------------------------------------------------ > > > > > > ########################################################## > > # # > > # WARNING!!! # > > # # > > # This code was run without the PreLoadBegin() # > > # macros. To get timing results we always recommend # > > # preloading. otherwise timing numbers may be # > > # meaningless. # > > ########################################################## > > > > > > Event Count Time (sec) Flops/sec > > --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg > > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > > > --- Event Stage 0: Main Stage > > > > MatMult 135 1.0 2.0830e+00 1.4 2.30e+08 1.9 2.4e+03 > > 1.3e+04 0.0e+00 1 26 30 26 0 1 26 30 26 0 862 > > MatMultAdd 40 1.0 3.2598e-01 4.5 2.68e+07 5.7 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31 > > MatSolve 44 1.0 6.7841e-01 1.7 1.93e+08 1.7 0.0e+00 > > 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 672 > > MatRelax 80 1.0 4.2949e+00 1.6 8.77e+07 1.2 2.2e+03 > > 1.3e+04 0.0e+00 1 23 28 26 0 1 23 28 26 0 374 > > MatLUFactorSym 1 1.0 7.6739e-02 1.1 0.00e+00 0.0 0.0e+00 > > 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatLUFactorNum 2 1.0 2.9370e+00 1.5 2.53e+08 1.5 0.0e+00 > > 0.0e+00 0.0e+00 1 44 0 0 0 1 44 0 0 0 1037 > > MatILUFactorSym 1 1.0 6.7334e-01 1.0 0.00e+00 0.0 0.0e+00 > > 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatAssemblyBegin 7 1.0 7.1652e-01 4.4 0.00e+00 0.0 1.3e+02 > > 2.9e+05 8.0e+00 0 0 2 32 0 0 0 2 32 4 0 > > MatAssemblyEnd 7 1.0 2.1473e+02 1.0 0.00e+00 0.0 8.4e+01 > > 3.2e+03 2.2e+01 93 0 1 0 1 94 0 1 0 10 0 > > MatGetRowIJ 2 1.0 1.8899e-03 1.1 0.00e+00 0.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetSubMatrice 1 1.0 3.1915e-02 1.0 0.00e+00 0.0 0.0e+00 > > 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0 > > MatGetOrdering 2 1.0 1.2184e-02 1.6 0.00e+00 0.0 0.0e+00 > > 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 > > MatIncreaseOvrlp 1 1.0 7.6865e-02 1.0 0.00e+00 0.0 1.1e+03 > > 2.4e+03 2.0e+01 0 0 13 2 1 0 0 13 2 9 0 > > MatZeroEntries 3 1.0 1.0429e-01 4.7 0.00e+00 0.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MAT_GetRedundantMatrix 1 1.0 3.0144e-02 1.2 0.00e+00 0.0 9.0e+01 > > 8.0e+04 2.0e+00 0 0 1 6 0 0 0 1 6 1 0 > > VecDot 39 1.0 9.0680e-01140.0 2.95e+08191.3 0.0e+00 > > 0.0e+00 3.9e+01 0 0 0 0 2 0 0 0 0 17 11 > > VecMDot 8 1.0 2.8777e-03 2.8 4.16e+07 3.3 0.0e+00 > > 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 4 84 > > VecNorm 31 1.0 6.2301e-02 4.2 7.17e+07 5.8 0.0e+00 > > 0.0e+00 3.1e+01 0 0 0 0 1 0 0 0 0 14 86 > > VecScale 85 1.0 1.7729e-03 1.4 4.94e+08 1.4 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2138 > > VecCopy 4 1.0 6.2108e-04 1.5 0.00e+00 0.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 139 1.0 3.5934e-03 1.4 0.00e+00 0.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecAXPY 98 1.0 5.1496e-03 1.3 9.24e+08 1.1 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4705 > > VecAYPX 40 1.0 1.0311e-02 1.4 1.90e+08 1.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 969 > > VecWAXPY 75 1.0 2.6060e-02 1.4 1.22e+08 1.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 618 > > VecMAXPY 9 1.0 2.1315e-04 1.6 3.01e+08 1.2 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1355 > > VecAssemblyBegin 4 1.0 1.9898e-03 1.2 0.00e+00 0.0 0.0e+00 > > 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 5 0 > > VecAssemblyEnd 4 1.0 2.1219e-05 1.2 0.00e+00 0.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 267 1.0 1.0224e-01 1.6 0.00e+00 0.0 6.3e+03 > > 1.1e+04 0.0e+00 0 0 79 59 0 0 0 79 59 0 0 > > VecScatterEnd 267 1.0 7.8653e-0111.3 0.00e+00 0.0 0.0e+00 > > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPGMRESOrthog 4 1.0 3.7677e-03 1.9 4.78e+07 2.4 0.0e+00 > > 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 5 141 > > KSPSetup 6 1.0 1.0260e-02 1.0 0.00e+00 0.0 0.0e+00 > > 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 > > KSPSolve 2 1.0 7.8238e+00 1.0 1.16e+08 1.2 5.9e+03 > > 1.1e+04 8.1e+01 3 70 75 54 3 3 70 75 54 36 622 > > PCSetUp 3 1.0 5.1323e+00 1.2 1.29e+08 1.2 1.6e+03 > > 9.1e+03 7.3e+01 2 47 20 13 3 2 47 20 13 33 636 > > PCSetUpOnBlocks 1 1.0 1.3325e+00 1.0 1.47e+08 1.0 0.0e+00 > > 0.0e+00 3.0e+00 1 17 0 0 0 1 17 0 0 1 871 > > PCApply 44 1.0 5.7917e+00 1.1 9.60e+07 1.2 4.7e+03 > > 1.0e+04 0.0e+00 2 41 59 42 0 2 41 59 42 0 497 > > ------------------------------------------------------------------------------------------------------------------------ > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' Mem. > > > > --- Event Stage 0: Main Stage > > > > Matrix 16 0 0 0 > > Index Set 36 29 256760 0 > > Vec 176 93 16582464 0 > > Vec Scatter 10 0 0 0 > > IS L to G Mapping 4 0 0 0 > > Krylov Solver 6 0 0 0 > > Preconditioner 6 0 0 0 > > Viewer 4 2 0 0 > > Container 2 0 0 0 > > ======================================================================================================================== > > Average time to get PetscTime(): 8.10623e-07 > > Average time for MPI_Barrier(): 0.000178194 > > Average time for zero size MPI_Send(): 1.33117e-05 > > OptionTable: -mg_levels_ksp_type richardson > > OptionTable: -mg_levels_pc_sor_omega 1.05 > > OptionTable: -mg_levels_pc_type sor > > OptionTable: -pc_ml_PrintLevel 4 > > OptionTable: -pc_ml_maxNlevels 2 > > OptionTable: -pc_type ml > > Compiled without FORTRAN kernels > > Compiled with full precision matrices (default) > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > > sizeof(PetscScalar) 8 > > Configure run at: Fri Sep 28 23:34:20 2007 > > > > > > > > > > Barry > > > > > > > > > The a > > > > > > On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote: > > > > > > > i call > > > > ierr = MatStashSetInitialSize(A[*seqsolve],stash_size, > > > > stash_size);CHKERRQ(ierr); > > > > with 100 000 000 for the stash size to make sure that's not the > > > > bottleneck > > > > > > > > the assemble time remains unchanged however. > > > > > > > > nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485 > > > > reallocs in MatAssemblyBegin_MPIAIJ = 0 > > > > > > > > cheers > > > > Thomas > > > > > > > > On Saturday 02 February 2008 23:19, Barry Smith wrote: > > > >> The matstash has a concept of preallocation also. During the first > > > >> setvalues > > > >> it is allocating more and more memory for the stash. In the second > > > >> setvalues > > > >> the stash is large enough so does not require any addition > > > >> allocation. > > > >> > > > >> You can use the option -matstash_initial_size to allocate > > > >> enough space > > > >> initially so that the first setvalues is also fast. It does not look > > > >> like there is a way > > > >> coded to get the that you should use. It should be set to the > > > >> maximum nonzeros > > > >> any process has that belongs to other processes. The stash handling > > > >> code is > > > >> in src/mat/utils/matstash.c, perhaps you can figure out how to > > > >> printout with PetscInfo() > > > >> the sizes needed? > > > >> > > > >> > > > >> Barry > > > >> > > > >> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote: > > > >>> On Saturday 02 February 2008 18:33, Hong Zhang wrote: > > > >>>> On Sat, 2 Feb 2008, Thomas Geenen wrote: > > > >>>>> Dear Petsc users, > > > >>>>> > > > >>>>> I would like to understand what is slowing down the assembly phase > > > >>>>> of my > > > >>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough > > > >>>>> guess of > > > >>>>> the number of off diagonal entries and then use a conservative > > > >>>>> value to > > > >>>>> make sure I do not need extra mallocs. (the number of diagonal > > > >>>>> entries is > > > >>>>> exact) > > > >>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. > > > >>>>> The first time i call MatSetValues and MatAssemblyBegin, > > > >>>>> MatAssemblyEnd it takes about 170 seconds > > > >>>>> the second time 0.3 seconds. > > > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on > > > >>>>> the > > > >>>>> "wrong" cpu. However thats also the case the second run. I checked > > > >>>>> that there are no additional mallocs > > > >>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after > > > >>>>> MatAssemblyBegin, MatAssemblyEnd. > > > >>>> > > > >>>> Run your code with the option '-log_summary' and check which > > > >>>> function > > > >>>> call dominates the execution time. > > > >>> > > > >>> the time is spend in MatStashScatterGetMesg_Private > > > >>> > > > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on > > > >>>>> the > > > >>>>> "wrong" cpu. > > > >>>> > > > >>>> Likely, the communication that sending the entries to the > > > >>>> corrected cpu consume the time. Can you fill the entries in the > > > >>>> correct cpu? > > > >>> > > > >>> the second time the entries are filled on the wrong CPU as well. > > > >>> i am curious about the difference in time between run 1 and 2. > > > >>> > > > >>>> Hong > > > >>>> > > > >>>>> cheers > > > >>>>> Thomas > > > > > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > From recrusader at gmail.com Mon Feb 4 12:20:28 2008 From: recrusader at gmail.com (Yujie) Date: Mon, 4 Feb 2008 10:20:28 -0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> Message-ID: <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com> what is the difference between sequantial and parallel AIJ matrix? Assuming there is a matrix A, if I partitaion this matrix into A1, A2, Ai... An. A is a parallel AIJ matrix at the whole view, Ai is a sequential AIJ matrix? I want to operate Ai at each node. In addition, whether is it possible to get general inverse using MatMatSolve() if the matrix is not square? Thanks a lot. Regards, Yujie On 2/4/08, Barry Smith wrote: > > > For sequential AIJ matrices you can fill the B matrix with the > identity and then use > MatMatSolve(). > > Note since the inverse of a sparse matrix is dense the B matrix is > a SeqDense matrix. > > Barry > > On Feb 4, 2008, at 12:37 AM, Yujie wrote: > > > Hi, > > Now, I want to inverse a sparse matrix. I have browsed the manual, > > however, I can't find some information. could you give me some advice? > > > > thanks a lot. > > > > Regards, > > Yujie > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Feb 4 12:21:05 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Feb 2008 12:21:05 -0600 Subject: assembly In-Reply-To: <8aa042e10802040934p7308142fjff32c882308eceda@mail.gmail.com> References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> <200802031344.56290.geenen@gmail.com> <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov> <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com> <8aa042e10802040934p7308142fjff32c882308eceda@mail.gmail.com> Message-ID: <636792AA-B9FF-4788-82C5-7A3E008124BA@mcs.anl.gov> MatAssemblyEnd 1 1.0 1.4709e+02 In MatAssemblyEnd all the messages with off-process values are received and then MatSetValues is called with them onto the local matrices. The only way this can be taking this huge amount of time is if the preallocation was not done correctly. Can you please run with -info and send all the output Barry On Feb 4, 2008, at 11:34 AM, Thomas Geenen wrote: > hi matt, > > this is indeed much clearer > i put the push and pop around MatAssemblyBegin/End > > 1: First_assembly: 1.4724e+02 90.8% 0.0000e+00 0.0% 7.000e+01 > 0.9% 2.266e+03 15.5% 9.000e+00 0.2% > 2: Second_assembly: 2.3823e-01 0.1% 0.0000e+00 0.0% 7.000e+01 > 0.9% 1.276e+02 0.9% 9.000e+00 0.2% > 3: Third_assembly: 5.0168e-01 0.3% 0.0000e+00 0.0% 4.200e+01 > 0.5% 2.237e+03 15.4% 3.000e+00 0.1% > > The second assembly is another system of equations (pressure > correction in simpler) > so 1 and 3 are 1 and 2 ...... > > cheers > Thomas > > > > ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > > Unknown Name on a linux-gnu named etna.geo.uu.nl with 6 processors, by > geenen Mon Feb 4 18:27:48 2008 > Using Petsc Release Version 2.3.3, Patch 3, Fri Jun 15 16:51:25 CDT > 2007 HG revision: f051789beadcd36f77fb6111d20225e26ed7cc0d > > Max Max/Min Avg Total > Time (sec): 1.621e+02 1.00000 1.621e+02 > Objects: 2.600e+02 1.00000 2.600e+02 > Flops: 1.265e+09 1.17394 1.161e+09 6.969e+09 > Flops/sec: 7.806e+06 1.17393 7.166e+06 4.300e+07 > MPI Messages: 1.436e+03 1.20816 1.326e+03 7.956e+03 > MPI Message Lengths: 2.120e+07 1.23141 1.457e+04 1.159e+08 > MPI Reductions: 9.862e+02 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length > N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > 0: Main Stage: 1.3160e+01 8.1% 6.9689e+09 100.0% 7.762e+03 > 97.6% 9.941e+03 68.2% 2.020e+02 3.4% > 1: First_assembly: 1.4724e+02 90.8% 0.0000e+00 0.0% 7.000e+01 > 0.9% 2.266e+03 15.5% 9.000e+00 0.2% > 2: Second_assembly: 2.3823e-01 0.1% 0.0000e+00 0.0% 7.000e+01 > 0.9% 1.276e+02 0.9% 9.000e+00 0.2% > 3: Third_assembly: 5.0168e-01 0.3% 0.0000e+00 0.0% 4.200e+01 > 0.5% 2.237e+03 15.4% 3.000e+00 0.1% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops/sec: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all > processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in > this phase > %M - percent messages in this phase %L - percent message > lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > ------------------------------------------------------------------------------------------------------------------------ > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was run without the PreLoadBegin() # > # macros. To get timing results we always recommend # > # preloading. otherwise timing numbers may be # > # meaningless. # > ########################################################## > > > Event Count Time (sec) Flops/sec > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 135 1.0 1.5541e+00 1.3 3.03e+08 1.8 2.4e+03 > 1.3e+04 0.0e+00 1 26 30 26 0 11 26 30 38 0 1155 > MatMultAdd 40 1.0 3.2611e-01 8.3 3.64e+07 7.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 31 > MatSolve 44 1.0 6.7682e-01 1.7 1.94e+08 1.7 0.0e+00 > 0.0e+00 0.0e+00 0 7 0 0 0 4 7 0 0 0 673 > MatRelax 80 1.0 3.4453e+00 1.4 9.46e+07 1.1 2.2e+03 > 1.3e+04 0.0e+00 2 23 28 26 0 22 23 29 38 0 466 > MatLUFactorSym 1 1.0 6.7567e-02 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 2 1.0 2.8804e+00 1.4 2.53e+08 1.4 0.0e+00 > 0.0e+00 0.0e+00 1 44 0 0 0 17 44 0 0 0 1058 > MatILUFactorSym 1 1.0 6.7676e-01 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.0e+00 0 0 0 0 0 5 0 0 0 0 0 > MatAssemblyBegin 4 1.0 2.7711e-0237.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 1 0 > MatAssemblyEnd 4 1.0 2.4401e-02 1.2 0.00e+00 0.0 2.8e+01 > 4.7e+02 7.0e+00 0 0 0 0 0 0 0 0 0 3 0 > MatGetRowIJ 2 1.0 1.2948e-02 7.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrice 1 1.0 2.8603e-02 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0 > MatGetOrdering 2 1.0 2.3054e-02 2.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 > MatIncreaseOvrlp 1 1.0 8.1528e-02 1.0 0.00e+00 0.0 1.1e+03 > 2.4e+03 2.0e+01 0 0 13 2 0 1 0 14 3 10 0 > MatZeroEntries 3 1.0 3.4422e-02 1.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MAT_GetRedundantMatrix 1 1.0 3.5774e-02 1.6 0.00e+00 0.0 9.0e+01 > 8.0e+04 2.0e+00 0 0 1 6 0 0 0 1 9 1 0 > VecDot 39 1.0 5.8092e-0131.0 1.02e+0842.3 0.0e+00 > 0.0e+00 3.9e+01 0 0 0 0 1 2 0 0 0 19 17 > VecMDot 8 1.0 3.4735e-03 2.9 4.52e+07 5.5 0.0e+00 > 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 4 69 > VecNorm 31 1.0 3.8690e-02 4.1 1.11e+08 5.6 0.0e+00 > 0.0e+00 3.1e+01 0 0 0 0 1 0 0 0 0 15 139 > VecScale 85 1.0 1.7631e-03 1.2 5.59e+08 1.9 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2150 > VecCopy 4 1.0 5.5027e-04 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 139 1.0 3.0956e-03 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 98 1.0 5.0848e-03 1.3 9.35e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4765 > VecAYPX 40 1.0 1.0264e-02 1.4 2.01e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 973 > VecWAXPY 75 1.0 2.6191e-02 1.4 1.22e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 615 > VecMAXPY 9 1.0 2.1935e-04 1.7 2.93e+08 1.1 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1317 > VecAssemblyBegin 4 1.0 1.9331e-03 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 6 0 > VecAssemblyEnd 4 1.0 2.3842e-05 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 267 1.0 1.0370e-01 1.5 0.00e+00 0.0 6.3e+03 > 1.1e+04 0.0e+00 0 0 79 59 0 1 0 81 86 0 0 > VecScatterEnd 267 1.0 4.1189e-01 3.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 > KSPGMRESOrthog 4 1.0 4.5178e-03 2.0 5.22e+07 3.8 0.0e+00 > 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 6 117 > KSPSetup 6 1.0 7.9882e-03 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 > KSPSolve 2 1.0 6.6590e+00 1.0 1.36e+08 1.2 5.9e+03 > 1.1e+04 8.1e+01 4 70 75 54 1 51 70 76 80 40 731 > PCSetUp 3 1.0 5.0877e+00 1.2 1.29e+08 1.2 1.6e+03 > 9.1e+03 7.3e+01 3 47 20 13 1 34 47 21 19 36 642 > PCSetUpOnBlocks 1 1.0 1.3292e+00 1.0 1.46e+08 1.0 0.0e+00 > 0.0e+00 3.0e+00 1 17 0 0 0 10 17 0 0 1 873 > PCApply 44 1.0 4.7444e+00 1.1 1.16e+08 1.2 4.7e+03 > 1.0e+04 0.0e+00 3 41 59 42 0 34 41 60 61 0 607 > > --- Event Stage 1: First_assembly > > MatAssemblyBegin 1 1.0 3.0375e-01 3.5 0.00e+00 0.0 4.2e+01 > 4.2e+05 2.0e+00 0 0 1 15 0 0 0 60 99 22 0 > MatAssemblyEnd 1 1.0 1.4709e+02 1.0 0.00e+00 0.0 2.8e+01 > 8.2e+03 7.0e+00 91 0 0 0 0 100 0 40 1 78 0 > > --- Event Stage 2: Second_assembly > > MatAssemblyBegin 1 1.0 1.7451e-02 5.0 0.00e+00 0.0 4.2e+01 > 2.4e+04 2.0e+00 0 0 1 1 0 5 0 60 98 22 0 > MatAssemblyEnd 1 1.0 2.3056e-01 1.0 0.00e+00 0.0 2.8e+01 > 8.4e+02 7.0e+00 0 0 0 0 0 95 0 40 2 78 0 > > --- Event Stage 3: Third_assembly > > MatAssemblyBegin 1 1.0 3.3676e-01 3.8 0.00e+00 0.0 4.2e+01 > 4.2e+05 2.0e+00 0 0 1 15 0 45 0100100 67 0 > MatAssemblyEnd 1 1.0 3.3125e-01 1.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.0e+00 0 0 0 0 0 55 0 0 0 33 0 > > --- Event Stage 4: Unknown > > > --- Event Stage 5: Unknown > > > --- Event Stage 6: Unknown > > > --- Event Stage 7: Unknown > > > --- Event Stage 8: Unknown > > > --- Event Stage 9: Unknown > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' > Mem. > > --- Event Stage 0: Main Stage > > Matrix 16 0 0 0 > Index Set 32 25 207888 0 > Vec 172 91 16374224 0 > Vec Scatter 8 0 0 0 > IS L to G Mapping 4 0 0 0 > Krylov Solver 6 0 0 0 > Preconditioner 6 0 0 0 > Viewer 4 2 0 0 > Container 2 0 0 0 > > --- Event Stage 1: First_assembly > > Index Set 2 2 44140 0 > Vec 2 1 196776 0 > Vec Scatter 1 0 0 0 > > --- Event Stage 2: Second_assembly > > Index Set 2 2 4732 0 > Vec 2 1 11464 0 > Vec Scatter 1 0 0 0 > > --- Event Stage 3: Third_assembly > > > --- Event Stage 4: Unknown > > > --- Event Stage 5: Unknown > > > --- Event Stage 6: Unknown > > > --- Event Stage 7: Unknown > > > --- Event Stage 8: Unknown > > > --- Event Stage 9: Unknown > > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > = > ====================================================================== > Average time to get PetscTime(): 8.82149e-07 > Average time for MPI_Barrier(): 0.000153208 > Average time for zero size MPI_Send(): 1.86761e-05 > OptionTable: -mg_levels_ksp_type richardson > OptionTable: -mg_levels_pc_sor_omega 1.05 > OptionTable: -mg_levels_pc_type sor > OptionTable: -pc_ml_PrintLevel 4 > OptionTable: -pc_ml_maxNlevels 2 > OptionTable: -pc_type ml > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 > Configure run at: Fri Sep 28 23:34:20 2007 > > On Feb 4, 2008 5:47 PM, Matthew Knepley wrote: >> On Feb 4, 2008 10:41 AM, Thomas Geenen wrote: >>> On Feb 3, 2008 8:51 PM, Barry Smith wrote: >>>> >>>> Hmmm, are you saying the first round of setting values still >>>> takes much longer then the second round? >>> >>> yes >>> >>>> Or is it the time >>>> in MatAssemblyBegin() much longer the first time? >>>> >>>> The MatAssembly process has one piece of code that's >>>> work is order n*size; where n is the stash size and size is the >>>> number of processes, all other work is only order n. >>>> >>>> Could you send the -log_summary output? >>> >>> the timing is cumulative i guess? >>> in between these two solves i solve a smaller system for which i do >>> not include the timing. >> >> I ma having a little trouble reading this. I think the easiest >> thing to do >> is wrap the two section of code in their own sections: >> >> PetscLogStageRegister(&stage1, "First assembly"); >> PetscLogStageRegister(&stage2, "Second assembly"); >> >> PetscLogStagePush(stage1); >> >> PetscLogStagePop(); >> >> PetscLogStagePush(stage2); >> >> PetscLogStagePop(); >> >> Then we can also get a look at how many messages are sent >> and how big they are. >> >> Thanks, >> >> Matt >> >> >>> run 1 >>> Max Max/Min Avg Total >>> Time (sec): 2.154e+02 1.00001 2.154e+02 >>> Objects: 2.200e+01 1.00000 2.200e+01 >>> Flops: 0.000e+00 0.00000 0.000e+00 0.000e+00 >>> Flops/sec: 0.000e+00 0.00000 0.000e+00 0.000e+00 >>> MPI Messages: 1.750e+01 1.25000 1.633e+01 9.800e+01 >>> MPI Message Lengths: 3.460e+06 1.29903 1.855e+05 1.818e+07 >>> MPI Reductions: 4.167e+00 1.00000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of >>> length >>> N --> 2N flops >>> and VecAXPY() for complex vectors of >>> length N --> 8N flops >>> >>> Summary of Stages: ----- Time ------ ----- Flops ----- --- >>> Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total counts >>> %Total Avg %Total counts %Total >>> 0: Main Stage: 2.1537e+02 100.0% 0.0000e+00 0.0% 9.800e+01 >>> 100.0% 1.855e+05 100.0% 2.500e+01 100.0% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on >>> interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flops/sec: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all >>> processors >>> Mess: number of messages sent >>> Avg. len: average message length >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with >>> PetscLogStagePush() >>> and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flops in >>> this phase >>> %M - percent messages in this phase %L - percent message >>> lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >>> time >>> over all processors) >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> >>> ########################################################## >>> # # >>> # WARNING!!! # >>> # # >>> # This code was run without the PreLoadBegin() # >>> # macros. To get timing results we always recommend # >>> # preloading. otherwise timing numbers may be # >>> # meaningless. # >>> ########################################################## >>> >>> >>> Event Count Time (sec) Flops/sec >>> --- Global --- --- Stage --- Total >>> Max Ratio Max Ratio Max Ratio Mess Avg >>> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> --- Event Stage 0: Main Stage >>> >>> MatAssemblyBegin 1 1.0 2.9536e-01 4.0 0.00e+00 0.0 4.2e+01 >>> 4.2e+05 2.0e+00 0 0 43 98 8 0 0 43 98 8 0 >>> MatAssemblyEnd 1 1.0 2.1410e+02 1.0 0.00e+00 0.0 2.8e+01 >>> 8.2e+03 7.0e+00 99 0 29 1 28 99 0 29 1 28 0 >>> MatZeroEntries 1 1.0 9.3739e-02 5.9 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 4 1.0 3.9721e-04 2.2 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory >>> Descendants' Mem. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 3 0 0 0 >>> Index Set 6 6 45500 0 >>> Vec 6 1 196776 0 >>> Vec Scatter 3 0 0 0 >>> IS L to G Mapping 2 0 0 0 >>> Krylov Solver 1 0 0 0 >>> Preconditioner 1 0 0 0 >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> ==================================================================== >>> Average time to get PetscTime(): 1.71661e-06 >>> Average time for MPI_Barrier(): 0.000159979 >>> Average time for zero size MPI_Send(): 1.29938e-05 >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 >>> Configure run at: Fri Sep 28 23:34:20 2007 >>> >>> run2 >>> Max Max/Min Avg Total >>> Time (sec): 2.298e+02 1.00000 2.298e+02 >>> Objects: 2.600e+02 1.00000 2.600e+02 >>> Flops: 1.265e+09 1.17394 1.161e+09 6.969e+09 >>> Flops/sec: 5.505e+06 1.17394 5.054e+06 3.032e+07 >>> MPI Messages: 1.436e+03 1.20816 1.326e+03 7.956e+03 >>> MPI Message Lengths: 2.120e+07 1.23141 1.457e+04 1.159e+08 >>> MPI Reductions: 4.192e+02 1.00000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of >>> length >>> N --> 2N flops >>> and VecAXPY() for complex vectors of >>> length N --> 8N flops >>> >>> Summary of Stages: ----- Time ------ ----- Flops ----- --- >>> Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total counts >>> %Total Avg %Total counts %Total >>> 0: Main Stage: 2.2943e+02 99.8% 6.9689e+09 100.0% 7.944e+03 >>> 99.8% 1.457e+04 100.0% 2.230e+02 8.9% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on >>> interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flops/sec: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all >>> processors >>> Mess: number of messages sent >>> Avg. len: average message length >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with >>> PetscLogStagePush() >>> and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flops in >>> this phase >>> %M - percent messages in this phase %L - percent message >>> lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max >>> time >>> over all processors) >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> >>> ########################################################## >>> # # >>> # WARNING!!! # >>> # # >>> # This code was run without the PreLoadBegin() # >>> # macros. To get timing results we always recommend # >>> # preloading. otherwise timing numbers may be # >>> # meaningless. # >>> ########################################################## >>> >>> >>> Event Count Time (sec) Flops/sec >>> --- Global --- --- Stage --- Total >>> Max Ratio Max Ratio Max Ratio Mess Avg >>> len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> --- Event Stage 0: Main Stage >>> >>> MatMult 135 1.0 2.0830e+00 1.4 2.30e+08 1.9 2.4e+03 >>> 1.3e+04 0.0e+00 1 26 30 26 0 1 26 30 26 0 862 >>> MatMultAdd 40 1.0 3.2598e-01 4.5 2.68e+07 5.7 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31 >>> MatSolve 44 1.0 6.7841e-01 1.7 1.93e+08 1.7 0.0e+00 >>> 0.0e+00 0.0e+00 0 7 0 0 0 0 7 0 0 0 672 >>> MatRelax 80 1.0 4.2949e+00 1.6 8.77e+07 1.2 2.2e+03 >>> 1.3e+04 0.0e+00 1 23 28 26 0 1 23 28 26 0 374 >>> MatLUFactorSym 1 1.0 7.6739e-02 1.1 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatLUFactorNum 2 1.0 2.9370e+00 1.5 2.53e+08 1.5 0.0e+00 >>> 0.0e+00 0.0e+00 1 44 0 0 0 1 44 0 0 0 1037 >>> MatILUFactorSym 1 1.0 6.7334e-01 1.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatAssemblyBegin 7 1.0 7.1652e-01 4.4 0.00e+00 0.0 1.3e+02 >>> 2.9e+05 8.0e+00 0 0 2 32 0 0 0 2 32 4 0 >>> MatAssemblyEnd 7 1.0 2.1473e+02 1.0 0.00e+00 0.0 8.4e+01 >>> 3.2e+03 2.2e+01 93 0 1 0 1 94 0 1 0 10 0 >>> MatGetRowIJ 2 1.0 1.8899e-03 1.1 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetSubMatrice 1 1.0 3.1915e-02 1.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0 >>> MatGetOrdering 2 1.0 1.2184e-02 1.6 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 >>> MatIncreaseOvrlp 1 1.0 7.6865e-02 1.0 0.00e+00 0.0 1.1e+03 >>> 2.4e+03 2.0e+01 0 0 13 2 1 0 0 13 2 9 0 >>> MatZeroEntries 3 1.0 1.0429e-01 4.7 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MAT_GetRedundantMatrix 1 1.0 3.0144e-02 1.2 0.00e+00 0.0 9.0e >>> +01 >>> 8.0e+04 2.0e+00 0 0 1 6 0 0 0 1 6 1 0 >>> VecDot 39 1.0 9.0680e-01140.0 2.95e+08191.3 0.0e+00 >>> 0.0e+00 3.9e+01 0 0 0 0 2 0 0 0 0 17 11 >>> VecMDot 8 1.0 2.8777e-03 2.8 4.16e+07 3.3 0.0e+00 >>> 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 4 84 >>> VecNorm 31 1.0 6.2301e-02 4.2 7.17e+07 5.8 0.0e+00 >>> 0.0e+00 3.1e+01 0 0 0 0 1 0 0 0 0 14 86 >>> VecScale 85 1.0 1.7729e-03 1.4 4.94e+08 1.4 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2138 >>> VecCopy 4 1.0 6.2108e-04 1.5 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 139 1.0 3.5934e-03 1.4 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAXPY 98 1.0 5.1496e-03 1.3 9.24e+08 1.1 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4705 >>> VecAYPX 40 1.0 1.0311e-02 1.4 1.90e+08 1.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 969 >>> VecWAXPY 75 1.0 2.6060e-02 1.4 1.22e+08 1.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 618 >>> VecMAXPY 9 1.0 2.1315e-04 1.6 3.01e+08 1.2 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1355 >>> VecAssemblyBegin 4 1.0 1.9898e-03 1.2 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 5 0 >>> VecAssemblyEnd 4 1.0 2.1219e-05 1.2 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecScatterBegin 267 1.0 1.0224e-01 1.6 0.00e+00 0.0 6.3e+03 >>> 1.1e+04 0.0e+00 0 0 79 59 0 0 0 79 59 0 0 >>> VecScatterEnd 267 1.0 7.8653e-0111.3 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPGMRESOrthog 4 1.0 3.7677e-03 1.9 4.78e+07 2.4 0.0e+00 >>> 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 5 141 >>> KSPSetup 6 1.0 1.0260e-02 1.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 2 0 >>> KSPSolve 2 1.0 7.8238e+00 1.0 1.16e+08 1.2 5.9e+03 >>> 1.1e+04 8.1e+01 3 70 75 54 3 3 70 75 54 36 622 >>> PCSetUp 3 1.0 5.1323e+00 1.2 1.29e+08 1.2 1.6e+03 >>> 9.1e+03 7.3e+01 2 47 20 13 3 2 47 20 13 33 636 >>> PCSetUpOnBlocks 1 1.0 1.3325e+00 1.0 1.47e+08 1.0 0.0e+00 >>> 0.0e+00 3.0e+00 1 17 0 0 0 1 17 0 0 1 871 >>> PCApply 44 1.0 5.7917e+00 1.1 9.60e+07 1.2 4.7e+03 >>> 1.0e+04 0.0e+00 2 41 59 42 0 2 41 59 42 0 497 >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory >>> Descendants' Mem. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 16 0 0 0 >>> Index Set 36 29 256760 0 >>> Vec 176 93 16582464 0 >>> Vec Scatter 10 0 0 0 >>> IS L to G Mapping 4 0 0 0 >>> Krylov Solver 6 0 0 0 >>> Preconditioner 6 0 0 0 >>> Viewer 4 2 0 0 >>> Container 2 0 0 0 >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> = >>> ==================================================================== >>> Average time to get PetscTime(): 8.10623e-07 >>> Average time for MPI_Barrier(): 0.000178194 >>> Average time for zero size MPI_Send(): 1.33117e-05 >>> OptionTable: -mg_levels_ksp_type richardson >>> OptionTable: -mg_levels_pc_sor_omega 1.05 >>> OptionTable: -mg_levels_pc_type sor >>> OptionTable: -pc_ml_PrintLevel 4 >>> OptionTable: -pc_ml_maxNlevels 2 >>> OptionTable: -pc_type ml >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 >>> Configure run at: Fri Sep 28 23:34:20 2007 >>> >>> >>>> >>>> Barry >>>> >>>> >>>> The a >>>> >>>> On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote: >>>> >>>>> i call >>>>> ierr = MatStashSetInitialSize(A[*seqsolve],stash_size, >>>>> stash_size);CHKERRQ(ierr); >>>>> with 100 000 000 for the stash size to make sure that's not the >>>>> bottleneck >>>>> >>>>> the assemble time remains unchanged however. >>>>> >>>>> nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485 >>>>> reallocs in MatAssemblyBegin_MPIAIJ = 0 >>>>> >>>>> cheers >>>>> Thomas >>>>> >>>>> On Saturday 02 February 2008 23:19, Barry Smith wrote: >>>>>> The matstash has a concept of preallocation also. During the >>>>>> first >>>>>> setvalues >>>>>> it is allocating more and more memory for the stash. In the >>>>>> second >>>>>> setvalues >>>>>> the stash is large enough so does not require any addition >>>>>> allocation. >>>>>> >>>>>> You can use the option -matstash_initial_size to >>>>>> allocate >>>>>> enough space >>>>>> initially so that the first setvalues is also fast. It does not >>>>>> look >>>>>> like there is a way >>>>>> coded to get the that you should use. It should be set >>>>>> to the >>>>>> maximum nonzeros >>>>>> any process has that belongs to other processes. The stash >>>>>> handling >>>>>> code is >>>>>> in src/mat/utils/matstash.c, perhaps you can figure out how to >>>>>> printout with PetscInfo() >>>>>> the sizes needed? >>>>>> >>>>>> >>>>>> Barry >>>>>> >>>>>> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote: >>>>>>> On Saturday 02 February 2008 18:33, Hong Zhang wrote: >>>>>>>> On Sat, 2 Feb 2008, Thomas Geenen wrote: >>>>>>>>> Dear Petsc users, >>>>>>>>> >>>>>>>>> I would like to understand what is slowing down the assembly >>>>>>>>> phase >>>>>>>>> of my >>>>>>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough >>>>>>>>> guess of >>>>>>>>> the number of off diagonal entries and then use a conservative >>>>>>>>> value to >>>>>>>>> make sure I do not need extra mallocs. (the number of diagonal >>>>>>>>> entries is >>>>>>>>> exact) >>>>>>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd. >>>>>>>>> The first time i call MatSetValues and MatAssemblyBegin, >>>>>>>>> MatAssemblyEnd it takes about 170 seconds >>>>>>>>> the second time 0.3 seconds. >>>>>>>>> I run it on 6 cpu's and I do fill quit a number of row- >>>>>>>>> entries on >>>>>>>>> the >>>>>>>>> "wrong" cpu. However thats also the case the second run. I >>>>>>>>> checked >>>>>>>>> that there are no additional mallocs >>>>>>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after >>>>>>>>> MatAssemblyBegin, MatAssemblyEnd. >>>>>>>> >>>>>>>> Run your code with the option '-log_summary' and check which >>>>>>>> function >>>>>>>> call dominates the execution time. >>>>>>> >>>>>>> the time is spend in MatStashScatterGetMesg_Private >>>>>>> >>>>>>>>> I run it on 6 cpu's and I do fill quit a number of row- >>>>>>>>> entries on >>>>>>>>> the >>>>>>>>> "wrong" cpu. >>>>>>>> >>>>>>>> Likely, the communication that sending the entries to the >>>>>>>> corrected cpu consume the time. Can you fill the entries in the >>>>>>>> correct cpu? >>>>>>> >>>>>>> the second time the entries are filled on the wrong CPU as well. >>>>>>> i am curious about the difference in time between run 1 and 2. >>>>>>> >>>>>>>> Hong >>>>>>>> >>>>>>>>> cheers >>>>>>>>> Thomas >>>>> >>>> >>>> >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> > From knepley at gmail.com Mon Feb 4 12:25:51 2008 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 4 Feb 2008 12:25:51 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com> References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com> Message-ID: On Feb 4, 2008 12:20 PM, Yujie wrote: > what is the difference between sequantial and parallel AIJ matrix? Assuming > there is a matrix A, if I partitaion this matrix into A1, A2, Ai... An. > A is a parallel AIJ matrix at the whole view, Ai is a sequential AIJ matrix? We mean parallel CSR format, which is described in the book Iterative Methods for Sparse Linear Systems by Yousef Saad as well as in the PETSc manual. > I want to operate Ai at each node. > In addition, whether is it possible to get general inverse using > MatMatSolve() if the matrix is not square? Thanks a lot. Rectangular matrices do not have inverses in that sense. You may want some sort of pseudo-inverse, but it must be motivated by the problem you are solving. Matt > Regards, > Yujie > > > On 2/4/08, Barry Smith wrote: > > > > For sequential AIJ matrices you can fill the B matrix with the > > identity and then use > > MatMatSolve(). > > > > Note since the inverse of a sparse matrix is dense the B matrix is > > a SeqDense matrix. > > > > Barry > > > > On Feb 4, 2008, at 12:37 AM, Yujie wrote: > > > > > Hi, > > > Now, I want to inverse a sparse matrix. I have browsed the manual, > > > however, I can't find some information. could you give me some advice? > > > > > > thanks a lot. > > > > > > Regards, > > > Yujie > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Mon Feb 4 12:26:12 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Feb 2008 12:26:12 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com> References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com> Message-ID: <30E53953-81B2-4EF0-B02D-D9810EB8FA0B@mcs.anl.gov> On Feb 4, 2008, at 12:20 PM, Yujie wrote: > what is the difference between sequantial and parallel AIJ matrix? > Assuming there is a matrix A, if I partitaion this matrix into A1, > A2, Ai... An. > A is a parallel AIJ matrix at the whole view, Ai is a sequential AIJ > matrix? It is not that simple. Ai is split into two parts 1) the "block diagonal" part and 2) the "off diagonal part" ; this is explained in the manual page for MatCreateMPIAIJ(). If you want to do operations on the pieces you will need to understand the code in src/mat/impls/aij/mpi. What do you want to do with Ai? > I want to operate Ai at each node. > In addition, whether is it possible to get general inverse using > MatMatSolve() if the matrix is not square? No, that requires much more complicated linear algebra technology then is in PETSc. Barry > Thanks a lot. > > Regards, > Yujie > > > On 2/4/08, Barry Smith wrote: > > For sequential AIJ matrices you can fill the B matrix with the > identity and then use > MatMatSolve(). > > Note since the inverse of a sparse matrix is dense the B matrix is > a SeqDense matrix. > > Barry > > On Feb 4, 2008, at 12:37 AM, Yujie wrote: > > > Hi, > > Now, I want to inverse a sparse matrix. I have browsed the manual, > > however, I can't find some information. could you give me some > advice? > > > > thanks a lot. > > > > Regards, > > Yujie > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From w_subber at yahoo.com Mon Feb 4 19:58:31 2008 From: w_subber at yahoo.com (Waad Subber) Date: Mon, 4 Feb 2008 17:58:31 -0800 (PST) Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com> Message-ID: <602426.95557.qm@web38210.mail.mud.yahoo.com> Hi There was a discussion between Tim Stitt and petsc developers about matrix inversion, and it was really helpful. That was in last Nov. You can check the emails archive http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html Waad Yujie wrote: what is the difference between sequantial and parallel AIJ matrix? Assuming there is a matrix A, if I partitaion this matrix into A1, A2, Ai... An. A is a parallel AIJ matrix at the whole view, Ai is a sequential AIJ matrix? I want to operate Ai at each node. In addition, whether is it possible to get general inverse using MatMatSolve() if the matrix is not square? Thanks a lot. Regards, Yujie On 2/4/08, Barry Smith wrote: For sequential AIJ matrices you can fill the B matrix with the identity and then use MatMatSolve(). Note since the inverse of a sparse matrix is dense the B matrix is a SeqDense matrix. Barry On Feb 4, 2008, at 12:37 AM, Yujie wrote: > Hi, > Now, I want to inverse a sparse matrix. I have browsed the manual, > however, I can't find some information. could you give me some advice? > > thanks a lot. > > Regards, > Yujie > --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiaxun_hou at yahoo.com.cn Mon Feb 4 20:34:16 2008 From: jiaxun_hou at yahoo.com.cn (jiaxun hou) Date: Tue, 5 Feb 2008 10:34:16 +0800 (CST) Subject: Compare the accuracy of PETSc's GMRES algorithm with Matlab's Message-ID: <82303.99306.qm@web15812.mail.cnb.yahoo.com> Hello everyone, I want to solve a linear system with no preconditioned GMRES in PETSc, but the result is divergent. White I solve the exactly same system under Matlab, I get the convergent result. Which result can I trust? And I print the residuals of them as the attachments. Thanks! Best regards, Jiaxun --------------------------------- ???????????????????????????????????? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: matlab.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: petsc.txt URL: From erlend.pedersen at holberger.com Tue Feb 5 03:26:56 2008 From: erlend.pedersen at holberger.com (Erlend Pedersen :.) Date: Tue, 05 Feb 2008 10:26:56 +0100 Subject: Overdetermined, non-linear In-Reply-To: References: <1201866864.6394.25.camel@erlend-ws.in.holberger.com> Message-ID: <1202203616.27733.50.camel@erlend-ws.in.holberger.com> On Sun, 2008-02-03 at 19:59 -0600, Matthew Knepley wrote: > On Feb 1, 2008 5:54 AM, Erlend Pedersen :. > wrote: > > I am attempting to use the PETSc nonlinear solver on an overdetermined > > system of non-linear equations. Hence, the Jacobian is not square, and > > so far we have unfortunately not succeeded with any combination of snes, > > ksp and pc. > > > > Could you confirm that snes actually works for overdetermined systems, > > and if so, is there an application example we could look at in order to > > make sure there is nothing wrong with our test-setup? > > > > We have previously used the MINPACK routine LMDER very successfully, but > > for our current problem sizes we rely on the use of sparse matrix > > representations and parallel architectures. PETSc's abstractions and > > automatic MPI makes this system very attractive for us, and we have > > already used the PETSc LSQR solver with great success. > > So in the sense that SNES is really just an iteration with an embedded solve, > yes it can solve non-square nonlinear systems. However, the user has to > understand what is meant by the Function and Jacobian evaluation methods. > I suggest implementing the simplest algorithm for non-square systems: > > http://en.wikipedia.org/wiki/Gauss-Newton_algorithm > > By implement, I mean your Function and Jacobian methods should return the > correct terms. I believe the reason you have not seen convergence is that > the result of the solve does not "mean" the correct thing for the iteration > in your current setup. > > Matt Thanks. Good to know that I should be able to get a working setup. Are there by any chance any code examples that I could use to clue myself in on how to transform my m equations of n unknonwns into a correct function for the Gauss-Newton algorithm? - Erlend :. From tstitt at cscs.ch Tue Feb 5 06:24:19 2008 From: tstitt at cscs.ch (Timothy Stitt) Date: Tue, 05 Feb 2008 13:24:19 +0100 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <602426.95557.qm@web38210.mail.mud.yahoo.com> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> Message-ID: <47A85573.4090607@cscs.ch> Yes Yujie, I was able to put together a parallel code to invert a large sparse matrix with the help of the PETSc developers. If you need any help or maybe a Fortran code template just let me know. Best, Tim. Waad Subber wrote: > Hi > There was a discussion between Tim Stitt and petsc developers about > matrix inversion, and it was really helpful. That was in last Nov. You > can check the emails archive > > http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html > > Waad > > */Yujie /* wrote: > > what is the difference between sequantial and parallel AIJ matrix? > Assuming there is a matrix A, if > I partitaion this matrix into A1, A2, Ai... An. > A is a parallel AIJ matrix at the whole view, Ai > is a sequential AIJ matrix? I want to operate Ai at each node. > In addition, whether is it possible to get general inverse using > MatMatSolve() if the matrix is not square? Thanks a lot. > > Regards, > Yujie > > > On 2/4/08, *Barry Smith* > wrote: > > > For sequential AIJ matrices you can fill the B matrix with the > identity and then use > MatMatSolve(). > > Note since the inverse of a sparse matrix is dense the B > matrix is > a SeqDense matrix. > > Barry > > On Feb 4, 2008, at 12:37 AM, Yujie wrote: > > > Hi, > > Now, I want to inverse a sparse matrix. I have browsed the > manual, > > however, I can't find some information. could you give me > some advice? > > > > thanks a lot. > > > > Regards, > > Yujie > > > > > > ------------------------------------------------------------------------ > Looking for last minute shopping deals? Find them fast with Yahoo! > Search. > -- Timothy Stitt HPC Applications Analyst Swiss National Supercomputing Centre (CSCS) Galleria 2 - Via Cantonale CH-6928 Manno, Switzerland +41 (0) 91 610 8233 stitt at cscs.ch From knepley at gmail.com Tue Feb 5 07:31:15 2008 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Feb 2008 07:31:15 -0600 Subject: Overdetermined, non-linear In-Reply-To: <1202203616.27733.50.camel@erlend-ws.in.holberger.com> References: <1201866864.6394.25.camel@erlend-ws.in.holberger.com> <1202203616.27733.50.camel@erlend-ws.in.holberger.com> Message-ID: On Feb 5, 2008 3:26 AM, Erlend Pedersen :. wrote: > > On Sun, 2008-02-03 at 19:59 -0600, Matthew Knepley wrote: > > On Feb 1, 2008 5:54 AM, Erlend Pedersen :. > > wrote: > > > I am attempting to use the PETSc nonlinear solver on an overdetermined > > > system of non-linear equations. Hence, the Jacobian is not square, and > > > so far we have unfortunately not succeeded with any combination of snes, > > > ksp and pc. > > > > > > Could you confirm that snes actually works for overdetermined systems, > > > and if so, is there an application example we could look at in order to > > > make sure there is nothing wrong with our test-setup? > > > > > > We have previously used the MINPACK routine LMDER very successfully, but > > > for our current problem sizes we rely on the use of sparse matrix > > > representations and parallel architectures. PETSc's abstractions and > > > automatic MPI makes this system very attractive for us, and we have > > > already used the PETSc LSQR solver with great success. > > > > So in the sense that SNES is really just an iteration with an embedded solve, > > yes it can solve non-square nonlinear systems. However, the user has to > > understand what is meant by the Function and Jacobian evaluation methods. > > I suggest implementing the simplest algorithm for non-square systems: > > > > http://en.wikipedia.org/wiki/Gauss-Newton_algorithm > > > > By implement, I mean your Function and Jacobian methods should return the > > correct terms. I believe the reason you have not seen convergence is that > > the result of the solve does not "mean" the correct thing for the iteration > > in your current setup. > > > > Matt > > Thanks. Good to know that I should be able to get a working setup. Are > there by any chance any code examples that I could use to clue myself in > on how to transform my m equations of n unknonwns into a correct > function for the Gauss-Newton algorithm? We do not have any nonlinear least-squares examples, unfortunately. At that point, most users have gone over to formulating their problem directly as an optimization problem (which allows more flexibility than least squares) and have moved to TAO (http://www-unix.mcs.anl.gov/tao/) which does have examples, I believe, for optimization of this kind. If you know that you only ever want to do least squares, and you want to solve the biggest, parallel problems, than stick with PETSc and build a nice Gauss-Newton (or Levenberg-Marquadt) solver. However, if you really want to solve a more general optimization problem, I recommend reformulating it now and moving to TAO. It is at least worth reading up on it. Thanks, Matt > - Erlend :. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From zonexo at gmail.com Tue Feb 5 07:57:51 2008 From: zonexo at gmail.com (Ben Tay) Date: Tue, 05 Feb 2008 21:57:51 +0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <47A85573.4090607@cscs.ch> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> Message-ID: <47A86B5F.5010503@gmail.com> Hi everyone, I was reading about the topic abt inversing a sparse matrix. I have to solve a poisson eqn for my CFD code. Usually, I form a system of linear eqns and solve Ax=b. The "A" is always the same and only the "b" changes every timestep. Does it mean that if I'm able to get the inverse matrix A^(-1), in order to get x at every timestep, I only need to do a simple matrix multiplication ie x=A^(-1)*b ? Hi Timothy, if the above is true, can you email me your Fortran code template? I'm also programming in fortran 90. Thank you very much Regards. Timothy Stitt wrote: > Yes Yujie, I was able to put together a parallel code to invert a > large sparse matrix with the help of the PETSc developers. If you need > any help or maybe a Fortran code template just let me know. > > Best, > > Tim. > > Waad Subber wrote: >> Hi >> There was a discussion between Tim Stitt and petsc developers about >> matrix inversion, and it was really helpful. That was in last Nov. >> You can check the emails archive >> >> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html >> >> >> Waad >> >> */Yujie /* wrote: >> >> what is the difference between sequantial and parallel AIJ matrix? >> Assuming there is a matrix A, if >> I partitaion this matrix into A1, A2, Ai... An. >> A is a parallel AIJ matrix at the whole view, Ai >> is a sequential AIJ matrix? I want to operate Ai at each node. >> In addition, whether is it possible to get general inverse using >> MatMatSolve() if the matrix is not square? Thanks a lot. >> >> Regards, >> Yujie >> >> >> On 2/4/08, *Barry Smith* > > wrote: >> >> >> For sequential AIJ matrices you can fill the B matrix >> with the >> identity and then use >> MatMatSolve(). >> >> Note since the inverse of a sparse matrix is dense the B >> matrix is >> a SeqDense matrix. >> >> Barry >> >> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >> >> > Hi, >> > Now, I want to inverse a sparse matrix. I have browsed the >> manual, >> > however, I can't find some information. could you give me >> some advice? >> > >> > thanks a lot. >> > >> > Regards, >> > Yujie >> > >> >> >> >> ------------------------------------------------------------------------ >> Looking for last minute shopping deals? Find them fast with Yahoo! >> Search. >> > > > > From dalcinl at gmail.com Tue Feb 5 08:04:54 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 5 Feb 2008 11:04:54 -0300 Subject: Compare the accuracy of PETSc's GMRES algorithm with Matlab's In-Reply-To: <82303.99306.qm@web15812.mail.cnb.yahoo.com> References: <82303.99306.qm@web15812.mail.cnb.yahoo.com> Message-ID: Could you try to run your PETSc code with the option below? -ksp_gmres_modifiedgramschmidt On 2/4/08, jiaxun hou wrote: > Hello everyone, > > I want to solve a linear system with no preconditioned GMRES in PETSc, but > the result is divergent. White I solve the exactly same system under Matlab, > I get the convergent result. Which result can I trust? > > And I print the residuals of them as the attachments. Thanks! > > Best regards, > Jiaxun > > > ________________________________ > ???????????????????????????????????? > Matlab results > ------------------------------------------------------------ > 1 ss= 0.8767977459 GRS=8.3589003100 GRS_new=7.3290649496 > 2 ss= 0.9942062562 GRS=7.3290649496 GRS_new=7.2866022253 > 3 ss= 0.9579384974 GRS=7.2866022253 GRS_new=6.9801167867 > 4 ss= 0.9770486148 GRS=6.9801167867 GRS_new=6.8199134376 > 5 ss= 0.9848398847 GRS=6.8199134376 GRS_new=6.7165227635 > 6 ss= 0.9817587194 GRS=6.7165227635 GRS_new=6.5940047873 > 7 ss= 0.9828044725 GRS=6.5940047873 GRS_new=6.4806173965 > 8 ss= 0.9923435431 GRS=6.4806173965 GRS_new=6.4309988285 > 9 ss= 0.9847288648 GRS=6.4309988285 GRS_new=6.3327901760 > 10 ss= 0.9999247510 GRS=6.3327901760 GRS_new=6.3323136396 > 11 ss= 0.9591518787 GRS=6.3323136396 GRS_new=6.0736505238 > 12 ss= 0.9998567242 GRS=6.0736505238 GRS_new=6.0727803165 > 13 ss= 0.9609943819 GRS=6.0727803165 GRS_new=5.8359077665 > 14 ss= 0.9999728993 GRS=5.8359077665 GRS_new=5.8357496091 > 15 ss= 0.9905739831 GRS=5.8357496091 GRS_new=5.7807417346 > 16 ss= 0.9937523276 GRS=5.7807417346 GRS_new=5.7446255541 > 17 ss= 0.9841963377 GRS=5.7446255541 GRS_new=5.6538394318 > 18 ss= 0.9953976161 GRS=5.6538394318 GRS_new=5.6278182921 > 19 ss= 0.9971858162 GRS=5.6278182921 GRS_new=5.6119805771 > 20 ss= 0.9891811740 GRS=5.6119805771 GRS_new=5.5512655356 > 21 ss= 0.9985693802 GRS=5.5512655356 GRS_new=5.5433237853 > 22 ss= 0.9902721074 GRS=5.5433237853 GRS_new=5.4893989269 > 23 ss= 0.9995190416 GRS=5.4893989269 GRS_new=5.4867587542 > 24 ss= 0.9852930755 GRS=5.4867587542 GRS_new=5.4060654074 > 25 ss= 0.9993241249 GRS=5.4060654074 GRS_new=5.4024115826 > 26 ss= 0.9886557159 GRS=5.4024115826 GRS_new=5.3411250907 > 27 ss= 0.9999955172 GRS=5.3411250907 GRS_new=5.3411011474 > 28 ss= 0.9825588209 GRS=5.3411011474 GRS_new=5.2479460454 > 29 ss= 0.9988457962 GRS=5.2479460454 GRS_new=5.2418888463 > 30 ss= 0.9884104840 GRS=5.2418888463 GRS_new=5.1811378916 > 31 ss= 0.9992362771 GRS=5.1811378916 GRS_new=5.1771809379 > 32 ss= 0.9858180681 GRS=5.1771809379 GRS_new=5.1037585103 > 33 ss= 0.9986080784 GRS=5.1037585103 GRS_new=5.0966544787 > 34 ss= 0.9867985434 GRS=5.0966544787 GRS_new=5.0293712159 > 35 ss= 0.9987564203 GRS=5.0293712159 GRS_new=5.0231167919 > 36 ss= 0.9882373404 GRS=5.0231167919 GRS_new=4.9640315790 > 37 ss= 0.9990354114 GRS=4.9640315790 GRS_new=4.9592433307 > 38 ss= 0.9868343640 GRS=4.9592433307 GRS_new=4.8939517382 > 39 ss= 0.9993989316 GRS=4.8939517382 GRS_new=4.8910101386 > 40 ss= 0.9876496755 GRS=4.8910101386 GRS_new=4.8306045761 > 41 ss= 0.9992609634 GRS=4.8306045761 GRS_new=4.8270345827 > 42 ss= 0.9900904643 GRS=4.8270345827 GRS_new=4.7792009111 > 43 ss= 0.9998679060 GRS=4.7792009111 GRS_new=4.7785696072 > 44 ss= 0.9863344717 GRS=4.7785696072 GRS_new=4.7132679291 > 45 ss= 0.9999844285 GRS=4.7132679291 GRS_new=4.7131945365 > 46 ss= 0.9910405864 GRS=4.7131945365 GRS_new=4.6709670773 > 47 ss= 0.9999609816 GRS=4.6709670773 GRS_new=4.6707848237 > 48 ss= 0.9899729173 GRS=4.6707848237 GRS_new=4.6239504780 > 49 ss= 0.9979852800 GRS=4.6239504780 GRS_new=4.6146345124 > 50 ss= 0.9933060164 GRS=4.6146345124 GRS_new=4.5837442249 > 51 ss= 0.9977925155 GRS=4.5837442249 GRS_new=4.5736256804 > 52 ss= 0.9908313109 GRS=4.5736256804 GRS_new=4.5316915283 > 53 ss= 0.9974726054 GRS=4.5316915283 GRS_new=4.5202381555 > 54 ss= 0.9933649859 GRS=4.5202381555 GRS_new=4.4902463115 > 55 ss= 0.9971691729 GRS=4.4902463115 GRS_new=4.4775352004 > 56 ss= 0.9923528355 GRS=4.4775352004 GRS_new=4.4432947523 > 57 ss= 0.9958902809 GRS=4.4432947523 GRS_new=4.4250340590 > 58 ss= 0.9956119858 GRS=4.4250340590 GRS_new=4.4056169468 > 59 ss= 0.9929633098 GRS=4.4056169468 GRS_new=4.3746159852 > 60 ss= 0.9965857536 GRS=4.3746159852 GRS_new=4.3596799681 > 61 ss= 0.9864800828 GRS=4.3596799681 GRS_new=4.3007374561 > 62 ss= 0.9993535743 GRS=4.3007374561 GRS_new=4.2979573489 > 63 ss= 0.9924403396 GRS=4.2979573489 GRS_new=4.2654662508 > 64 ss= 0.9999218865 GRS=4.2654662508 GRS_new=4.2651330604 > 65 ss= 0.9903604367 GRS=4.2651330604 GRS_new=4.2240190404 > 66 ss= 0.9999981102 GRS=4.2240190404 GRS_new=4.2240110578 > 67 ss= 0.9850734469 GRS=4.2240110578 GRS_new=4.1609611323 > 68 ss= 0.9998407715 GRS=4.1609611323 GRS_new=4.1602985889 > 69 ss= 0.9825309063 GRS=4.1602985889 GRS_new=4.0876219429 > 70 ss= 0.9999999323 GRS=4.0876219429 GRS_new=4.0876216663 > 71 ss= 0.9801304161 GRS=4.0876216663 GRS_new=4.0064023246 > 72 ss= 0.9998051423 GRS=4.0064023246 GRS_new=4.0056216462 > 73 ss= 0.9855617342 GRS=4.0056216462 GRS_new=3.9477874163 > 74 ss= 0.9995279352 GRS=3.9477874163 GRS_new=3.9459238048 > 75 ss= 0.9899078491 GRS=3.9459238048 GRS_new=3.9061009463 > 76 ss= 0.9997748883 GRS=3.9061009463 GRS_new=3.9052216374 > 77 ss= 0.9915984422 GRS=3.9052216374 GRS_new=3.8724116920 > 78 ss= 0.9999758238 GRS=3.8724116920 GRS_new=3.8723180717 > 79 ss= 0.9906834932 GRS=3.8723180717 GRS_new=3.8362415942 > 80 ss= 0.9999058873 GRS=3.8362415942 GRS_new=3.8358805550 > 81 ss= 0.9931303767 GRS=3.8358805550 GRS_new=3.8095295006 > 82 ss= 0.9994032497 GRS=3.8095295006 GRS_new=3.8072561629 > 83 ss= 0.9946211871 GRS=3.8072561629 GRS_new=3.7867776442 > 84 ss= 0.9975086306 GRS=3.7867776442 GRS_new=3.7773433822 > 85 ss= 0.9951749469 GRS=3.7773433822 GRS_new=3.7591175000 > 86 ss= 0.9965077732 GRS=3.7591175000 GRS_new=3.7459898092 > 87 ss= 0.9934497713 GRS=3.7459898092 GRS_new=3.7214527192 > 88 ss= 0.9934219923 GRS=3.7214527192 GRS_new=3.6969729745 > 89 ss= 0.9943188889 GRS=3.6969729745 GRS_new=3.6759700602 > 90 ss= 0.9911302936 GRS=3.6759700602 GRS_new=3.6433652850 > 91 ss= 0.9969167891 GRS=3.6433652850 GRS_new=3.6321320213 > 92 ss= 0.9850570261 GRS=3.6321320213 GRS_new=3.5778571672 > 93 ss= 0.9994276670 GRS=3.5778571672 GRS_new=3.5758094416 > 94 ss= 0.9825319957 GRS=3.5758094416 GRS_new=3.5133471868 > 95 ss= 0.9990651680 GRS=3.5133471868 GRS_new=3.5100627973 > 96 ss= 0.9873693338 GRS=3.5100627973 GRS_new=3.4657283658 > 97 ss= 0.9995260021 GRS=3.4657283658 GRS_new=3.4640856180 > 98 ss= 0.9852095868 GRS=3.4640856180 GRS_new=3.4128503602 > 99 ss= 0.9994646613 GRS=3.4128503602 GRS_new=3.4110233294 > 100 ss= 0.9838857947 GRS=3.4110233294 GRS_new=3.3560573991 > 101 ss= 0.9999092368 GRS=3.3560573991 GRS_new=3.3557527927 > 102 ss= 0.9887575579 GRS=3.3557527927 GRS_new=3.3180259361 > 103 ss= 0.9999098244 GRS=3.3180259361 GRS_new=3.3177267310 > 104 ss= 0.9859325023 GRS=3.3177267310 GRS_new=3.2710546178 > 105 ss= 0.9990376979 GRS=3.2710546178 GRS_new=3.2679068751 > 106 ss= 0.9880431974 GRS=3.2679068751 GRS_new=3.2288331576 > 107 ss= 0.9990743475 GRS=3.2288331576 GRS_new=3.2258443800 > 108 ss= 0.9890143770 GRS=3.2258443800 GRS_new=3.1904064698 > 109 ss= 0.9996039465 GRS=3.1904064698 GRS_new=3.1891428981 > 110 ss= 0.9911902290 GRS=3.1891428981 GRS_new=3.1610472793 > 111 ss= 0.9996627037 GRS=3.1610472793 GRS_new=3.1599810697 > 112 ss= 0.9873736362 GRS=3.1599810697 GRS_new=3.1200819991 > 113 ss= 0.9994522850 GRS=3.1200819991 GRS_new=3.1183730833 > 114 ss= 0.9892755061 GRS=3.1183730833 GRS_new=3.0849301101 > 115 ss= 0.9997734276 GRS=3.0849301101 GRS_new=3.0842311502 > 116 ss= 0.9883184336 GRS=3.0842311502 GRS_new=3.0482024992 > 117 ss= 0.9999009839 GRS=3.0482024992 GRS_new=3.0479006782 > 118 ss= 0.9854371189 GRS=3.0479006782 GRS_new=3.0035144631 > 119 ss= 0.9999937513 GRS=3.0035144631 GRS_new=3.0034956949 > 120 ss= 0.9816559742 GRS=3.0034956949 GRS_new=2.9483994923 > 121 ss= 0.9998133192 GRS=2.9483994923 GRS_new=2.9478490828 > 122 ss= 0.9897842140 GRS=2.9478490828 GRS_new=2.9177344875 > 123 ss= 0.9937438967 GRS=2.9177344875 GRS_new=2.8994808391 > 124 ss= 0.9928844144 GRS=2.8994808391 GRS_new=2.8788493350 > 125 ss= 0.9940354540 GRS=2.8788493350 GRS_new=2.8616783057 > 126 ss= 0.9903795737 GRS=2.8616783057 GRS_new=2.8341477405 > 127 ss= 0.9930147037 GRS=2.8341477405 GRS_new=2.8143503787 > 128 ss= 0.9954513964 GRS=2.8143503787 GRS_new=2.8015490145 > 129 ss= 0.9921800208 GRS=2.8015490145 GRS_new=2.7796409594 > 130 ss= 0.9939987784 GRS=2.7796409594 GRS_new=2.7629597180 > 131 ss= 0.9908792550 GRS=2.7629597180 GRS_new=2.7377594669 > 132 ss= 0.9973083911 GRS=2.7377594669 GRS_new=2.7303904893 > 133 ss= 0.9909726835 GRS=2.7303904893 GRS_new=2.7057423901 > 134 ss= 0.9988907774 GRS=2.7057423901 GRS_new=2.7027411194 > 135 ss= 0.9878359656 GRS=2.7027411194 GRS_new=2.6698648834 > 136 ss= 0.9998424757 GRS=2.6698648834 GRS_new=2.6694443147 > 137 ss= 0.9860925172 GRS=2.6694443147 GRS_new=2.6323190638 > 138 ss= 0.9998970195 GRS=2.6323190638 GRS_new=2.6320479862 > 139 ss= 0.9834325176 GRS=2.6320479862 GRS_new=2.5884415775 > 140 ss= 0.9997099187 GRS=2.5884415775 GRS_new=2.5876907191 > 141 ss= 0.9873985061 GRS=2.5876907191 GRS_new=2.5550819503 > 142 ss= 0.9997385347 GRS=2.5550819503 GRS_new=2.5544138850 > 143 ss= 0.9878404410 GRS=2.5544138850 GRS_new=2.5233533387 > 144 ss= 0.9999702108 GRS=2.5233533387 GRS_new=2.5232781699 > 145 ss= 0.9834843437 GRS=2.5232781699 GRS_new=2.4816045750 > 146 ss= 0.9999555852 GRS=2.4816045750 GRS_new=2.4814943549 > 147 ss= 0.9833714542 GRS=2.4814943549 GRS_new=2.4402307123 > 148 ss= 0.9998565798 GRS=2.4402307123 GRS_new=2.4398807339 > 149 ss= 0.9793940038 GRS=2.4398807339 GRS_new=2.3896045609 > 150 ss= 0.9980094016 GRS=2.3896045609 GRS_new=2.3848478179 > 151 ss= 0.9839947785 GRS=2.3848478179 GRS_new=2.3466778003 > 152 ss= 0.9946845124 GRS=2.3466778003 GRS_new=2.3342040635 > 153 ss= 0.9815926232 GRS=2.3342040635 GRS_new=2.2912374898 > 154 ss= 0.9928244685 GRS=2.2912374898 GRS_new=2.2747966429 > 155 ss= 0.9915544496 GRS=2.2747966429 GRS_new=2.2555847332 > 156 ss= 0.9860201299 GRS=2.2555847332 GRS_new=2.2240519517 > 157 ss= 0.9920398136 GRS=2.2240519517 GRS_new=2.2063480837 > 158 ss= 0.9850997487 GRS=2.2063480837 GRS_new=2.1734729429 > 159 ss= 0.9846737498 GRS=2.1734729429 GRS_new=2.1401617527 > 160 ss= 0.9889998946 GRS=2.1401617527 GRS_new=2.1166197478 > 161 ss= 0.9886237198 GRS=2.1166197478 GRS_new=2.0925404885 > 162 ss= 0.9864441737 GRS=2.0925404885 GRS_new=2.0641743731 > 163 ss= 0.9889421019 GRS=2.0641743731 GRS_new=2.0413489432 > 164 ss= 0.9875293875 GRS=2.0413489432 GRS_new=2.0158920717 > 165 ss= 0.9923696106 GRS=2.0158920717 GRS_new=2.0005100301 > 166 ss= 0.9917055125 GRS=2.0005100301 GRS_new=1.9839168246 > 167 ss= 0.9883956382 GRS=1.9839168246 GRS_new=1.9608947359 > 168 ss= 0.9951669546 GRS=1.9608947359 GRS_new=1.9514176427 > 169 ss= 0.9846800082 GRS=1.9514176427 GRS_new=1.9215219403 > 170 ss= 0.9851222204 GRS=1.9215219403 GRS_new=1.8929339605 > 171 ss= 0.9913306028 GRS=1.8929339605 GRS_new=1.8765233641 > 172 ss= 0.9772904480 GRS=1.8765233641 GRS_new=1.8339083592 > 173 ss= 0.9958226278 GRS=1.8339083592 GRS_new=1.8262474413 > 174 ss= 0.9723712619 GRS=1.8262474413 GRS_new=1.7757905291 > 175 ss= 0.9978494633 GRS=1.7757905291 GRS_new=1.7719716264 > 176 ss= 0.9699641657 GRS=1.7719716264 GRS_new=1.7187489803 > 177 ss= 0.9978279403 GRS=1.7187489803 GRS_new=1.7150157549 > 178 ss= 0.9651257580 GRS=1.7150157549 GRS_new=1.6552058804 > 179 ss= 0.9958703980 GRS=1.6552058804 GRS_new=1.6483705389 > 180 ss= 0.9583941862 GRS=1.6483705389 GRS_new=1.5797887411 > 181 ss= 0.9943257865 GRS=1.5797887411 GRS_new=1.5708246826 > 182 ss= 0.9689013349 GRS=1.5708246826 GRS_new=1.5219741318 > 183 ss= 0.9927423131 GRS=1.5219741318 GRS_new=1.5109281201 > 184 ss= 0.9760067066 GRS=1.5109281201 GRS_new=1.4746759785 > 185 ss= 0.9900954777 GRS=1.4746759785 GRS_new=1.4600700174 > 186 ss= 0.9696993568 GRS=1.4600700174 GRS_new=1.4158289568 > 187 ss= 0.9802346349 GRS=1.4158289568 GRS_new=1.3878445806 > 188 ss= 0.9724237348 GRS=1.3878445806 GRS_new=1.3495730103 > 189 ss= 0.9796598123 GRS=1.3495730103 GRS_new=1.3221224419 > 190 ss= 0.9595089723 GRS=1.3221224419 GRS_new=1.2685883455 > 191 ss= 0.9795574759 GRS=1.2685883455 GRS_new=1.2426551977 > 192 ss= 0.9844421304 GRS=1.2426551977 GRS_new=1.2233221302 > 193 ss= 0.9545711990 GRS=1.2233221302 GRS_new=1.1677480726 > 194 ss= 0.9856428311 GRS=1.1677480726 GRS_new=1.1509825164 > 195 ss= 0.9526429154 GRS=1.1509825164 GRS_new=1.0964753400 > 196 ss= 0.9696093276 GRS=1.0964753400 GRS_new=1.0631527171 > 197 ss= 0.9687936691 GRS=1.0631527171 GRS_new=1.0299756216 > 198 ss= 0.9749973851 GRS=1.0299756216 GRS_new=1.0042235378 > 199 ss= 0.9305657933 GRS=1.0042235378 GRS_new=0.9344960731 > 200 ss= 0.9663929841 GRS=0.9344960731 GRS_new=0.9030904487 > 201 ss= 0.9463180084 GRS=0.9030904487 GRS_new=0.8546107548 > 202 ss= 0.9440343320 GRS=0.8546107548 GRS_new=0.8067818930 > 203 ss= 0.9692366995 GRS=0.8067818930 GRS_new=0.7819626192 > 204 ss= 0.9647239092 GRS=0.7819626192 GRS_new=0.7543780349 > 205 ss= 0.9586533240 GRS=0.7543780349 GRS_new=0.7231870107 > 206 ss= 0.9512701628 GRS=0.7231870107 GRS_new=0.6879462254 > 207 ss= 0.9863931602 GRS=0.6879462254 GRS_new=0.6785854513 > 208 ss= 0.9436979088 GRS=0.6785854513 GRS_new=0.6403796714 > 209 ss= 0.9902642308 GRS=0.6403796714 GRS_new=0.6341450827 > 210 ss= 0.9265021328 GRS=0.6341450827 GRS_new=0.5875367716 > 211 ss= 0.9812524410 GRS=0.5875367716 GRS_new=0.5765218913 > 212 ss= 0.9309942287 GRS=0.5765218913 GRS_new=0.5367385535 > 213 ss= 0.9730496129 GRS=0.5367385535 GRS_new=0.5222732417 > 214 ss= 0.9656883272 GRS=0.5222732417 GRS_new=0.5043531731 > 215 ss= 0.9452863486 GRS=0.5043531731 GRS_new=0.4767581694 > 216 ss= 0.9438553901 GRS=0.4767581694 GRS_new=0.4499907680 > 217 ss= 0.9427151241 GRS=0.4499907680 GRS_new=0.4242131027 > 218 ss= 0.9141056503 GRS=0.4242131027 GRS_new=0.3877755941 > 219 ss= 0.9652966758 GRS=0.3877755941 GRS_new=0.3743184919 > 220 ss= 0.9380852210 GRS=0.3743184919 GRS_new=0.3511426452 > 221 ss= 0.9857748287 GRS=0.3511426452 GRS_new=0.3461475809 > 222 ss= 0.9011745983 GRS=0.3461475809 GRS_new=0.3119394072 > 223 ss= 0.9958789390 GRS=0.3119394072 GRS_new=0.3106538859 > 224 ss= 0.8942056291 GRS=0.3106538859 GRS_new=0.2777884535 > 225 ss= 0.9860175937 GRS=0.2777884535 GRS_new=0.2739043025 > 226 ss= 0.9569264287 GRS=0.2739043025 GRS_new=0.2621062660 > 227 ss= 0.9784558926 GRS=0.2621062660 GRS_new=0.2564594204 > 228 ss= 0.9702851128 GRS=0.2564594204 GRS_new=0.2488387576 > 229 ss= 0.9368992686 GRS=0.2488387576 GRS_new=0.2331368500 > 230 ss= 0.9534314432 GRS=0.2331368500 GRS_new=0.2222800034 > 231 ss= 0.9462475811 GRS=0.2222800034 GRS_new=0.2103319155 > 232 ss= 0.9506818789 GRS=0.2103319155 GRS_new=0.1999587407 > 233 ss= 0.9284295306 GRS=0.1999587407 GRS_new=0.1856475997 > 234 ss= 0.9646994083 GRS=0.1856475997 GRS_new=0.1790941296 > 235 ss= 0.9564933801 GRS=0.1790941296 GRS_new=0.1713023494 > 236 ss= 0.8964059043 GRS=0.1713023494 GRS_new=0.1535564374 > 237 ss= 0.9362210991 GRS=0.1535564374 GRS_new=0.1437627766 > 238 ss= 0.8842624203 GRS=0.1437627766 GRS_new=0.1271240208 > 239 ss= 0.9572135619 GRS=0.1271240208 GRS_new=0.1216848367 > 240 ss= 0.8843474708 GRS=0.1216848367 GRS_new=0.1076116776 > 241 ss= 0.9344896729 GRS=0.1076116776 GRS_new=0.1005620014 > 242 ss= 0.9446141679 GRS=0.1005620014 GRS_new=0.0949922913 > 243 ss= 0.9420503449 GRS=0.0949922913 GRS_new=0.0894875208 > 244 ss= 0.8996858936 GRS=0.0894875208 GRS_new=0.0805106601 > 245 ss= 0.8843030849 GRS=0.0805106601 GRS_new=0.0711958251 > 246 ss= 0.8495068207 GRS=0.0711958251 GRS_new=0.0604813390 > 247 ss= 0.9823275657 GRS=0.0604813390 GRS_new=0.0594124865 > 248 ss= 0.8032699443 GRS=0.0594124865 GRS_new=0.0477242647 > 249 ss= 0.9278236882 GRS=0.0477242647 GRS_new=0.0442797033 > 250 ss= 0.9228876267 GRS=0.0442797033 GRS_new=0.0408651903 > 251 ss= 0.9244258318 GRS=0.0408651903 GRS_new=0.0377768376 > 252 ss= 0.9452470204 GRS=0.0377768376 GRS_new=0.0357084431 > 253 ss= 0.9004540041 GRS=0.0357084431 GRS_new=0.0321538106 > 254 ss= 0.9615617722 GRS=0.0321538106 GRS_new=0.0309178751 > 255 ss= 0.8226991366 GRS=0.0309178751 GRS_new=0.0254361092 > 256 ss= 0.9052555692 GRS=0.0254361092 GRS_new=0.0230261795 > 257 ss= 0.8810736262 GRS=0.0230261795 GRS_new=0.0202877594 > 258 ss= 0.8902808119 GRS=0.0202877594 GRS_new=0.0180618029 > 259 ss= 0.8622991452 GRS=0.0180618029 GRS_new=0.0155746772 > 260 ss= 0.9209308003 GRS=0.0155746772 GRS_new=0.0143432000 > 261 ss= 0.8712647973 GRS=0.0143432000 GRS_new=0.0124967252 > 262 ss= 0.9652566894 GRS=0.0124967252 GRS_new=0.0120625476 > 263 ss= 0.8755993907 GRS=0.0120625476 GRS_new=0.0105619593 > 264 ss= 0.9532509670 GRS=0.0105619593 GRS_new=0.0100681980 > 265 ss= 0.7939620184 GRS=0.0100681980 GRS_new=0.0079937668 > 266 ss= 0.9274761528 GRS=0.0079937668 GRS_new=0.0074140281 > 267 ss= 0.8665778971 GRS=0.0074140281 GRS_new=0.0064248328 > 268 ss= 0.8479362252 GRS=0.0064248328 GRS_new=0.0054478485 > 269 ss= 0.8984785974 GRS=0.0054478485 GRS_new=0.0048947753 > 270 ss= 0.8307538442 GRS=0.0048947753 GRS_new=0.0040663534 > 271 ss= 0.7605938618 GRS=0.0040663534 GRS_new=0.0030928434 > 272 ss= 0.9282658981 GRS=0.0030928434 GRS_new=0.0028709811 > 273 ss= 0.8253071221 GRS=0.0028709811 GRS_new=0.0023694411 > 274 ss= 0.9363207077 GRS=0.0023694411 GRS_new=0.0022185568 > 275 ss= -0.6388998849 GRS=0.0022185568 GRS_new=-0.0014174357 > 276 ss= 0.9846474808 GRS=-0.0014174357 GRS_new=-0.0013956745 > 277 ss= 0.6798895257 GRS=-0.0013956745 GRS_new=-0.0009489045 > 278 ss= 0.8665430398 GRS=-0.0009489045 GRS_new=-0.0008222666 > 279 ss= 0.9124149039 GRS=-0.0008222666 GRS_new=-0.0007502483 > 280 ss= 0.8807164067 GRS=-0.0007502483 GRS_new=-0.0006607559 > 281 ss= 0.8804510347 GRS=-0.0006607559 GRS_new=-0.0005817633 > 282 ss= 0.8206397927 GRS=-0.0005817633 GRS_new=-0.0004774181 > 283 ss= 0.8096503797 GRS=-0.0004774181 GRS_new=-0.0003865417 > 284 ss= 0.8500015980 GRS=-0.0003865417 GRS_new=-0.0003285611 > 285 ss= 0.8026252203 GRS=-0.0003285611 GRS_new=-0.0002637114 > 286 ss= -0.7069351672 GRS=-0.0002637114 GRS_new=0.0001864269 > 287 ss= 0.9999108471 GRS=0.0001864269 GRS_new=0.0001864103 > 288 ss= 0.6379050807 GRS=0.0001864103 GRS_new=0.0001189120 > 289 ss= 0.9121889821 GRS=0.0001189120 GRS_new=0.0001084703 > 290 ss= 0.7271537479 GRS=0.0001084703 GRS_new=0.0000788746 > 291 ss= 0.8520989054 GRS=0.0000788746 GRS_new=0.0000672089 > 292 ss= 0.7999946682 GRS=0.0000672089 GRS_new=0.0000537668 > 293 ss= 0.8065995129 GRS=0.0000537668 GRS_new=0.0000433683 > 294 ss= 0.7179751982 GRS=0.0000433683 GRS_new=0.0000311373 > 295 ss= 0.9944276090 GRS=0.0000311373 GRS_new=0.0000309638 > 296 ss= 0.4960821305 GRS=0.0000309638 GRS_new=0.0000153606 > 297 ss= 0.9628692907 GRS=0.0000153606 GRS_new=0.0000147903 > 298 ss= 0.4440261274 GRS=0.0000147903 GRS_new=0.0000065673 > > PETSc results > ------------------------------------------------------------ > ss= 0.8767977459 GRS(0)=8.3589003100 GRS(1)=-7.3290649496 > ss= 0.9942062562 GRS(1)=-7.3290649496 GRS(2)=7.2866022253 > ss= 0.9579384974 GRS(2)=7.2866022253 GRS(3)=-6.9801167867 > ss= 0.9770486148 GRS(3)=-6.9801167867 GRS(4)=6.8199134376 > ss= 0.9848398847 GRS(4)=6.8199134376 GRS(5)=-6.7165227635 > ss= 0.9817587194 GRS(5)=-6.7165227635 GRS(6)=6.5940047873 > ss= 0.9828044725 GRS(6)=6.5940047873 GRS(7)=-6.4806173965 > ss= 0.9923435431 GRS(7)=-6.4806173965 GRS(8)=6.4309988285 > ss= 0.9847288648 GRS(8)=6.4309988285 GRS(9)=-6.3327901760 > ss= 0.9999247510 GRS(9)=-6.3327901760 GRS(10)=6.3323136396 > ss= 0.9591518787 GRS(10)=6.3323136396 GRS(11)=-6.0736505238 > ss= 0.9998567242 GRS(11)=-6.0736505238 GRS(12)=6.0727803165 > ss= 0.9609943819 GRS(12)=6.0727803165 GRS(13)=-5.8359077665 > ss= 0.9999728993 GRS(13)=-5.8359077665 GRS(14)=5.8357496091 > ss= 0.9905739831 GRS(14)=5.8357496091 GRS(15)=-5.7807417346 > ss= 0.9937523276 GRS(15)=-5.7807417346 GRS(16)=5.7446255541 > ss= 0.9841963377 GRS(16)=5.7446255541 GRS(17)=-5.6538394318 > ss= 0.9953976161 GRS(17)=-5.6538394318 GRS(18)=5.6278182921 > ss= 0.9971858162 GRS(18)=5.6278182921 GRS(19)=-5.6119805772 > ss= 0.9891811740 GRS(19)=-5.6119805772 GRS(20)=5.5512655357 > ss= 0.9985693802 GRS(20)=5.5512655357 GRS(21)=-5.5433237853 > ss= 0.9902721075 GRS(21)=-5.5433237853 GRS(22)=5.4893989272 > ss= 0.9995190415 GRS(22)=5.4893989272 GRS(23)=-5.4867587542 > ss= 0.9852930754 GRS(23)=-5.4867587542 GRS(24)=5.4060654070 > ss= 0.9993241250 GRS(24)=5.4060654070 GRS(25)=-5.4024115827 > ss= 0.9886557155 GRS(25)=-5.4024115827 GRS(26)=5.3411250889 > ss= 0.9999955172 GRS(26)=5.3411250889 GRS(27)=-5.3411011457 > ss= 0.9825588208 GRS(27)=-5.3411011457 GRS(28)=5.2479460434 > ss= 0.9988457956 GRS(28)=5.2479460434 GRS(29)=-5.2418888410 > ss= 0.9884104836 GRS(29)=-5.2418888410 GRS(30)=5.1811378842 > ss= 0.9992362779 GRS(30)=5.1811378842 GRS(31)=-5.1771809345 > ss= 0.9858180688 GRS(31)=-5.1771809345 GRS(32)=5.1037585107 > ss= 0.9986080812 GRS(32)=5.1037585107 GRS(33)=-5.0966544933 > ss= 0.9867985464 GRS(33)=-5.0966544933 GRS(34)=5.0293712457 > ss= 0.9987564222 GRS(34)=5.0293712457 GRS(35)=-5.0231168315 > ss= 0.9882373460 GRS(35)=-5.0231168315 GRS(36)=4.9640316464 > ss= 0.9990354060 GRS(36)=4.9640316464 GRS(37)=-4.9592433714 > ss= 0.9868343675 GRS(37)=-4.9592433714 GRS(38)=4.8939517956 > ss= 0.9993989288 GRS(38)=4.8939517956 GRS(39)=-4.8910101822 > ss= 0.9876496783 GRS(39)=-4.8910101822 GRS(40)=4.8306046332 > ss= 0.9992609725 GRS(40)=4.8306046332 GRS(41)=-4.8270346838 > ss= 0.9900904698 GRS(41)=-4.8270346838 GRS(42)=4.7792010378 > ss= 0.9998679144 GRS(42)=4.7792010378 GRS(43)=-4.7785697743 > ss= 0.9863344290 GRS(43)=-4.7785697743 GRS(44)=4.7132678899 > ss= 0.9999844323 GRS(44)=4.7132678899 GRS(45)=-4.7131945153 > ss= 0.9910404741 GRS(45)=-4.7131945153 GRS(46)=4.6709665272 > ss= 0.9999609765 GRS(46)=4.6709665272 GRS(47)=-4.6707842498 > ss= 0.9899724790 GRS(47)=-4.6707842498 GRS(48)=4.6239478627 > ss= 0.9979853488 GRS(48)=4.6239478627 GRS(49)=-4.6146322207 > ss= 0.9933053898 GRS(49)=-4.6146322207 GRS(50)=4.5837390570 > ss= 0.9977925127 GRS(50)=4.5837390570 GRS(51)=-4.5736205111 > ss= 0.9908317992 GRS(51)=-4.5736205111 GRS(52)=4.5316886400 > ss= 0.9974703441 GRS(52)=4.5316886400 GRS(53)=-4.5202250272 > ss= 0.9933577593 GRS(53)=-4.5202250272 GRS(54)=4.4902006044 > ss= 0.9971745163 GRS(54)=4.4902006044 GRS(55)=-4.4775136156 > ss= 0.9923600818 GRS(55)=-4.4775136156 GRS(56)=4.4433057777 > ss= 0.9958745981 GRS(56)=4.4433057777 GRS(57)=-4.4249753554 > ss= 0.9955761873 GRS(57)=-4.4249753554 GRS(58)=4.4054000934 > ss= 0.9930040219 GRS(58)=4.4054000934 GRS(59)=-4.3745800110 > ss= 0.9966346650 GRS(59)=-4.3745800110 GRS(60)=4.3598580838 > ss= 0.9863487872 GRS(60)=4.3598580838 GRS(61)=-4.3003407333 > ss= 0.9993086675 GRS(61)=-4.3003407333 GRS(62)=4.2973677678 > ss= 0.9925217950 GRS(62)=4.2973677678 GRS(63)=-4.2652311706 > ss= 0.9999207600 GRS(63)=-4.2652311706 GRS(64)=4.2648931935 > ss= 0.9902311161 GRS(64)=4.2648931935 GRS(65)=-4.2232299472 > ss= 0.9999879339 GRS(65)=-4.2232299472 GRS(66)=4.2231789892 > ss= 0.9851418699 GRS(66)=4.2231789892 GRS(67)=-4.1604304461 > ss= 0.9998329649 GRS(67)=-4.1604304461 GRS(68)=4.1597355080 > ss= 0.9823773748 GRS(68)=4.1597355080 GRS(69)=-4.0864300483 > ss= 0.9999777038 GRS(69)=-4.0864300483 GRS(70)=4.0863389365 > ss= 0.9824555504 GRS(70)=4.0863389365 GRS(71)=-4.0146463691 > ss= 0.9999965224 GRS(71)=-4.0146463691 GRS(72)=4.0146324076 > ss= 0.9814858949 GRS(72)=4.0146324076 GRS(73)=-3.9403050811 > ss= 0.9999966647 GRS(73)=-3.9403050811 GRS(74)=3.9402919391 > ss= 0.9935776881 GRS(74)=3.9402919391 GRS(75)=-3.9149861555 > ss= 0.9996537787 GRS(75)=-3.9149861555 GRS(76)=3.9136307039 > ss= 0.9996908126 GRS(76)=3.9136307039 GRS(77)=-3.9124206585 > ss= 0.9940733429 GRS(77)=-3.9124206585 GRS(78)=3.8892330829 > ss= 0.9917645614 GRS(78)=3.8892330829 GRS(79)=-3.8572035427 > ss= 0.9979318848 GRS(79)=-3.8572035427 GRS(80)=3.8492264013 > ss= 0.9968843841 GRS(80)=3.8492264013 GRS(81)=-3.8372336902 > ss= 0.9999422648 GRS(81)=-3.8372336902 GRS(82)=3.8370121468 > ss= 0.9998611209 GRS(82)=3.8370121468 GRS(83)=-3.8364792660 > ss= 0.9969188227 GRS(83)=-3.8364792660 GRS(84)=3.8246583932 > ss= 0.9993374910 GRS(84)=3.8246583932 GRS(85)=-3.8221245227 > ss= 0.9960060274 GRS(85)=-3.8221245227 GRS(86)=3.8068590620 > ss= 0.9992693029 GRS(86)=3.8068590620 GRS(87)=-3.8040774011 > ss= 0.9986496902 GRS(87)=-3.8040774011 GRS(88)=3.7989407181 > ss= 0.9993521740 GRS(88)=3.7989407181 GRS(89)=-3.7964796656 > ss= 0.9995241475 GRS(89)=-3.7964796656 GRS(90)=3.7946731011 > ss= 0.9996635044 GRS(90)=3.7946731011 GRS(91)=-3.7933962102 > ss= 0.9990514354 GRS(91)=-3.7933962102 GRS(92)=3.7897979287 > ss= 0.9999406006 GRS(92)=3.7897979287 GRS(93)=-3.7895728170 > ss= 0.9989296448 GRS(93)=-3.7895728170 GRS(94)=3.7855166280 > ss= 0.9998479921 GRS(94)=3.7855166280 GRS(95)=-3.7849411995 > ss= 0.9990064433 GRS(95)=-3.7849411995 GRS(96)=3.7811806459 > ss= 0.9993703821 GRS(96)=3.7811806459 GRS(97)=-3.7787999470 > ss= 0.9999839199 GRS(97)=-3.7787999470 GRS(98)=3.7787391836 > ss= 0.9990449258 GRS(98)=3.7787391836 GRS(99)=-3.7751302072 > ss= 0.9962938985 GRS(99)=-3.7751302072 GRS(100)=3.7611391914 > ss= 0.9987838060 GRS(100)=3.7611391914 GRS(101)=-3.7565649166 > ss= 0.9975838636 GRS(101)=-3.7565649166 GRS(102)=3.7474885433 > ss= 0.9997294989 GRS(102)=3.7474885433 GRS(103)=-3.7464748436 > ss= 0.9989983184 GRS(103)=-3.7464748436 GRS(104)=3.7427220687 > ss= 0.9989125566 GRS(104)=3.7427220687 GRS(105)=-3.7386520702 > ss= 0.9957203280 GRS(105)=-3.7386520702 GRS(106)=3.7226518658 > ss= 0.9991932085 GRS(106)=3.7226518658 GRS(107)=-3.7196484618 > ss= 0.9997653132 GRS(107)=-3.7196484618 GRS(108)=3.7187755093 > ss= 0.9992213997 GRS(108)=3.7187755093 GRS(109)=-3.7158800696 > ss= 0.9952900827 GRS(109)=-3.7158800696 GRS(110)=3.6983785818 > ss= 0.9985549601 GRS(110)=3.6983785818 GRS(111)=-3.6930342772 > ss= 0.9994399191 GRS(111)=-3.6930342772 GRS(112)=3.6909658793 > ss= 0.9983418456 GRS(112)=3.6909658793 GRS(113)=-3.6848456880 > ss= 0.9994158990 GRS(113)=-3.6848456880 GRS(114)=3.6826933658 > ss= 0.9982195750 GRS(114)=3.6826933658 GRS(115)=-3.6761366063 > ss= 0.9991707575 GRS(115)=-3.6761366063 GRS(116)=3.6730881974 > ss= 0.9987102098 GRS(116)=3.6730881974 GRS(117)=-3.6683506843 > ss= 0.9986401624 GRS(117)=-3.6683506843 GRS(118)=3.6633623233 > ss= 0.9987924884 GRS(118)=3.6633623233 GRS(119)=-3.6589387709 > ss= 0.9984336194 GRS(119)=-3.6589387709 GRS(120)=3.6532074803 > ss= 0.9993482297 GRS(120)=3.6532074803 GRS(121)=-3.6508264281 > ss= 0.9971928551 GRS(121)=-3.6508264281 GRS(122)=3.6405780295 > ss= 0.9994387455 GRS(122)=3.6405780295 GRS(123)=-3.6385347389 > ss= 0.9942118921 GRS(123)=-3.6385347389 GRS(124)=3.6174745074 > ss= 0.9986868853 GRS(124)=3.6174745074 GRS(125)=-3.6127243485 > ss= 0.9988696701 GRS(125)=-3.6127243485 GRS(126)=3.6086407783 > ss= 0.9996170184 GRS(126)=3.6086407783 GRS(127)=-3.6072587352 > ss= 0.9990788944 GRS(127)=-3.6072587352 GRS(128)=3.6039360689 > ss= 0.9993388154 GRS(128)=3.6039360689 GRS(129)=-3.6015532017 > ss= 0.9990718332 GRS(129)=-3.6015532017 GRS(130)=3.5982103596 > ss= 0.9976971136 GRS(130)=3.5982103596 GRS(131)=-3.5899240897 > ss= 0.9992135898 GRS(131)=-3.5899240897 GRS(132)=3.5871009367 > ss= 0.9954909155 GRS(132)=3.5871009367 GRS(133)=-3.5709263955 > ss= 0.9991680969 GRS(133)=-3.5709263955 GRS(134)=3.5679557307 > ss= 0.9983786001 GRS(134)=3.5679557307 GRS(135)=-3.5621706476 > ss= 0.9996053240 GRS(135)=-3.5621706476 GRS(136)=3.5607647445 > ss= 0.9993398136 GRS(136)=3.5607647445 GRS(137)=-3.5584139761 > ss= 0.9991764202 GRS(137)=-3.5584139761 GRS(138)=3.5554833384 > ss= 0.9998724792 GRS(138)=3.5554833384 GRS(139)=-3.5550299405 > ss= 0.9993082128 GRS(139)=-3.5550299405 GRS(140)=3.5525706164 > ss= 0.9999423075 GRS(140)=3.5525706164 GRS(141)=-3.5523656597 > ss= 0.9941184498 GRS(141)=-3.5523656597 GRS(142)=3.5314722428 > ss= 0.9991938940 GRS(142)=3.5314722428 GRS(143)=-3.5286255020 > ss= 0.9992458884 GRS(143)=-3.5286255020 GRS(144)=3.5259645247 > ss= 0.9997983606 GRS(144)=3.5259645247 GRS(145)=-3.5252535512 > ss= 0.9994699733 GRS(145)=-3.5252535512 GRS(146)=3.5233850726 > ss= 0.9951672976 GRS(146)=3.5233850726 GRS(147)=-3.5063576013 > ss= 0.9993936705 GRS(147)=-3.5063576013 GRS(148)=3.5042315933 > ss= 0.9974660116 GRS(148)=3.5042315933 GRS(149)=-3.4953519110 > ss= 0.9991638024 GRS(149)=-3.4953519110 GRS(150)=3.4924291060 > ss= 0.9997542087 GRS(150)=3.4924291060 GRS(151)=-3.4915706973 > ss= 0.9988871428 GRS(151)=-3.4915706973 GRS(152)=3.4876850776 > ss= 0.9995866915 GRS(152)=3.4876850776 GRS(153)=-3.4862435878 > ss= 0.9988659428 GRS(153)=-3.4862435878 GRS(154)=3.4822899881 > ss= 0.9995188309 GRS(154)=3.4822899881 GRS(155)=-3.4806144176 > ss= 0.9989542490 GRS(155)=-3.4806144176 GRS(156)=3.4769745615 > ss= 0.9993750477 GRS(156)=3.4769745615 GRS(157)=-3.4748016183 > ss= 0.9990318039 GRS(157)=-3.4748016183 GRS(158)=3.4714373291 > ss= 0.9991632994 GRS(158)=3.4714373291 GRS(159)=-3.4685327753 > ss= 0.9989657278 GRS(159)=-3.4685327753 GRS(160)=3.4649453684 > ss= 0.9989351160 GRS(160)=3.4649453684 GRS(161)=-3.4612556035 > ss= 0.9986648427 GRS(161)=-3.4612556035 GRS(162)=3.4566342830 > ss= 0.9986802695 GRS(162)=3.4566342830 GRS(163)=-3.4520724574 > ss= 0.9999842765 GRS(163)=-3.4520724574 GRS(164)=3.4520181787 > ss= 0.9962707501 GRS(164)=3.4520181787 GRS(165)=-3.4391447404 > ss= 0.9992908775 GRS(165)=-3.4391447404 GRS(166)=3.4367059655 > ss= 0.9991528773 GRS(166)=3.4367059655 GRS(167)=-3.4337946537 > ss= 0.9988349701 GRS(167)=-3.4337946537 GRS(168)=3.4297941805 > ss= 0.9994063998 GRS(168)=3.4297941805 GRS(169)=-3.4277582539 > ss= 0.9988363765 GRS(169)=-3.4277582539 GRS(170)=3.4237696339 > ss= 0.9994013434 GRS(170)=3.4237696339 GRS(171)=-3.4217199716 > ss= 0.9986836331 GRS(171)=-3.4217199716 GRS(172)=3.4172157327 > ss= 0.9992205093 GRS(172)=3.4172157327 GRS(173)=-3.4145520449 > ss= 0.9962578169 GRS(173)=-3.4145520449 GRS(174)=3.4017741661 > ss= 0.9990349573 GRS(174)=3.4017741661 GRS(175)=-3.3984913088 > ss= 0.9995742569 GRS(175)=-3.3984913088 GRS(176)=3.3970444246 > ss= 0.9988914825 GRS(176)=3.3970444246 GRS(177)=-3.3932787413 > ss= 0.9993890209 GRS(177)=-3.3932787413 GRS(178)=3.3912055190 > ss= 0.9985401554 GRS(178)=3.3912055190 GRS(179)=-3.3862548861 > ss= 0.9992489307 GRS(179)=-3.3862548861 GRS(180)=3.3837115741 > ss= 0.9998779270 GRS(180)=3.3837115741 GRS(181)=-3.3832985143 > ss= 0.9991176810 GRS(181)=-3.3832985143 GRS(182)=3.3803133656 > ss= 0.9993999107 GRS(182)=3.3803133656 GRS(183)=-3.3782848756 > ss= 0.9990005881 GRS(183)=-3.3782848756 GRS(184)=3.3749085774 > ss= 0.9993085669 GRS(184)=3.3749085774 GRS(185)=-3.3725750540 > ss= 0.9988404869 GRS(185)=-3.3725750540 GRS(186)=3.3686645091 > ss= 0.9992545135 GRS(186)=3.3686645091 GRS(187)=-3.3661532151 > ss= 0.9981115878 GRS(187)=-3.3661532151 GRS(188)=3.3597965304 > ss= 0.9991828961 GRS(188)=3.3597965304 GRS(189)=-3.3570512275 > ss= 0.9997520639 GRS(189)=-3.3570512275 GRS(190)=3.3562188934 > ss= 0.9991014452 GRS(190)=3.3562188934 GRS(191)=-3.3532031469 > ss= 0.9993888343 GRS(191)=-3.3532031469 GRS(192)=3.3511537842 > ss= 0.9990006788 GRS(192)=3.3511537842 GRS(193)=-3.3478049052 > ss= 0.9992859047 GRS(193)=-3.3478049052 GRS(194)=3.3454142535 > ss= 0.9988116463 GRS(194)=3.3454142535 GRS(195)=-3.3414387180 > ss= 0.9992254629 GRS(195)=-3.3414387180 GRS(196)=3.3388506498 > ss= 0.9973993121 GRS(196)=3.3388506498 GRS(197)=-3.3301673412 > ss= 0.9991480513 GRS(197)=-3.3301673412 GRS(198)=3.3273302095 > ss= 0.9995365296 GRS(198)=3.3273302095 GRS(199)=-3.3257880903 > ss= 0.9990384042 GRS(199)=-3.3257880903 GRS(200)=3.3225900263 > ss= 0.9992891727 GRS(200)=3.3225900263 GRS(201)=-3.3202282388 > ss= 0.9987917588 GRS(201)=-3.3202282388 GRS(202)=3.3162166022 > ss= 0.9991905749 GRS(202)=3.3162166022 GRS(203)=-3.3135323732 > ss= 0.9945222759 GRS(203)=-3.3135323732 GRS(204)=3.2953817569 > ss= 0.9990617879 GRS(204)=3.2953817569 GRS(205)=-3.2922899900 > ss= 0.9992534719 GRS(205)=-3.2922899900 GRS(206)=3.2898322031 > ss= 0.9961647199 GRS(206)=3.2898322031 GRS(207)=-3.2772147750 > ss= 0.9990875122 GRS(207)=-3.2772147750 GRS(208)=3.2742243567 > ss= 0.9991667318 GRS(208)=3.2742243567 GRS(209)=-3.2714960497 > ss= 0.9994417361 GRS(209)=-3.2714960497 GRS(210)=3.2696696916 > ss= 0.9987831388 GRS(210)=3.2696696916 GRS(211)=-3.2656909574 > ss= 0.9991318601 GRS(211)=-3.2656909574 GRS(212)=3.2628558808 > ss= 0.9992048809 GRS(212)=3.2628558808 GRS(213)=-3.2602615216 > ss= 0.9999977715 GRS(213)=-3.2602615216 GRS(214)=3.2602542561 > ss= 0.9990704168 GRS(214)=3.2602542561 GRS(215)=-3.2572235785 > ss= 0.9990871785 GRS(215)=-3.2572235785 GRS(216)=3.2542503146 > ss= 0.9992241431 GRS(216)=3.2542503146 GRS(217)=-3.2517254820 > ss= 0.9992341296 GRS(217)=-3.2517254820 GRS(218)=3.2492350815 > ss= 0.9991332641 GRS(218)=3.2492350815 GRS(219)=-3.2464188528 > ss= 0.9989246373 GRS(219)=-3.2464188528 GRS(220)=3.2429277749 > ss= 0.9999780993 GRS(220)=3.2429277749 GRS(221)=-3.2428567525 > ss= 0.9992237669 GRS(221)=-3.2428567525 GRS(222)=3.2403395397 > ss= 0.9992159985 GRS(222)=3.2403395397 GRS(223)=-3.2377991086 > ss= 0.9991226513 GRS(223)=-3.2377991086 GRS(224)=3.2349584299 > ss= 0.9992580461 GRS(224)=3.2349584299 GRS(225)=-3.2325582398 > ss= 0.9993831678 GRS(225)=-3.2325582398 GRS(226)=3.2305642938 > ss= 0.9992804217 GRS(226)=3.2305642938 GRS(227)=-3.2282396499 > ss= 0.9992571243 GRS(227)=-3.2282396499 GRS(228)=3.2258414692 > ss= 0.9995229352 GRS(228)=3.2258414692 GRS(229)=-3.2243025337 > ss= 0.9992479966 GRS(229)=-3.2243025337 GRS(230)=3.2218778472 > ss= 0.9993507731 GRS(230)=3.2218778472 GRS(231)=-3.2197861174 > ss= 0.9994051829 GRS(231)=-3.2197861174 GRS(232)=3.2178709337 > ss= 0.9993009431 GRS(232)=3.2178709337 GRS(233)=-3.2156214587 > ss= 0.9993810026 GRS(233)=-3.2156214587 GRS(234)=3.2136309973 > ss= 0.9991509825 GRS(234)=3.2136309973 GRS(235)=-3.2109025682 > ss= 0.9993726158 GRS(235)=-3.2109025682 GRS(236)=3.2088880988 > ss= 0.9997222850 GRS(236)=3.2088880988 GRS(237)=-3.2079969423 > ss= 0.9991037372 GRS(237)=-3.2079969423 GRS(238)=3.2051217341 > ss= 0.9999158291 GRS(238)=3.2051217341 GRS(239)=-3.2048519561 > ss= 0.9972093241 GRS(239)=-3.2048519561 GRS(240)=3.1959082530 > ss= 0.9993238488 GRS(240)=3.1959082530 GRS(241)=-3.1937473358 > ss= 0.9991330096 GRS(241)=-3.1937473358 GRS(242)=3.1909783876 > ss= 0.9994080319 GRS(242)=3.1909783876 GRS(243)=-3.1890894303 > ss= 0.9993385501 GRS(243)=-3.1890894303 GRS(244)=3.1869800073 > ss= 0.9999972375 GRS(244)=3.1869800073 GRS(245)=-3.1869712032 > ss= 0.9993571022 GRS(245)=-3.1869712032 GRS(246)=3.1849223065 > ss= 0.9993387979 GRS(246)=3.1849223065 GRS(247)=-3.1828164292 > ss= 0.9993523547 GRS(247)=-3.1828164292 GRS(248)=3.1807550930 > ss= 0.9993420252 GRS(248)=3.1807550930 GRS(249)=-3.1786622361 > ss= 0.9996629413 GRS(249)=-3.1786622361 GRS(250)=3.1775908404 > ss= 0.9993227720 GRS(250)=3.1775908404 GRS(251)=-3.1754388868 > ss= 0.9995068178 GRS(251)=-3.1754388868 GRS(252)=3.1738728167 > ss= 0.9993412723 GRS(252)=3.1738728167 GRS(253)=-3.1717820987 > ss= 0.9994261036 GRS(253)=-3.1717820987 GRS(254)=3.1699618242 > ss= 0.9993883498 GRS(254)=3.1699618242 GRS(255)=-3.1680229166 > ss= 0.9993428349 GRS(255)=-3.1680229166 GRS(256)=3.1659410024 > ss= 0.9994184780 GRS(256)=3.1659410024 GRS(257)=-3.1640999380 > ss= 0.9993866187 GRS(257)=-3.1640999380 GRS(258)=3.1621591383 > ss= 0.9994709376 GRS(258)=3.1621591383 GRS(259)=-3.1604861587 > ss= 0.9994159200 GRS(259)=-3.1604861587 GRS(260)=3.1586401820 > ss= 0.9994211964 GRS(260)=3.1586401820 GRS(261)=-3.1568119498 > ss= 0.9994309409 GRS(261)=-3.1568119498 GRS(262)=3.1550155371 > ss= 0.9994528128 GRS(262)=3.1550155371 GRS(263)=-3.1532891530 > ss= 0.9993306065 GRS(263)=-3.1532891530 GRS(264)=3.1511783617 > ss= 0.9994349559 GRS(264)=3.1511783617 GRS(265)=-3.1493978068 > ss= 0.9999220100 GRS(265)=-3.1493978068 GRS(266)=3.1491521854 > ss= 0.9993566767 GRS(266)=3.1491521854 GRS(267)=-3.1471262622 > ss= 0.9994992300 GRS(267)=-3.1471262622 GRS(268)=3.1455502758 > ss= 0.9992832754 GRS(268)=3.1455502758 GRS(269)=-3.1432957826 > ss= 0.9993621556 GRS(269)=-3.1432957826 GRS(270)=3.1412908491 > ss= 0.9992263941 GRS(270)=3.1412908491 GRS(271)=-3.1388607279 > ss= 0.9993389644 GRS(271)=-3.1388607279 GRS(272)=3.1367858291 > ss= 0.9993636102 GRS(272)=3.1367858291 GRS(273)=-3.1347896105 > ss= 0.9993568803 GRS(273)=-3.1347896105 GRS(274)=3.1327735655 > ss= 0.9993604667 GRS(274)=3.1327735655 GRS(275)=-3.1307700524 > ss= 0.9993541678 GRS(275)=-3.1307700524 GRS(276)=3.1287481002 > ss= 0.9993852133 GRS(276)=3.1287481002 GRS(277)=-3.1268245874 > ss= 0.9993703101 GRS(277)=-3.1268245874 GRS(278)=3.1248556576 > ss= 0.9993265741 GRS(278)=3.1248556576 GRS(279)=-3.1227512989 > ss= 0.9993968034 GRS(279)=-3.1227512989 GRS(280)=3.1208676660 > ss= 0.9993740036 GRS(280)=3.1208676660 GRS(281)=-3.1189140142 > ss= 0.9998628189 GRS(281)=-3.1189140142 GRS(282)=3.1184861582 > ss= 0.9994002598 GRS(282)=3.1184861582 GRS(283)=-3.1166158768 > ss= 0.9993757232 GRS(283)=-3.1166158768 GRS(284)=3.1146702459 > ss= 0.9998069433 GRS(284)=3.1146702459 GRS(285)=-3.1140689380 > ss= 0.9994091675 GRS(285)=-3.1140689380 GRS(286)=3.1122290447 > ss= 0.9993798463 GRS(286)=3.1122290447 GRS(287)=-3.1102989845 > ss= 0.9990410297 GRS(287)=-3.1102989845 GRS(288)=3.1073163001 > ss= 0.9994253213 GRS(288)=3.1073163001 GRS(289)=-3.1055305917 > ss= 0.9993813198 GRS(289)=-3.1055305917 GRS(290)=3.1036092614 > ss= 0.9992530678 GRS(290)=3.1036092614 GRS(291)=-3.1012910758 > ss= 0.9995044672 GRS(291)=-3.1012910758 GRS(292)=3.0997542844 > ss= 0.9993860763 GRS(292)=3.0997542844 GRS(293)=-3.0978512716 > ss= 0.9993606501 GRS(293)=-3.0978512716 GRS(294)=3.0958706606 > ss= 0.9993658646 GRS(294)=3.0958706606 GRS(295)=-3.0939074593 > ss= 0.9993192768 GRS(295)=-3.0939074593 GRS(296)=3.0918013649 > ss= 0.9994759923 GRS(296)=3.0918013649 GRS(297)=-3.0901812373 > ss= 0.9993682724 GRS(297)=-3.0901812373 GRS(298)=3.0882290846 > ss= 0.9993261365 GRS(298)=3.0882290846 GRS(299)=-3.0861480399 > ss= 0.9995045375 GRS(299)=-3.0861480399 GRS(300)=3.0846189693 > ss= 0.9993483698 GRS(300)=3.0846189693 GRS(301)=-3.0826089383 > ss= 0.9993300737 GRS(301)=-3.0826089383 GRS(302)=3.0805438174 > ss= 0.9993518109 GRS(302)=3.0805438174 GRS(303)=-3.0785470426 > ss= 0.9993356624 GRS(303)=-3.0785470426 GRS(304)=3.0765018480 > ss= 0.9993604538 GRS(304)=3.0765018480 GRS(305)=-3.0745342827 > ss= 0.9993163946 GRS(305)=-3.0745342827 GRS(306)=3.0724325146 > ss= 0.9993437039 GRS(306)=3.0724325146 GRS(307)=-3.0704160891 > ss= 0.9991526250 GRS(307)=-3.0704160891 GRS(308)=3.0678142953 > ss= 0.9993247846 GRS(308)=3.0678142953 GRS(309)=-3.0657428599 > ss= 0.9994201749 GRS(309)=-3.0657428599 GRS(310)=3.0639652653 > ss= 0.9992913585 GRS(310)=3.0639652653 GRS(311)=-3.0617940123 > ss= 0.9993792182 GRS(311)=-3.0617940123 GRS(312)=3.0598933062 > ss= 0.9990868152 GRS(312)=3.0598933062 GRS(313)=-3.0570990580 > ss= 0.9993661564 GRS(313)=-3.0570990580 GRS(314)=3.0551613354 > ss= 0.9995746355 GRS(314)=3.0551613354 GRS(315)=-3.0538617781 > ss= 0.9992909820 GRS(315)=-3.0538617781 GRS(316)=3.0516965351 > ss= 0.9995174353 GRS(316)=3.0516965351 GRS(317)=-3.0502238941 > ss= 0.9995468513 GRS(317)=-3.0502238941 GRS(318)=3.0488416892 > ss= 0.9999960894 GRS(318)=3.0488416892 GRS(319)=-3.0488297663 > ss= 0.9995299833 GRS(319)=-3.0488297663 GRS(320)=3.0473967655 > ss= 0.9993869847 GRS(320)=3.0473967655 GRS(321)=-3.0455286648 > ss= 0.9969262118 GRS(321)=-3.0455286648 GRS(322)=3.0361673546 > ss= 0.9994269862 GRS(322)=3.0361673546 GRS(323)=-3.0344275889 > ss= 0.9993358123 GRS(323)=-3.0344275889 GRS(324)=3.0324121595 > ss= 0.9995194150 GRS(324)=3.0324121595 GRS(325)=-3.0309548276 > ss= 0.9993737671 GRS(325)=-3.0309548276 GRS(326)=3.0290567438 > ss= 0.9993186418 GRS(326)=3.0290567438 GRS(327)=-3.0269928712 > ss= 0.9993835771 GRS(327)=-3.0269928712 GRS(328)=3.0251269633 > ss= 0.9993590702 GRS(328)=3.0251269633 GRS(329)=-3.0231880692 > ss= 0.9997037403 GRS(329)=-3.0231880692 GRS(330)=3.0222924203 > ss= 0.9993561904 GRS(330)=3.0222924203 GRS(331)=-3.0203466394 > ss= 0.9993784183 GRS(331)=-3.0203466394 GRS(332)=3.0184692472 > ss= 0.9993002817 GRS(332)=3.0184692472 GRS(333)=-3.0163571690 > ss= 0.9993637662 GRS(333)=-3.0163571690 GRS(334)=3.0144380605 > ss= 0.9995062884 GRS(334)=3.0144380605 GRS(335)=-3.0129497974 > ss= 0.9993146857 GRS(335)=-3.0129497974 GRS(336)=3.0108849798 > ss= 0.9994180850 GRS(336)=3.0108849798 GRS(337)=-3.0091329006 > ss= 0.9988448398 GRS(337)=-3.0091329006 GRS(338)=3.0056568700 > ss= 0.9993695263 GRS(338)=3.0056568700 GRS(339)=-3.0037618823 > ss= 0.9996753233 GRS(339)=-3.0037618823 GRS(340)=3.0027866308 > ss= 0.9992593698 GRS(340)=3.0027866308 GRS(341)=-3.0005626764 > ss= 0.9995516107 GRS(341)=-3.0005626764 GRS(342)=2.9992172563 > ss= 0.9988421528 GRS(342)=2.9992172563 GRS(343)=-2.9957446209 > ss= 0.9994896854 GRS(343)=-2.9957446209 GRS(344)=2.9942158487 > ss= 0.9995334412 GRS(344)=2.9942158487 GRS(345)=-2.9928188711 > ss= 0.9995010777 GRS(345)=-2.9928188711 GRS(346)=2.9913256871 > ss= 0.9994682910 GRS(346)=2.9913256871 GRS(347)=-2.9897351722 > ss= 0.9993530559 GRS(347)=-2.9897351722 GRS(348)=2.9878009807 > ss= 0.9994873794 GRS(348)=2.9878009807 GRS(349)=-2.9862693723 > ss= 0.9994200506 GRS(349)=-2.9862693723 GRS(350)=2.9845374871 > ss= 0.9990825480 GRS(350)=2.9845374871 GRS(351)=-2.9817993171 > ss= 0.9994431179 GRS(351)=-2.9817993171 GRS(352)=2.9801388066 > ss= 0.9993923854 GRS(352)=2.9801388066 GRS(353)=-2.9783280307 > ss= 0.9996308745 GRS(353)=-2.9783280307 GRS(354)=2.9772286539 > ss= 0.9994186741 GRS(354)=2.9772286539 GRS(355)=-2.9754979138 > ss= 0.9993914987 GRS(355)=-2.9754979138 GRS(356)=2.9736873193 > ss= 0.9994840367 GRS(356)=2.9736873193 GRS(357)=-2.9721530059 > ss= 0.9994154374 GRS(357)=-2.9721530059 GRS(358)=2.9704155964 > ss= 0.9994028270 GRS(358)=2.9704155964 GRS(359)=-2.9686417444 > ss= 0.9994462895 GRS(359)=-2.9686417444 GRS(360)=2.9669979762 > ss= 0.9994183126 GRS(360)=2.9669979762 GRS(361)=-2.9652721110 > ss= 0.9994100620 GRS(361)=-2.9652721110 GRS(362)=2.9635227842 > ss= 0.9994348253 GRS(362)=2.9635227842 GRS(363)=-2.9618478761 > ss= 0.9994225680 GRS(363)=-2.9618478761 GRS(364)=2.9601376103 > ss= 0.9994146048 GRS(364)=2.9601376103 GRS(365)=-2.9584047599 > ss= 0.9994361762 GRS(365)=-2.9584047599 GRS(366)=2.9567367410 > ss= 0.9994255562 GRS(366)=2.9567367410 GRS(367)=-2.9550382618 > ss= 0.9994133788 GRS(367)=-2.9550382618 GRS(368)=2.9533047736 > ss= 0.9994359414 GRS(368)=2.9533047736 GRS(369)=-2.9516389368 > ss= 0.9994270259 GRS(369)=-2.9516389368 GRS(370)=2.9499477240 > ss= 0.9994154864 GRS(370)=2.9499477240 GRS(371)=-2.9482234393 > ss= 0.9994371249 GRS(371)=-2.9482234393 GRS(372)=2.9465639576 > ss= 0.9994305019 GRS(372)=2.9465639576 GRS(373)=-2.9448858950 > ss= 0.9994162432 GRS(373)=-2.9448858950 GRS(374)=2.9431667977 > ss= 0.9994415671 GRS(374)=2.9431667977 GRS(375)=-2.9415232365 > ss= 0.9994325367 GRS(375)=-2.9415232365 GRS(376)=2.9398540299 > ss= 0.9994133030 GRS(376)=2.9398540299 GRS(377)=-2.9381292263 > ss= 0.9994439229 GRS(377)=-2.9381292263 GRS(378)=2.9364954000 > ss= 0.9994333716 GRS(378)=2.9364954000 GRS(379)=-2.9348314983 > ss= 0.9994076337 GRS(379)=-2.9348314983 GRS(380)=2.9330930030 > ss= 0.9994439768 GRS(380)=2.9330930030 GRS(381)=-2.9314621353 > ss= 0.9994328745 GRS(381)=-2.9314621353 GRS(382)=2.9297996283 > ss= 0.9993894478 GRS(382)=2.9297996283 GRS(383)=-2.9280108328 > ss= 0.9994416553 GRS(383)=-2.9280108328 GRS(384)=2.9263759933 > ss= 0.9994327591 GRS(384)=2.9263759933 GRS(385)=-2.9247160331 > ss= 0.9996752211 GRS(385)=-2.9247160331 GRS(386)=2.9237661472 > ss= 0.9994412103 GRS(386)=2.9237661472 GRS(387)=-2.9221323769 > ss= 0.9994354005 GRS(387)=-2.9221323769 GRS(388)=2.9204825424 > ss= 0.9995772094 GRS(388)=2.9204825424 GRS(389)=-2.9192477899 > ss= 0.9994436575 GRS(389)=-2.9192477899 GRS(390)=2.9176236883 > ss= 0.9994384983 GRS(390)=2.9176236883 GRS(391)=-2.9159854378 > ss= 0.9995047771 GRS(391)=-2.9159854378 GRS(392)=2.9145413750 > ss= 0.9994459262 GRS(392)=2.9145413750 GRS(393)=-2.9129265040 > ss= 0.9994409367 GRS(393)=-2.9129265040 GRS(394)=2.9112979939 > ss= 0.9993892623 GRS(394)=2.9112979939 GRS(395)=-2.9095199545 > ss= 0.9994489008 GRS(395)=-2.9095199545 GRS(396)=2.9079165204 > ss= 0.9994407083 GRS(396)=2.9079165204 GRS(397)=-2.9062901469 > ss= 0.9999992564 GRS(397)=-2.9062901469 GRS(398)=2.9062879857 > ss= 0.9994462376 GRS(398)=2.9062879857 GRS(399)=-2.9046785927 > ss= 0.9994417929 GRS(399)=-2.9046785927 GRS(400)=2.9030571804 > ss= 0.9994443601 GRS(400)=2.9030571804 GRS(401)=-2.9014441259 > ss= 0.9994349489 GRS(401)=-2.9014441259 GRS(402)=2.8998046618 > ss= 0.9994507091 GRS(402)=2.8998046618 GRS(403)=-2.8982118253 > ss= 0.9991488880 GRS(403)=-2.8982118253 GRS(404)=2.8957451226 > ss= 0.9994664648 GRS(404)=2.8957451226 GRS(405)=-2.8942001406 > ss= 0.9994267905 GRS(405)=-2.8942001406 GRS(406)=2.8925411575 > ss= 0.9995049185 GRS(406)=2.8925411575 GRS(407)=-2.8911091138 > ss= 0.9994354727 GRS(407)=-2.8911091138 GRS(408)=2.8894770039 > ss= 0.9998613315 GRS(408)=2.8894770039 GRS(409)=-2.8890763244 > ss= 0.9994382314 GRS(409)=-2.8890763244 GRS(410)=2.8874533321 > ss= 0.9993073995 GRS(410)=2.8874533321 GRS(411)=-2.8854534807 > ss= 0.9994395437 GRS(411)=-2.8854534807 GRS(412)=2.8838363100 > ss= 0.9993777033 GRS(412)=2.8838363100 GRS(413)=-2.8820417081 > ss= 0.9994400705 GRS(413)=-2.8820417081 GRS(414)=2.8804279678 > ss= 0.9994217500 GRS(414)=2.8804279678 GRS(415)=-2.8787623604 > ss= 0.9994605371 GRS(415)=-2.8787623604 GRS(416)=2.8772093749 > ss= 0.9994478522 GRS(416)=2.8772093749 GRS(417)=-2.8756207301 > ss= 0.9994382054 GRS(417)=-2.8756207301 GRS(418)=2.8740052217 > ss= 0.9994606002 GRS(418)=2.8740052217 GRS(419)=-2.8724549840 > ss= 0.9994335068 GRS(419)=-2.8724549840 GRS(420)=2.8708277577 > ss= 0.9994662513 GRS(420)=2.8708277577 GRS(421)=-2.8692954572 > ss= 0.9994312466 GRS(421)=-2.8692954572 GRS(422)=2.8676635357 > ss= 0.9994693955 GRS(422)=2.8676635357 GRS(423)=-2.8661419406 > ss= 0.9994303104 GRS(423)=-2.8661419406 GRS(424)=2.8645091293 > ss= 0.9994740117 GRS(424)=2.8645091293 GRS(425)=-2.8630024311 > ss= 0.9994320424 GRS(425)=-2.8630024311 GRS(426)=2.8613763670 > ss= 0.9994937902 GRS(426)=2.8613763670 GRS(427)=-2.8599279104 > ss= 0.9994391154 GRS(427)=-2.8599279104 GRS(428)=2.8583238210 > ss= 0.9998244039 GRS(428)=2.8583238210 GRS(429)=-2.8578219105 > ss= 0.9994528308 GRS(429)=-2.8578219105 GRS(430)=2.8562581983 > ss= 0.9994183073 GRS(430)=2.8562581983 GRS(431)=-2.8545967337 > ss= 0.9994769203 GRS(431)=-2.8545967337 GRS(432)=2.8531035521 > ss= 0.9993725204 GRS(432)=2.8531035521 GRS(433)=-2.8513132877 > ss= 0.9994752875 GRS(433)=-2.8513132877 GRS(434)=2.8498171679 > ss= 0.9991919106 GRS(434)=2.8498171679 GRS(435)=-2.8475142608 > ss= 0.9994554057 GRS(435)=-2.8475142608 GRS(436)=2.8459635208 > ss= 0.9998493353 GRS(436)=2.8459635208 GRS(437)=-2.8455347345 > ss= 0.9992897356 GRS(437)=-2.8455347345 GRS(438)=2.8435136523 > ss= 0.9995149138 GRS(438)=2.8435136523 GRS(439)=-2.8421343032 > ss= 0.9994913111 GRS(439)=-2.8421343032 GRS(440)=2.8406885409 > ss= 0.9994564266 GRS(440)=2.8406885409 GRS(441)=-2.8391444181 > ss= 0.9994716723 GRS(441)=-2.8391444181 GRS(442)=2.8376444194 > ss= 0.9994250562 GRS(442)=2.8376444194 GRS(443)=-2.8360129334 > ss= 0.9994597350 GRS(443)=-2.8360129334 GRS(444)=2.8344807349 > ss= 0.9993746578 GRS(444)=2.8344807349 GRS(445)=-2.8327082146 > ss= 0.9994442346 GRS(445)=-2.8327082146 GRS(446)=2.8311338934 > ss= 0.9999295844 GRS(446)=2.8311338934 GRS(447)=-2.8309345375 > ss= 0.9994206614 GRS(447)=-2.8309345375 GRS(448)=2.8292944679 > ss= 0.9995009941 GRS(448)=2.8292944679 GRS(449)=-2.8278826333 > ss= 0.9993970511 GRS(449)=-2.8278826333 GRS(450)=2.8261775645 > ss= 0.9994637400 GRS(450)=2.8261775645 GRS(451)=-2.8246619985 > ss= 0.9994167795 GRS(451)=-2.8246619985 GRS(452)=2.8230145976 > ss= 0.9994125541 GRS(452)=2.8230145976 GRS(453)=-2.8213562293 > ss= 0.9994801773 GRS(453)=-2.8213562293 GRS(454)=2.8198896244 > ss= 0.9995266347 GRS(454)=2.8198896244 GRS(455)=-2.8185547865 > ss= 0.9996024419 GRS(455)=-2.8185547865 GRS(456)=2.8174342471 > ss= 0.9994835605 GRS(456)=2.8174342471 GRS(457)=-2.8159792128 > ss= 0.9988856001 GRS(457)=-2.8159792128 GRS(458)=2.8128410859 > ss= 0.9994714230 GRS(458)=2.8128410859 GRS(459)=-2.8113542828 > ss= 0.9993199998 GRS(459)=-2.8113542828 GRS(460)=2.8094425612 > ss= 0.9994553524 GRS(460)=2.8094425612 GRS(461)=-2.8079124051 > ss= 0.9993865791 GRS(461)=-2.8079124051 GRS(462)=2.8061899729 > ss= 0.9994941828 GRS(462)=2.8061899729 GRS(463)=-2.8047705536 > ss= 0.9995832541 GRS(463)=-2.8047705536 GRS(464)=2.8036016771 > ss= 0.9994728529 GRS(464)=2.8036016771 GRS(465)=-2.8021237666 > ss= 0.9994176712 GRS(465)=-2.8021237666 GRS(466)=2.8004920093 > ss= 0.9994672136 GRS(466)=2.8004920093 GRS(467)=-2.7989999452 > ss= 0.9994308286 GRS(467)=-2.7989999452 GRS(468)=2.7974068344 > ss= 0.9994625246 GRS(468)=2.7974068344 GRS(469)=-2.7959032970 > ss= 0.9994370966 GRS(469)=-2.7959032970 GRS(470)=2.7943294734 > ss= 0.9994427972 GRS(470)=2.7943294734 GRS(471)=-2.7927724651 > ss= 0.9994392836 GRS(471)=-2.7927724651 GRS(472)=2.7912065117 > ss= 0.9994625470 GRS(472)=2.7912065117 GRS(473)=-2.7897063694 > ss= 0.9994285596 GRS(473)=-2.7897063694 GRS(474)=2.7881122186 > ss= 0.9994522292 GRS(474)=2.7881122186 GRS(475)=-2.7865849722 > ss= 0.9994432728 GRS(475)=-2.7865849722 GRS(476)=2.7850336045 > ss= 0.9994134803 GRS(476)=2.7850336045 GRS(477)=-2.7834001273 > ss= 0.9994177564 GRS(477)=-2.7834001273 GRS(478)=2.7817795104 > ss= 0.9995472340 GRS(478)=2.7817795104 GRS(479)=-2.7805200150 > ss= 0.9997332345 GRS(479)=-2.7805200150 GRS(480)=2.7797782683 > ss= 0.9994997631 GRS(480)=2.7797782683 GRS(481)=-2.7783877206 > ss= 0.9995042222 GRS(481)=-2.7783877206 GRS(482)=2.7770102576 > ss= 0.9995476122 GRS(482)=2.7770102576 GRS(483)=-2.7757539720 > ss= 0.9994921444 GRS(483)=-2.7757539720 GRS(484)=2.7743442898 > ss= 0.9994873122 GRS(484)=2.7743442898 GRS(485)=-2.7729219174 > ss= 0.9999730391 GRS(485)=-2.7729219174 GRS(486)=2.7728471570 > ss= 0.9994923417 GRS(486)=2.7728471570 GRS(487)=-2.7714394981 > ss= 0.9994789142 GRS(487)=-2.7714394981 GRS(488)=2.7699953403 > ss= 0.9992720272 GRS(488)=2.7699953403 GRS(489)=-2.7679788590 > ss= 0.9994799968 GRS(489)=-2.7679788590 GRS(490)=2.7665395010 > ss= 0.9994687697 GRS(490)=2.7665395010 GRS(491)=-2.7650698315 > ss= 0.9994773622 GRS(491)=-2.7650698315 GRS(492)=2.7636247016 > ss= 0.9994692161 GRS(492)=2.7636247016 GRS(493)=-2.7621578140 > ss= 0.9994799462 GRS(493)=-2.7621578140 GRS(494)=2.7607213433 > ss= 0.9991770779 GRS(494)=2.7607213433 GRS(495)=-2.7584494847 > ss= 0.9994700180 GRS(495)=-2.7584494847 GRS(496)=2.7569875560 > ss= 0.9994848689 GRS(496)=2.7569875560 GRS(497)=-2.7555673461 > ss= 0.9995099823 GRS(497)=-2.7555673461 GRS(498)=2.7542170693 > ss= 0.9994067694 GRS(498)=2.7542170693 GRS(499)=-2.7525831835 > ss= 0.9994993553 GRS(499)=-2.7525831835 GRS(500)=2.7512051173 > ss= 0.9994953347 GRS(500)=2.7512051173 GRS(501)=-2.7498166795 > ss= 0.9994687424 GRS(501)=-2.7498166795 GRS(502)=2.7483558183 > ss= 0.9995016204 GRS(502)=2.7483558183 GRS(503)=-2.7469860940 > ss= 0.9994807365 GRS(503)=-2.7469860940 GRS(504)=2.7455596843 > ss= 0.9994728852 GRS(504)=2.7455596843 GRS(505)=-2.7441124591 > ss= 0.9994772929 GRS(505)=-2.7441124591 GRS(506)=2.7426780920 > ss= 0.9994755747 GRS(506)=2.7426780920 GRS(507)=-2.7412397621 > ss= 0.9994800621 GRS(507)=-2.7412397621 GRS(508)=2.7398144876 > ss= 0.9989624525 GRS(508)=2.7398144876 GRS(509)=-2.7369717999 > ss= 0.9994536459 GRS(509)=-2.7369717999 GRS(510)=2.7354764441 > ss= 0.9994723305 GRS(510)=2.7354764441 GRS(511)=-2.7340330165 > ss= 0.9994844897 GRS(511)=-2.7340330165 GRS(512)=2.7326235944 > ss= 0.9994756251 GRS(512)=2.7326235944 GRS(513)=-2.7311906752 > ss= 0.9994664488 GRS(513)=-2.7311906752 GRS(514)=2.7297334451 > ss= 0.9994689458 GRS(514)=2.7297334451 GRS(515)=-2.7282838087 > ss= 0.9994776639 GRS(515)=-2.7282838087 GRS(516)=2.7268587276 > ss= 0.9994795503 GRS(516)=2.7268587276 GRS(517)=-2.7254395348 > ss= 0.9994767193 GRS(517)=-2.7254395348 GRS(518)=2.7240133650 > ss= 0.9994733901 GRS(518)=2.7240133650 GRS(519)=-2.7225788725 > ss= 0.9994744144 GRS(519)=-2.7225788725 GRS(520)=2.7211479244 > ss= 0.9994791983 GRS(520)=2.7211479244 GRS(521)=-2.7197307459 > ss= 0.9994803647 GRS(521)=-2.7197307459 GRS(522)=2.7183174779 > ss= 0.9994783137 GRS(522)=2.7183174779 GRS(523)=-2.7168993690 > ss= 0.9994775802 GRS(523)=-2.7168993690 GRS(524)=2.7154800070 > ss= 0.9994792642 GRS(524)=2.7154800070 GRS(525)=-2.7140659595 > ss= 0.9994809563 GRS(525)=-2.7140659595 GRS(526)=2.7126572405 > ss= 0.9994814491 GRS(526)=2.7126572405 GRS(527)=-2.7112505896 > ss= 0.9994812712 GRS(527)=-2.7112505896 GRS(528)=2.7098441859 > ss= 0.9994812986 GRS(528)=2.7098441859 GRS(529)=-2.7084385860 > ss= 0.9994821262 GRS(529)=-2.7084385860 GRS(530)=2.7070359566 > ss= 0.9994832602 GRS(530)=2.7070359566 GRS(531)=-2.7056371233 > ss= 0.9994838017 GRS(531)=-2.7056371233 GRS(532)=2.7042404780 > ss= 0.9994839049 GRS(532)=2.7042404780 GRS(533)=-2.7028448326 > ss= 0.9994842695 GRS(533)=-2.7028448326 GRS(534)=2.7014508932 > ss= 0.9994850175 GRS(534)=2.7014508932 GRS(535)=-2.7000596931 > ss= 0.9994857425 GRS(535)=-2.7000596931 GRS(536)=2.6986711673 > ss= 0.9994862214 GRS(536)=2.6986711673 GRS(537)=-2.6972846478 > ss= 0.9994865976 GRS(537)=-2.6972846478 GRS(538)=2.6958998553 > ss= 0.9994870841 GRS(538)=2.6958998553 GRS(539)=-2.6945170855 > ss= 0.9994876964 GRS(539)=-2.6945170855 GRS(540)=2.6931366745 > ss= 0.9994882853 GRS(540)=2.6931366745 GRS(541)=-2.6917585568 > ss= 0.9994887772 GRS(541)=-2.6917585568 GRS(542)=2.6903824686 > ss= 0.9994892462 GRS(542)=2.6903824686 GRS(543)=-2.6890083456 > ss= 0.9994897650 GRS(543)=-2.6890083456 GRS(544)=2.6876363193 > ss= 0.9994903157 GRS(544)=2.6876363193 GRS(545)=-2.6862664732 > ss= 0.9994908504 GRS(545)=-2.6862664732 GRS(546)=2.6848987617 > ss= 0.9994913574 GRS(546)=2.6848987617 GRS(547)=-2.6835331079 > ss= 0.9994918585 GRS(547)=-2.6835331079 GRS(548)=2.6821694934 > ss= 0.9994923736 GRS(548)=2.6821694934 GRS(549)=-2.6808079532 > ss= 0.9994928989 GRS(549)=-2.6808079532 GRS(550)=2.6794485125 > ss= 0.9994934180 GRS(550)=2.6794485125 GRS(551)=-2.6780911522 > ss= 0.9994939260 GRS(551)=-2.6780911522 GRS(552)=2.6767358399 > ss= 0.9994944323 GRS(552)=2.6767358399 GRS(553)=-2.6753825689 > ss= 0.9994949438 GRS(553)=-2.6753825689 GRS(554)=2.6740313503 > ss= 0.9994954571 GRS(554)=2.6740313503 GRS(555)=-2.6726821868 > ss= 0.9994959667 GRS(555)=-2.6726821868 GRS(556)=2.6713350661 > ss= 0.9994964726 GRS(556)=2.6713350661 GRS(557)=-2.6699899756 > ss= 0.9994969778 GRS(557)=-2.6699899756 GRS(558)=2.6686469112 > ss= 0.9994974838 GRS(558)=2.6686469112 GRS(559)=-2.6673058729 > ss= 0.9994979894 GRS(559)=-2.6673058729 GRS(560)=2.6659668569 > ss= 0.9994984930 GRS(560)=2.6659668569 GRS(561)=-2.6646298560 > ss= 0.9994989950 GRS(561)=-2.6646298560 GRS(562)=2.6632948631 > ss= 0.9994994962 GRS(562)=2.6632948631 GRS(563)=-2.6619618739 > ss= 0.9994999969 GRS(563)=-2.6619618739 GRS(564)=2.6606308846 > ss= 0.9995004966 GRS(564)=2.6606308846 GRS(565)=-2.6593018906 > ss= 0.9995009952 GRS(565)=-2.6593018906 GRS(566)=2.6579748861 > ss= 0.9995014925 GRS(566)=2.6579748861 GRS(567)=-2.6566498658 > ss= 0.9995019890 GRS(567)=-2.6566498658 GRS(568)=2.6553268249 > ss= 0.9995024846 GRS(568)=2.6553268249 GRS(569)=-2.6540057588 > ss= 0.9995029792 GRS(569)=-2.6540057588 GRS(570)=2.6526866627 > ss= 0.9995034728 GRS(570)=2.6526866627 GRS(571)=-2.6513695316 > ss= 0.9995039653 GRS(571)=-2.6513695316 GRS(572)=2.6500543603 > ss= 0.9995044569 GRS(572)=2.6500543603 GRS(573)=-2.6487411441 > ss= 0.9995049476 GRS(573)=-2.6487411441 GRS(574)=2.6474298784 > ss= 0.9995054373 GRS(574)=2.6474298784 GRS(575)=-2.6461205582 > ss= 0.9995059259 GRS(575)=-2.6461205582 GRS(576)=2.6448131787 > ss= 0.9995064137 GRS(576)=2.6448131787 GRS(577)=-2.6435077351 > ss= 0.9995069004 GRS(577)=-2.6435077351 GRS(578)=2.6422042226 > ss= 0.9995073863 GRS(578)=2.6422042226 GRS(579)=-2.6409026365 > ss= 0.9995078711 GRS(579)=-2.6409026365 GRS(580)=2.6396029721 > ss= 0.9995083550 GRS(580)=2.6396029721 GRS(581)=-2.6383052245 > ss= 0.9995088380 GRS(581)=-2.6383052245 GRS(582)=2.6370093892 > ss= 0.9995093200 GRS(582)=2.6370093892 GRS(583)=-2.6357154614 > ss= 0.9995098011 GRS(583)=-2.6357154614 GRS(584)=2.6344234365 > ss= 0.9995102812 GRS(584)=2.6344234365 GRS(585)=-2.6331333097 > ss= 0.9995107604 GRS(585)=-2.6331333097 GRS(586)=2.6318450765 > ss= 0.9995112386 GRS(586)=2.6318450765 GRS(587)=-2.6305587322 > ss= 0.9995117159 GRS(587)=-2.6305587322 GRS(588)=2.6292742722 > ss= 0.9995121923 GRS(588)=2.6292742722 GRS(589)=-2.6279916920 > ss= 0.9995126677 GRS(589)=-2.6279916920 GRS(590)=2.6267109868 > ss= 0.9995131423 GRS(590)=2.6267109868 GRS(591)=-2.6254321522 > ss= 0.9995136159 GRS(591)=-2.6254321522 GRS(592)=2.6241551836 > ss= 0.9995140885 GRS(592)=2.6241551836 GRS(593)=-2.6228800765 > ss= 0.9995145603 GRS(593)=-2.6228800765 GRS(594)=2.6216068264 > ss= 0.9995150311 GRS(594)=2.6216068264 GRS(595)=-2.6203354287 > ss= 0.9995155011 GRS(595)=-2.6203354287 GRS(596)=2.6190658790 > ss= 0.9995159701 GRS(596)=2.6190658790 GRS(597)=-2.6177981728 > ss= 0.9995164382 GRS(597)=-2.6177981728 GRS(598)=2.6165323057 > ss= 0.9995169054 GRS(598)=2.6165323057 GRS(599)=-2.6152682731 > ss= 0.9995173717 GRS(599)=-2.6152682731 GRS(600)=2.6140060707 > ss= 0.9995178371 GRS(600)=2.6140060707 GRS(601)=-2.6127456941 > ss= 0.9995183017 GRS(601)=-2.6127456941 GRS(602)=2.6114871389 > ss= 0.9995187653 GRS(602)=2.6114871389 GRS(603)=-2.6102304006 > ss= 0.9995192280 GRS(603)=-2.6102304006 GRS(604)=2.6089754749 > ss= 0.9995196899 GRS(604)=2.6089754749 GRS(605)=-2.6077223575 > ss= 0.9995201508 GRS(605)=-2.6077223575 GRS(606)=2.6064710441 > ss= 0.9995206109 GRS(606)=2.6064710441 GRS(607)=-2.6052215302 > ss= 0.9995210701 GRS(607)=-2.6052215302 GRS(608)=2.6039738116 > ss= 0.9995215284 GRS(608)=2.6039738116 GRS(609)=-2.6027278840 > ss= 0.9995219858 GRS(609)=-2.6027278840 GRS(610)=2.6014837431 > ss= 0.9995224424 GRS(610)=2.6014837431 GRS(611)=-2.6002413847 > ss= 0.9995228980 GRS(611)=-2.6002413847 GRS(612)=2.5990008044 > ss= 0.9995233529 GRS(612)=2.5990008044 GRS(613)=-2.5977619981 > ss= 0.9995238068 GRS(613)=-2.5977619981 GRS(614)=2.5965249616 > ss= 0.9995242599 GRS(614)=2.5965249616 GRS(615)=-2.5952896906 > ss= 0.9995247121 GRS(615)=-2.5952896906 GRS(616)=2.5940561809 > ss= 0.9995251635 GRS(616)=2.5940561809 GRS(617)=-2.5928244283 > ss= 0.9995256140 GRS(617)=-2.5928244283 GRS(618)=2.5915944287 > ss= 0.9995260637 GRS(618)=2.5915944287 GRS(619)=-2.5903661780 > ss= 0.9995265125 GRS(619)=-2.5903661780 GRS(620)=2.5891396719 > ss= 0.9995269604 GRS(620)=2.5891396719 GRS(621)=-2.5879149064 > ss= 0.9995274075 GRS(621)=-2.5879149064 GRS(622)=2.5866918773 > ss= 0.9995278538 GRS(622)=2.5866918773 GRS(623)=-2.5854705806 > ss= 0.9995282992 GRS(623)=-2.5854705806 GRS(624)=2.5842510122 > ss= 0.9995287438 GRS(624)=2.5842510122 GRS(625)=-2.5830331679 > ss= 0.9995291876 GRS(625)=-2.5830331679 GRS(626)=2.5818170438 > ss= 0.9995296305 GRS(626)=2.5818170438 GRS(627)=-2.5806026357 > ss= 0.9995300726 GRS(627)=-2.5806026357 GRS(628)=2.5793899397 > ss= 0.9995305138 GRS(628)=2.5793899397 GRS(629)=-2.5781789517 > ss= 0.9995309542 GRS(629)=-2.5781789517 GRS(630)=2.5769696678 > ss= 0.9995313938 GRS(630)=2.5769696678 GRS(631)=-2.5757620838 > ss= 0.9995318326 GRS(631)=-2.5757620838 GRS(632)=2.5745561960 > ss= 0.9995322705 GRS(632)=2.5745561960 GRS(633)=-2.5733520002 > ss= 0.9995327077 GRS(633)=-2.5733520002 GRS(634)=2.5721494926 > ss= 0.9995331440 GRS(634)=2.5721494926 GRS(635)=-2.5709486691 > ss= 0.9995335795 GRS(635)=-2.5709486691 GRS(636)=2.5697495260 > ss= 0.9995340142 GRS(636)=2.5697495260 GRS(637)=-2.5685520591 > ss= 0.9995344481 GRS(637)=-2.5685520591 GRS(638)=2.5673562648 > ss= 0.9995348811 GRS(638)=2.5673562648 GRS(639)=-2.5661621389 > ss= 0.9995353134 GRS(639)=-2.5661621389 GRS(640)=2.5649696778 > ss= 0.9995357449 GRS(640)=2.5649696778 GRS(641)=-2.5637788775 > ss= 0.9995361755 GRS(641)=-2.5637788775 GRS(642)=2.5625897341 > ss= 0.9995366054 GRS(642)=2.5625897341 GRS(643)=-2.5614022439 > ss= 0.9995370345 GRS(643)=-2.5614022439 GRS(644)=2.5602164030 > ss= 0.9995374628 GRS(644)=2.5602164030 GRS(645)=-2.5590322076 > ss= 0.9995378902 GRS(645)=-2.5590322076 GRS(646)=2.5578496538 > ss= 0.9995383169 GRS(646)=2.5578496538 GRS(647)=-2.5566687380 > ss= 0.9995387428 GRS(647)=-2.5566687380 GRS(648)=2.5554894562 > ss= 0.9995391680 GRS(648)=2.5554894562 GRS(649)=-2.5543118048 > ss= 0.9995395923 GRS(649)=-2.5543118048 GRS(650)=2.5531357800 > ss= 0.9995400159 GRS(650)=2.5531357800 GRS(651)=-2.5519613781 > ss= 0.9995404387 GRS(651)=-2.5519613781 GRS(652)=2.5507885953 > ss= 0.9995408607 GRS(652)=2.5507885953 GRS(653)=-2.5496174279 > ss= 0.9995412819 GRS(653)=-2.5496174279 GRS(654)=2.5484478723 > ss= 0.9995417024 GRS(654)=2.5484478723 GRS(655)=-2.5472799246 > ss= 0.9995421220 GRS(655)=-2.5472799246 GRS(656)=2.5461135813 > ss= 0.9995425410 GRS(656)=2.5461135813 GRS(657)=-2.5449488386 > ss= 0.9995429591 GRS(657)=-2.5449488386 GRS(658)=2.5437856929 > ss= 0.9995433765 GRS(658)=2.5437856929 GRS(659)=-2.5426241406 > ss= 0.9995437931 GRS(659)=-2.5426241406 GRS(660)=2.5414641780 > ss= 0.9995442090 GRS(660)=2.5414641780 GRS(661)=-2.5403058016 > ss= 0.9995446241 GRS(661)=-2.5403058016 GRS(662)=2.5391490076 > ss= 0.9995450385 GRS(662)=2.5391490076 GRS(663)=-2.5379937925 > ss= 0.9995454521 GRS(663)=-2.5379937925 GRS(664)=2.5368401527 > ss= 0.9995458649 GRS(664)=2.5368401527 GRS(665)=-2.5356880846 > ss= 0.9995462770 GRS(665)=-2.5356880846 GRS(666)=2.5345375847 > ss= 0.9995466884 GRS(666)=2.5345375847 GRS(667)=-2.5333886494 > ss= 0.9995470990 GRS(667)=-2.5333886494 GRS(668)=2.5322412752 > ss= 0.9995475089 GRS(668)=2.5322412752 GRS(669)=-2.5310954584 > ss= 0.9995479180 GRS(669)=-2.5310954584 GRS(670)=2.5299511957 > ss= 0.9995483264 GRS(670)=2.5299511957 GRS(671)=-2.5288084835 > ss= 0.9995487340 GRS(671)=-2.5288084835 GRS(672)=2.5276673183 > ss= 0.9995491409 GRS(672)=2.5276673183 GRS(673)=-2.5265276966 > ss= 0.9995495471 GRS(673)=-2.5265276966 GRS(674)=2.5253896150 > ss= 0.9995499526 GRS(674)=2.5253896150 GRS(675)=-2.5242530699 > ss= 0.9995503573 GRS(675)=-2.5242530699 GRS(676)=2.5231180579 > ss= 0.9995507613 GRS(676)=2.5231180579 GRS(677)=-2.5219845756 > ss= 0.9995511646 GRS(677)=-2.5219845756 GRS(678)=2.5208526196 > ss= 0.9995515671 GRS(678)=2.5208526196 GRS(679)=-2.5197221864 > ss= 0.9995519689 GRS(679)=-2.5197221864 GRS(680)=2.5185932726 > ss= 0.9995523700 GRS(680)=2.5185932726 GRS(681)=-2.5174658747 > ss= 0.9995527704 GRS(681)=-2.5174658747 GRS(682)=2.5163399895 > ss= 0.9995531701 GRS(682)=2.5163399895 GRS(683)=-2.5152156136 > ss= 0.9995535691 GRS(683)=-2.5152156136 GRS(684)=2.5140927435 > ss= 0.9995539673 GRS(684)=2.5140927435 GRS(685)=-2.5129713759 > ss= 0.9995543648 GRS(685)=-2.5129713759 GRS(686)=2.5118515075 > ss= 0.9995547617 GRS(686)=2.5118515075 GRS(687)=-2.5107331349 > ss= 0.9995551578 GRS(687)=-2.5107331349 GRS(688)=2.5096162548 > ss= 0.9995555532 GRS(688)=2.5096162548 GRS(689)=-2.5085008639 > ss= 0.9995559479 GRS(689)=-2.5085008639 GRS(690)=2.5073869589 > ss= 0.9995563419 GRS(690)=2.5073869589 GRS(691)=-2.5062745364 > ss= 0.9995567352 GRS(691)=-2.5062745364 GRS(692)=2.5051635932 > ss= 0.9995571279 GRS(692)=2.5051635932 GRS(693)=-2.5040541261 > ss= 0.9995575198 GRS(693)=-2.5040541261 GRS(694)=2.5029461317 > ss= 0.9995579110 GRS(694)=2.5029461317 GRS(695)=-2.5018396068 > ss= 0.9995583016 GRS(695)=-2.5018396068 GRS(696)=2.5007345481 > ss= 0.9995586914 GRS(696)=2.5007345481 GRS(697)=-2.4996309525 > ss= 0.9995590806 GRS(697)=-2.4996309525 GRS(698)=2.4985288167 > ss= 0.9995594691 GRS(698)=2.4985288167 GRS(699)=-2.4974281374 > ss= 0.9995598568 GRS(699)=-2.4974281374 GRS(700)=2.4963289115 > ss= 0.9995602440 GRS(700)=2.4963289115 GRS(701)=-2.4952311358 > ss= 0.9995606304 GRS(701)=-2.4952311358 GRS(702)=2.4941348070 > ss= 0.9995610161 GRS(702)=2.4941348070 GRS(703)=-2.4930399221 > ss= 0.9995614012 GRS(703)=-2.4930399221 GRS(704)=2.4919464778 > ss= 0.9995617856 GRS(704)=2.4919464778 GRS(705)=-2.4908544710 > ss= 0.9995621693 GRS(705)=-2.4908544710 GRS(706)=2.4897638986 > ss= 0.9995625524 GRS(706)=2.4897638986 GRS(707)=-2.4886747573 > ss= 0.9995629348 GRS(707)=-2.4886747573 GRS(708)=2.4875870442 > ss= 0.9995633165 GRS(708)=2.4875870442 GRS(709)=-2.4865007560 > ss= 0.9995636976 GRS(709)=-2.4865007560 GRS(710)=2.4854158896 > ss= 0.9995640779 GRS(710)=2.4854158896 GRS(711)=-2.4843324420 > ss= 0.9995644577 GRS(711)=-2.4843324420 GRS(712)=2.4832504101 > ss= 0.9995648367 GRS(712)=2.4832504101 GRS(713)=-2.4821697907 > ss= 0.9995652151 GRS(713)=-2.4821697907 GRS(714)=2.4810905808 > ss= 0.9995655929 GRS(714)=2.4810905808 GRS(715)=-2.4800127774 > ss= 0.9995659700 GRS(715)=-2.4800127774 GRS(716)=2.4789363774 > ss= 0.9995663464 GRS(716)=2.4789363774 GRS(717)=-2.4778613778 > ss= 0.9995667222 GRS(717)=-2.4778613778 GRS(718)=2.4767877755 > ss= 0.9995670973 GRS(718)=2.4767877755 GRS(719)=-2.4757155674 > ss= 0.9995674718 GRS(719)=-2.4757155674 GRS(720)=2.4746447507 > ss= 0.9995678457 GRS(720)=2.4746447507 GRS(721)=-2.4735753222 > ss= 0.9995682188 GRS(721)=-2.4735753222 GRS(722)=2.4725072790 > ss= 0.9995685914 GRS(722)=2.4725072790 GRS(723)=-2.4714406181 > ss= 0.9995689633 GRS(723)=-2.4714406181 GRS(724)=2.4703753365 > ss= 0.9995693346 GRS(724)=2.4703753365 GRS(725)=-2.4693114312 > ss= 0.9995697052 GRS(725)=-2.4693114312 GRS(726)=2.4682488993 > ss= 0.9995700752 GRS(726)=2.4682488993 GRS(727)=-2.4671877379 > ss= 0.9995704445 GRS(727)=-2.4671877379 GRS(728)=2.4661279439 > ss= 0.9995708133 GRS(728)=2.4661279439 GRS(729)=-2.4650695145 > ss= 0.9995711813 GRS(729)=-2.4650695145 GRS(730)=2.4640124467 > ss= 0.9995715488 GRS(730)=2.4640124467 GRS(731)=-2.4629567376 > ss= 0.9995719156 GRS(731)=-2.4629567376 GRS(732)=2.4619023843 > ss= 0.9995722818 GRS(732)=2.4619023843 GRS(733)=-2.4608493839 > ss= 0.9995726474 GRS(733)=-2.4608493839 GRS(734)=2.4597977335 > ss= 0.9995730123 GRS(734)=2.4597977335 GRS(735)=-2.4587474302 > ss= 0.9995733767 GRS(735)=-2.4587474302 GRS(736)=2.4576984712 > ss= 0.9995737404 GRS(736)=2.4576984712 GRS(737)=-2.4566508536 > ss= 0.9995741035 GRS(737)=-2.4566508536 GRS(738)=2.4556045745 > ss= 0.9995744659 GRS(738)=2.4556045745 GRS(739)=-2.4545596311 > ss= 0.9995748278 GRS(739)=-2.4545596311 GRS(740)=2.4535160205 > ss= 0.9995751890 GRS(740)=2.4535160205 GRS(741)=-2.4524737399 > ss= 0.9995755496 GRS(741)=-2.4524737399 GRS(742)=2.4514327866 > ss= 0.9995759096 GRS(742)=2.4514327866 GRS(743)=-2.4503931576 > ss= 0.9995762690 GRS(743)=-2.4503931576 GRS(744)=2.4493548501 > ss= 0.9995766278 GRS(744)=2.4493548501 GRS(745)=-2.4483178615 > ss= 0.9995769860 GRS(745)=-2.4483178615 GRS(746)=2.4472821888 > ss= 0.9995773436 GRS(746)=2.4472821888 GRS(747)=-2.4462478293 > ss= 0.9995777006 GRS(747)=-2.4462478293 GRS(748)=2.4452147803 > ss= 0.9995780569 GRS(748)=2.4452147803 GRS(749)=-2.4441830389 > ss= 0.9995784127 GRS(749)=-2.4441830389 GRS(750)=2.4431526024 > ss= 0.9995787679 GRS(750)=2.4431526024 GRS(751)=-2.4421234681 > ss= 0.9995791225 GRS(751)=-2.4421234681 GRS(752)=2.4410956332 > ss= 0.9995794764 GRS(752)=2.4410956332 GRS(753)=-2.4400690950 > ss= 0.9995798298 GRS(753)=-2.4400690950 GRS(754)=2.4390438507 > ss= 0.9995801826 GRS(754)=2.4390438507 GRS(755)=-2.4380198977 > ss= 0.9995805348 GRS(755)=-2.4380198977 GRS(756)=2.4369972333 > ss= 0.9995808864 GRS(756)=2.4369972333 GRS(757)=-2.4359758546 > ss= 0.9995812374 GRS(757)=-2.4359758546 GRS(758)=2.4349557592 > ss= 0.9995815879 GRS(758)=2.4349557592 GRS(759)=-2.4339369441 > ss= 0.9995819377 GRS(759)=-2.4339369441 GRS(760)=2.4329194069 > ss= 0.9995822870 GRS(760)=2.4329194069 GRS(761)=-2.4319031448 > ss= 0.9995826357 GRS(761)=-2.4319031448 GRS(762)=2.4308881551 > ss= 0.9995829837 GRS(762)=2.4308881551 GRS(763)=-2.4298744353 > ss= 0.9995833313 GRS(763)=-2.4298744353 GRS(764)=2.4288619825 > ss= 0.9995836782 GRS(764)=2.4288619825 GRS(765)=-2.4278507944 > ss= 0.9995840246 GRS(765)=-2.4278507944 GRS(766)=2.4268408681 > ss= 0.9995843703 GRS(766)=2.4268408681 GRS(767)=-2.4258322010 > ss= 0.9995847156 GRS(767)=-2.4258322010 GRS(768)=2.4248247906 > ss= 0.9995850602 GRS(768)=2.4248247906 GRS(769)=-2.4238186343 > ss= 0.9995854043 GRS(769)=-2.4238186343 GRS(770)=2.4228137294 > ss= 0.9995857477 GRS(770)=2.4228137294 GRS(771)=-2.4218100734 > ss= 0.9995860907 GRS(771)=-2.4218100734 GRS(772)=2.4208076636 > ss= 0.9995864330 GRS(772)=2.4208076636 GRS(773)=-2.4198064975 > ss= 0.9995867748 GRS(773)=-2.4198064975 GRS(774)=2.4188065725 > ss= 0.9995871161 GRS(774)=2.4188065725 GRS(775)=-2.4178078861 > ss= 0.9995874567 GRS(775)=-2.4178078861 GRS(776)=2.4168104357 > ss= 0.9995877968 GRS(776)=2.4168104357 GRS(777)=-2.4158142188 > ss= 0.9995881364 GRS(777)=-2.4158142188 GRS(778)=2.4148192328 > ss= 0.9995884753 GRS(778)=2.4148192328 GRS(779)=-2.4138254751 > ss= 0.9995888138 GRS(779)=-2.4138254751 GRS(780)=2.4128329433 > ss= 0.9995891516 GRS(780)=2.4128329433 GRS(781)=-2.4118416349 > ss= 0.9995894890 GRS(781)=-2.4118416349 GRS(782)=2.4108515473 > ss= 0.9995898257 GRS(782)=2.4108515473 GRS(783)=-2.4098626780 > ss= 0.9995901619 GRS(783)=-2.4098626780 GRS(784)=2.4088750245 > ss= 0.9995904976 GRS(784)=2.4088750245 GRS(785)=-2.4078885844 > ss= 0.9995908327 GRS(785)=-2.4078885844 GRS(786)=2.4069033551 > ss= 0.9995911673 GRS(786)=2.4069033551 GRS(787)=-2.4059193342 > ss= 0.9995915013 GRS(787)=-2.4059193342 GRS(788)=2.4049365193 > ss= 0.9995918347 GRS(788)=2.4049365193 GRS(789)=-2.4039549078 > ss= 0.9995921677 GRS(789)=-2.4039549078 GRS(790)=2.4029744972 > ss= 0.9995925001 GRS(790)=2.4029744972 GRS(791)=-2.4019952853 > ss= 0.9995928319 GRS(791)=-2.4019952853 GRS(792)=2.4010172694 > ss= 0.9995931632 GRS(792)=2.4010172694 GRS(793)=-2.4000404472 > ss= 0.9995934940 GRS(793)=-2.4000404472 GRS(794)=2.3990648163 > ss= 0.9995938242 GRS(794)=2.3990648163 GRS(795)=-2.3980903742 > ss= 0.9995941539 GRS(795)=-2.3980903742 GRS(796)=2.3971171186 > ss= 0.9995944830 GRS(796)=2.3971171186 GRS(797)=-2.3961450469 > ss= 0.9995948117 GRS(797)=-2.3961450469 GRS(798)=2.3951741569 > ss= 0.9995951397 GRS(798)=2.3951741569 GRS(799)=-2.3942044461 > ss= 0.9995954673 GRS(799)=-2.3942044461 GRS(800)=2.3932359121 > ss= 0.9995957943 GRS(800)=2.3932359121 GRS(801)=-2.3922685526 > ss= 0.9995961208 GRS(801)=-2.3922685526 GRS(802)=2.3913023651 > ss= 0.9995964468 GRS(802)=2.3913023651 GRS(803)=-2.3903373474 > ss= 0.9995967723 GRS(803)=-2.3903373474 GRS(804)=2.3893734971 > ss= 0.9995970972 GRS(804)=2.3893734971 GRS(805)=-2.3884108118 > ss= 0.9995974216 GRS(805)=-2.3884108118 GRS(806)=2.3874492891 > ss= 0.9995977455 GRS(806)=2.3874492891 GRS(807)=-2.3864889268 > ss= 0.9995980688 GRS(807)=-2.3864889268 GRS(808)=2.3855297225 > ss= 0.9995983917 GRS(808)=2.3855297225 GRS(809)=-2.3845716738 > ss= 0.9995987140 GRS(809)=-2.3845716738 GRS(810)=2.3836147785 > ss= 0.9995990358 GRS(810)=2.3836147785 GRS(811)=-2.3826590343 > ss= 0.9995993571 GRS(811)=-2.3826590343 GRS(812)=2.3817044388 > ss= 0.9995996778 GRS(812)=2.3817044388 GRS(813)=-2.3807509897 > ss= 0.9995999981 GRS(813)=-2.3807509897 GRS(814)=2.3797986848 > ss= 0.9996003178 GRS(814)=2.3797986848 GRS(815)=-2.3788475217 > ss= 0.9996006371 GRS(815)=-2.3788475217 GRS(816)=2.3778974982 > ss= 0.9996009558 GRS(816)=2.3778974982 GRS(817)=-2.3769486120 > ss= 0.9996012740 GRS(817)=-2.3769486120 GRS(818)=2.3760008608 > ss= 0.9996015917 GRS(818)=2.3760008608 GRS(819)=-2.3750542425 > ss= 0.9996019089 GRS(819)=-2.3750542425 GRS(820)=2.3741087546 > ss= 0.9996022256 GRS(820)=2.3741087546 GRS(821)=-2.3731643950 > ss= 0.9996025418 GRS(821)=-2.3731643950 GRS(822)=2.3722211615 > ss= 0.9996028575 GRS(822)=2.3722211615 GRS(823)=-2.3712790517 > ss= 0.9996031727 GRS(823)=-2.3712790517 GRS(824)=2.3703380635 > ss= 0.9996034874 GRS(824)=2.3703380635 GRS(825)=-2.3693981946 > ss= 0.9996038016 GRS(825)=-2.3693981946 GRS(826)=2.3684594429 > ss= 0.9996041153 GRS(826)=2.3684594429 GRS(827)=-2.3675218061 > ss= 0.9996044285 GRS(827)=-2.3675218061 GRS(828)=2.3665852820 > ss= 0.9996047412 GRS(828)=2.3665852820 GRS(829)=-2.3656498684 > ss= 0.9996050534 GRS(829)=-2.3656498684 GRS(830)=2.3647155631 > ss= 0.9996053652 GRS(830)=2.3647155631 GRS(831)=-2.3637823639 > ss= 0.9996056764 GRS(831)=-2.3637823639 GRS(832)=2.3628502687 > ss= 0.9996059871 GRS(832)=2.3628502687 GRS(833)=-2.3619192752 > ss= 0.9996062974 GRS(833)=-2.3619192752 GRS(834)=2.3609893814 > ss= 0.9996066071 GRS(834)=2.3609893814 GRS(835)=-2.3600605850 > ss= 0.9996069164 GRS(835)=-2.3600605850 GRS(836)=2.3591328839 > ss= 0.9996072252 GRS(836)=2.3591328839 GRS(837)=-2.3582062759 > ss= 0.9996075335 GRS(837)=-2.3582062759 GRS(838)=2.3572807589 > ss= 0.9996078413 GRS(838)=2.3572807589 GRS(839)=-2.3563563308 > ss= 0.9996081486 GRS(839)=-2.3563563308 GRS(840)=2.3554329893 > ss= 0.9996084555 GRS(840)=2.3554329893 GRS(841)=-2.3545107325 > ss= 0.9996087619 GRS(841)=-2.3545107325 GRS(842)=2.3535895581 > ss= 0.9996090678 GRS(842)=2.3535895581 GRS(843)=-2.3526694641 > ss= 0.9996093732 GRS(843)=-2.3526694641 GRS(844)=2.3517504483 > ss= 0.9996096781 GRS(844)=2.3517504483 GRS(845)=-2.3508325087 > ss= 0.9996099826 GRS(845)=-2.3508325087 GRS(846)=2.3499156431 > ss= 0.9996102866 GRS(846)=2.3499156431 GRS(847)=-2.3489998494 > ss= 0.9996105901 GRS(847)=-2.3489998494 GRS(848)=2.3480851256 > ss= 0.9996108931 GRS(848)=2.3480851256 GRS(849)=-2.3471714695 > ss= 0.9996111957 GRS(849)=-2.3471714695 GRS(850)=2.3462588792 > ss= 0.9996114978 GRS(850)=2.3462588792 GRS(851)=-2.3453473525 > ss= 0.9996117994 GRS(851)=-2.3453473525 GRS(852)=2.3444368874 > ss= 0.9996121006 GRS(852)=2.3444368874 GRS(853)=-2.3435274817 > ss= 0.9996124013 GRS(853)=-2.3435274817 GRS(854)=2.3426191336 > ss= 0.9996127015 GRS(854)=2.3426191336 GRS(855)=-2.3417118408 > ss= 0.9996130013 GRS(855)=-2.3417118408 GRS(856)=2.3408056014 > ss= 0.9996133006 GRS(856)=2.3408056014 GRS(857)=-2.3399004133 > ss= 0.9996135995 GRS(857)=-2.3399004133 GRS(858)=2.3389962745 > ss= 0.9996138978 GRS(858)=2.3389962745 GRS(859)=-2.3380931830 > ss= 0.9996141958 GRS(859)=-2.3380931830 GRS(860)=2.3371911367 > ss= 0.9996144932 GRS(860)=2.3371911367 GRS(861)=-2.3362901337 > ss= 0.9996147902 GRS(861)=-2.3362901337 GRS(862)=2.3353901719 > ss= 0.9996150868 GRS(862)=2.3353901719 GRS(863)=-2.3344912493 > ss= 0.9996153829 GRS(863)=-2.3344912493 GRS(864)=2.3335933639 > ss= 0.9996156785 GRS(864)=2.3335933639 GRS(865)=-2.3326965138 > ss= 0.9996159737 GRS(865)=-2.3326965138 GRS(866)=2.3318006969 > ss= 0.9996162684 GRS(866)=2.3318006969 GRS(867)=-2.3309059113 > ss= 0.9996165627 GRS(867)=-2.3309059113 GRS(868)=2.3300121549 > ss= 0.9996168565 GRS(868)=2.3300121549 GRS(869)=-2.3291194259 > ss= 0.9996171499 GRS(869)=-2.3291194259 GRS(870)=2.3282277222 > ss= 0.9996174428 GRS(870)=2.3282277222 GRS(871)=-2.3273370419 > ss= 0.9996177353 GRS(871)=-2.3273370419 GRS(872)=2.3264473830 > ss= 0.9996180273 GRS(872)=2.3264473830 GRS(873)=-2.3255587436 > ss= 0.9996183189 GRS(873)=-2.3255587436 GRS(874)=2.3246711217 > ss= 0.9996186100 GRS(874)=2.3246711217 GRS(875)=-2.3237845155 > ss= 0.9996189007 GRS(875)=-2.3237845155 GRS(876)=2.3228989228 > ss= 0.9996191910 GRS(876)=2.3228989228 GRS(877)=-2.3220143419 > ss= 0.9996194808 GRS(877)=-2.3220143419 GRS(878)=2.3211307708 > ss= 0.9996197701 GRS(878)=2.3211307708 GRS(879)=-2.3202482076 > ss= 0.9996200591 GRS(879)=-2.3202482076 GRS(880)=2.3193666503 > ss= 0.9996203476 GRS(880)=2.3193666503 GRS(881)=-2.3184860971 > ss= 0.9996206356 GRS(881)=-2.3184860971 GRS(882)=2.3176065461 > ss= 0.9996209232 GRS(882)=2.3176065461 GRS(883)=-2.3167279953 > ss= 0.9996212104 GRS(883)=-2.3167279953 GRS(884)=2.3158504428 > ss= 0.9996214972 GRS(884)=2.3158504428 GRS(885)=-2.3149738869 > ss= 0.9996217835 GRS(885)=-2.3149738869 GRS(886)=2.3140983255 > ss= 0.9996220693 GRS(886)=2.3140983255 GRS(887)=-2.3132237568 > ss= 0.9996223548 GRS(887)=-2.3132237568 GRS(888)=2.3123501789 > ss= 0.9996226398 GRS(888)=2.3123501789 GRS(889)=-2.3114775900 > ss= 0.9996229244 GRS(889)=-2.3114775900 GRS(890)=2.3106059882 > ss= 0.9996232086 GRS(890)=2.3106059882 GRS(891)=-2.3097353717 > ss= 0.9996234923 GRS(891)=-2.3097353717 GRS(892)=2.3088657385 > ss= 0.9996237756 GRS(892)=2.3088657385 GRS(893)=-2.3079970869 > ss= 0.9996240585 GRS(893)=-2.3079970869 GRS(894)=2.3071294149 > ss= 0.9996243409 GRS(894)=2.3071294149 GRS(895)=-2.3062627208 > ss= 0.9996246229 GRS(895)=-2.3062627208 GRS(896)=2.3053970027 > ss= 0.9996249046 GRS(896)=2.3053970027 GRS(897)=-2.3045322587 > ss= 0.9996251857 GRS(897)=-2.3045322587 GRS(898)=2.3036684872 > ss= 0.9996254665 GRS(898)=2.3036684872 GRS(899)=-2.3028056861 > ss= 0.9996257468 GRS(899)=-2.3028056861 GRS(900)=2.3019438538 > ss= 0.9996260268 GRS(900)=2.3019438538 GRS(901)=-2.3010829884 > ss= 0.9996263063 GRS(901)=-2.3010829884 GRS(902)=2.3002230881 > ss= 0.9996265853 GRS(902)=2.3002230881 GRS(903)=-2.2993641511 > ss= 0.9996268640 GRS(903)=-2.2993641511 GRS(904)=2.2985061756 > ss= 0.9996271423 GRS(904)=2.2985061756 GRS(905)=-2.2976491598 > ss= 0.9996274201 GRS(905)=-2.2976491598 GRS(906)=2.2967931019 > ss= 0.9996276975 GRS(906)=2.2967931019 GRS(907)=-2.2959380002 > ss= 0.9996279745 GRS(907)=-2.2959380002 GRS(908)=2.2950838528 > ss= 0.9996282511 GRS(908)=2.2950838528 GRS(909)=-2.2942306580 > ss= 0.9996285273 GRS(909)=-2.2942306580 GRS(910)=2.2933784140 > ss= 0.9996288031 GRS(910)=2.2933784140 GRS(911)=-2.2925271191 > ss= 0.9996290785 GRS(911)=-2.2925271191 GRS(912)=2.2916767714 > ss= 0.9996293534 GRS(912)=2.2916767714 GRS(913)=-2.2908273693 > ss= 0.9996296280 GRS(913)=-2.2908273693 GRS(914)=2.2899789110 > ss= 0.9996299021 GRS(914)=2.2899789110 GRS(915)=-2.2891313947 > ss= 0.9996301759 GRS(915)=-2.2891313947 GRS(916)=2.2882848187 > ss= 0.9996304492 GRS(916)=2.2882848187 GRS(917)=-2.2874391813 > ss= 0.9996307222 GRS(917)=-2.2874391813 GRS(918)=2.2865944807 > ss= 0.9996309947 GRS(918)=2.2865944807 GRS(919)=-2.2857507152 > ss= 0.9996312668 GRS(919)=-2.2857507152 GRS(920)=2.2849078830 > ss= 0.9996315385 GRS(920)=2.2849078830 GRS(921)=-2.2840659826 > ss= 0.9996318099 GRS(921)=-2.2840659826 GRS(922)=2.2832250120 > ss= 0.9996320808 GRS(922)=2.2832250120 GRS(923)=-2.2823849697 > ss= 0.9996323513 GRS(923)=-2.2823849697 GRS(924)=2.2815458539 > ss= 0.9996326215 GRS(924)=2.2815458539 GRS(925)=-2.2807076629 > ss= 0.9996328912 GRS(925)=-2.2807076629 GRS(926)=2.2798703951 > ss= 0.9996331605 GRS(926)=2.2798703951 GRS(927)=-2.2790340487 > ss= 0.9996334295 GRS(927)=-2.2790340487 GRS(928)=2.2781986220 > ss= 0.9996336980 GRS(928)=2.2781986220 GRS(929)=-2.2773641134 > ss= 0.9996339662 GRS(929)=-2.2773641134 GRS(930)=2.2765305211 > ss= 0.9996342340 GRS(930)=2.2765305211 GRS(931)=-2.2756978435 > ss= 0.9996345013 GRS(931)=-2.2756978435 GRS(932)=2.2748660790 > ss= 0.9996347683 GRS(932)=2.2748660790 GRS(933)=-2.2740352258 > ss= 0.9996350349 GRS(933)=-2.2740352258 GRS(934)=2.2732052824 > ss= 0.9996353011 GRS(934)=2.2732052824 GRS(935)=-2.2723762470 > ss= 0.9996355669 GRS(935)=-2.2723762470 GRS(936)=2.2715481179 > ss= 0.9996358324 GRS(936)=2.2715481179 GRS(937)=-2.2707208936 > ss= 0.9996360974 GRS(937)=-2.2707208936 GRS(938)=2.2698945724 > ss= 0.9996363621 GRS(938)=2.2698945724 GRS(939)=-2.2690691526 > ss= 0.9996366263 GRS(939)=-2.2690691526 GRS(940)=2.2682446326 > ss= 0.9996368902 GRS(940)=2.2682446326 GRS(941)=-2.2674210108 > ss= 0.9996371537 GRS(941)=-2.2674210108 GRS(942)=2.2665982856 > ss= 0.9996374169 GRS(942)=2.2665982856 GRS(943)=-2.2657764552 > ss= 0.9996376796 GRS(943)=-2.2657764552 GRS(944)=2.2649555182 > ss= 0.9996379420 GRS(944)=2.2649555182 GRS(945)=-2.2641354728 > ss= 0.9996382039 GRS(945)=-2.2641354728 GRS(946)=2.2633163175 > ss= 0.9996384655 GRS(946)=2.2633163175 GRS(947)=-2.2624980507 > ss= 0.9996387268 GRS(947)=-2.2624980507 GRS(948)=2.2616806707 > ss= 0.9996389876 GRS(948)=2.2616806707 GRS(949)=-2.2608641760 > ss= 0.9996392481 GRS(949)=-2.2608641760 GRS(950)=2.2600485649 > ss= 0.9996395082 GRS(950)=2.2600485649 GRS(951)=-2.2592338359 > ss= 0.9996397679 GRS(951)=-2.2592338359 GRS(952)=2.2584199874 > ss= 0.9996400273 GRS(952)=2.2584199874 GRS(953)=-2.2576070177 > ss= 0.9996402862 GRS(953)=-2.2576070177 GRS(954)=2.2567949254 > ss= 0.9996405448 GRS(954)=2.2567949254 GRS(955)=-2.2559837088 > ss= 0.9996408031 GRS(955)=-2.2559837088 GRS(956)=2.2551733664 > ss= 0.9996410609 GRS(956)=2.2551733664 GRS(957)=-2.2543638965 > ss= 0.9996413184 GRS(957)=-2.2543638965 GRS(958)=2.2535552977 > ss= 0.9996415755 GRS(958)=2.2535552977 GRS(959)=-2.2527475683 > ss= 0.9996418323 GRS(959)=-2.2527475683 GRS(960)=2.2519407069 > ss= 0.9996420887 GRS(960)=2.2519407069 GRS(961)=-2.2511347118 > ss= 0.9996423447 GRS(961)=-2.2511347118 GRS(962)=2.2503295815 > ss= 0.9996426003 GRS(962)=2.2503295815 GRS(963)=-2.2495253145 > ss= 0.9996428556 GRS(963)=-2.2495253145 GRS(964)=2.2487219092 > ss= 0.9996431105 GRS(964)=2.2487219092 GRS(965)=-2.2479193640 > ss= 0.9996433651 GRS(965)=-2.2479193640 GRS(966)=2.2471176775 > ss= 0.9996436193 GRS(966)=2.2471176775 GRS(967)=-2.2463168482 > ss= 0.9996438731 GRS(967)=-2.2463168482 GRS(968)=2.2455168744 > ss= 0.9996441266 GRS(968)=2.2455168744 GRS(969)=-2.2447177547 > ss= 0.9996443797 GRS(969)=-2.2447177547 GRS(970)=2.2439194875 > ss= 0.9996446325 GRS(970)=2.2439194875 GRS(971)=-2.2431220714 > ss= 0.9996448849 GRS(971)=-2.2431220714 GRS(972)=2.2423255048 > ss= 0.9996451369 GRS(972)=2.2423255048 GRS(973)=-2.2415297862 > ss= 0.9996453886 GRS(973)=-2.2415297862 GRS(974)=2.2407349141 > ss= 0.9996456399 GRS(974)=2.2407349141 GRS(975)=-2.2399408871 > ss= 0.9996458909 GRS(975)=-2.2399408871 GRS(976)=2.2391477035 > ss= 0.9996461415 GRS(976)=2.2391477035 GRS(977)=-2.2383553620 > ss= 0.9996463917 GRS(977)=-2.2383553620 GRS(978)=2.2375638610 > ss= 0.9996466416 GRS(978)=2.2375638610 GRS(979)=-2.2367731991 > ss= 0.9996468912 GRS(979)=-2.2367731991 GRS(980)=2.2359833747 > ss= 0.9996471404 GRS(980)=2.2359833747 GRS(981)=-2.2351943865 > ss= 0.9996473892 GRS(981)=-2.2351943865 GRS(982)=2.2344062328 > ss= 0.9996476377 GRS(982)=2.2344062328 GRS(983)=-2.2336189123 > ss= 0.9996478858 GRS(983)=-2.2336189123 GRS(984)=2.2328324235 > ss= 0.9996481336 GRS(984)=2.2328324235 GRS(985)=-2.2320467649 > ss= 0.9996483811 GRS(985)=-2.2320467649 GRS(986)=2.2312619350 > ss= 0.9996486282 GRS(986)=2.2312619350 GRS(987)=-2.2304779324 > ss= 0.9996488749 GRS(987)=-2.2304779324 GRS(988)=2.2296947557 > ss= 0.9996491213 GRS(988)=2.2296947557 GRS(989)=-2.2289124034 > ss= 0.9996493674 GRS(989)=-2.2289124034 GRS(990)=2.2281308741 > ss= 0.9996496131 GRS(990)=2.2281308741 GRS(991)=-2.2273501663 > ss= 0.9996498585 GRS(991)=-2.2273501663 GRS(992)=2.2265702785 > ss= 0.9996501035 GRS(992)=2.2265702785 GRS(993)=-2.2257912094 > ss= 0.9996503482 GRS(993)=-2.2257912094 GRS(994)=2.2250129575 > ss= 0.9996505925 GRS(994)=2.2250129575 GRS(995)=-2.2242355213 > ss= 0.9996508365 GRS(995)=-2.2242355213 GRS(996)=2.2234588996 > ss= 0.9996510802 GRS(996)=2.2234588996 GRS(997)=-2.2226830907 > ss= 0.9996513235 GRS(997)=-2.2226830907 GRS(998)=2.2219080934 > ss= 0.9996515665 GRS(998)=2.2219080934 GRS(999)=-2.2211339062 > ss= 0.9996518091 GRS(999)=-2.2211339062 GRS(1000)=2.2203605277 > > -- Lisandro Dalc??n --------------- Centro Internacional de M??todos Computacionales en Ingenier??a (CIMEC) Instituto de Desarrollo Tecnol??gico para la Industria Qu??mica (INTEC) Consejo Nacional de Investigaciones Cient??ficas y T??cnicas (CONICET) PTLC - G??emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Tue Feb 5 08:43:32 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 5 Feb 2008 11:43:32 -0300 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <47A86B5F.5010503@gmail.com> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> Message-ID: Ben, some time ago I was doing some testing with PETSc for solving incompressible NS eqs with fractional step method. I've found that in our software and hardware setup, the best way to solve the pressure problem was by using HYPRE BoomerAMG. This preconditioner usually have some heavy setup, but if your Poison matrix does not change, then the sucessive solves at each time step are really fast. If you still want to use a direct method, you should use the combination '-ksp_type preonly -pc_type lu' (by default, this will only work on sequential mode, unless you build PETSc with an external package like MUMPS). This way, PETSc computes the LU factorization only once, and at each time step, the call to KSPSolve end-up only doing the triangular solvers. The nice thing about PETSc is that, if you next realize the factorization take a long time (as it usually take in big problems), you can switch BoomerAMG by only passing in the command line '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's all, you do not need to change your code. And more, depending on your problem you can choose the direct solvers or algebraic multigrid as you want, by simply pass the appropriate combination options in the command line (or a options file, using the -options_file option). Please, if you ever try HYPRE BoomerAMG preconditioners, I would like to know about your experience. Regards, On 2/5/08, Ben Tay wrote: > Hi everyone, > > I was reading about the topic abt inversing a sparse matrix. I have to > solve a poisson eqn for my CFD code. Usually, I form a system of linear > eqns and solve Ax=b. The "A" is always the same and only the "b" changes > every timestep. Does it mean that if I'm able to get the inverse matrix > A^(-1), in order to get x at every timestep, I only need to do a simple > matrix multiplication ie x=A^(-1)*b ? > > Hi Timothy, if the above is true, can you email me your Fortran code > template? I'm also programming in fortran 90. Thank you very much > > Regards. > > Timothy Stitt wrote: > > Yes Yujie, I was able to put together a parallel code to invert a > > large sparse matrix with the help of the PETSc developers. If you need > > any help or maybe a Fortran code template just let me know. > > > > Best, > > > > Tim. > > > > Waad Subber wrote: > >> Hi > >> There was a discussion between Tim Stitt and petsc developers about > >> matrix inversion, and it was really helpful. That was in last Nov. > >> You can check the emails archive > >> > >> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html > >> > >> > >> Waad > >> > >> */Yujie /* wrote: > >> > >> what is the difference between sequantial and parallel AIJ matrix? > >> Assuming there is a matrix A, if > >> I partitaion this matrix into A1, A2, Ai... An. > >> A is a parallel AIJ matrix at the whole view, Ai > >> is a sequential AIJ matrix? I want to operate Ai at each node. > >> In addition, whether is it possible to get general inverse using > >> MatMatSolve() if the matrix is not square? Thanks a lot. > >> > >> Regards, > >> Yujie > >> > >> > >> On 2/4/08, *Barry Smith* >> > wrote: > >> > >> > >> For sequential AIJ matrices you can fill the B matrix > >> with the > >> identity and then use > >> MatMatSolve(). > >> > >> Note since the inverse of a sparse matrix is dense the B > >> matrix is > >> a SeqDense matrix. > >> > >> Barry > >> > >> On Feb 4, 2008, at 12:37 AM, Yujie wrote: > >> > >> > Hi, > >> > Now, I want to inverse a sparse matrix. I have browsed the > >> manual, > >> > however, I can't find some information. could you give me > >> some advice? > >> > > >> > thanks a lot. > >> > > >> > Regards, > >> > Yujie > >> > > >> > >> > >> > >> ------------------------------------------------------------------------ > >> Looking for last minute shopping deals? Find them fast with Yahoo! > >> Search. > >> > > > > > > > > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From recrusader at gmail.com Tue Feb 5 11:26:35 2008 From: recrusader at gmail.com (Yujie) Date: Tue, 5 Feb 2008 09:26:35 -0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <47A85573.4090607@cscs.ch> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> Message-ID: <7ff0ee010802050926q1f13235dgf4ddea8393586493@mail.gmail.com> Hi, Tim Thank you for your help. I am really glad to get your help. According to what you said, if the matrix A has been divided into several nodes in the cluster, you may use your parallel code to inverse A? My problem is that what is the distribution of the results? thanks a lot. Regards, Yujie On 2/5/08, Timothy Stitt wrote: > > Yes Yujie, I was able to put together a parallel code to invert a large > sparse matrix with the help of the PETSc developers. If you need any > help or maybe a Fortran code template just let me know. > > Best, > > Tim. > > Waad Subber wrote: > > Hi > > There was a discussion between Tim Stitt and petsc developers about > > matrix inversion, and it was really helpful. That was in last Nov. You > > can check the emails archive > > > > > http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html > > > > Waad > > > > */Yujie /* wrote: > > > > what is the difference between sequantial and parallel AIJ matrix? > > Assuming there is a matrix A, if > > I partitaion this matrix into A1, A2, Ai... An. > > A is a parallel AIJ matrix at the whole view, Ai > > is a sequential AIJ matrix? I want to operate Ai at each node. > > In addition, whether is it possible to get general inverse using > > MatMatSolve() if the matrix is not square? Thanks a lot. > > > > Regards, > > Yujie > > > > > > On 2/4/08, *Barry Smith* > > wrote: > > > > > > For sequential AIJ matrices you can fill the B matrix with > the > > identity and then use > > MatMatSolve(). > > > > Note since the inverse of a sparse matrix is dense the B > > matrix is > > a SeqDense matrix. > > > > Barry > > > > On Feb 4, 2008, at 12:37 AM, Yujie wrote: > > > > > Hi, > > > Now, I want to inverse a sparse matrix. I have browsed the > > manual, > > > however, I can't find some information. could you give me > > some advice? > > > > > > thanks a lot. > > > > > > Regards, > > > Yujie > > > > > > > > > > > ------------------------------------------------------------------------ > > Looking for last minute shopping deals? Find them fast with Yahoo! > > Search. > > < > http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping > > > > > > -- > Timothy Stitt > HPC Applications Analyst > > Swiss National Supercomputing Centre (CSCS) > Galleria 2 - Via Cantonale > CH-6928 Manno, Switzerland > > +41 (0) 91 610 8233 > stitt at cscs.ch > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Feb 5 11:32:52 2008 From: recrusader at gmail.com (Yujie) Date: Tue, 5 Feb 2008 09:32:52 -0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> Message-ID: <7ff0ee010802050932q5661bcek51be3259a6332550@mail.gmail.com> Hi, Lisandro I have tried to use BoomerAMG for my problem. My problem is a set of elliptic-type linear PDEs. They are strong coupled. The convergence was bad. I tried to adjust some parameters, the convergence had some improvements and was always bad. I have little knowledge about your problem. I have discussed my problem with Hypre developers, they told me that if the PDEs are strong coupled, it is difficult to use BoomerAMG. Regards, Yujie On 2/5/08, Lisandro Dalcin wrote: > > Ben, some time ago I was doing some testing with PETSc for solving > incompressible NS eqs with fractional step method. I've found that in > our software and hardware setup, the best way to solve the pressure > problem was by using HYPRE BoomerAMG. This preconditioner usually have > some heavy setup, but if your Poison matrix does not change, then the > sucessive solves at each time step are really fast. > > If you still want to use a direct method, you should use the > combination '-ksp_type preonly -pc_type lu' (by default, this will > only work on sequential mode, unless you build PETSc with an external > package like MUMPS). This way, PETSc computes the LU factorization > only once, and at each time step, the call to KSPSolve end-up only > doing the triangular solvers. > > The nice thing about PETSc is that, if you next realize the > factorization take a long time (as it usually take in big problems), > you can switch BoomerAMG by only passing in the command line > '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's > all, you do not need to change your code. And more, depending on your > problem you can choose the direct solvers or algebraic multigrid as > you want, by simply pass the appropriate combination options in the > command line (or a options file, using the -options_file option). > > Please, if you ever try HYPRE BoomerAMG preconditioners, I would like > to know about your experience. > > Regards, > > On 2/5/08, Ben Tay wrote: > > Hi everyone, > > > > I was reading about the topic abt inversing a sparse matrix. I have to > > solve a poisson eqn for my CFD code. Usually, I form a system of linear > > eqns and solve Ax=b. The "A" is always the same and only the "b" changes > > every timestep. Does it mean that if I'm able to get the inverse matrix > > A^(-1), in order to get x at every timestep, I only need to do a simple > > matrix multiplication ie x=A^(-1)*b ? > > > > Hi Timothy, if the above is true, can you email me your Fortran code > > template? I'm also programming in fortran 90. Thank you very much > > > > Regards. > > > > Timothy Stitt wrote: > > > Yes Yujie, I was able to put together a parallel code to invert a > > > large sparse matrix with the help of the PETSc developers. If you need > > > any help or maybe a Fortran code template just let me know. > > > > > > Best, > > > > > > Tim. > > > > > > Waad Subber wrote: > > >> Hi > > >> There was a discussion between Tim Stitt and petsc developers about > > >> matrix inversion, and it was really helpful. That was in last Nov. > > >> You can check the emails archive > > >> > > >> > http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html > > >> > > >> > > >> Waad > > >> > > >> */Yujie /* wrote: > > >> > > >> what is the difference between sequantial and parallel AIJ > matrix? > > >> Assuming there is a matrix A, if > > >> I partitaion this matrix into A1, A2, Ai... An. > > >> A is a parallel AIJ matrix at the whole view, Ai > > >> is a sequential AIJ matrix? I want to operate Ai at each node. > > >> In addition, whether is it possible to get general inverse using > > >> MatMatSolve() if the matrix is not square? Thanks a lot. > > >> > > >> Regards, > > >> Yujie > > >> > > >> > > >> On 2/4/08, *Barry Smith* > >> > wrote: > > >> > > >> > > >> For sequential AIJ matrices you can fill the B matrix > > >> with the > > >> identity and then use > > >> MatMatSolve(). > > >> > > >> Note since the inverse of a sparse matrix is dense the B > > >> matrix is > > >> a SeqDense matrix. > > >> > > >> Barry > > >> > > >> On Feb 4, 2008, at 12:37 AM, Yujie wrote: > > >> > > >> > Hi, > > >> > Now, I want to inverse a sparse matrix. I have browsed the > > >> manual, > > >> > however, I can't find some information. could you give me > > >> some advice? > > >> > > > >> > thanks a lot. > > >> > > > >> > Regards, > > >> > Yujie > > >> > > > >> > > >> > > >> > > >> > ------------------------------------------------------------------------ > > >> Looking for last minute shopping deals? Find them fast with Yahoo! > > >> Search. > > >> < > http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping > > > > > > > > > > > > > > > > > > > > > -- > Lisandro Dalc?n > --------------- > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > Tel/Fax: +54-(0)342-451.1594 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Tue Feb 5 12:43:06 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 5 Feb 2008 15:43:06 -0300 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <7ff0ee010802050932q5661bcek51be3259a6332550@mail.gmail.com> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <7ff0ee010802050932q5661bcek51be3259a6332550@mail.gmail.com> Message-ID: Yujie, My recommendation was mainly directed to Ben Tay, as he (like me) only needs to solve a simple scalar elliptic PDE. In your case, as you said, BoomerAMG is hard to use. Any way, to put things completelly clear, I believe you should NEVER try to build and explicit inverse matrix (unless you problem is really small and perhaps only for the shake of debug something). Instead, you have to use LU factorization through the combination of options '-ksp_type preonly -pc_type lu'. On 2/5/08, Yujie wrote: > Hi, Lisandro > > I have tried to use BoomerAMG for my problem. My problem is a set of > elliptic-type linear PDEs. They are strong coupled. The convergence > was bad. I tried to adjust some parameters, the convergence > had some improvements and was always bad. I have little knowledge > about your problem. I have discussed my > problem with Hypre developers, > they told me that if the PDEs are strong coupled, it is difficult to use > BoomerAMG. > > Regards, > Yujie > > > On 2/5/08, Lisandro Dalcin wrote: > > Ben, some time ago I was doing some testing with PETSc for solving > > incompressible NS eqs with fractional step method. I've found that in > > our software and hardware setup, the best way to solve the pressure > > problem was by using HYPRE BoomerAMG. This preconditioner usually have > > some heavy setup, but if your Poison matrix does not change, then the > > sucessive solves at each time step are really fast. > > > > If you still want to use a direct method, you should use the > > combination '-ksp_type preonly -pc_type lu' (by default, this will > > only work on sequential mode, unless you build PETSc with an external > > package like MUMPS). This way, PETSc computes the LU factorization > > only once, and at each time step, the call to KSPSolve end-up only > > doing the triangular solvers. > > > > The nice thing about PETSc is that, if you next realize the > > factorization take a long time (as it usually take in big problems), > > you can switch BoomerAMG by only passing in the command line > > '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's > > all, you do not need to change your code. And more, depending on your > > problem you can choose the direct solvers or algebraic multigrid as > > you want, by simply pass the appropriate combination options in the > > command line (or a options file, using the -options_file option). > > > > Please, if you ever try HYPRE BoomerAMG preconditioners, I would like > > to know about your experience. > > > > Regards, > > > > On 2/5/08, Ben Tay wrote: > > > Hi everyone, > > > > > > I was reading about the topic abt inversing a sparse matrix. I have to > > > solve a poisson eqn for my CFD code. Usually, I form a system of linear > > > eqns and solve Ax=b. The "A" is always the same and only the "b" changes > > > every timestep. Does it mean that if I'm able to get the inverse matrix > > > A^(-1), in order to get x at every timestep, I only need to do a simple > > > matrix multiplication ie x=A^(-1)*b ? > > > > > > Hi Timothy, if the above is true, can you email me your Fortran code > > > template? I'm also programming in fortran 90. Thank you very much > > > > > > Regards. > > > > > > Timothy Stitt wrote: > > > > Yes Yujie, I was able to put together a parallel code to invert a > > > > large sparse matrix with the help of the PETSc developers. If you need > > > > any help or maybe a Fortran code template just let me know. > > > > > > > > Best, > > > > > > > > Tim. > > > > > > > > Waad Subber wrote: > > > >> Hi > > > >> There was a discussion between Tim Stitt and petsc developers about > > > >> matrix inversion, and it was really helpful. That was in last Nov. > > > >> You can check the emails archive > > > >> > > > >> > http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html > > > >> > > > >> > > > >> Waad > > > >> > > > >> */Yujie /* wrote: > > > >> > > > >> what is the difference between sequantial and parallel AIJ > matrix? > > > >> Assuming there is a matrix A, if > > > >> I partitaion this matrix into A1, A2, Ai... An. > > > >> A is a parallel AIJ matrix at the whole view, Ai > > > >> is a sequential AIJ matrix? I want to operate Ai at each node. > > > >> In addition, whether is it possible to get general inverse using > > > >> MatMatSolve() if the matrix is not square? Thanks a lot. > > > >> > > > >> Regards, > > > >> Yujie > > > >> > > > >> > > > >> On 2/4/08, *Barry Smith* > > >> > wrote: > > > >> > > > >> > > > >> For sequential AIJ matrices you can fill the B matrix > > > >> with the > > > >> identity and then use > > > >> MatMatSolve(). > > > >> > > > >> Note since the inverse of a sparse matrix is dense the B > > > >> matrix is > > > >> a SeqDense matrix. > > > >> > > > >> Barry > > > >> > > > >> On Feb 4, 2008, at 12:37 AM, Yujie wrote: > > > >> > > > >> > Hi, > > > >> > Now, I want to inverse a sparse matrix. I have browsed the > > > >> manual, > > > >> > however, I can't find some information. could you give me > > > >> some advice? > > > >> > > > > >> > thanks a lot. > > > >> > > > > >> > Regards, > > > >> > Yujie > > > >> > > > > >> > > > >> > > > >> > > > >> > ------------------------------------------------------------------------ > > > >> Looking for last minute shopping deals? Find them fast with Yahoo! > > > >> Search. > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Lisandro Dalc?n > > --------------- > > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) > > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) > > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) > > PTLC - G?emes 3450, (3000) Santa Fe, Argentina > > Tel/Fax: +54-(0)342-451.1594 > > > > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From pflath at ices.utexas.edu Tue Feb 5 13:50:51 2008 From: pflath at ices.utexas.edu (Pearl Flath) Date: Tue, 5 Feb 2008 13:50:51 -0600 Subject: Trouble with DA, multiple degrees of freedom Message-ID: Dear All, I have a code where the velocity (three components) and pressure are all stored in a distributed array with 4 degrees of freedom per node. I'd like to take one component of the velocity and multiply it by -1, but I am having trouble figuring out how to access that. I believe it must involve DAVecGetArrayDOF or DAVecGetArray, but I haven't managed to get either to work. I've attached a code fragment where it loads the velocity. Could someone suggest how to do this or point me to where I can find additional discussion of this? I've read the users manual on DA already. Sincerely, Pearl Flath ICES, UT Austin --------------------------------- DACreate3d(PETSC_COMM_WORLD,DA_NONPERIODIC,DA_STENCIL_BOX,m,n,p, PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE, 4,1,PETSC_NULL,PETSC_NULL,PETSC_NULL,&daV); DACreateGlobalVector(daV, &vel); // Set the velocity file to read from PetscTruth flg ; PetscViewer view_u; char velocityfile[1024] ; PetscOptionsGetString(0,"-velocityfile",velocityfile,1023,&flg); PetscViewerBinaryOpen(PETSC_COMM_WORLD, velocityfile, FILE_MODE_READ, &view_u); VecLoadIntoVector(view_u, vel); PetscViewerDestroy(view_u); -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 5 13:58:58 2008 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Feb 2008 13:58:58 -0600 Subject: Trouble with DA, multiple degrees of freedom In-Reply-To: References: Message-ID: The easiest thing to do in C is to declare a struct: typedef struct { PetscScalar v[3]; PetscScalar p; } Space; and then cast pointers Space ***array; DAVecGetArray(da, u, (void *) &array); array[k][j][i].v *= -1.0; Thanks, Matt On Feb 5, 2008 1:50 PM, Pearl Flath wrote: > Dear All, > I have a code where the velocity (three components) and pressure are all > stored in a distributed array with 4 degrees of freedom per node. I'd like > to take one component of the velocity and multiply it by -1, but I am having > trouble figuring out how to access that. I believe it must involve > DAVecGetArrayDOF or DAVecGetArray, but I haven't managed to get either to > work. I've attached a code fragment where it loads the velocity. Could > someone suggest how to do this or point me to where I can find additional > discussion of this? I've read the users manual on DA already. > Sincerely, > Pearl Flath > ICES, UT Austin > --------------------------------- > DACreate3d(PETSC_COMM_WORLD,DA_NONPERIODIC,DA_STENCIL_BOX,m,n,p, > PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE, > 4,1,PETSC_NULL,PETSC_NULL,PETSC_NULL,&daV); > > DACreateGlobalVector(daV, &vel); > > // Set the velocity file to read from > PetscTruth flg ; > PetscViewer view_u; > char velocityfile[1024] ; > PetscOptionsGetString(0,"-velocityfile",velocityfile,1023,&flg); > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, velocityfile, > FILE_MODE_READ, &view_u); > VecLoadIntoVector(view_u, vel); > PetscViewerDestroy(view_u); > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Tue Feb 5 15:39:02 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Feb 2008 15:39:02 -0600 Subject: Trouble with DA, multiple degrees of freedom In-Reply-To: References: Message-ID: <3D23BA07-59A6-4B04-90D5-21C081A8B9BE@mcs.anl.gov> VecStrideScale() is the easiest way to do this. Barry On Feb 5, 2008, at 1:50 PM, Pearl Flath wrote: > Dear All, > I have a code where the velocity (three components) and pressure > are all stored in a distributed array with 4 degrees of freedom per > node. I'd like to take one component of the velocity and multiply it > by -1, but I am having trouble figuring out how to access that. I > believe it must involve DAVecGetArrayDOF or DAVecGetArray, but I > haven't managed to get either to work. I've attached a code fragment > where it loads the velocity. Could someone suggest how to do this or > point me to where I can find additional discussion of this? I've > read the users manual on DA already. > Sincerely, > Pearl Flath > ICES, UT Austin > --------------------------------- > DACreate3d(PETSC_COMM_WORLD,DA_NONPERIODIC,DA_STENCIL_BOX,m,n,p, > PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE, > 4,1,PETSC_NULL,PETSC_NULL,PETSC_NULL,&daV); > > DACreateGlobalVector(daV, &vel); > > // Set the velocity file to read from > PetscTruth flg ; > PetscViewer view_u; > char velocityfile[1024] ; > PetscOptionsGetString(0,"-velocityfile",velocityfile,1023,&flg); > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, velocityfile, > FILE_MODE_READ, &view_u); > VecLoadIntoVector(view_u, vel); > PetscViewerDestroy(view_u); > > From zonexo at gmail.com Tue Feb 5 20:04:27 2008 From: zonexo at gmail.com (Ben Tay) Date: Wed, 06 Feb 2008 10:04:27 +0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> Message-ID: <47A915AB.9010006@gmail.com> Hi Lisandro, I'm using the fractional step mtd to solve the NS eqns as well. I've tried the direct mtd and also boomerAMG in solving the poisson eqn. Experience shows that for smaller matrix, direct mtd is slightly faster but if the matrix increases in size, boomerAMG is faster. Btw, if I'm not wrong, the default solver will be GMRES. I've also tried using the "Struct" interface solely under Hypre. It's even faster for big matrix, although the improvement doesn't seem to be a lot. I need to do more tests to confirm though. I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a while to solve the eqns. I'm wondering if it'll be faster if I get the inverse and then do matrix multiplication. Or just calling KSPSolve is actually doing something similar and there'll not be any speed difference. Hope someone can enlighten... Thanks! Lisandro Dalcin wrote: > Ben, some time ago I was doing some testing with PETSc for solving > incompressible NS eqs with fractional step method. I've found that in > our software and hardware setup, the best way to solve the pressure > problem was by using HYPRE BoomerAMG. This preconditioner usually have > some heavy setup, but if your Poison matrix does not change, then the > sucessive solves at each time step are really fast. > > If you still want to use a direct method, you should use the > combination '-ksp_type preonly -pc_type lu' (by default, this will > only work on sequential mode, unless you build PETSc with an external > package like MUMPS). This way, PETSc computes the LU factorization > only once, and at each time step, the call to KSPSolve end-up only > doing the triangular solvers. > > The nice thing about PETSc is that, if you next realize the > factorization take a long time (as it usually take in big problems), > you can switch BoomerAMG by only passing in the command line > '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's > all, you do not need to change your code. And more, depending on your > problem you can choose the direct solvers or algebraic multigrid as > you want, by simply pass the appropriate combination options in the > command line (or a options file, using the -options_file option). > > Please, if you ever try HYPRE BoomerAMG preconditioners, I would like > to know about your experience. > > Regards, > > On 2/5/08, Ben Tay wrote: > >> Hi everyone, >> >> I was reading about the topic abt inversing a sparse matrix. I have to >> solve a poisson eqn for my CFD code. Usually, I form a system of linear >> eqns and solve Ax=b. The "A" is always the same and only the "b" changes >> every timestep. Does it mean that if I'm able to get the inverse matrix >> A^(-1), in order to get x at every timestep, I only need to do a simple >> matrix multiplication ie x=A^(-1)*b ? >> >> Hi Timothy, if the above is true, can you email me your Fortran code >> template? I'm also programming in fortran 90. Thank you very much >> >> Regards. >> >> Timothy Stitt wrote: >> >>> Yes Yujie, I was able to put together a parallel code to invert a >>> large sparse matrix with the help of the PETSc developers. If you need >>> any help or maybe a Fortran code template just let me know. >>> >>> Best, >>> >>> Tim. >>> >>> Waad Subber wrote: >>> >>>> Hi >>>> There was a discussion between Tim Stitt and petsc developers about >>>> matrix inversion, and it was really helpful. That was in last Nov. >>>> You can check the emails archive >>>> >>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html >>>> >>>> >>>> Waad >>>> >>>> */Yujie /* wrote: >>>> >>>> what is the difference between sequantial and parallel AIJ matrix? >>>> Assuming there is a matrix A, if >>>> I partitaion this matrix into A1, A2, Ai... An. >>>> A is a parallel AIJ matrix at the whole view, Ai >>>> is a sequential AIJ matrix? I want to operate Ai at each node. >>>> In addition, whether is it possible to get general inverse using >>>> MatMatSolve() if the matrix is not square? Thanks a lot. >>>> >>>> Regards, >>>> Yujie >>>> >>>> >>>> On 2/4/08, *Barry Smith* >>> > wrote: >>>> >>>> >>>> For sequential AIJ matrices you can fill the B matrix >>>> with the >>>> identity and then use >>>> MatMatSolve(). >>>> >>>> Note since the inverse of a sparse matrix is dense the B >>>> matrix is >>>> a SeqDense matrix. >>>> >>>> Barry >>>> >>>> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >>>> >>>> > Hi, >>>> > Now, I want to inverse a sparse matrix. I have browsed the >>>> manual, >>>> > however, I can't find some information. could you give me >>>> some advice? >>>> > >>>> > thanks a lot. >>>> > >>>> > Regards, >>>> > Yujie >>>> > >>>> >>>> >>>> >>>> ------------------------------------------------------------------------ >>>> Looking for last minute shopping deals? Find them fast with Yahoo! >>>> Search. >>>> >>>> >>> >>> >> > > > From bsmith at mcs.anl.gov Tue Feb 5 20:16:18 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Feb 2008 20:16:18 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <47A915AB.9010006@gmail.com> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> Message-ID: <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> On Feb 5, 2008, at 8:04 PM, Ben Tay wrote: > Hi Lisandro, > > I'm using the fractional step mtd to solve the NS eqns as well. I've > tried the direct mtd and also boomerAMG in solving the poisson eqn. > Experience shows that for smaller matrix, direct mtd is slightly > faster but if the matrix increases in size, boomerAMG is faster. > Btw, if I'm not wrong, the default solver will be GMRES. I've also > tried using the "Struct" interface solely under Hypre. It's even > faster for big matrix, although the improvement doesn't seem to be a > lot. I need to do more tests to confirm though. > > I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a > while to solve the eqns. I'm wondering if it'll be faster if I get > the inverse and then do matrix multiplication. Or just calling > KSPSolve is actually doing something similar and there'll not be any > speed difference. Hope someone can enlighten... > > Thanks! > Ben, Forming the inverse explicitly will be a complete failure. Because it is dense it will have (1400x2000)^2 values and each multiply will take 2*(1400x2000)^2 floating point operations, while boomerAMG should take only O(1400x2000). BTW: if this is a constant coefficient Poisson operator with Neumann or Dirchelet boundary conditions then likely a parallel FFT based algorithm would be fastest. Alas we do not yet have this in PETSc. It looks like FFTW finally has an updated MPI version so we need to do the PETSc interface for that. Barry > Lisandro Dalcin wrote: >> Ben, some time ago I was doing some testing with PETSc for solving >> incompressible NS eqs with fractional step method. I've found that in >> our software and hardware setup, the best way to solve the pressure >> problem was by using HYPRE BoomerAMG. This preconditioner usually >> have >> some heavy setup, but if your Poison matrix does not change, then the >> sucessive solves at each time step are really fast. >> >> If you still want to use a direct method, you should use the >> combination '-ksp_type preonly -pc_type lu' (by default, this will >> only work on sequential mode, unless you build PETSc with an external >> package like MUMPS). This way, PETSc computes the LU factorization >> only once, and at each time step, the call to KSPSolve end-up only >> doing the triangular solvers. >> >> The nice thing about PETSc is that, if you next realize the >> factorization take a long time (as it usually take in big problems), >> you can switch BoomerAMG by only passing in the command line >> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's >> all, you do not need to change your code. And more, depending on your >> problem you can choose the direct solvers or algebraic multigrid as >> you want, by simply pass the appropriate combination options in the >> command line (or a options file, using the -options_file option). >> >> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like >> to know about your experience. >> >> Regards, >> >> On 2/5/08, Ben Tay wrote: >> >>> Hi everyone, >>> >>> I was reading about the topic abt inversing a sparse matrix. I >>> have to >>> solve a poisson eqn for my CFD code. Usually, I form a system of >>> linear >>> eqns and solve Ax=b. The "A" is always the same and only the "b" >>> changes >>> every timestep. Does it mean that if I'm able to get the inverse >>> matrix >>> A^(-1), in order to get x at every timestep, I only need to do a >>> simple >>> matrix multiplication ie x=A^(-1)*b ? >>> >>> Hi Timothy, if the above is true, can you email me your Fortran code >>> template? I'm also programming in fortran 90. Thank you very much >>> >>> Regards. >>> >>> Timothy Stitt wrote: >>> >>>> Yes Yujie, I was able to put together a parallel code to invert a >>>> large sparse matrix with the help of the PETSc developers. If you >>>> need >>>> any help or maybe a Fortran code template just let me know. >>>> >>>> Best, >>>> >>>> Tim. >>>> >>>> Waad Subber wrote: >>>> >>>>> Hi >>>>> There was a discussion between Tim Stitt and petsc developers >>>>> about >>>>> matrix inversion, and it was really helpful. That was in last Nov. >>>>> You can check the emails archive >>>>> >>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html >>>>> >>>>> >>>>> Waad >>>>> >>>>> */Yujie /* wrote: >>>>> >>>>> what is the difference between sequantial and parallel AIJ >>>>> matrix? >>>>> Assuming there is a matrix A, if >>>>> I partitaion this matrix into A1, A2, Ai... An. >>>>> A is a parallel AIJ matrix at the whole view, Ai >>>>> is a sequential AIJ matrix? I want to operate Ai at each node. >>>>> In addition, whether is it possible to get general inverse >>>>> using >>>>> MatMatSolve() if the matrix is not square? Thanks a lot. >>>>> >>>>> Regards, >>>>> Yujie >>>>> >>>>> >>>>> On 2/4/08, *Barry Smith* >>>> > wrote: >>>>> >>>>> >>>>> For sequential AIJ matrices you can fill the B matrix >>>>> with the >>>>> identity and then use >>>>> MatMatSolve(). >>>>> >>>>> Note since the inverse of a sparse matrix is dense >>>>> the B >>>>> matrix is >>>>> a SeqDense matrix. >>>>> >>>>> Barry >>>>> >>>>> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >>>>> >>>>> > Hi, >>>>> > Now, I want to inverse a sparse matrix. I have browsed >>>>> the >>>>> manual, >>>>> > however, I can't find some information. could you give me >>>>> some advice? >>>>> > >>>>> > thanks a lot. >>>>> > >>>>> > Regards, >>>>> > Yujie >>>>> > >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------ >>>>> Looking for last minute shopping deals? Find them fast with Yahoo! >>>>> Search. >>>>> >>>> > >>>>> >>>> >>>> >>> >> >> >> > From zonexo at gmail.com Tue Feb 5 20:48:38 2008 From: zonexo at gmail.com (Ben Tay) Date: Wed, 06 Feb 2008 10:48:38 +0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> Message-ID: <47A92006.20108@gmail.com> Thank you Barry for your enlightenment. I'll just continue to use BoomerAMG for the poisson eqn. I'll also check up on FFTW. Last time, I recalled that there seemed to be some restrictions for FFT on solving poisson eqn. It seems that the grids must be constant in at least 1 dimension. I wonder if that is true? If that's the case, then it's not possible for me to use it, although it's a constant coefficient Poisson operator with Neumann or Dirchelet boundary conditions. thank you. Barry Smith wrote: > > On Feb 5, 2008, at 8:04 PM, Ben Tay wrote: > >> Hi Lisandro, >> >> I'm using the fractional step mtd to solve the NS eqns as well. I've >> tried the direct mtd and also boomerAMG in solving the poisson eqn. >> Experience shows that for smaller matrix, direct mtd is slightly >> faster but if the matrix increases in size, boomerAMG is faster. >> Btw, if I'm not wrong, the default solver will be GMRES. I've also >> tried using the "Struct" interface solely under Hypre. It's even >> faster for big matrix, although the improvement doesn't seem to be a >> lot. I need to do more tests to confirm though. >> >> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a >> while to solve the eqns. I'm wondering if it'll be faster if I get >> the inverse and then do matrix multiplication. Or just calling >> KSPSolve is actually doing something similar and there'll not be any >> speed difference. Hope someone can enlighten... >> >> Thanks! >> > Ben, > > Forming the inverse explicitly will be a complete failure. > Because it is dense it will have (1400x2000)^2 values and > each multiply will take 2*(1400x2000)^2 floating point operations, > while boomerAMG should take only O(1400x2000). > > BTW: if this is a constant coefficient Poisson operator with > Neumann or Dirchelet boundary conditions then > likely a parallel FFT based algorithm would be fastest. Alas we do not > yet have this in PETSc. It looks like FFTW finally > has an updated MPI version so we need to do the PETSc interface for that. > > > Barry > > >> Lisandro Dalcin wrote: >>> Ben, some time ago I was doing some testing with PETSc for solving >>> incompressible NS eqs with fractional step method. I've found that in >>> our software and hardware setup, the best way to solve the pressure >>> problem was by using HYPRE BoomerAMG. This preconditioner usually have >>> some heavy setup, but if your Poison matrix does not change, then the >>> sucessive solves at each time step are really fast. >>> >>> If you still want to use a direct method, you should use the >>> combination '-ksp_type preonly -pc_type lu' (by default, this will >>> only work on sequential mode, unless you build PETSc with an external >>> package like MUMPS). This way, PETSc computes the LU factorization >>> only once, and at each time step, the call to KSPSolve end-up only >>> doing the triangular solvers. >>> >>> The nice thing about PETSc is that, if you next realize the >>> factorization take a long time (as it usually take in big problems), >>> you can switch BoomerAMG by only passing in the command line >>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's >>> all, you do not need to change your code. And more, depending on your >>> problem you can choose the direct solvers or algebraic multigrid as >>> you want, by simply pass the appropriate combination options in the >>> command line (or a options file, using the -options_file option). >>> >>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like >>> to know about your experience. >>> >>> Regards, >>> >>> On 2/5/08, Ben Tay wrote: >>> >>>> Hi everyone, >>>> >>>> I was reading about the topic abt inversing a sparse matrix. I have to >>>> solve a poisson eqn for my CFD code. Usually, I form a system of >>>> linear >>>> eqns and solve Ax=b. The "A" is always the same and only the "b" >>>> changes >>>> every timestep. Does it mean that if I'm able to get the inverse >>>> matrix >>>> A^(-1), in order to get x at every timestep, I only need to do a >>>> simple >>>> matrix multiplication ie x=A^(-1)*b ? >>>> >>>> Hi Timothy, if the above is true, can you email me your Fortran code >>>> template? I'm also programming in fortran 90. Thank you very much >>>> >>>> Regards. >>>> >>>> Timothy Stitt wrote: >>>> >>>>> Yes Yujie, I was able to put together a parallel code to invert a >>>>> large sparse matrix with the help of the PETSc developers. If you >>>>> need >>>>> any help or maybe a Fortran code template just let me know. >>>>> >>>>> Best, >>>>> >>>>> Tim. >>>>> >>>>> Waad Subber wrote: >>>>> >>>>>> Hi >>>>>> There was a discussion between Tim Stitt and petsc developers about >>>>>> matrix inversion, and it was really helpful. That was in last Nov. >>>>>> You can check the emails archive >>>>>> >>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html >>>>>> >>>>>> >>>>>> >>>>>> Waad >>>>>> >>>>>> */Yujie /* wrote: >>>>>> >>>>>> what is the difference between sequantial and parallel AIJ >>>>>> matrix? >>>>>> Assuming there is a matrix A, if >>>>>> I partitaion this matrix into A1, A2, Ai... An. >>>>>> A is a parallel AIJ matrix at the whole view, Ai >>>>>> is a sequential AIJ matrix? I want to operate Ai at each node. >>>>>> In addition, whether is it possible to get general inverse using >>>>>> MatMatSolve() if the matrix is not square? Thanks a lot. >>>>>> >>>>>> Regards, >>>>>> Yujie >>>>>> >>>>>> >>>>>> On 2/4/08, *Barry Smith* >>>>> > wrote: >>>>>> >>>>>> >>>>>> For sequential AIJ matrices you can fill the B matrix >>>>>> with the >>>>>> identity and then use >>>>>> MatMatSolve(). >>>>>> >>>>>> Note since the inverse of a sparse matrix is dense the B >>>>>> matrix is >>>>>> a SeqDense matrix. >>>>>> >>>>>> Barry >>>>>> >>>>>> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >>>>>> >>>>>> > Hi, >>>>>> > Now, I want to inverse a sparse matrix. I have browsed the >>>>>> manual, >>>>>> > however, I can't find some information. could you give me >>>>>> some advice? >>>>>> > >>>>>> > thanks a lot. >>>>>> > >>>>>> > Regards, >>>>>> > Yujie >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------ >>>>>> >>>>>> Looking for last minute shopping deals? Find them fast with Yahoo! >>>>>> Search. >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >>> >> > > From bsmith at mcs.anl.gov Tue Feb 5 21:21:46 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Feb 2008 21:21:46 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <47A92006.20108@gmail.com> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> <47A92006.20108@gmail.com> Message-ID: <79724F06-DB17-4794-9539-E07A37DC469F@mcs.anl.gov> On Feb 5, 2008, at 8:48 PM, Ben Tay wrote: > Thank you Barry for your enlightenment. I'll just continue to use > BoomerAMG for the poisson eqn. I'll also check up on FFTW. Last > time, I recalled that there seemed to be some restrictions for FFT > on solving poisson eqn. It seems that the grids must be constant in > at least 1 dimension. Yes. Then it decouples into a bunch of tridiagonal solves; Basically if you can do separation of variables you can use FFTs. Barry > I wonder if that is true? If that's the case, then it's not possible > for me to use it, although it's a constant coefficient Poisson > operator with Neumann or Dirchelet boundary conditions. > > thank you. > > Barry Smith wrote: >> >> On Feb 5, 2008, at 8:04 PM, Ben Tay wrote: >> >>> Hi Lisandro, >>> >>> I'm using the fractional step mtd to solve the NS eqns as well. >>> I've tried the direct mtd and also boomerAMG in solving the >>> poisson eqn. Experience shows that for smaller matrix, direct mtd >>> is slightly faster but if the matrix increases in size, boomerAMG >>> is faster. Btw, if I'm not wrong, the default solver will be >>> GMRES. I've also tried using the "Struct" interface solely under >>> Hypre. It's even faster for big matrix, although the improvement >>> doesn't seem to be a lot. I need to do more tests to confirm though. >>> >>> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite >>> a while to solve the eqns. I'm wondering if it'll be faster if I >>> get the inverse and then do matrix multiplication. Or just calling >>> KSPSolve is actually doing something similar and there'll not be >>> any speed difference. Hope someone can enlighten... >>> >>> Thanks! >>> >> Ben, >> >> Forming the inverse explicitly will be a complete failure. >> Because it is dense it will have (1400x2000)^2 values and >> each multiply will take 2*(1400x2000)^2 floating point operations, >> while boomerAMG should take only O(1400x2000). >> >> BTW: if this is a constant coefficient Poisson operator with >> Neumann or Dirchelet boundary conditions then >> likely a parallel FFT based algorithm would be fastest. Alas we do >> not yet have this in PETSc. It looks like FFTW finally >> has an updated MPI version so we need to do the PETSc interface for >> that. >> >> >> Barry >> >> >>> Lisandro Dalcin wrote: >>>> Ben, some time ago I was doing some testing with PETSc for solving >>>> incompressible NS eqs with fractional step method. I've found >>>> that in >>>> our software and hardware setup, the best way to solve the pressure >>>> problem was by using HYPRE BoomerAMG. This preconditioner usually >>>> have >>>> some heavy setup, but if your Poison matrix does not change, then >>>> the >>>> sucessive solves at each time step are really fast. >>>> >>>> If you still want to use a direct method, you should use the >>>> combination '-ksp_type preonly -pc_type lu' (by default, this will >>>> only work on sequential mode, unless you build PETSc with an >>>> external >>>> package like MUMPS). This way, PETSc computes the LU factorization >>>> only once, and at each time step, the call to KSPSolve end-up only >>>> doing the triangular solvers. >>>> >>>> The nice thing about PETSc is that, if you next realize the >>>> factorization take a long time (as it usually take in big >>>> problems), >>>> you can switch BoomerAMG by only passing in the command line >>>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's >>>> all, you do not need to change your code. And more, depending on >>>> your >>>> problem you can choose the direct solvers or algebraic multigrid as >>>> you want, by simply pass the appropriate combination options in the >>>> command line (or a options file, using the -options_file option). >>>> >>>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would >>>> like >>>> to know about your experience. >>>> >>>> Regards, >>>> >>>> On 2/5/08, Ben Tay wrote: >>>> >>>>> Hi everyone, >>>>> >>>>> I was reading about the topic abt inversing a sparse matrix. I >>>>> have to >>>>> solve a poisson eqn for my CFD code. Usually, I form a system of >>>>> linear >>>>> eqns and solve Ax=b. The "A" is always the same and only the "b" >>>>> changes >>>>> every timestep. Does it mean that if I'm able to get the inverse >>>>> matrix >>>>> A^(-1), in order to get x at every timestep, I only need to do a >>>>> simple >>>>> matrix multiplication ie x=A^(-1)*b ? >>>>> >>>>> Hi Timothy, if the above is true, can you email me your Fortran >>>>> code >>>>> template? I'm also programming in fortran 90. Thank you very much >>>>> >>>>> Regards. >>>>> >>>>> Timothy Stitt wrote: >>>>> >>>>>> Yes Yujie, I was able to put together a parallel code to invert a >>>>>> large sparse matrix with the help of the PETSc developers. If >>>>>> you need >>>>>> any help or maybe a Fortran code template just let me know. >>>>>> >>>>>> Best, >>>>>> >>>>>> Tim. >>>>>> >>>>>> Waad Subber wrote: >>>>>> >>>>>>> Hi >>>>>>> There was a discussion between Tim Stitt and petsc developers >>>>>>> about >>>>>>> matrix inversion, and it was really helpful. That was in last >>>>>>> Nov. >>>>>>> You can check the emails archive >>>>>>> >>>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html >>>>>>> >>>>>>> >>>>>>> Waad >>>>>>> >>>>>>> */Yujie /* wrote: >>>>>>> >>>>>>> what is the difference between sequantial and parallel AIJ >>>>>>> matrix? >>>>>>> Assuming there is a matrix A, if >>>>>>> I partitaion this matrix into A1, A2, Ai... An. >>>>>>> A is a parallel AIJ matrix at the whole view, Ai >>>>>>> is a sequential AIJ matrix? I want to operate Ai at each node. >>>>>>> In addition, whether is it possible to get general inverse >>>>>>> using >>>>>>> MatMatSolve() if the matrix is not square? Thanks a lot. >>>>>>> >>>>>>> Regards, >>>>>>> Yujie >>>>>>> >>>>>>> >>>>>>> On 2/4/08, *Barry Smith* >>>>>> > wrote: >>>>>>> >>>>>>> >>>>>>> For sequential AIJ matrices you can fill the B matrix >>>>>>> with the >>>>>>> identity and then use >>>>>>> MatMatSolve(). >>>>>>> >>>>>>> Note since the inverse of a sparse matrix is dense >>>>>>> the B >>>>>>> matrix is >>>>>>> a SeqDense matrix. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >>>>>>> >>>>>>> > Hi, >>>>>>> > Now, I want to inverse a sparse matrix. I have browsed >>>>>>> the >>>>>>> manual, >>>>>>> > however, I can't find some information. could you give >>>>>>> me >>>>>>> some advice? >>>>>>> > >>>>>>> > thanks a lot. >>>>>>> > >>>>>>> > Regards, >>>>>>> > Yujie >>>>>>> > >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------ >>>>>>> Looking for last minute shopping deals? Find them fast with >>>>>>> Yahoo! >>>>>>> Search. >>>>>>> >>>>>> > >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>> >> >> > From zonexo at gmail.com Tue Feb 5 21:39:41 2008 From: zonexo at gmail.com (Ben Tay) Date: Wed, 06 Feb 2008 11:39:41 +0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <79724F06-DB17-4794-9539-E07A37DC469F@mcs.anl.gov> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> <47A92006.20108@gmail.com> <79724F06-DB17-4794-9539-E07A37DC469F@mcs.anl.gov> Message-ID: <47A92BFD.1040205@gmail.com> Sorry Barry, I just would like to confirm that as long as it's a constant constant coefficient Poisson eqn with Neumann or Dirchelet boundary conditions, I can use FFT. It doesn't matter if the grids are uniform or not. Is that correct? Thanks. Barry Smith wrote: > > On Feb 5, 2008, at 8:48 PM, Ben Tay wrote: > >> Thank you Barry for your enlightenment. I'll just continue to use >> BoomerAMG for the poisson eqn. I'll also check up on FFTW. Last time, >> I recalled that there seemed to be some restrictions for FFT on >> solving poisson eqn. It seems that the grids must be constant in at >> least 1 dimension. > > Yes. Then it decouples into a bunch of tridiagonal solves; > Basically if you can do separation of variables you can > use FFTs. > > Barry > >> I wonder if that is true? If that's the case, then it's not possible >> for me to use it, although it's a constant coefficient Poisson >> operator with Neumann or Dirchelet boundary conditions. >> >> thank you. >> >> Barry Smith wrote: >>> >>> On Feb 5, 2008, at 8:04 PM, Ben Tay wrote: >>> >>>> Hi Lisandro, >>>> >>>> I'm using the fractional step mtd to solve the NS eqns as well. >>>> I've tried the direct mtd and also boomerAMG in solving the poisson >>>> eqn. Experience shows that for smaller matrix, direct mtd is >>>> slightly faster but if the matrix increases in size, boomerAMG is >>>> faster. Btw, if I'm not wrong, the default solver will be GMRES. >>>> I've also tried using the "Struct" interface solely under Hypre. >>>> It's even faster for big matrix, although the improvement doesn't >>>> seem to be a lot. I need to do more tests to confirm though. >>>> >>>> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite >>>> a while to solve the eqns. I'm wondering if it'll be faster if I >>>> get the inverse and then do matrix multiplication. Or just calling >>>> KSPSolve is actually doing something similar and there'll not be >>>> any speed difference. Hope someone can enlighten... >>>> >>>> Thanks! >>>> >>> Ben, >>> >>> Forming the inverse explicitly will be a complete failure. >>> Because it is dense it will have (1400x2000)^2 values and >>> each multiply will take 2*(1400x2000)^2 floating point operations, >>> while boomerAMG should take only O(1400x2000). >>> >>> BTW: if this is a constant coefficient Poisson operator with >>> Neumann or Dirchelet boundary conditions then >>> likely a parallel FFT based algorithm would be fastest. Alas we do >>> not yet have this in PETSc. It looks like FFTW finally >>> has an updated MPI version so we need to do the PETSc interface for >>> that. >>> >>> >>> Barry >>> >>> >>>> Lisandro Dalcin wrote: >>>>> Ben, some time ago I was doing some testing with PETSc for solving >>>>> incompressible NS eqs with fractional step method. I've found that in >>>>> our software and hardware setup, the best way to solve the pressure >>>>> problem was by using HYPRE BoomerAMG. This preconditioner usually >>>>> have >>>>> some heavy setup, but if your Poison matrix does not change, then the >>>>> sucessive solves at each time step are really fast. >>>>> >>>>> If you still want to use a direct method, you should use the >>>>> combination '-ksp_type preonly -pc_type lu' (by default, this will >>>>> only work on sequential mode, unless you build PETSc with an external >>>>> package like MUMPS). This way, PETSc computes the LU factorization >>>>> only once, and at each time step, the call to KSPSolve end-up only >>>>> doing the triangular solvers. >>>>> >>>>> The nice thing about PETSc is that, if you next realize the >>>>> factorization take a long time (as it usually take in big problems), >>>>> you can switch BoomerAMG by only passing in the command line >>>>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's >>>>> all, you do not need to change your code. And more, depending on your >>>>> problem you can choose the direct solvers or algebraic multigrid as >>>>> you want, by simply pass the appropriate combination options in the >>>>> command line (or a options file, using the -options_file option). >>>>> >>>>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like >>>>> to know about your experience. >>>>> >>>>> Regards, >>>>> >>>>> On 2/5/08, Ben Tay wrote: >>>>> >>>>>> Hi everyone, >>>>>> >>>>>> I was reading about the topic abt inversing a sparse matrix. I >>>>>> have to >>>>>> solve a poisson eqn for my CFD code. Usually, I form a system of >>>>>> linear >>>>>> eqns and solve Ax=b. The "A" is always the same and only the "b" >>>>>> changes >>>>>> every timestep. Does it mean that if I'm able to get the inverse >>>>>> matrix >>>>>> A^(-1), in order to get x at every timestep, I only need to do a >>>>>> simple >>>>>> matrix multiplication ie x=A^(-1)*b ? >>>>>> >>>>>> Hi Timothy, if the above is true, can you email me your Fortran code >>>>>> template? I'm also programming in fortran 90. Thank you very much >>>>>> >>>>>> Regards. >>>>>> >>>>>> Timothy Stitt wrote: >>>>>> >>>>>>> Yes Yujie, I was able to put together a parallel code to invert a >>>>>>> large sparse matrix with the help of the PETSc developers. If >>>>>>> you need >>>>>>> any help or maybe a Fortran code template just let me know. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Tim. >>>>>>> >>>>>>> Waad Subber wrote: >>>>>>> >>>>>>>> Hi >>>>>>>> There was a discussion between Tim Stitt and petsc developers >>>>>>>> about >>>>>>>> matrix inversion, and it was really helpful. That was in last Nov. >>>>>>>> You can check the emails archive >>>>>>>> >>>>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Waad >>>>>>>> >>>>>>>> */Yujie /* wrote: >>>>>>>> >>>>>>>> what is the difference between sequantial and parallel AIJ >>>>>>>> matrix? >>>>>>>> Assuming there is a matrix A, if >>>>>>>> I partitaion this matrix into A1, A2, Ai... An. >>>>>>>> A is a parallel AIJ matrix at the whole view, Ai >>>>>>>> is a sequential AIJ matrix? I want to operate Ai at each node. >>>>>>>> In addition, whether is it possible to get general inverse using >>>>>>>> MatMatSolve() if the matrix is not square? Thanks a lot. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Yujie >>>>>>>> >>>>>>>> >>>>>>>> On 2/4/08, *Barry Smith* >>>>>>> > wrote: >>>>>>>> >>>>>>>> >>>>>>>> For sequential AIJ matrices you can fill the B matrix >>>>>>>> with the >>>>>>>> identity and then use >>>>>>>> MatMatSolve(). >>>>>>>> >>>>>>>> Note since the inverse of a sparse matrix is dense the B >>>>>>>> matrix is >>>>>>>> a SeqDense matrix. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >>>>>>>> >>>>>>>> > Hi, >>>>>>>> > Now, I want to inverse a sparse matrix. I have browsed the >>>>>>>> manual, >>>>>>>> > however, I can't find some information. could you give me >>>>>>>> some advice? >>>>>>>> > >>>>>>>> > thanks a lot. >>>>>>>> > >>>>>>>> > Regards, >>>>>>>> > Yujie >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------ >>>>>>>> >>>>>>>> Looking for last minute shopping deals? Find them fast with Yahoo! >>>>>>>> Search. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >> > > From bsmith at mcs.anl.gov Wed Feb 6 07:52:39 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 6 Feb 2008 07:52:39 -0600 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <47A92BFD.1040205@gmail.com> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> <47A92006.20108@gmail.com> <79724F06-DB17-4794-9539-E07A37DC469F@mcs.anl.gov> <47A92BFD.1040205@gmail.com> Message-ID: Whoops, actually the grids in each direction would need to be uniform. Barry On Feb 5, 2008, at 9:39 PM, Ben Tay wrote: > Sorry Barry, I just would like to confirm that as long as it's a > constant constant coefficient Poisson eqn with Neumann or Dirchelet > boundary conditions, I can use FFT. It doesn't matter if the grids > are uniform or not. Is that correct? Thanks. > > Barry Smith wrote: >> >> On Feb 5, 2008, at 8:48 PM, Ben Tay wrote: >> >>> Thank you Barry for your enlightenment. I'll just continue to use >>> BoomerAMG for the poisson eqn. I'll also check up on FFTW. Last >>> time, I recalled that there seemed to be some restrictions for FFT >>> on solving poisson eqn. It seems that the grids must be constant >>> in at least 1 dimension. >> >> Yes. Then it decouples into a bunch of tridiagonal solves; >> Basically if you can do separation of variables you can >> use FFTs. >> >> Barry >> >>> I wonder if that is true? If that's the case, then it's not >>> possible for me to use it, although it's a constant coefficient >>> Poisson operator with Neumann or Dirchelet boundary conditions. >>> >>> thank you. >>> >>> Barry Smith wrote: >>>> >>>> On Feb 5, 2008, at 8:04 PM, Ben Tay wrote: >>>> >>>>> Hi Lisandro, >>>>> >>>>> I'm using the fractional step mtd to solve the NS eqns as well. >>>>> I've tried the direct mtd and also boomerAMG in solving the >>>>> poisson eqn. Experience shows that for smaller matrix, direct >>>>> mtd is slightly faster but if the matrix increases in size, >>>>> boomerAMG is faster. Btw, if I'm not wrong, the default solver >>>>> will be GMRES. I've also tried using the "Struct" interface >>>>> solely under Hypre. It's even faster for big matrix, although >>>>> the improvement doesn't seem to be a lot. I need to do more >>>>> tests to confirm though. >>>>> >>>>> I'm now doing 2D simulation with 1400x2000 grids. It's takes >>>>> quite a while to solve the eqns. I'm wondering if it'll be >>>>> faster if I get the inverse and then do matrix multiplication. >>>>> Or just calling KSPSolve is actually doing something similar and >>>>> there'll not be any speed difference. Hope someone can >>>>> enlighten... >>>>> >>>>> Thanks! >>>>> >>>> Ben, >>>> >>>> Forming the inverse explicitly will be a complete failure. >>>> Because it is dense it will have (1400x2000)^2 values and >>>> each multiply will take 2*(1400x2000)^2 floating point >>>> operations, while boomerAMG should take only O(1400x2000). >>>> >>>> BTW: if this is a constant coefficient Poisson operator with >>>> Neumann or Dirchelet boundary conditions then >>>> likely a parallel FFT based algorithm would be fastest. Alas we >>>> do not yet have this in PETSc. It looks like FFTW finally >>>> has an updated MPI version so we need to do the PETSc interface >>>> for that. >>>> >>>> >>>> Barry >>>> >>>> >>>>> Lisandro Dalcin wrote: >>>>>> Ben, some time ago I was doing some testing with PETSc for >>>>>> solving >>>>>> incompressible NS eqs with fractional step method. I've found >>>>>> that in >>>>>> our software and hardware setup, the best way to solve the >>>>>> pressure >>>>>> problem was by using HYPRE BoomerAMG. This preconditioner >>>>>> usually have >>>>>> some heavy setup, but if your Poison matrix does not change, >>>>>> then the >>>>>> sucessive solves at each time step are really fast. >>>>>> >>>>>> If you still want to use a direct method, you should use the >>>>>> combination '-ksp_type preonly -pc_type lu' (by default, this >>>>>> will >>>>>> only work on sequential mode, unless you build PETSc with an >>>>>> external >>>>>> package like MUMPS). This way, PETSc computes the LU >>>>>> factorization >>>>>> only once, and at each time step, the call to KSPSolve end-up >>>>>> only >>>>>> doing the triangular solvers. >>>>>> >>>>>> The nice thing about PETSc is that, if you next realize the >>>>>> factorization take a long time (as it usually take in big >>>>>> problems), >>>>>> you can switch BoomerAMG by only passing in the command line >>>>>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And >>>>>> that's >>>>>> all, you do not need to change your code. And more, depending >>>>>> on your >>>>>> problem you can choose the direct solvers or algebraic >>>>>> multigrid as >>>>>> you want, by simply pass the appropriate combination options in >>>>>> the >>>>>> command line (or a options file, using the -options_file option). >>>>>> >>>>>> Please, if you ever try HYPRE BoomerAMG preconditioners, I >>>>>> would like >>>>>> to know about your experience. >>>>>> >>>>>> Regards, >>>>>> >>>>>> On 2/5/08, Ben Tay wrote: >>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> I was reading about the topic abt inversing a sparse matrix. I >>>>>>> have to >>>>>>> solve a poisson eqn for my CFD code. Usually, I form a system >>>>>>> of linear >>>>>>> eqns and solve Ax=b. The "A" is always the same and only the >>>>>>> "b" changes >>>>>>> every timestep. Does it mean that if I'm able to get the >>>>>>> inverse matrix >>>>>>> A^(-1), in order to get x at every timestep, I only need to do >>>>>>> a simple >>>>>>> matrix multiplication ie x=A^(-1)*b ? >>>>>>> >>>>>>> Hi Timothy, if the above is true, can you email me your >>>>>>> Fortran code >>>>>>> template? I'm also programming in fortran 90. Thank you very >>>>>>> much >>>>>>> >>>>>>> Regards. >>>>>>> >>>>>>> Timothy Stitt wrote: >>>>>>> >>>>>>>> Yes Yujie, I was able to put together a parallel code to >>>>>>>> invert a >>>>>>>> large sparse matrix with the help of the PETSc developers. If >>>>>>>> you need >>>>>>>> any help or maybe a Fortran code template just let me know. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Tim. >>>>>>>> >>>>>>>> Waad Subber wrote: >>>>>>>> >>>>>>>>> Hi >>>>>>>>> There was a discussion between Tim Stitt and petsc >>>>>>>>> developers about >>>>>>>>> matrix inversion, and it was really helpful. That was in >>>>>>>>> last Nov. >>>>>>>>> You can check the emails archive >>>>>>>>> >>>>>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html >>>>>>>>> >>>>>>>>> >>>>>>>>> Waad >>>>>>>>> >>>>>>>>> */Yujie /* wrote: >>>>>>>>> >>>>>>>>> what is the difference between sequantial and parallel AIJ >>>>>>>>> matrix? >>>>>>>>> Assuming there is a matrix A, if >>>>>>>>> I partitaion this matrix into A1, A2, Ai... An. >>>>>>>>> A is a parallel AIJ matrix at the whole view, Ai >>>>>>>>> is a sequential AIJ matrix? I want to operate Ai at each >>>>>>>>> node. >>>>>>>>> In addition, whether is it possible to get general inverse >>>>>>>>> using >>>>>>>>> MatMatSolve() if the matrix is not square? Thanks a lot. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Yujie >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2/4/08, *Barry Smith* >>>>>>>> > wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> For sequential AIJ matrices you can fill the B matrix >>>>>>>>> with the >>>>>>>>> identity and then use >>>>>>>>> MatMatSolve(). >>>>>>>>> >>>>>>>>> Note since the inverse of a sparse matrix is dense >>>>>>>>> the B >>>>>>>>> matrix is >>>>>>>>> a SeqDense matrix. >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >>>>>>>>> >>>>>>>>> > Hi, >>>>>>>>> > Now, I want to inverse a sparse matrix. I have >>>>>>>>> browsed the >>>>>>>>> manual, >>>>>>>>> > however, I can't find some information. could you >>>>>>>>> give me >>>>>>>>> some advice? >>>>>>>>> > >>>>>>>>> > thanks a lot. >>>>>>>>> > >>>>>>>>> > Regards, >>>>>>>>> > Yujie >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------ >>>>>>>>> Looking for last minute shopping deals? Find them fast with >>>>>>>>> Yahoo! >>>>>>>>> Search. >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> >> > From erlend.pedersen at holberger.com Wed Feb 6 09:49:32 2008 From: erlend.pedersen at holberger.com (Erlend Pedersen :.) Date: Wed, 06 Feb 2008 16:49:32 +0100 Subject: Overdetermined, non-linear In-Reply-To: References: <1201866864.6394.25.camel@erlend-ws.in.holberger.com> <1202203616.27733.50.camel@erlend-ws.in.holberger.com> Message-ID: <1202312972.9921.15.camel@erlend-ws.in.holberger.com> On Tue, 2008-02-05 at 07:31 -0600, Matthew Knepley wrote: > On Feb 5, 2008 3:26 AM, Erlend Pedersen :. > wrote: > > On Sun, 2008-02-03 at 19:59 -0600, Matthew Knepley wrote: > > > On Feb 1, 2008 5:54 AM, Erlend Pedersen :. > > > wrote: > > > > I am attempting to use the PETSc nonlinear solver on an overdetermined > > > > system of non-linear equations. Hence, the Jacobian is not square, and > > > > so far we have unfortunately not succeeded with any combination of snes, > > > > ksp and pc. > > > > > > > > Could you confirm that snes actually works for overdetermined systems, > > > > and if so, is there an application example we could look at in order to > > > > make sure there is nothing wrong with our test-setup? > > > > > > > > We have previously used the MINPACK routine LMDER very successfully, but > > > > for our current problem sizes we rely on the use of sparse matrix > > > > representations and parallel architectures. PETSc's abstractions and > > > > automatic MPI makes this system very attractive for us, and we have > > > > already used the PETSc LSQR solver with great success. > > > > > > So in the sense that SNES is really just an iteration with an embedded solve, > > > yes it can solve non-square nonlinear systems. However, the user has to > > > understand what is meant by the Function and Jacobian evaluation methods. > > > I suggest implementing the simplest algorithm for non-square systems: > > > > > > http://en.wikipedia.org/wiki/Gauss-Newton_algorithm > > > > > > By implement, I mean your Function and Jacobian methods should return the > > > correct terms. I believe the reason you have not seen convergence is that > > > the result of the solve does not "mean" the correct thing for the iteration > > > in your current setup. > > > > > > Matt > > > > Thanks. Good to know that I should be able to get a working setup. Are > > there by any chance any code examples that I could use to clue myself in > > on how to transform my m equations of n unknonwns into a correct > > function for the Gauss-Newton algorithm? > > We do not have any nonlinear least-squares examples, unfortunately. At that > point, most users have gone over to formulating their problem directly as > an optimization problem (which allows more flexibility than least squares) and > have moved to TAO (http://www-unix.mcs.anl.gov/tao/) which does have > examples, I believe, for optimization of this kind. > > If you know that you only ever want to do least squares, and you want to solve > the biggest, parallel problems, than stick with PETSc and build a nice > Gauss-Newton > (or Levenberg-Marquadt) solver. However, if you really want to solve a more > general optimization problem, I recommend reformulating it now and moving > to TAO. It is at least worth reading up on it. Reformulating as an optimization problem does seem like the easier route for now. I kept away from TAO in order to Keep It Simple, but now I see that the opposite might be the case. I should be able to provide it with a gradient, if not necessarily a Hessian. Thanks again :) - Erlend :. > > Thanks, > > Matt > > > - Erlend :. From dalcinl at gmail.com Wed Feb 6 09:53:48 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 6 Feb 2008 12:53:48 -0300 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <47A915AB.9010006@gmail.com> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> Message-ID: Well, after taking into accout Barry's comments, you have have the following choices. * You can use a direct method based on LU factorization using '-ksp_type preonly -pc_type lu' . This way, PETSc will compute the LU factors the fist time they are needed; after that, every call to KSPSolve will reuse those factors. This will work only in sequential with a default PETSc build, but you could also build PETSc with MUMPS, and it will let you do the parallel factorization. For MUMPS to actually work in your matrix, I believe you have to add the following line: MatConvert(A, MATAIJMUMPS, MAT_REUSE_MATRIX, &A); after assembling (ie. MatAssembleBegin/End calls) your Poisson matrix. * You can use CG with '-ksp_type cg' (I assume your matrix is SPD, as it is in a standard fractional step method), and a preconditioner. And then, I believe the best choice for your application will bee BoomerAMG. It has a rather high setup cost, but solves are fast. Or your could use ML, it has less setup costs, but the solvers are a bit slower. So if you make many timesteps, I would say that BoomerAMG will pay. Finally, if you use the last option, perhaps you can try Paul Fischer tricks. I tried to add this to KSP's some time ago, but I stoped for many reasons (the main one, lack of time). You can take a look at this: http://citeseer.ist.psu.edu/492082.html A similar (equivalent?) approach is this other one (perhaps a bit easier to implement, depending on your taste) doi.wiley.com/10.1002/cnm.743 On 2/5/08, Ben Tay wrote: > Hi Lisandro, > > I'm using the fractional step mtd to solve the NS eqns as well. I've > tried the direct mtd and also boomerAMG in solving the poisson eqn. > Experience shows that for smaller matrix, direct mtd is slightly faster > but if the matrix increases in size, boomerAMG is faster. Btw, if I'm > not wrong, the default solver will be GMRES. I've also tried using the > "Struct" interface solely under Hypre. It's even faster for big matrix, > although the improvement doesn't seem to be a lot. I need to do more > tests to confirm though. > > I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a > while to solve the eqns. I'm wondering if it'll be faster if I get the > inverse and then do matrix multiplication. Or just calling KSPSolve is > actually doing something similar and there'll not be any speed > difference. Hope someone can enlighten... > > Thanks! > > Lisandro Dalcin wrote: > > Ben, some time ago I was doing some testing with PETSc for solving > > incompressible NS eqs with fractional step method. I've found that in > > our software and hardware setup, the best way to solve the pressure > > problem was by using HYPRE BoomerAMG. This preconditioner usually have > > some heavy setup, but if your Poison matrix does not change, then the > > sucessive solves at each time step are really fast. > > > > If you still want to use a direct method, you should use the > > combination '-ksp_type preonly -pc_type lu' (by default, this will > > only work on sequential mode, unless you build PETSc with an external > > package like MUMPS). This way, PETSc computes the LU factorization > > only once, and at each time step, the call to KSPSolve end-up only > > doing the triangular solvers. > > > > The nice thing about PETSc is that, if you next realize the > > factorization take a long time (as it usually take in big problems), > > you can switch BoomerAMG by only passing in the command line > > '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's > > all, you do not need to change your code. And more, depending on your > > problem you can choose the direct solvers or algebraic multigrid as > > you want, by simply pass the appropriate combination options in the > > command line (or a options file, using the -options_file option). > > > > Please, if you ever try HYPRE BoomerAMG preconditioners, I would like > > to know about your experience. > > > > Regards, > > > > On 2/5/08, Ben Tay wrote: > > > >> Hi everyone, > >> > >> I was reading about the topic abt inversing a sparse matrix. I have to > >> solve a poisson eqn for my CFD code. Usually, I form a system of linear > >> eqns and solve Ax=b. The "A" is always the same and only the "b" changes > >> every timestep. Does it mean that if I'm able to get the inverse matrix > >> A^(-1), in order to get x at every timestep, I only need to do a simple > >> matrix multiplication ie x=A^(-1)*b ? > >> > >> Hi Timothy, if the above is true, can you email me your Fortran code > >> template? I'm also programming in fortran 90. Thank you very much > >> > >> Regards. > >> > >> Timothy Stitt wrote: > >> > >>> Yes Yujie, I was able to put together a parallel code to invert a > >>> large sparse matrix with the help of the PETSc developers. If you need > >>> any help or maybe a Fortran code template just let me know. > >>> > >>> Best, > >>> > >>> Tim. > >>> > >>> Waad Subber wrote: > >>> > >>>> Hi > >>>> There was a discussion between Tim Stitt and petsc developers about > >>>> matrix inversion, and it was really helpful. That was in last Nov. > >>>> You can check the emails archive > >>>> > >>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html > >>>> > >>>> > >>>> Waad > >>>> > >>>> */Yujie /* wrote: > >>>> > >>>> what is the difference between sequantial and parallel AIJ matrix? > >>>> Assuming there is a matrix A, if > >>>> I partitaion this matrix into A1, A2, Ai... An. > >>>> A is a parallel AIJ matrix at the whole view, Ai > >>>> is a sequential AIJ matrix? I want to operate Ai at each node. > >>>> In addition, whether is it possible to get general inverse using > >>>> MatMatSolve() if the matrix is not square? Thanks a lot. > >>>> > >>>> Regards, > >>>> Yujie > >>>> > >>>> > >>>> On 2/4/08, *Barry Smith* >>>> > wrote: > >>>> > >>>> > >>>> For sequential AIJ matrices you can fill the B matrix > >>>> with the > >>>> identity and then use > >>>> MatMatSolve(). > >>>> > >>>> Note since the inverse of a sparse matrix is dense the B > >>>> matrix is > >>>> a SeqDense matrix. > >>>> > >>>> Barry > >>>> > >>>> On Feb 4, 2008, at 12:37 AM, Yujie wrote: > >>>> > >>>> > Hi, > >>>> > Now, I want to inverse a sparse matrix. I have browsed the > >>>> manual, > >>>> > however, I can't find some information. could you give me > >>>> some advice? > >>>> > > >>>> > thanks a lot. > >>>> > > >>>> > Regards, > >>>> > Yujie > >>>> > > >>>> > >>>> > >>>> > >>>> ------------------------------------------------------------------------ > >>>> Looking for last minute shopping deals? Find them fast with Yahoo! > >>>> Search. > >>>> > >>>> > >>> > >>> > >> > > > > > > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From zonexo at gmail.com Wed Feb 6 10:24:20 2008 From: zonexo at gmail.com (Ben Tay) Date: Thu, 07 Feb 2008 00:24:20 +0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> Message-ID: <47A9DF34.6030903@gmail.com> Hi Lisandro, Thanks for your recommendation. Btw, does the poisson eqn arising from fractional step gives a matrix which is SPD? Because my grid's are non-uniform in both x,y directions. Shouldn't that result in a non-symmetric matrix? But I think it's still PD, positive definite. Correct me if I'm wrong. Thanks Lisandro Dalcin wrote: > Well, after taking into accout Barry's comments, you have have the > following choices. > > * You can use a direct method based on LU factorization using > '-ksp_type preonly -pc_type lu' . This way, PETSc will compute the LU > factors the fist time they are needed; after that, every call to > KSPSolve will reuse those factors. This will work only in sequential > with a default PETSc build, but you could also build PETSc with MUMPS, > and it will let you do the parallel factorization. For MUMPS to > actually work in your matrix, I believe you have to add the following > line: > > MatConvert(A, MATAIJMUMPS, MAT_REUSE_MATRIX, &A); > > after assembling (ie. MatAssembleBegin/End calls) your Poisson matrix. > > > * You can use CG with '-ksp_type cg' (I assume your matrix is SPD, as > it is in a standard fractional step method), and a preconditioner. And > then, I believe the best choice for your application will bee > BoomerAMG. It has a rather high setup cost, but solves are fast. Or > your could use ML, it has less setup costs, but the solvers are a bit > slower. So if you make many timesteps, I would say that BoomerAMG will > pay. > > Finally, if you use the last option, perhaps you can try Paul Fischer > tricks. I tried to add this to KSP's some time ago, but I stoped for > many reasons (the main one, lack of time). You can take a look at > this: > > http://citeseer.ist.psu.edu/492082.html > > A similar (equivalent?) approach is this other one (perhaps a bit > easier to implement, depending on your taste) > doi.wiley.com/10.1002/cnm.743 > > > On 2/5/08, Ben Tay wrote: > >> Hi Lisandro, >> >> I'm using the fractional step mtd to solve the NS eqns as well. I've >> tried the direct mtd and also boomerAMG in solving the poisson eqn. >> Experience shows that for smaller matrix, direct mtd is slightly faster >> but if the matrix increases in size, boomerAMG is faster. Btw, if I'm >> not wrong, the default solver will be GMRES. I've also tried using the >> "Struct" interface solely under Hypre. It's even faster for big matrix, >> although the improvement doesn't seem to be a lot. I need to do more >> tests to confirm though. >> >> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a >> while to solve the eqns. I'm wondering if it'll be faster if I get the >> inverse and then do matrix multiplication. Or just calling KSPSolve is >> actually doing something similar and there'll not be any speed >> difference. Hope someone can enlighten... >> >> Thanks! >> >> Lisandro Dalcin wrote: >> >>> Ben, some time ago I was doing some testing with PETSc for solving >>> incompressible NS eqs with fractional step method. I've found that in >>> our software and hardware setup, the best way to solve the pressure >>> problem was by using HYPRE BoomerAMG. This preconditioner usually have >>> some heavy setup, but if your Poison matrix does not change, then the >>> sucessive solves at each time step are really fast. >>> >>> If you still want to use a direct method, you should use the >>> combination '-ksp_type preonly -pc_type lu' (by default, this will >>> only work on sequential mode, unless you build PETSc with an external >>> package like MUMPS). This way, PETSc computes the LU factorization >>> only once, and at each time step, the call to KSPSolve end-up only >>> doing the triangular solvers. >>> >>> The nice thing about PETSc is that, if you next realize the >>> factorization take a long time (as it usually take in big problems), >>> you can switch BoomerAMG by only passing in the command line >>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's >>> all, you do not need to change your code. And more, depending on your >>> problem you can choose the direct solvers or algebraic multigrid as >>> you want, by simply pass the appropriate combination options in the >>> command line (or a options file, using the -options_file option). >>> >>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like >>> to know about your experience. >>> >>> Regards, >>> >>> On 2/5/08, Ben Tay wrote: >>> >>> >>>> Hi everyone, >>>> >>>> I was reading about the topic abt inversing a sparse matrix. I have to >>>> solve a poisson eqn for my CFD code. Usually, I form a system of linear >>>> eqns and solve Ax=b. The "A" is always the same and only the "b" changes >>>> every timestep. Does it mean that if I'm able to get the inverse matrix >>>> A^(-1), in order to get x at every timestep, I only need to do a simple >>>> matrix multiplication ie x=A^(-1)*b ? >>>> >>>> Hi Timothy, if the above is true, can you email me your Fortran code >>>> template? I'm also programming in fortran 90. Thank you very much >>>> >>>> Regards. >>>> >>>> Timothy Stitt wrote: >>>> >>>> >>>>> Yes Yujie, I was able to put together a parallel code to invert a >>>>> large sparse matrix with the help of the PETSc developers. If you need >>>>> any help or maybe a Fortran code template just let me know. >>>>> >>>>> Best, >>>>> >>>>> Tim. >>>>> >>>>> Waad Subber wrote: >>>>> >>>>> >>>>>> Hi >>>>>> There was a discussion between Tim Stitt and petsc developers about >>>>>> matrix inversion, and it was really helpful. That was in last Nov. >>>>>> You can check the emails archive >>>>>> >>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html >>>>>> >>>>>> >>>>>> Waad >>>>>> >>>>>> */Yujie /* wrote: >>>>>> >>>>>> what is the difference between sequantial and parallel AIJ matrix? >>>>>> Assuming there is a matrix A, if >>>>>> I partitaion this matrix into A1, A2, Ai... An. >>>>>> A is a parallel AIJ matrix at the whole view, Ai >>>>>> is a sequential AIJ matrix? I want to operate Ai at each node. >>>>>> In addition, whether is it possible to get general inverse using >>>>>> MatMatSolve() if the matrix is not square? Thanks a lot. >>>>>> >>>>>> Regards, >>>>>> Yujie >>>>>> >>>>>> >>>>>> On 2/4/08, *Barry Smith* >>>>> > wrote: >>>>>> >>>>>> >>>>>> For sequential AIJ matrices you can fill the B matrix >>>>>> with the >>>>>> identity and then use >>>>>> MatMatSolve(). >>>>>> >>>>>> Note since the inverse of a sparse matrix is dense the B >>>>>> matrix is >>>>>> a SeqDense matrix. >>>>>> >>>>>> Barry >>>>>> >>>>>> On Feb 4, 2008, at 12:37 AM, Yujie wrote: >>>>>> >>>>>> > Hi, >>>>>> > Now, I want to inverse a sparse matrix. I have browsed the >>>>>> manual, >>>>>> > however, I can't find some information. could you give me >>>>>> some advice? >>>>>> > >>>>>> > thanks a lot. >>>>>> > >>>>>> > Regards, >>>>>> > Yujie >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------ >>>>>> Looking for last minute shopping deals? Find them fast with Yahoo! >>>>>> Search. >>>>>> >>>>>> >>>>>> >>>>> >>> >>> >> > > > From dalcinl at gmail.com Wed Feb 6 11:10:01 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Wed, 6 Feb 2008 14:10:01 -0300 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: <47A9DF34.6030903@gmail.com> References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> <47A9DF34.6030903@gmail.com> Message-ID: On 2/6/08, Ben Tay wrote: Because my grid's are > non-uniform in both x,y directions. Shouldn't that result in a > non-symmetric matrix? But I think it's still PD, positive definite. > Correct me if I'm wrong. I believe you are wrong, unless you are using a non-standart spatial discretization method. Is your Poisson equation using some additional terms than the usual Laplace operator? For standard finite elements and finite diferences, your matrix should be symmetric. Of course, symmetry can be lost if you use the common trick of zeroing-out rows for boundary conditions (using MatZeroRows and related). But even in that case, I believe you can still use CG. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From zonexo at gmail.com Thu Feb 7 00:18:51 2008 From: zonexo at gmail.com (Ben Tay) Date: Thu, 07 Feb 2008 14:18:51 +0800 Subject: how to inverse a sparse matrix in Petsc? In-Reply-To: References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <47A915AB.9010006@gmail.com> <47A9DF34.6030903@gmail.com> Message-ID: <47AAA2CB.6050705@gmail.com> Ya, Lisandro, it's my mistake. It is indeed SPD. Thank you for your reminder! Lisandro Dalcin wrote: > On 2/6/08, Ben Tay wrote: > Because my grid's are > >> non-uniform in both x,y directions. Shouldn't that result in a >> non-symmetric matrix? But I think it's still PD, positive definite. >> Correct me if I'm wrong. >> > > I believe you are wrong, unless you are using a non-standart spatial > discretization method. Is your Poisson equation using some additional > terms than the usual Laplace operator? For standard finite elements > and finite diferences, your matrix should be symmetric. Of course, > symmetry can be lost if you use the common trick of zeroing-out rows > for boundary conditions (using MatZeroRows and related). But even in > that case, I believe you can still use CG. > > From tstitt at cscs.ch Thu Feb 7 09:06:39 2008 From: tstitt at cscs.ch (Timothy Stitt) Date: Thu, 07 Feb 2008 16:06:39 +0100 Subject: Legendre Transform Message-ID: <47AB1E7F.1070907@cscs.ch> Hi all, I am not sure if this query is directly related to the PETSc library per se, but I was wondering if anyone knows of an efficient implementation of the Legendre Transform. A parallel implementation would be even more ideal. I don't know that much about the implementation of integral transforms...could it be done in PETSc for instance, if no suitable library exists? A web search doesn't throw up to many relevant hits. Thanks in advance for any guidance given, Best, Tim. -- Timothy Stitt HPC Applications Analyst Swiss National Supercomputing Centre (CSCS) Galleria 2 - Via Cantonale CH-6928 Manno, Switzerland +41 (0) 91 610 8233 stitt at cscs.ch From sanjay at ce.berkeley.edu Thu Feb 7 09:22:01 2008 From: sanjay at ce.berkeley.edu (Sanjay Govindjee) Date: Thu, 07 Feb 2008 16:22:01 +0100 Subject: Legendre Transform In-Reply-To: <47AB1E7F.1070907@cscs.ch> References: <47AB1E7F.1070907@cscs.ch> Message-ID: <47AB2219.6020401@ce.berkeley.edu> Can you define what you mean by Legendre transform? The usual definition in physics is L{f}(p) = max_x (p.x - f(x)) where f(x) is convex. When you say parallel, then I presume that x lies in a very high dimensional space. In which case you are simply looking at a root finding problem in high dimensions and you could certainly use PETSc to solve the intermediate linear solves of a Newton scheme (or just invoke SNES). -sg Timothy Stitt wrote: > Hi all, > > I am not sure if this query is directly related to the PETSc library > per se, but I was wondering if anyone knows of an efficient > implementation of the Legendre Transform. A parallel implementation > would be even more ideal. > I don't know that much about the implementation of integral > transforms...could it be done in PETSc for instance, if no suitable > library exists? A web search doesn't throw up to many relevant hits. > > Thanks in advance for any guidance given, > > Best, > > Tim. > From tstitt at cscs.ch Thu Feb 7 11:03:06 2008 From: tstitt at cscs.ch (Timothy Stitt) Date: Thu, 07 Feb 2008 18:03:06 +0100 Subject: Legendre Transform In-Reply-To: <47AB2219.6020401@ce.berkeley.edu> References: <47AB1E7F.1070907@cscs.ch> <47AB2219.6020401@ce.berkeley.edu> Message-ID: <47AB39CA.9050908@cscs.ch> Thanks Sanjay for the reply...I apologise for my ambiguous definition but it is more to do with my unfamiliarity with the topic. I have been asked to help a group optimise their quasi-spectral geophysical MHD code. From what I gather the significant computation occurs when it flips back and forth between spectral and real space. The Fourier and Chebyshev transforms are quite efficient but not the Legendre transform. The full forward and inverse Legendre transforms are calculated on each process (using direct summation and Gauss-Legendre quadrature on the inverse) but performance suffers as the resolution becomes high. Ideally I would like to be make sure I am implementing the most efficient Legendre transform algorithm available and more importantly if it can be performed in parallel, hence my call for existing libraries and maybe PETSc library support. I apologise if the application area is not directly related to your own field but I hope you were able to follow the general idea. Again I would appreciate your comments. Thanks, Tim. Sanjay Govindjee wrote: > Can you define what you mean by Legendre transform? The usual > definition in physics > is L{f}(p) = max_x (p.x - f(x)) where f(x) is convex. When you say > parallel, then I presume that > x lies in a very high dimensional space. In which case you are simply > looking at a root > finding problem in high dimensions and you could certainly use PETSc > to solve the intermediate > linear solves of a Newton scheme (or just invoke SNES). > > -sg > > Timothy Stitt wrote: >> Hi all, >> >> I am not sure if this query is directly related to the PETSc library >> per se, but I was wondering if anyone knows of an efficient >> implementation of the Legendre Transform. A parallel implementation >> would be even more ideal. >> I don't know that much about the implementation of integral >> transforms...could it be done in PETSc for instance, if no suitable >> library exists? A web search doesn't throw up to many relevant hits. >> >> Thanks in advance for any guidance given, >> >> Best, >> >> Tim. >> > > -- Timothy Stitt HPC Applications Analyst Swiss National Supercomputing Centre (CSCS) Galleria 2 - Via Cantonale CH-6928 Manno, Switzerland +41 (0) 91 610 8233 stitt at cscs.ch From vijay.m at gmail.com Fri Feb 8 00:11:17 2008 From: vijay.m at gmail.com (Vijay S. Mahadevan) Date: Fri, 8 Feb 2008 00:11:17 -0600 Subject: PCGetType question Message-ID: <00aa01c86a19$6cbcac50$b63010ac@neutron> Hi all, I?ve been trying to figure out how exactly to find out the PCType for a given PC context. Here?s the sample code I?ve been trying to execute but to no avail. Method 1: PetscTruth isshell ; PetscTypeCompare((PetscObject)pc, PCSHELL, &isshell); PetscPrintf(PETSC_COMM_SELF, " PETSC_IS_SHELL = %D", isshell) ; Result: The isshell variable is always false even when I set ?pc_type shell option. Why ? Method 2: PCType currpcType ; ierr = PCGetType(pc, &currpcType) ; if(currpcType == PCSHELL) isshell = PETSC_TRUE ; PetscPrintf(PETSC_COMM_SELF, " PETSC_IS_SHELL = %D", isshell) ; Result: Again, the isshell variable is always false even when I set ?pc_type shell option. Also, my currpcType string is a null string. Any ideas on what I am doing wrong on either one of these cases. I just spent a while trying to figure out if there was a bug in some other part of my code while the isshell variable is never set in the first place. Any help would be appreciated. Thanks, Vijay No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.19.21/1265 - Release Date: 2/7/2008 11:17 AM -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Fri Feb 8 07:52:46 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 8 Feb 2008 10:52:46 -0300 Subject: PCGetType question In-Reply-To: <00aa01c86a19$6cbcac50$b63010ac@neutron> References: <00aa01c86a19$6cbcac50$b63010ac@neutron> Message-ID: On 2/8/08, Vijay S. Mahadevan wrote: > PetscTruth isshell ; > PetscTypeCompare((PetscObject)pc, PCSHELL, &isshell); > PetscPrintf(PETSC_COMM_SELF, " PETSC_IS_SHELL = %D", isshell) ; > Result: The isshell variable is always false even when I set ?pc_type shell > option. Why ? Did you call PCSetFromOptions (or KSPSetFromOptions) before calling PetscTypeCompare()? If not, the option '-pc_type' will not be used > Method 2: > PCType currpcType ; > ierr = PCGetType(pc, &currpcType) ; > if(currpcType == PCSHELL) > isshell = PETSC_TRUE ; > PetscPrintf(PETSC_COMM_SELF, " PETSC_IS_SHELL = %D", isshell) ; > Result: Again, the isshell variable is always false even when I set ?pc_type > shell option. Also, my currpcType string is a null string. PCType is a 'const char*', so you should never compare that with '==' opertor!!. Again, if you got a null pointer from PCGetType(), my guess is that you forgot to call XXXSetFromOptions, where XXX is PC, or a KSP object containing it, or a SNES cointaing the KSP in turn containing the PC. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From Andrew.Barker at Colorado.EDU Fri Feb 8 11:46:09 2008 From: Andrew.Barker at Colorado.EDU (Andrew T Barker) Date: Fri, 8 Feb 2008 10:46:09 -0700 (MST) Subject: sub_pc lu zero pivot Message-ID: <20080208104609.AAW99320@batman.int.colorado.edu> For my code on a single processor, using the options -ksp_type gmres -pc_type lu works fine, but -ksp_type gmres -pc_type asm -sub_pc_type lu produces a zero pivot. On one processor I would expect these two to be identical, and in fact when I print out the asm submatrices (there is only one) in the second case and the original matrix in the first case, they are identical. Even stranger is that -ksp_type gmres -pc_type lu -sub_pc_type lu also produces a zero pivot. I would expect it to ignore the sub_pc in this case. Thanks for any help, Andrew --- Andrew T. Barker andrew.barker at colorado.edu Applied Math Department University of Colorado, Boulder 526 UCB, Boulder, CO 80309-0526 From knepley at gmail.com Fri Feb 8 12:02:45 2008 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 8 Feb 2008 12:02:45 -0600 Subject: sub_pc lu zero pivot In-Reply-To: <20080208104609.AAW99320@batman.int.colorado.edu> References: <20080208104609.AAW99320@batman.int.colorado.edu> Message-ID: We would like to see: 1) The complete error message 2) The output of -snes_view so we can see exactly what solver setup you have Thanks, Matt On Feb 8, 2008 11:46 AM, Andrew T Barker wrote: > > For my code on a single processor, using the options > > -ksp_type gmres -pc_type lu > > works fine, but > > -ksp_type gmres -pc_type asm -sub_pc_type lu > > produces a zero pivot. On one processor I would expect these two to be identical, and in fact when I print out the asm submatrices (there is only one) in the second case and the original matrix in the first case, they are identical. Even stranger is that > > -ksp_type gmres -pc_type lu -sub_pc_type lu > > also produces a zero pivot. I would expect it to ignore the sub_pc in this case. > > Thanks for any help, > > Andrew > > --- > Andrew T. Barker > andrew.barker at colorado.edu > Applied Math Department > University of Colorado, Boulder > 526 UCB, Boulder, CO 80309-0526 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Andrew.Barker at Colorado.EDU Fri Feb 8 12:34:27 2008 From: Andrew.Barker at Colorado.EDU (Andrew T Barker) Date: Fri, 8 Feb 2008 11:34:27 -0700 (MST) Subject: sub_pc lu zero pivot Message-ID: <20080208113427.AAX01992@batman.int.colorado.edu> I apparently was explicitly changing the PC in my code after PCSetFromOptions, so the options weren't getting set properly (which I found out by using -snes_view). Sorry to bother. Andrew ---- Original message ---- >Date: Fri, 8 Feb 2008 12:02:45 -0600 >From: "Matthew Knepley" >Subject: Re: sub_pc lu zero pivot >To: petsc-users at mcs.anl.gov > >We would like to see: > > 1) The complete error message > > 2) The output of -snes_view so we can see exactly what solver setup you have > > Thanks, > > Matt > >On Feb 8, 2008 11:46 AM, Andrew T Barker wrote: >> >> For my code on a single processor, using the options >> >> -ksp_type gmres -pc_type lu >> >> works fine, but >> >> -ksp_type gmres -pc_type asm -sub_pc_type lu >> >> produces a zero pivot. On one processor I would expect these two to be identical, and in fact when I print out the asm submatrices (there is only one) in the second case and the original matrix in the first case, they are identical. Even stranger is that >> >> -ksp_type gmres -pc_type lu -sub_pc_type lu >> >> also produces a zero pivot. I would expect it to ignore the sub_pc in this case. >> >> Thanks for any help, >> >> Andrew >> >> --- >> Andrew T. Barker >> andrew.barker at colorado.edu >> Applied Math Department >> University of Colorado, Boulder >> 526 UCB, Boulder, CO 80309-0526 >> >> > > > >-- >What most experimenters take for granted before they begin their >experiments is infinitely more interesting than any results to which >their experiments lead. >-- Norbert Wiener > From recrusader at gmail.com Tue Feb 12 13:06:51 2008 From: recrusader at gmail.com (Yujie) Date: Tue, 12 Feb 2008 11:06:51 -0800 Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()? In-Reply-To: References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com> <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com> <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com> Message-ID: <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com> hi, Matt If I output matrix with binary format, should the file format obtained from sequential output is the same with that from parallel output? I mean that I don't need to consider whether MatView_***_Binary() is parallel or sequential when I use the matrix file. thanks a lot. Regards, Yujie On 1/23/08, Matthew Knepley wrote: > > On Jan 23, 2008 2:18 PM, Yujie wrote: > > Thank you for your further explanation. I just want to use this data in > > other packages. I think that ASCII file is likely better. Because I > don't > > know the format of the binary file? how to find it? > > Look at MatView_SeqAIJ_Binary() in src/mat/impls/aij/seq/aij.c. The > format is pretty simple. > > > In addition, do you have any better methods to save the sparsity > structure > > picture of the matrix? Now, I use "-mat_view_draw" to do this. However, > the > > speed is very slow and the picture is small. I want to get a big picture > and > > directly save it to the disk? > > could you give me some advice? thanks a lot. > > We do not have a better way to make the sparsity picture. I assume you > could > write something that decides how many pixels to use, calculates an average > occupancy per pixel, and writes a BMP or something. > > Matt > > > Regards, > > Yujie > > > > On 1/23/08, Matthew Knepley wrote: > > > On Jan 22, 2008 11:01 PM, Yujie wrote: > > > > Dear Matt: > > > > > > > > thank you for your reply. Do you have any method to generate an > ascii > > file > > > > of the huge sparse matrix? thanks > > > > > > I think you miss my point. The PETSc function is not a bad way to > generate > > > ASCII matrices. ASCII matrices make "no sense" for large operators. > > > > > > Matt > > > > > > > Regards, > > > > Yujie > > > > > > > > > > > > > > > > On 1/23/08, Matthew Knepley wrote: > > > > > On Jan 22, 2008 8:50 PM, Yujie < recrusader at gmail.com> wrote: > > > > > > Hi everyone: > > > > > > > > > > > > #include "petsc.h" > > > > > > PetscErrorCode PetscViewerASCIIOpen(MPI_Comm comm,const char > > > > > > name[],PetscViewer *lab) > > > > > > > > > > > > #include "petsc.h" > > > > > > PetscErrorCode PetscViewerBinaryOpen(MPI_Comm comm,const char > > > > > > name[],PetscFileMode type,PetscViewer *binv) > > > > > > > > > > > > if the difference between them is that one for ASCII output and > the > > > > other > > > > > > for Binary output, why are there different parameters? > > > > > > > > > > It is historical. If you want to be generic, you should use > > > > > > > > > > PetscViewerCreate() > > > > > PetscViewerSetType() > > > > > PetscViewerFileSetMode() > > > > > PetscViewerFileSetName() > > > > > > > > > > which can create both. > > > > > > > > > > > The speed to output matrix is very fast when I use > > > > PetscViewerBinaryOpen. > > > > > > However, when I use PetscViewerASCIIOpen, I can't get the matrix > > output. > > > > the > > > > > > code always is running and it has taken about one day! what's > the > > > > problem? > > > > > > thank you. > > > > > > > > > > ASCII files do not make sense for large matrices. You should use > > binary > > > > files. > > > > > > > > > > Matt > > > > > > > > > > > Regards, > > > > > > Yujie > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before they begin their > > > > > experiments is infinitely more interesting than any results to > which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Feb 12 14:15:00 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 12 Feb 2008 14:15:00 -0600 Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()? In-Reply-To: <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com> References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com> <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com> <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com> <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com> Message-ID: <3E498C1D-EE34-4744-8D0E-C0EE9119741C@mcs.anl.gov> The files should be the same. Barry 1) If the matrix is generated differently in parallel then sequential then rounding of the floating point operations will result in very slightly different numerical values in the matrix. 2) If the ordering of the the unknowns is different in the parallel and sequential then the matrices will, of course, be permutations of each other. On Feb 12, 2008, at 1:06 PM, Yujie wrote: > hi, Matt > > If I output matrix with binary format, should the file format > obtained from sequential output is the same with that from parallel > output? I mean that I don't need to consider whether > MatView_***_Binary() is parallel or sequential when I use the matrix > file. > > thanks a lot. > > Regards, > Yujie > > On 1/23/08, Matthew Knepley wrote: > On Jan 23, 2008 2:18 PM, Yujie wrote: > > Thank you for your further explanation. I just want to use this > data in > > other packages. I think that ASCII file is likely better. Because > I don't > > know the format of the binary file? how to find it? > > Look at MatView_SeqAIJ_Binary() in src/mat/impls/aij/seq/aij.c. The > format is pretty simple. > > > In addition, do you have any better methods to save the sparsity > structure > > picture of the matrix? Now, I use "-mat_view_draw" to do this. > However, the > > speed is very slow and the picture is small. I want to get a big > picture and > > directly save it to the disk? > > could you give me some advice? thanks a lot. > > We do not have a better way to make the sparsity picture. I assume > you could > write something that decides how many pixels to use, calculates an > average > occupancy per pixel, and writes a BMP or something. > > Matt > > > Regards, > > Yujie > > > > On 1/23/08, Matthew Knepley wrote: > > > On Jan 22, 2008 11:01 PM, Yujie wrote: > > > > Dear Matt: > > > > > > > > thank you for your reply. Do you have any method to generate > an ascii > > file > > > > of the huge sparse matrix? thanks > > > > > > I think you miss my point. The PETSc function is not a bad way > to generate > > > ASCII matrices. ASCII matrices make "no sense" for large > operators. > > > > > > Matt > > > > > > > Regards, > > > > Yujie > > > > > > > > > > > > > > > > On 1/23/08, Matthew Knepley wrote: > > > > > On Jan 22, 2008 8:50 PM, Yujie < recrusader at gmail.com> wrote: > > > > > > Hi everyone: > > > > > > > > > > > > #include "petsc.h" > > > > > > PetscErrorCode PetscViewerASCIIOpen(MPI_Comm comm,const > char > > > > > > name[],PetscViewer *lab) > > > > > > > > > > > > #include "petsc.h" > > > > > > PetscErrorCode PetscViewerBinaryOpen(MPI_Comm comm,const > char > > > > > > name[],PetscFileMode type,PetscViewer *binv) > > > > > > > > > > > > if the difference between them is that one for ASCII > output and the > > > > other > > > > > > for Binary output, why are there different parameters? > > > > > > > > > > It is historical. If you want to be generic, you should use > > > > > > > > > > PetscViewerCreate() > > > > > PetscViewerSetType() > > > > > PetscViewerFileSetMode() > > > > > PetscViewerFileSetName() > > > > > > > > > > which can create both. > > > > > > > > > > > The speed to output matrix is very fast when I use > > > > PetscViewerBinaryOpen. > > > > > > However, when I use PetscViewerASCIIOpen, I can't get the > matrix > > output. > > > > the > > > > > > code always is running and it has taken about one day! > what's the > > > > problem? > > > > > > thank you. > > > > > > > > > > ASCII files do not make sense for large matrices. You should > use > > binary > > > > files. > > > > > > > > > > Matt > > > > > > > > > > > Regards, > > > > > > Yujie > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before they begin > their > > > > > experiments is infinitely more interesting than any results > to which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Feb 12 14:23:38 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 12 Feb 2008 14:23:38 -0600 (CST) Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()? In-Reply-To: <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com> References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com> <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com> <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com> <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com> Message-ID: On Tue, 12 Feb 2008, Yujie wrote: > On 1/23/08, Matthew Knepley wrote: > > > In addition, do you have any better methods to save the sparsity > > > structure picture of the matrix? Now, I use "-mat_view_draw" to > > > do this. However, the speed is very slow and the picture is > > > small. I want to get a big picture and directly save it to the > > > disk? could you give me some advice? thanks a lot. > > We do not have a better way to make the sparsity picture. I assume > >you could write something that decides how many pixels to use, > >calculates an average occupancy per pixel, and writes a BMP or > >something. Couple of notes on this. - -mat_view_draw can be slow for parallel runs [because all the data is moved to proc-0, from where its displayed]. If you wish to speed up, you can either: * run it sequentially [depending upon your code, the matrix generated could be different - so its not suitable] * do a binary dump [with MatView() on a binary viewer] - and then reload this matrix with a sequential code and then do mat_view [check mat/examples/tests/ex33.c,ex43.c] - you can use the option '-draw_pause -1' to make the window not disappear. Now you can zoom-in & zoom-out [with mouse-left or mouse-right click] - Take the snapshot of this window with xv or gnome-screenshot or other screen-dump tool [ like 'xwd | xpr -device ps > dump.ps'] - Alternatively you can dump the matrix is matlab format - and use Matlab visualization tools. Satish From bsmith at mcs.anl.gov Tue Feb 12 14:29:49 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 12 Feb 2008 14:29:49 -0600 Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()? In-Reply-To: References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com> <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com> <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com> <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com> Message-ID: <0AB82F4A-4114-4763-AC47-69EDE8E4E2DA@mcs.anl.gov> It is important to remember what the PETSc users manual says "PETSc graphics library is not intended to compete with high-quality graphics packages. Instead, it is intended to be easy to use interactively with PETSc programs. We urge users to generate their publication-quality graphics using a professional graphics package." We are not graphics experts, nor do we want to be, or could be. Barry On Feb 12, 2008, at 2:23 PM, Satish Balay wrote: > On Tue, 12 Feb 2008, Yujie wrote: > >> On 1/23/08, Matthew Knepley wrote: > >>>> In addition, do you have any better methods to save the sparsity >>>> structure picture of the matrix? Now, I use "-mat_view_draw" to >>>> do this. However, the speed is very slow and the picture is >>>> small. I want to get a big picture and directly save it to the >>>> disk? could you give me some advice? thanks a lot. > >>> We do not have a better way to make the sparsity picture. I assume >>> you could write something that decides how many pixels to use, >>> calculates an average occupancy per pixel, and writes a BMP or >>> something. > > Couple of notes on this. > > - -mat_view_draw can be slow for parallel runs [because all the data > is moved to proc-0, from where its displayed]. If you wish to speed > up, you can either: > * run it sequentially [depending upon your code, the matrix generated > could be different - so its not suitable] > * do a binary dump [with MatView() on a binary viewer] - and > then reload this matrix with a sequential code and then do mat_view > [check mat/examples/tests/ex33.c,ex43.c] > > - you can use the option '-draw_pause -1' to make the window not > disappear. Now you can zoom-in & zoom-out [with mouse-left or > mouse-right click] > > - Take the snapshot of this window with xv or gnome-screenshot or > other screen-dump tool [ like 'xwd | xpr -device ps > dump.ps'] > > - Alternatively you can dump the matrix is matlab format - and use > Matlab visualization tools. > > Satish > > From recrusader at gmail.com Tue Feb 12 16:08:49 2008 From: recrusader at gmail.com (Yujie) Date: Tue, 12 Feb 2008 14:08:49 -0800 Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()? In-Reply-To: <0AB82F4A-4114-4763-AC47-69EDE8E4E2DA@mcs.anl.gov> References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com> <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com> <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com> <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com> <0AB82F4A-4114-4763-AC47-69EDE8E4E2DA@mcs.anl.gov> Message-ID: <7ff0ee010802121408s6d7ec5eau6a7ea8582ae59a26@mail.gmail.com> thanks a lot, everyone :). On 2/12/08, Barry Smith wrote: > > > It is important to remember what the PETSc users manual > says > > "PETSc graphics library is not intended to compete with > high-quality graphics packages. Instead, it is intended to be > easy to use interactively with PETSc programs. We urge users > to generate their publication-quality graphics using a > professional graphics package." > > We are not graphics experts, nor do we want to be, or could be. > > Barry > > > > > On Feb 12, 2008, at 2:23 PM, Satish Balay wrote: > > > On Tue, 12 Feb 2008, Yujie wrote: > > > >> On 1/23/08, Matthew Knepley wrote: > > > >>>> In addition, do you have any better methods to save the sparsity > >>>> structure picture of the matrix? Now, I use "-mat_view_draw" to > >>>> do this. However, the speed is very slow and the picture is > >>>> small. I want to get a big picture and directly save it to the > >>>> disk? could you give me some advice? thanks a lot. > > > >>> We do not have a better way to make the sparsity picture. I assume > >>> you could write something that decides how many pixels to use, > >>> calculates an average occupancy per pixel, and writes a BMP or > >>> something. > > > > Couple of notes on this. > > > > - -mat_view_draw can be slow for parallel runs [because all the data > > is moved to proc-0, from where its displayed]. If you wish to speed > > up, you can either: > > * run it sequentially [depending upon your code, the matrix generated > > could be different - so its not suitable] > > * do a binary dump [with MatView() on a binary viewer] - and > > then reload this matrix with a sequential code and then do mat_view > > [check mat/examples/tests/ex33.c,ex43.c] > > > > - you can use the option '-draw_pause -1' to make the window not > > disappear. Now you can zoom-in & zoom-out [with mouse-left or > > mouse-right click] > > > > - Take the snapshot of this window with xv or gnome-screenshot or > > other screen-dump tool [ like 'xwd | xpr -device ps > dump.ps'] > > > > - Alternatively you can dump the matrix is matlab format - and use > > Matlab visualization tools. > > > > Satish > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Stephen.R.Ball at awe.co.uk Wed Feb 13 06:12:03 2008 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Wed, 13 Feb 2008 12:12:03 -0000 Subject: References for preconditioners and solver methods. Message-ID: <82DCAO020874@awe.co.uk> Hi I am writing a paper that references PETSc and the preconditioners and linear solvers that it uses. I would like to include references for these. I have searched and found references for quite a few but am struggling to find references for the following solver methods: BICG CGNE CHEBYCHEV CR (Conjugate Residuals) QCG RICHARDSON TCQMR Could you send me suitable references for these methods? I'm not sure if they exist, but could you also send me suitable references for the following preconditioners: ASM BJACOBI ILU ICC Much appreciated Stephen -- _______________________________________________________________________________ The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. AWE Plc Registered in England and Wales Registration No 02763902 AWE, Aldermaston, Reading, RG7 4PR From bsmith at mcs.anl.gov Wed Feb 13 14:40:30 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 13 Feb 2008 14:40:30 -0600 Subject: References for preconditioners and solver methods. In-Reply-To: <82DCAO020874@awe.co.uk> References: <82DCAO020874@awe.co.uk> Message-ID: I've started adding them to the manual pages. Here are the ones I have so far On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote: > > Hi > > I am writing a paper that references PETSc and the preconditioners and > linear solvers that it uses. I would like to include references for > these. I have searched and found references for quite a few but am > struggling to find references for the following solver methods: > > BICG > > CGNE This is just CG applied to the normal equations; it is not an idea worthing of a publication. > > CHEBYCHEV > > CR (Conjugate Residuals) Methods of Conjugate Gradients for Solving Linear Systems, Magnus R. Hestenes and Eduard Stiefel, Journal of Research of the National Bureau of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp. 409--436. > > QCG The Conjugate Gradient Method and Trust Regions in Large Scale Optimization, Trond Steihaug SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983), pp. 626-637 > > RICHARDSON > > TCQMR Transpose-free formulations of Lanczos-type methods for nonsymmetric linear systems, Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical Algorithms, Volume 17, Numbers 1-2 / May, 1998 pp. 51-66. > > > Could you send me suitable references for these methods? > > I'm not sure if they exist, but could you also send me suitable > references for the following preconditioners: > > ASM An additive variant of the Schwarz alternating method for the case of many subregions M Dryja, OB Widlund - Courant Institute, New York University Technical report Domain Decompositions: Parallel Multilevel Methods for Elliptic Partial Differential Equations, Barry Smith, Petter Bjorstad, and William Gropp, Cambridge University Press, ISBN 0-521-49589-X. > > BJACOBI Any iterative solver book, this is just Jacobi's method > > ILU > ICC > Both ICC and ILU the review article APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A. VAN DER VORST http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf chapter in Parallel Numerical Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan, ICASE/LaRC Interdisciplinary Series in Science and Engineering, Kluwer, pp. 167--202. It is difficult to determine the publications where the FIRST use of ILU/ICC appeared since the did not call them that originally. If anyone has references to the original Chebychev and Bi-CG algorithms please let us know. Barry > Much appreciated > > Stephen > -- > _______________________________________________________________________________ > > The information in this email and in any attachment(s) is commercial > in confidence. If you are not the named addressee(s) or if you > receive this email in error then any distribution, copying or use of > this communication or the information in it is strictly prohibited. > Please notify us immediately by email at > admin.internet(at)awe.co.uk, and then delete this message from your > computer. While attachments are virus checked, AWE plc does not > accept any liability in respect of any virus which is not detected. > > AWE Plc > Registered in England and Wales > Registration No 02763902 > AWE, Aldermaston, Reading, RG7 4PR > From Stephen.R.Ball at awe.co.uk Thu Feb 14 10:30:45 2008 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Thu, 14 Feb 2008 16:30:45 -0000 Subject: References for preconditioners and solver methods. Message-ID: <82EGUi224985@awe.co.uk> Hi Thanks for your suggestions. You have given a reference for CR (Conjugate Residuals) as: Methods of Conjugate Gradients for Solving Linear Systems, Magnus R. Hestenes and Eduard Stiefel, Journal of Research of the National Bureau of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp. 409--436. However the PETSc user manual says this is the reference for CG (Conjugate Gradient). Can you clarify which is the case? If it is not for CR do you know of a reference for CR? If anyone can provide references for the Bi-CG, Chebychev, CR (Conjugate Residuals), QCG (Quadratic CG) and Richardson solvers that would be very much appreciated. Regards Stephen -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith Sent: 13 February 2008 20:41 To: petsc-users at mcs.anl.gov Subject: EXTERNAL: Re: References for preconditioners and solver methods. I've started adding them to the manual pages. Here are the ones I have so far On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote: > > Hi > > I am writing a paper that references PETSc and the preconditioners and > linear solvers that it uses. I would like to include references for > these. I have searched and found references for quite a few but am > struggling to find references for the following solver methods: > > BICG > > CGNE This is just CG applied to the normal equations; it is not an idea worthing of a publication. > > CHEBYCHEV > > CR (Conjugate Residuals) Methods of Conjugate Gradients for Solving Linear Systems, Magnus R. Hestenes and Eduard Stiefel, Journal of Research of the National Bureau of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp. 409--436. > > QCG The Conjugate Gradient Method and Trust Regions in Large Scale Optimization, Trond Steihaug SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983), pp. 626-637 > > RICHARDSON > > TCQMR Transpose-free formulations of Lanczos-type methods for nonsymmetric linear systems, Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical Algorithms, Volume 17, Numbers 1-2 / May, 1998 pp. 51-66. > > > Could you send me suitable references for these methods? > > I'm not sure if they exist, but could you also send me suitable > references for the following preconditioners: > > ASM An additive variant of the Schwarz alternating method for the case of many subregions M Dryja, OB Widlund - Courant Institute, New York University Technical report Domain Decompositions: Parallel Multilevel Methods for Elliptic Partial Differential Equations, Barry Smith, Petter Bjorstad, and William Gropp, Cambridge University Press, ISBN 0-521-49589-X. > > BJACOBI Any iterative solver book, this is just Jacobi's method > > ILU > ICC > Both ICC and ILU the review article APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A. VAN DER VORST http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf chapter in Parallel Numerical Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan, ICASE/LaRC Interdisciplinary Series in Science and Engineering, Kluwer, pp. 167--202. It is difficult to determine the publications where the FIRST use of ILU/ICC appeared since the did not call them that originally. If anyone has references to the original Chebychev and Bi-CG algorithms please let us know. Barry > Much appreciated > > Stephen > -- > ________________________________________________________________________ _______ > > The information in this email and in any attachment(s) is commercial > in confidence. If you are not the named addressee(s) or if you > receive this email in error then any distribution, copying or use of > this communication or the information in it is strictly prohibited. > Please notify us immediately by email at > admin.internet(at)awe.co.uk, and then delete this message from your > computer. While attachments are virus checked, AWE plc does not > accept any liability in respect of any virus which is not detected. > > AWE Plc > Registered in England and Wales > Registration No 02763902 > AWE, Aldermaston, Reading, RG7 4PR > -- _______________________________________________________________________________ The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. AWE Plc Registered in England and Wales Registration No 02763902 AWE, Aldermaston, Reading, RG7 4PR From knepley at gmail.com Thu Feb 14 12:56:30 2008 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Feb 2008 12:56:30 -0600 Subject: References for preconditioners and solver methods. In-Reply-To: <82EGUi224985@awe.co.uk> References: <82EGUi224985@awe.co.uk> Message-ID: On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball wrote: > > > Hi > > Thanks for your suggestions. You have given a reference for CR > (Conjugate Residuals) as: > > Methods of Conjugate Gradients for Solving Linear Systems, Magnus R. > Hestenes and Eduard Stiefel, Journal of Research of the National Bureau > of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp. > 409--436. I get this: The Conjugate Residual Method for Constrained Minimization Problems David G. Luenberger SIAM Journal on Numerical Analysis, Vol. 7, No. 3 (Sep., 1970), pp. 390-398 Barry, do you agree? Matt > However the PETSc user manual says this is the reference for CG > (Conjugate Gradient). Can you clarify which is the case? If it is not > for CR do you know of a reference for CR? > > If anyone can provide references for the Bi-CG, Chebychev, CR (Conjugate > Residuals), QCG (Quadratic CG) and Richardson solvers that would be very > much appreciated. > > Regards > > Stephen > > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith > Sent: 13 February 2008 20:41 > To: petsc-users at mcs.anl.gov > Subject: EXTERNAL: Re: References for preconditioners and solver > methods. > > > I've started adding them to the manual pages. Here are the ones I > have so far > > On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote: > > > > > Hi > > > > I am writing a paper that references PETSc and the preconditioners and > > linear solvers that it uses. I would like to include references for > > these. I have searched and found references for quite a few but am > > struggling to find references for the following solver methods: > > > > BICG > > > > > > CGNE > > This is just CG applied to the normal equations; it is not an idea > worthing of a > publication. > > > > > CHEBYCHEV > > > > > > > CR (Conjugate Residuals) > > Methods of Conjugate Gradients for Solving Linear Systems, Magnus > R. Hestenes and Eduard Stiefel, > Journal of Research of the National Bureau of Standards Vol. 49, > No. 6, December 1952 Research Paper 2379 > pp. 409--436. > > > > > QCG > > The Conjugate Gradient Method and Trust Regions in Large Scale > Optimization, Trond Steihaug > SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983), > pp. 626-637 > > > > > RICHARDSON > > > > > > TCQMR > > Transpose-free formulations of Lanczos-type methods for > nonsymmetric linear systems, > Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical > Algorithms, > Volume 17, Numbers 1-2 / May, 1998 pp. 51-66. > > > > > > Could you send me suitable references for these methods? > > > > I'm not sure if they exist, but could you also send me suitable > > references for the following preconditioners: > > > > ASM > An additive variant of the Schwarz alternating method for the > case of many subregions > M Dryja, OB Widlund - Courant Institute, New York University > Technical report > > Domain Decompositions: Parallel Multilevel Methods for Elliptic > Partial Differential Equations, > Barry Smith, Petter Bjorstad, and William Gropp, Cambridge > University Press, ISBN 0-521-49589-X. > > > > > BJACOBI > > Any iterative solver book, this is just Jacobi's method > > > > ILU > > ICC > > > > Both ICC and ILU the review article > > APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A. > VAN DER VORST > > http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf > chapter in Parallel Numerical > Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan, > ICASE/LaRC Interdisciplinary Series in > Science and Engineering, Kluwer, pp. 167--202. > > It is difficult to determine the publications where the FIRST use of > ILU/ICC appeared since the did not > call them that originally. > > If anyone has references to the original Chebychev and Bi-CG > algorithms please let us know. > > Barry > > > Much appreciated > > > > Stephen > > -- > > > ________________________________________________________________________ > _______ > > > > The information in this email and in any attachment(s) is commercial > > in confidence. If you are not the named addressee(s) or if you > > receive this email in error then any distribution, copying or use of > > this communication or the information in it is strictly prohibited. > > Please notify us immediately by email at > > admin.internet(at)awe.co.uk, and then delete this message from your > > computer. While attachments are virus checked, AWE plc does not > > accept any liability in respect of any virus which is not detected. > > > > AWE Plc > > Registered in England and Wales > > Registration No 02763902 > > AWE, Aldermaston, Reading, RG7 4PR > > > -- > _______________________________________________________________________________ > > The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. > > AWE Plc > Registered in England and Wales > Registration No 02763902 > AWE, Aldermaston, Reading, RG7 4PR > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Thu Feb 14 13:02:53 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 14 Feb 2008 13:02:53 -0600 Subject: References for preconditioners and solver methods. In-Reply-To: References: <82EGUi224985@awe.co.uk> Message-ID: On Feb 14, 2008, at 12:56 PM, Matthew Knepley wrote: > On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball > wrote: >> >> >> Hi >> >> Thanks for your suggestions. You have given a reference for CR >> (Conjugate Residuals) as: >> >> Methods of Conjugate Gradients for Solving Linear Systems, Magnus R. >> Hestenes and Eduard Stiefel, Journal of Research of the National >> Bureau >> of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp. >> 409--436. > > I get this: > > The Conjugate Residual Method for Constrained Minimization Problems > David G. Luenberger > SIAM Journal on Numerical Analysis, Vol. 7, No. 3 (Sep., 1970), pp. > 390-398 > > Barry, do you agree? I took at a look at Hestenes and Stiefel, though they don't use the term "conjugate residuals" I would argue that the algorithm is essentially there and so we should not give credit to someone else. Barry > > > Matt > >> However the PETSc user manual says this is the reference for CG >> (Conjugate Gradient). Can you clarify which is the case? If it is not >> for CR do you know of a reference for CR? >> >> If anyone can provide references for the Bi-CG, Chebychev, CR >> (Conjugate >> Residuals), QCG (Quadratic CG) and Richardson solvers that would be >> very >> much appreciated. >> >> Regards >> >> Stephen >> >> >> >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov >> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith >> Sent: 13 February 2008 20:41 >> To: petsc-users at mcs.anl.gov >> Subject: EXTERNAL: Re: References for preconditioners and solver >> methods. >> >> >> I've started adding them to the manual pages. Here are the ones I >> have so far >> >> On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote: >> >>> >>> Hi >>> >>> I am writing a paper that references PETSc and the preconditioners >>> and >>> linear solvers that it uses. I would like to include references for >>> these. I have searched and found references for quite a few but am >>> struggling to find references for the following solver methods: >>> >>> BICG >> >> >>> >>> CGNE >> >> This is just CG applied to the normal equations; it is not an idea >> worthing of a >> publication. >> >>> >>> CHEBYCHEV >> >> >> >>> >>> CR (Conjugate Residuals) >> >> Methods of Conjugate Gradients for Solving Linear Systems, Magnus >> R. Hestenes and Eduard Stiefel, >> Journal of Research of the National Bureau of Standards Vol. 49, >> No. 6, December 1952 Research Paper 2379 >> pp. 409--436. >> >>> >>> QCG >> >> The Conjugate Gradient Method and Trust Regions in Large Scale >> Optimization, Trond Steihaug >> SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983), >> pp. 626-637 >> >>> >>> RICHARDSON >> >> >>> >>> TCQMR >> >> Transpose-free formulations of Lanczos-type methods for >> nonsymmetric linear systems, >> Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical >> Algorithms, >> Volume 17, Numbers 1-2 / May, 1998 pp. 51-66. >>> >>> >>> Could you send me suitable references for these methods? >>> >>> I'm not sure if they exist, but could you also send me suitable >>> references for the following preconditioners: >>> >>> ASM >> An additive variant of the Schwarz alternating method for the >> case of many subregions >> M Dryja, OB Widlund - Courant Institute, New York University >> Technical report >> >> Domain Decompositions: Parallel Multilevel Methods for Elliptic >> Partial Differential Equations, >> Barry Smith, Petter Bjorstad, and William Gropp, Cambridge >> University Press, ISBN 0-521-49589-X. >> >>> >>> BJACOBI >> >> Any iterative solver book, this is just Jacobi's method >>> >>> ILU >>> ICC >>> >> >> Both ICC and ILU the review article >> >> APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A. >> VAN DER VORST >> >> http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf >> chapter in Parallel Numerical >> Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan, >> ICASE/LaRC Interdisciplinary Series in >> Science and Engineering, Kluwer, pp. 167--202. >> >> It is difficult to determine the publications where the FIRST use of >> ILU/ICC appeared since the did not >> call them that originally. >> >> If anyone has references to the original Chebychev and Bi-CG >> algorithms please let us know. >> >> Barry >> >>> Much appreciated >>> >>> Stephen >>> -- >>> >> ________________________________________________________________________ >> _______ >>> >>> The information in this email and in any attachment(s) is commercial >>> in confidence. If you are not the named addressee(s) or if you >>> receive this email in error then any distribution, copying or use of >>> this communication or the information in it is strictly prohibited. >>> Please notify us immediately by email at >>> admin.internet(at)awe.co.uk, and then delete this message from your >>> computer. While attachments are virus checked, AWE plc does not >>> accept any liability in respect of any virus which is not detected. >>> >>> AWE Plc >>> Registered in England and Wales >>> Registration No 02763902 >>> AWE, Aldermaston, Reading, RG7 4PR >>> >> -- >> _______________________________________________________________________________ >> >> The information in this email and in any attachment(s) is >> commercial in confidence. If you are not the named addressee(s) or >> if you receive this email in error then any distribution, copying >> or use of this communication or the information in it is strictly >> prohibited. Please notify us immediately by email at >> admin.internet(at)awe.co.uk, and then delete this message from your >> computer. While attachments are virus checked, AWE plc does not >> accept any liability in respect of any virus which is not detected. >> >> AWE Plc >> Registered in England and Wales >> Registration No 02763902 >> AWE, Aldermaston, Reading, RG7 4PR >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From knepley at gmail.com Thu Feb 14 13:15:42 2008 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Feb 2008 13:15:42 -0600 Subject: References for preconditioners and solver methods. In-Reply-To: <82EGUi224985@awe.co.uk> References: <82EGUi224985@awe.co.uk> Message-ID: On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball wrote: > If anyone can provide references for the Bi-CG, Chebychev, CR (Conjugate > Residuals), QCG (Quadratic CG) and Richardson solvers that would be very > much appreciated. > > QCG > > The Conjugate Gradient Method and Trust Regions in Large Scale > Optimization, Trond Steihaug > SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983), > pp. 626-637 and I put Richardson in the source yesterday, so it should be up on the developer documentation online today. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Stephen.R.Ball at awe.co.uk Fri Feb 15 04:44:11 2008 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Fri, 15 Feb 2008 10:44:11 -0000 Subject: References for preconditioners and solver methods. Message-ID: <82FAjs014703@awe.co.uk> Hi Can you tell me where I can get hold of the developer documentation? Regards Stephen -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley Sent: 14 February 2008 19:16 To: petsc-users at mcs.anl.gov Subject: EXTERNAL: Re: References for preconditioners and solver methods. On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball wrote: > If anyone can provide references for the Bi-CG, Chebychev, CR (Conjugate > Residuals), QCG (Quadratic CG) and Richardson solvers that would be very > much appreciated. > > QCG > > The Conjugate Gradient Method and Trust Regions in Large Scale > Optimization, Trond Steihaug > SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983), > pp. 626-637 and I put Richardson in the source yesterday, so it should be up on the developer documentation online today. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- _______________________________________________________________________________ The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. AWE Plc Registered in England and Wales Registration No 02763902 AWE, Aldermaston, Reading, RG7 4PR From Stephen.R.Ball at awe.co.uk Fri Feb 15 04:42:33 2008 From: Stephen.R.Ball at awe.co.uk (Stephen R Ball) Date: Fri, 15 Feb 2008 10:42:33 -0000 Subject: References for preconditioners and solver methods. Message-ID: <82FAe7014420@awe.co.uk> Hi So to clarify then I should use reference: Methods of Conjugate Gradients for Solving Linear Systems, Magnus R. Hestenes and Eduard Stiefel, Journal of Research of the National Bureau of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp. 409--436. For both CG and CR? Regards Stephen -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith Sent: 14 February 2008 19:03 To: petsc-users at mcs.anl.gov Subject: EXTERNAL: Re: References for preconditioners and solver methods. On Feb 14, 2008, at 12:56 PM, Matthew Knepley wrote: > On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball > wrote: >> >> >> Hi >> >> Thanks for your suggestions. You have given a reference for CR >> (Conjugate Residuals) as: >> >> Methods of Conjugate Gradients for Solving Linear Systems, Magnus R. >> Hestenes and Eduard Stiefel, Journal of Research of the National >> Bureau >> of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp. >> 409--436. > > I get this: > > The Conjugate Residual Method for Constrained Minimization Problems > David G. Luenberger > SIAM Journal on Numerical Analysis, Vol. 7, No. 3 (Sep., 1970), pp. > 390-398 > > Barry, do you agree? I took at a look at Hestenes and Stiefel, though they don't use the term "conjugate residuals" I would argue that the algorithm is essentially there and so we should not give credit to someone else. Barry > > > Matt > >> However the PETSc user manual says this is the reference for CG >> (Conjugate Gradient). Can you clarify which is the case? If it is not >> for CR do you know of a reference for CR? >> >> If anyone can provide references for the Bi-CG, Chebychev, CR >> (Conjugate >> Residuals), QCG (Quadratic CG) and Richardson solvers that would be >> very >> much appreciated. >> >> Regards >> >> Stephen >> >> >> >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov >> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith >> Sent: 13 February 2008 20:41 >> To: petsc-users at mcs.anl.gov >> Subject: EXTERNAL: Re: References for preconditioners and solver >> methods. >> >> >> I've started adding them to the manual pages. Here are the ones I >> have so far >> >> On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote: >> >>> >>> Hi >>> >>> I am writing a paper that references PETSc and the preconditioners >>> and >>> linear solvers that it uses. I would like to include references for >>> these. I have searched and found references for quite a few but am >>> struggling to find references for the following solver methods: >>> >>> BICG >> >> >>> >>> CGNE >> >> This is just CG applied to the normal equations; it is not an idea >> worthing of a >> publication. >> >>> >>> CHEBYCHEV >> >> >> >>> >>> CR (Conjugate Residuals) >> >> Methods of Conjugate Gradients for Solving Linear Systems, Magnus >> R. Hestenes and Eduard Stiefel, >> Journal of Research of the National Bureau of Standards Vol. 49, >> No. 6, December 1952 Research Paper 2379 >> pp. 409--436. >> >>> >>> QCG >> >> The Conjugate Gradient Method and Trust Regions in Large Scale >> Optimization, Trond Steihaug >> SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983), >> pp. 626-637 >> >>> >>> RICHARDSON >> >> >>> >>> TCQMR >> >> Transpose-free formulations of Lanczos-type methods for >> nonsymmetric linear systems, >> Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical >> Algorithms, >> Volume 17, Numbers 1-2 / May, 1998 pp. 51-66. >>> >>> >>> Could you send me suitable references for these methods? >>> >>> I'm not sure if they exist, but could you also send me suitable >>> references for the following preconditioners: >>> >>> ASM >> An additive variant of the Schwarz alternating method for the >> case of many subregions >> M Dryja, OB Widlund - Courant Institute, New York University >> Technical report >> >> Domain Decompositions: Parallel Multilevel Methods for Elliptic >> Partial Differential Equations, >> Barry Smith, Petter Bjorstad, and William Gropp, Cambridge >> University Press, ISBN 0-521-49589-X. >> >>> >>> BJACOBI >> >> Any iterative solver book, this is just Jacobi's method >>> >>> ILU >>> ICC >>> >> >> Both ICC and ILU the review article >> >> APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A. >> VAN DER VORST >> >> http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf >> chapter in Parallel Numerical >> Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan, >> ICASE/LaRC Interdisciplinary Series in >> Science and Engineering, Kluwer, pp. 167--202. >> >> It is difficult to determine the publications where the FIRST use of >> ILU/ICC appeared since the did not >> call them that originally. >> >> If anyone has references to the original Chebychev and Bi-CG >> algorithms please let us know. >> >> Barry >> >>> Much appreciated >>> >>> Stephen >>> -- >>> >> ________________________________________________________________________ >> _______ >>> >>> The information in this email and in any attachment(s) is commercial >>> in confidence. If you are not the named addressee(s) or if you >>> receive this email in error then any distribution, copying or use of >>> this communication or the information in it is strictly prohibited. >>> Please notify us immediately by email at >>> admin.internet(at)awe.co.uk, and then delete this message from your >>> computer. While attachments are virus checked, AWE plc does not >>> accept any liability in respect of any virus which is not detected. >>> >>> AWE Plc >>> Registered in England and Wales >>> Registration No 02763902 >>> AWE, Aldermaston, Reading, RG7 4PR >>> >> -- >> ________________________________________________________________________ _______ >> >> The information in this email and in any attachment(s) is >> commercial in confidence. If you are not the named addressee(s) or >> if you receive this email in error then any distribution, copying >> or use of this communication or the information in it is strictly >> prohibited. Please notify us immediately by email at >> admin.internet(at)awe.co.uk, and then delete this message from your >> computer. While attachments are virus checked, AWE plc does not >> accept any liability in respect of any virus which is not detected. >> >> AWE Plc >> Registered in England and Wales >> Registration No 02763902 >> AWE, Aldermaston, Reading, RG7 4PR >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > -- _______________________________________________________________________________ The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. AWE Plc Registered in England and Wales Registration No 02763902 AWE, Aldermaston, Reading, RG7 4PR From knutert at stud.ntnu.no Fri Feb 15 07:00:09 2008 From: knutert at stud.ntnu.no (knutert at stud.ntnu.no) Date: Fri, 15 Feb 2008 14:00:09 +0100 Subject: Poor performance with BoomerAMG? In-Reply-To: <82FAjs014703@awe.co.uk> References: <82FAjs014703@awe.co.uk> Message-ID: <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> Hello, I am trying to use the hypre multigrid solver to solve a Poisson equation. However, on a test case with grid size 257x257 it takes 40 seconds to converge on one processor when I run with ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg Using the DMMG framework, the same problem takes less than a second, and the default gmres solver uses only four seconds. Am I somehow using the solver the wrong way, or is this performance expected? Regards Knut Erik Teigen From zonexo at gmail.com Fri Feb 15 07:47:49 2008 From: zonexo at gmail.com (Ben Tay) Date: Fri, 15 Feb 2008 21:47:49 +0800 Subject: Poor performance with BoomerAMG? In-Reply-To: <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> Message-ID: <47B59805.1070306@gmail.com> Hi Knut, I'm currently using boomeramg to solve my poisson eqn too. I'm using it on my structured C-grid. I found it to be faster than LU, especially as the grid size increases. However I use it as a preconditioner with GMRES as the solver. Have you tried this option? Although it's faster, the speed increase is usually less than double. It seems to be worse if there is a lot of stretching in the grid. Btw, your mention using the DMMG framework and it takes less than a sec. What solver or preconditioner did you use? It's 4 times faster than GMRES... thanks! knutert at stud.ntnu.no wrote: > Hello, > > I am trying to use the hypre multigrid solver to solve a Poisson > equation. > However, on a test case with grid size 257x257 it takes 40 seconds to > converge > on one processor when I run with > ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg > > Using the DMMG framework, the same problem takes less than a second, > and the default gmres solver uses only four seconds. > > Am I somehow using the solver the wrong way, or is this performance > expected? > > Regards > Knut Erik Teigen > > From knepley at gmail.com Fri Feb 15 08:35:46 2008 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 15 Feb 2008 08:35:46 -0600 Subject: References for preconditioners and solver methods. In-Reply-To: <82FAjs014703@awe.co.uk> References: <82FAjs014703@awe.co.uk> Message-ID: On Fri, Feb 15, 2008 at 4:44 AM, Stephen R Ball wrote: > > Hi > > Can you tell me where I can get hold of the developer documentation? http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/index.html Matt > Regards > > Stephen > > > > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley > Sent: 14 February 2008 19:16 > To: petsc-users at mcs.anl.gov > Subject: EXTERNAL: Re: References for preconditioners and solver > methods. > > On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball > wrote: > > If anyone can provide references for the Bi-CG, Chebychev, CR > (Conjugate > > Residuals), QCG (Quadratic CG) and Richardson solvers that would be > very > > much appreciated. > > > QCG > > > > The Conjugate Gradient Method and Trust Regions in Large Scale > > Optimization, Trond Steihaug > > SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983), > > pp. 626-637 > > and I put Richardson in the source yesterday, so it should be up on > the developer > documentation online today. > > Matt > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > -- > _______________________________________________________________________________ > > The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. > > AWE Plc > Registered in England and Wales > Registration No 02763902 > AWE, Aldermaston, Reading, RG7 4PR > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knutert at stud.ntnu.no Fri Feb 15 08:36:35 2008 From: knutert at stud.ntnu.no (knutert at stud.ntnu.no) Date: Fri, 15 Feb 2008 15:36:35 +0100 Subject: Poor performance with BoomerAMG? In-Reply-To: <47B59805.1070306@gmail.com> References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> Message-ID: <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> Hi Ben, Thank you for answering. With gmres and boomeramg I get a run time of 2s, so that is much better. However, if I increase the grid size to 513x513, I get a run time of one minute. With richardson, it fails to converge. LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for the 513x513 problem. When using the DMMG framework, I just used the default solvers. I use the Galerkin process to generate the coarse matrices for the multigrid cycle. Best, Knut Siterer Ben Tay : > Hi Knut, > > I'm currently using boomeramg to solve my poisson eqn too. I'm using it > on my structured C-grid. I found it to be faster than LU, especially as > the grid size increases. However I use it as a preconditioner with > GMRES as the solver. Have you tried this option? Although it's faster, > the speed increase is usually less than double. It seems to be worse if > there is a lot of stretching in the grid. > > Btw, your mention using the DMMG framework and it takes less than a > sec. What solver or preconditioner did you use? It's 4 times faster > than GMRES... > > thanks! > > knutert at stud.ntnu.no wrote: >> Hello, >> >> I am trying to use the hypre multigrid solver to solve a Poisson equation. >> However, on a test case with grid size 257x257 it takes 40 seconds >> to converge >> on one processor when I run with >> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >> >> Using the DMMG framework, the same problem takes less than a second, >> and the default gmres solver uses only four seconds. >> >> Am I somehow using the solver the wrong way, or is this performance >> expected? >> >> Regards >> Knut Erik Teigen >> >> From bsmith at mcs.anl.gov Fri Feb 15 11:13:28 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 15 Feb 2008 11:13:28 -0600 Subject: Poor performance with BoomerAMG? In-Reply-To: <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> Message-ID: <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov> Run with the DMMG solver with the option -pc_type hypre What happens? Then run again with the additional option -ksp_type richardson Is hypre taking many, many iterations which is causing the slow speed? I expect there is something wrong with your code that does not use DMMG. Be careful how you handle boundary conditions; you need to make sure they have the same scaling as the other equations. Barry On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: > Hi Ben, > > Thank you for answering. With gmres and boomeramg I get a run time of > 2s, so that is much better. However, if I increase the grid size to > 513x513, I get a run time of one minute. With richardson, it fails > to converge. > LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for > the 513x513 problem. > > When using the DMMG framework, I just used the default solvers. > I use the Galerkin process to generate the coarse matrices for > the multigrid cycle. > > Best, > Knut > > Siterer Ben Tay : > >> Hi Knut, >> >> I'm currently using boomeramg to solve my poisson eqn too. I'm >> using it >> on my structured C-grid. I found it to be faster than LU, >> especially as >> the grid size increases. However I use it as a preconditioner with >> GMRES as the solver. Have you tried this option? Although it's >> faster, >> the speed increase is usually less than double. It seems to be >> worse if >> there is a lot of stretching in the grid. >> >> Btw, your mention using the DMMG framework and it takes less than a >> sec. What solver or preconditioner did you use? It's 4 times faster >> than GMRES... >> >> thanks! >> >> knutert at stud.ntnu.no wrote: >>> Hello, >>> >>> I am trying to use the hypre multigrid solver to solve a Poisson >>> equation. >>> However, on a test case with grid size 257x257 it takes 40 >>> seconds to converge >>> on one processor when I run with >>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >>> >>> Using the DMMG framework, the same problem takes less than a second, >>> and the default gmres solver uses only four seconds. >>> >>> Am I somehow using the solver the wrong way, or is this >>> performance expected? >>> >>> Regards >>> Knut Erik Teigen >>> >>> > > > From a.albiniana.crespo at gmail.com Fri Feb 15 14:23:36 2008 From: a.albiniana.crespo at gmail.com (=?ISO-8859-1?Q?antonio_albi=F1ana_crespo?=) Date: Fri, 15 Feb 2008 21:23:36 +0100 Subject: Installing Petsc Message-ID: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com> Hi! I'm new in this world, and I'm trying to install Petsc in order to develop my final studies project. I have received a few error messages and I have tried to study the configure.log file: Popping language C ********************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------- Downloaded mpi could not be used. Please check install in /new/proyecto/PETSc/petsc-2.3.3-p8/externalpackages/mpich2-1.0.5p4 /linux-gnu-c-debug ********************************************************************************* File "./config/configure.py", line 190, in petsc_configure framework.configure(out = sys.stdout) File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/framework.py", line 878, in configure child.configure() File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py", line 380, in configure self.executeTest(self.configureLibrary) File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/base.py", line 93, in executeTest return apply(test, args,kargs) File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/packages/MPI.py", line 548, in configureLibrary config.package.Package.configureLibrary(self) File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py", line 331, in configureLibrary for location, directory, lib, incl in self.generateGuesses(): File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py", line 160, in generateGuesses raise RuntimeError('Downloaded '+self.package+' could not be used. Please check install in '+d+'\n') I have seen many messages with: Possible ERROR while running linker: gcc: opci?n '-PIC' no reconocida gcc: opci?n '-PIC' no reconocida What am I doing wrongly? I thank you beforehand for your attention. -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Feb 15 16:14:00 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 15 Feb 2008 16:14:00 -0600 (CST) Subject: Installing Petsc In-Reply-To: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com> References: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com> Message-ID: Can you send such installation issues to petsc-maint at mcs.anl.gov with the complete configure.log file as attachment? >> > Possible ERROR while running linker: gcc: opci?n '-PIC' no reconocida > gcc: opci?n '-PIC' no reconocida << Perhaps configure misbehaves with non-english output from some of the tools it invokes.. Satish On Fri, 15 Feb 2008, antonio albi?ana crespo wrote: > Hi! > > I'm new in this world, and I'm trying to install Petsc in order to develop > my final studies project. I have received a few error messages and I have > tried to study the configure.log file: > > Popping language C > ********************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > --------------------------------------------------------------------------------------- > Downloaded mpi could not be used. Please check install in > /new/proyecto/PETSc/petsc-2.3.3-p8/externalpackages/mpich2-1.0.5p4 > /linux-gnu-c-debug > ********************************************************************************* > File "./config/configure.py", line 190, in petsc_configure > framework.configure(out = sys.stdout) > File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/framework.py", > line 878, in configure > child.configure() > File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py", > line 380, in configure > self.executeTest(self.configureLibrary) > File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/base.py", > line 93, in executeTest > return apply(test, args,kargs) > File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/packages/MPI.py", > line 548, in configureLibrary > config.package.Package.configureLibrary(self) > File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py", > line 331, in configureLibrary > for location, directory, lib, incl in self.generateGuesses(): > File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py", > line 160, in generateGuesses > raise RuntimeError('Downloaded '+self.package+' could not be used. > Please check install in '+d+'\n') > > I have seen many messages with: > > Possible ERROR while running linker: gcc: opci?n '-PIC' no reconocida > gcc: opci?n '-PIC' no reconocida > > What am I doing wrongly? > > I thank you beforehand for your attention. > From tyoung at ippt.gov.pl Fri Feb 15 17:00:58 2008 From: tyoung at ippt.gov.pl (Toby D. Young) Date: Sat, 16 Feb 2008 00:00:58 +0100 (CET) Subject: Installing Petsc In-Reply-To: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com> References: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com> Message-ID: > Possible ERROR while running linker: gcc: opci?n '-PIC' no reconocida > gcc: opci?n '-PIC' no reconocida Try compiling with option `-fPIC' (not `-PIC'). Best, Toby ----- Toby D. Young - Adiunkt (Assistant Professor) Department of Computational Science Institute of Fundamental Technological Research Polish Academy of Sciences Room 206, ul. Swietokrzyska 21 00-049 Warszawa, POLAND From rlmackie862 at gmail.com Fri Feb 15 17:23:40 2008 From: rlmackie862 at gmail.com (Randall Mackie) Date: Fri, 15 Feb 2008 15:23:40 -0800 Subject: Question on the ordering for a 3D Distributed Array Vector with 3 degrees of freedom Message-ID: <47B61EFC.7000506@gmail.com> I am using a 3D distributed array with 3 degrees of freedom, where each degree of freedom refers to, for example, the model value in the x, y, and z directions (the model properties are diagonally anisotropic). If I scatter the DA vector to a natural vector on the zero processor, and then use VecGetArray to access it: Call VecGetArray(vseq,xx_v,xx_i,ierr) do i=1,3*mx*my*mz v=xx_a(i) end do call VecRestoreArray(vseq,xx_v,xx_i,ierr) Is the natural ordering with the vector v then v(mx,my,mz,dof)? So then if I want to get the model values in the x direction and write them to a file, then it would be the first mx*my*mz values, and so forth? Randy From Andrew.Barker at Colorado.EDU Fri Feb 15 17:36:00 2008 From: Andrew.Barker at Colorado.EDU (Andrew T Barker) Date: Fri, 15 Feb 2008 16:36:00 -0700 (MST) Subject: Poor performance with BoomerAMG? Message-ID: <20080215163600.ABA57782@batman.int.colorado.edu> >Be careful how you handle boundary conditions; you need to make sure >they have the same scaling as the other equations. Could you clarify what you mean? Is boomerAMG sensitive to scaling of matrix rows in a way that other solvers/preconditioners are not? Andrew > >On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: > >> Hi Ben, >> >> Thank you for answering. With gmres and boomeramg I get a run time of >> 2s, so that is much better. However, if I increase the grid size to >> 513x513, I get a run time of one minute. With richardson, it fails >> to converge. >> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for >> the 513x513 problem. >> >> When using the DMMG framework, I just used the default solvers. >> I use the Galerkin process to generate the coarse matrices for >> the multigrid cycle. >> >> Best, >> Knut >> >> Siterer Ben Tay : >> >>> Hi Knut, >>> >>> I'm currently using boomeramg to solve my poisson eqn too. I'm >>> using it >>> on my structured C-grid. I found it to be faster than LU, >>> especially as >>> the grid size increases. However I use it as a preconditioner with >>> GMRES as the solver. Have you tried this option? Although it's >>> faster, >>> the speed increase is usually less than double. It seems to be >>> worse if >>> there is a lot of stretching in the grid. >>> >>> Btw, your mention using the DMMG framework and it takes less than a >>> sec. What solver or preconditioner did you use? It's 4 times faster >>> than GMRES... >>> >>> thanks! >>> >>> knutert at stud.ntnu.no wrote: >>>> Hello, >>>> >>>> I am trying to use the hypre multigrid solver to solve a Poisson >>>> equation. >>>> However, on a test case with grid size 257x257 it takes 40 >>>> seconds to converge >>>> on one processor when I run with >>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >>>> >>>> Using the DMMG framework, the same problem takes less than a second, >>>> and the default gmres solver uses only four seconds. >>>> >>>> Am I somehow using the solver the wrong way, or is this >>>> performance expected? >>>> >>>> Regards >>>> Knut Erik Teigen >>>> >>>> >> >> >> > From bsmith at mcs.anl.gov Sat Feb 16 11:39:13 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 16 Feb 2008 11:39:13 -0600 Subject: Question on the ordering for a 3D Distributed Array Vector with 3 degrees of freedom In-Reply-To: <47B61EFC.7000506@gmail.com> References: <47B61EFC.7000506@gmail.com> Message-ID: In Fortran array indexing it is v(dof,mx,my,mz) Barry On Feb 15, 2008, at 5:23 PM, Randall Mackie wrote: > I am using a 3D distributed array with 3 degrees of freedom, where > each > degree of freedom refers to, for example, the model value in the x, > y, and > z directions (the model properties are diagonally anisotropic). > > If I scatter the DA vector to a natural vector on the zero processor, > and then use VecGetArray to access it: > > Call VecGetArray(vseq,xx_v,xx_i,ierr) > > do i=1,3*mx*my*mz > v=xx_a(i) > end do > > call VecRestoreArray(vseq,xx_v,xx_i,ierr) > > > Is the natural ordering with the vector v then v(mx,my,mz,dof)? > > So then if I want to get the model values in the x direction and write > them to a file, then it would be the first mx*my*mz values, and so > forth? > > > Randy > From bsmith at mcs.anl.gov Sat Feb 16 11:49:04 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 16 Feb 2008 11:49:04 -0600 Subject: Poor performance with BoomerAMG? In-Reply-To: <20080215163600.ABA57782@batman.int.colorado.edu> References: <20080215163600.ABA57782@batman.int.colorado.edu> Message-ID: <30BAACA0-5EAE-4A7C-8BD7-2E34071CAE57@mcs.anl.gov> All multigrid solvers depend on proper scaling of the variables. For example for a Laplacian operator the matrix entries are \integral \grad \phi_i dot \grad \phi_j now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the volume is O(h^3) meaning the matrix entries are O(h). Now say you impose a Dirichlet boundary conditions by just saying u_k = g_k. In 2d this is ok but in 3d you need to use h*u_k = h*g_k otherwise when you restrict to the coarser grid the resulting matrix entries for the boundary are "out of whack" with the matrix entries for the interior of the domain. Actually most preconditioners and Krylov methods behavior does depend on the row scaling; multigrid is just particularly sensitive. Barry On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote: > > >> Be careful how you handle boundary conditions; you need to make sure >> they have the same scaling as the other equations. > > Could you clarify what you mean? Is boomerAMG sensitive to scaling > of matrix rows in a way that other solvers/preconditioners are not? > > Andrew > >> >> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: >> >>> Hi Ben, >>> >>> Thank you for answering. With gmres and boomeramg I get a run time >>> of >>> 2s, so that is much better. However, if I increase the grid size to >>> 513x513, I get a run time of one minute. With richardson, it fails >>> to converge. >>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for >>> the 513x513 problem. >>> >>> When using the DMMG framework, I just used the default solvers. >>> I use the Galerkin process to generate the coarse matrices for >>> the multigrid cycle. >>> >>> Best, >>> Knut >>> >>> Siterer Ben Tay : >>> >>>> Hi Knut, >>>> >>>> I'm currently using boomeramg to solve my poisson eqn too. I'm >>>> using it >>>> on my structured C-grid. I found it to be faster than LU, >>>> especially as >>>> the grid size increases. However I use it as a preconditioner with >>>> GMRES as the solver. Have you tried this option? Although it's >>>> faster, >>>> the speed increase is usually less than double. It seems to be >>>> worse if >>>> there is a lot of stretching in the grid. >>>> >>>> Btw, your mention using the DMMG framework and it takes less than a >>>> sec. What solver or preconditioner did you use? It's 4 times faster >>>> than GMRES... >>>> >>>> thanks! >>>> >>>> knutert at stud.ntnu.no wrote: >>>>> Hello, >>>>> >>>>> I am trying to use the hypre multigrid solver to solve a Poisson >>>>> equation. >>>>> However, on a test case with grid size 257x257 it takes 40 >>>>> seconds to converge >>>>> on one processor when I run with >>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >>>>> >>>>> Using the DMMG framework, the same problem takes less than a >>>>> second, >>>>> and the default gmres solver uses only four seconds. >>>>> >>>>> Am I somehow using the solver the wrong way, or is this >>>>> performance expected? >>>>> >>>>> Regards >>>>> Knut Erik Teigen >>>>> >>>>> >>> >>> >>> >> > From rlmackie862 at gmail.com Sat Feb 16 13:28:21 2008 From: rlmackie862 at gmail.com (Randall Mackie) Date: Sat, 16 Feb 2008 11:28:21 -0800 Subject: Question on DA's and VecScatters Message-ID: <47B73955.4070507@gmail.com> In my 3D distributed array, I created global vectors, local vectors, and one natural vector (because I want to access and output to a file some of these values). If you create a scatter context, using VecScatterCreateToZero, does it matter whether or not I specify a global vector or the natural vector to create the context? In other words, does it matter in this vecscattercreate call whether or not I use vnat (natural vector) or vsol (global vector) call VecScatterCreateToZero(vnat,vToZero,vseq,ierr) if later I make these calls: call DaGlobalToNaturalBegin(da,vsol,INSERT_VALUES,vnat,ierr) call DaGlobalToNaturalEnd(da,vsol,INSERT_VALUES,vnat,ierr) call VecScatterBegin(vToZero,vnat,vseq....) call VecScatterEnd(vToZero,vnat,vseq....) call VecGetArray(vseq....) Or does it keep track because it knows what type of vectors are being dealt with? Thanks, Randy From bsmith at mcs.anl.gov Sat Feb 16 16:10:52 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 16 Feb 2008 16:10:52 -0600 Subject: Question on DA's and VecScatters In-Reply-To: <47B73955.4070507@gmail.com> References: <47B73955.4070507@gmail.com> Message-ID: It most definitely matters. The VecScatterCreateToZero() simply concatenates the values from all the processes together so they need to be in the natural order before collecting on zero. Barry BTW: The VecView() to a binary file for DA global vectors does the mapping automatically, so the file is always in the natural ordering. On Feb 16, 2008, at 1:28 PM, Randall Mackie wrote: > In my 3D distributed array, I created global vectors, local vectors, > and one natural vector (because I want to access and output to a file > some of these values). > > If you create a scatter context, using VecScatterCreateToZero, does > it matter whether or not I specify a global vector or the natural > vector > to create the context? > > In other words, does it matter in this vecscattercreate call whether > or not > I use vnat (natural vector) or vsol (global vector) > > call VecScatterCreateToZero(vnat,vToZero,vseq,ierr) > > > if later I make these calls: > > > call DaGlobalToNaturalBegin(da,vsol,INSERT_VALUES,vnat,ierr) > call DaGlobalToNaturalEnd(da,vsol,INSERT_VALUES,vnat,ierr) > > call VecScatterBegin(vToZero,vnat,vseq....) > call VecScatterEnd(vToZero,vnat,vseq....) > > > call VecGetArray(vseq....) > > Or does it keep track because it knows what type of vectors are > being dealt with? > > > Thanks, Randy > > From knutert at stud.ntnu.no Mon Feb 18 01:57:34 2008 From: knutert at stud.ntnu.no (knutert at stud.ntnu.no) Date: Mon, 18 Feb 2008 08:57:34 +0100 Subject: Poor performance with BoomerAMG? In-Reply-To: <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov> References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov> Message-ID: <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no> Thank you for the reply, Barry. The same thing happens if I use hypre with the DMMG solver. As you say, with hypre, the convergence is extremely slow, requiring a lot of iterations, 1413 iterations (1820 if I use richardson) for a 257x257 problem, while the default only needs 5. I use the same way of handling boundary conditions in the two codes. I've also compared the coeff matrix and rhs, and they are equal. -Knut Erik- Siterer Barry Smith : > > Run with the DMMG solver with the option -pc_type hypre > What happens? Then run again with the additional option -ksp_type richardson > > Is hypre taking many, many iterations which is causing the slow speed? > > I expect there is something wrong with your code that does not use DMMG. > Be careful how you handle boundary conditions; you need to make sure > they have the same scaling as the other equations. > > Barry > > > > On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: > >> Hi Ben, >> >> Thank you for answering. With gmres and boomeramg I get a run time of >> 2s, so that is much better. However, if I increase the grid size to >> 513x513, I get a run time of one minute. With richardson, it fails >> to converge. >> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for >> the 513x513 problem. >> >> When using the DMMG framework, I just used the default solvers. >> I use the Galerkin process to generate the coarse matrices for >> the multigrid cycle. >> >> Best, >> Knut >> >> Siterer Ben Tay : >> >>> Hi Knut, >>> >>> I'm currently using boomeramg to solve my poisson eqn too. I'm using it >>> on my structured C-grid. I found it to be faster than LU, especially as >>> the grid size increases. However I use it as a preconditioner with >>> GMRES as the solver. Have you tried this option? Although it's faster, >>> the speed increase is usually less than double. It seems to be worse if >>> there is a lot of stretching in the grid. >>> >>> Btw, your mention using the DMMG framework and it takes less than a >>> sec. What solver or preconditioner did you use? It's 4 times faster >>> than GMRES... >>> >>> thanks! >>> >>> knutert at stud.ntnu.no wrote: >>>> Hello, >>>> >>>> I am trying to use the hypre multigrid solver to solve a Poisson equation. >>>> However, on a test case with grid size 257x257 it takes 40 >>>> seconds to converge >>>> on one processor when I run with >>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >>>> >>>> Using the DMMG framework, the same problem takes less than a second, >>>> and the default gmres solver uses only four seconds. >>>> >>>> Am I somehow using the solver the wrong way, or is this >>>> performance expected? >>>> >>>> Regards >>>> Knut Erik Teigen >>>> >>>> >> >> >> From bsmith at mcs.anl.gov Mon Feb 18 12:10:04 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 18 Feb 2008 12:10:04 -0600 Subject: Poor performance with BoomerAMG? In-Reply-To: <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no> References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov> <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no> Message-ID: <0D9E98F1-82D0-4453-8581-DF454A2C10FC@mcs.anl.gov> Please send the code to petsc-maint at mcs.anl.gov Something is not right. Barry On Feb 18, 2008, at 1:57 AM, knutert at stud.ntnu.no wrote: > Thank you for the reply, Barry. > > The same thing happens if I use hypre with the DMMG solver. > As you say, with hypre, the convergence is extremely slow, requiring > a lot of iterations, 1413 iterations (1820 if I use richardson) for > a 257x257 > problem, while the default only needs 5. > > I use the same way of handling boundary conditions in the two codes. > I've also compared the coeff matrix and rhs, and they are equal. > > -Knut Erik- > > Siterer Barry Smith : > >> >> Run with the DMMG solver with the option -pc_type hypre >> What happens? Then run again with the additional option -ksp_type >> richardson >> >> Is hypre taking many, many iterations which is causing the slow >> speed? >> >> I expect there is something wrong with your code that does not use >> DMMG. >> Be careful how you handle boundary conditions; you need to make sure >> they have the same scaling as the other equations. >> >> Barry >> >> >> >> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: >> >>> Hi Ben, >>> >>> Thank you for answering. With gmres and boomeramg I get a run time >>> of >>> 2s, so that is much better. However, if I increase the grid size to >>> 513x513, I get a run time of one minute. With richardson, it >>> fails to converge. >>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s >>> for the 513x513 problem. >>> >>> When using the DMMG framework, I just used the default solvers. >>> I use the Galerkin process to generate the coarse matrices for >>> the multigrid cycle. >>> >>> Best, >>> Knut >>> >>> Siterer Ben Tay : >>> >>>> Hi Knut, >>>> >>>> I'm currently using boomeramg to solve my poisson eqn too. I'm >>>> using it >>>> on my structured C-grid. I found it to be faster than LU, >>>> especially as >>>> the grid size increases. However I use it as a preconditioner with >>>> GMRES as the solver. Have you tried this option? Although it's >>>> faster, >>>> the speed increase is usually less than double. It seems to be >>>> worse if >>>> there is a lot of stretching in the grid. >>>> >>>> Btw, your mention using the DMMG framework and it takes less than a >>>> sec. What solver or preconditioner did you use? It's 4 times faster >>>> than GMRES... >>>> >>>> thanks! >>>> >>>> knutert at stud.ntnu.no wrote: >>>>> Hello, >>>>> >>>>> I am trying to use the hypre multigrid solver to solve a Poisson >>>>> equation. >>>>> However, on a test case with grid size 257x257 it takes 40 >>>>> seconds to converge >>>>> on one processor when I run with >>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >>>>> >>>>> Using the DMMG framework, the same problem takes less than a >>>>> second, >>>>> and the default gmres solver uses only four seconds. >>>>> >>>>> Am I somehow using the solver the wrong way, or is this >>>>> performance expected? >>>>> >>>>> Regards >>>>> Knut Erik Teigen >>>>> >>>>> >>> >>> >>> > > > From keita at cray.com Mon Feb 18 12:27:10 2008 From: keita at cray.com (Keita Teranishi) Date: Mon, 18 Feb 2008 12:27:10 -0600 Subject: Support for Howell-Rutherford or Howell-Boeing Sparse matrix format Message-ID: <925346A443D4E340BEB20248BAFCDBDF04245686@CFEVS1-IP.americas.cray.com> Hi, I am wondering if there is any PETSc routine that loads Howell-Rutherford or Howell-Boeing format, and converts it to AIJ format automatically. Since there is a huge collection of sparse matrices at University of Florida, such routine is very useful for benchmarking KSP and PC. Thanks, ================================ Keita Teranishi Math Software Group Cray, Inc. keita at cray.com ================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Feb 18 16:00:39 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 18 Feb 2008 16:00:39 -0600 Subject: Support for Howell-Rutherford or Howell-Boeing Sparse matrix format In-Reply-To: <925346A443D4E340BEB20248BAFCDBDF04245686@CFEVS1-IP.americas.cray.com> References: <925346A443D4E340BEB20248BAFCDBDF04245686@CFEVS1-IP.americas.cray.com> Message-ID: <39BFB7E1-6579-4FA0-9C91-FD385AD182CC@mcs.anl.gov> Keita, We have not provided supported library routines for this since we've found that actual ASCII file formats often require slightly different readers. And thus we could not provide robust readers. Instead we have some example routines in src/mat/examples/tutorials/, specifically ex78.c that the user may modify for their exact needs. I've added this information to our FAQ. Barry On Feb 18, 2008, at 12:27 PM, Keita Teranishi wrote: > Hi, > > I am wondering if there is any PETSc routine that loads Howell- > Rutherford or Howell-Boeing format, and converts it to AIJ format > automatically. Since there is a huge collection of sparse matrices > at University of Florida, such routine is very useful for > benchmarking KSP and PC. > > Thanks, > ================================ > Keita Teranishi > Math Software Group > Cray, Inc. > keita at cray.com > ================================ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernhard.kubicek at arsenal.ac.at Tue Feb 19 04:12:45 2008 From: bernhard.kubicek at arsenal.ac.at (Bernhard Kubicek) Date: Tue, 19 Feb 2008 11:12:45 +0100 Subject: Parallel matrix assembly - SetValues Problem? Message-ID: <1203415965.7200.108.camel@node99> Dear List, sorry to bother you but I just finished reading the whole archive and couldn't find a solution to a problem of mine that keeps on bothering me now for 7+ days. The problem is that my code produces different matrices if run in parallel or single cpu. I do a manual partitioning of the mesh by using metis by hand. Thereafter, there is a list of finite-volume elements that I want to be stored on the individual cpu and a renumbering that is manged somehow. I create my matrix with MatCreateMPIAIJ(PETSC_COMM_WORLD,mycount,mycount, PETSC_DETERMINE,PETSC_DETERMINE,50,PETSC_NULL,50,PETSC_NULL,&A), where mycount is different on each cpu, and is the mentioned number of elements I wish to have there locally. for each local row/element at the same time let a user calculate the matrix elements and column positions, and the right hand side values for this row. I output those for debugging. Within the loop for the local rows, I call MatSetValues(A,1,&i,nrEntries,entries,v,INSERT_VALUES) VecSetValue(rhs,i,rhsval,INSERT_VALUES) in this order When I run on one cpu, everything works nicely. A 3d mesh of a 10-element long bar with each element having volume 1, creates the following matrix: -2. 1. 0. 0. 0. 0. 0. 0. 0. 1. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. -3. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 1. 0. 0. 0. 0. 0. 0. 0. 1. -2. rhs 0 0 0 0 0 -2 0 0 0 0 The CPU sets the matrix and rhs like this ( global matrix row: column/value ...column/value | rhs-value ) Row 0 cols:0/-2 9/1 1/1 | 0 Row 1 cols:1/-2 0/1 2/1 | 0 Row 2 cols:2/-2 1/1 3/1 | 0 Row 3 cols:3/-2 2/1 4/1 | 0 Row 4 cols:4/-3 3/1 | 0 Row 5 cols:5/-3 6/1 | -2 Row 6 cols:6/-2 5/1 7/1 | 0 Row 7 cols:7/-2 6/1 8/1 | 0 Row 8 cols:8/-2 7/1 9/1 | 0 Row 9 cols:9/-2 0/1 8/1 | 0 because of the meshing the central rows in the matrix are the most exterior elements, on which wall boundary condition 0 and 1 are set (laplace equation). one 2 cpus, the matrix looses is different, although the global-local element renumbering is defacto nonexisting (cpu 0: rows 0-4, cpu 1: rows 5-9): 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. -3. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. 1. 0. 0. 0. 0. 0. 0. 0. 1. -2. Process [0] 0 0 0 0 0 Process [1] -2 0 0 0 0 here cpu 0 sets: Row 0 cols:0/-2 9/1 1/1 |0 Row 1 cols:1/-2 0/1 2/1 |0 Row 2 cols:2/-2 1/1 3/1 |0 Row 3 cols:3/-2 2/1 4/1 |0 Row 4 cols:4/-3 3/1 |0 and cpu 1: Row 5 cols:5/-3 6/1|-2 Row 6 cols:6/-2 5/1 7/1 |0 Row 7 cols:7/-2 6/1 8/1 |0 Row 8 cols:8/-2 7/1 9/1 |0 Row 9 cols:9/-2 0/1 8/1 |0 I triple verified that ***SetValues is called with the exactly same values as on one cpu, and that nothing is set twice, and that every cpu sets it's correct columns. Also for more sophisticated renumberings Attached are the outputs when run with with -info. My current guess is that I create the matrix falsely, or that I cannot mix the setting of Vec and Mat values before their respective ???AssemblyBegin/Ends. If anyone has any idea where the problem is, is would be extremely nice to help me here. Thank you very much, even for the slightest help Bernhard Kubicek ------ Physics Doctorate Student Techn. University of Vienna, Austria Freelancer arsenal research, Vienna Austria -------------- next part -------------- [1] PetscInitialize(): PETSc successfully started: number of processors = 2 [1] PetscGetHostName(): Rejecting domainname, likely is NIS node99.(none) [1] PetscInitialize(): Running on machine: node99 Trying to read Gmsh .msh file "stab.gmsh" Gmsh2 file format recognised [0] PetscInitialize(): PETSc successfully started: number of processors = 2 [0] PetscGetHostName(): Rejecting domainname, likely is NIS node99.(none) [0] PetscInitialize(): Running on machine: node99 Read 44 nodes. alltogether number of elements including faces, ignored:61 Trying to read Gmsh .msh file "stab.gmsh" Gmsh2 file format recognised Read 44 nodes. alltogether number of elements including faces, ignored:61 [1] PetscCommDuplicate(): Duplicating a communicator 91 141 max tags = 1073741823 [1] PetscCommDuplicate(): returning tag 1073741823 [1] PetscCommDuplicate(): Duplicating a communicator 92 143 max tags = 1073741823 [1] PetscCommDuplicate(): returning tag 1073741823 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741822 [1] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [1] PetscCommDuplicate(): returning tag 1073741820 [1] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741819 [0] PetscCommDuplicate(): returning tag 1073741814 [1] PetscCommDuplicate(): returning tag 1073741809 [1] MatStashScatterBegin_Private(): No of messages: 0 [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatStashScatterBegin_Private(): No of messages: 0 [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 5 X 5; storage space: 237 unneeded,13 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3 [1] Mat_CheckInode(): Found 5 nodes out of 5 rows. Not using Inode routines [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741821 [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter [1] PetscCommDuplicate(): returning tag 1073741802 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741820 [1] PetscCommDuplicate(): returning tag 1073741801 [1] PetscCommDuplicate(): returning tag 1073741796 [1] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter [0] VecScatterCreate(): General case: MPI to Seq [0] MatSetOption_Inode(): Not using Inode routines due to MatSetOption(MAT_DO_NOT_USE_INODES [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5 X 1; storage space: 249 unneeded,1 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1 [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 4)/(num_localrows 5) > 0.6. Use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5 X 5; storage space: 0 unneeded,13 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741819 [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter [1] PetscCommDuplicate(): returning tag 1073741793 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741818 [1] PetscCommDuplicate(): returning tag 1073741792 [1] PetscCommDuplicate(): returning tag 1073741787 [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter [0] VecScatterLocalOptimizeCopy_Private(): Local scatter is a copy, optimizing for it [0] VecScatterCreate(): General case: MPI to Seq [1] MatSetOption_Inode(): Not using Inode routines due to MatSetOption(MAT_DO_NOT_USE_INODES [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 5 X 1; storage space: 0 unneeded,1 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1 [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] Mat_CheckCompressedRow(): Skip check. m: 5, n: 1,M: 5, N: 1,nrows: 1, ii: 0x89153a0, type: seqaij [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1 [0] Mat_CheckCompressedRow(): Skip check. m: 5, n: 1,M: 5, N: 1,nrows: 1, ii: 0x8917588, type: seqaij [1] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [1] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. [1] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [1] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. [1] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741784 [1] PetscCommDuplicate(): returning tag 1073741783 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741817 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741816 [1] MatStashScatterBegin_Private(): No of messages: 1 [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 14 entries, uses 0 mallocs. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 133 unneeded,27 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 10 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3 [0] Mat_CheckInode(): Found 10 nodes out of 10 rows. Not using Inode routines [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 0 X 0; storage space: 0 unneeded,0 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0 [1] Mat_CheckInode(): Found 0 nodes of 0. Limit used: 5. Using Inode routines [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741815 [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter [0] PetscCommDuplicate(): returning tag 1073741779 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741814 [1] PetscCommDuplicate(): returning tag 1073741814 [0] PetscCommDuplicate(): returning tag 1073741778 [1] PetscCommDuplicate(): returning tag 1073741773 [1] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter [0] VecScatterCreate(): General case: MPI to Seq [1] MatSetOption_Inode(): Not using Inode routines due to MatSetOption(MAT_DO_NOT_USE_INODES [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 0 X 0; storage space: 0 unneeded,0 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0 [0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 10)/(num_localrows 10) > 0.6. Use CompressedRow routines. [1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 0) > 0.6. Use CompressedRow routines. [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741813 [1] PetscCommDuplicate(): returning tag 1073741813 1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -3.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -3.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 [1] PetscCommDuplicate(): returning tag 1073741770 Process [0] 0 0 0 0 0 Process [1] -2 0 0 0 0 [0] PetscCommDuplicate(): returning tag 1073741769 [1] PetscCommDuplicate(): returning tag 1073741769 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741768 [1] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741767 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [1] PetscCommDuplicate(): returning tag 1073741766 [1] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [1] PetscCommDuplicate(): returning tag 1073741765 [0] PetscCommDuplicate(): returning tag 1073741764 [1] PetscCommDuplicate(): returning tag 1073741763 [1] PetscCommDuplicate(): returning tag 1073741762 [0] PetscCommDuplicate(): returning tag 1073741761 [1] PetscCommDuplicate(): returning tag 1073741756 [0] PetscCommDuplicate(): returning tag 1073741751 [0] PetscCommDuplicate(): returning tag 1073741746 [0] PetscCommDuplicate(): returning tag 1073741741 [1] PetscCommDuplicate(): returning tag 1073741736 [0] PCSetUp(): Setting up new PC [1] PetscCommDuplicate(): returning tag 1073741731 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741812 [1] MatIncreaseOverlap_MPIAIJ_Receive(): Allocated 1 bytes, required 3 bytes, no of mallocs = 0 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741811 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741810 [1] PetscCommDuplicate(): returning tag 1073741809 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741808 [1] PetscCommDuplicate(): returning tag 1073741722 [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter [0] VecScatterLocalOptimizeCopy_Private(): Local scatter is a copy, optimizing for it [1] VecScatterLocalOptimizeCopy_Private(): Local scatter is a copy, optimizing for it [0] VecScatterCreate(): General case: MPI to Seq [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741807 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741806 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741805 [1] PetscCommDuplicate(): returning tag 1073741805 [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 6 X 6; storage space: 0 unneeded,16 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3 [1] Mat_CheckInode(): Found 6 nodes out of 6 rows. Not using Inode routines [0] PCSetUp(): Setting up new PC [0] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741804 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741803 [1] PetscCommDuplicate(): returning tag 1073741803 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [0] PetscCommDuplicate(): returning tag 1073741802 [1] PetscCommDuplicate(): returning tag 1073741802 [1] MatILUFactorSymbolic_SeqAIJ(): Reallocs 0 Fill ratio:given 50 needed 0.9375 [1] MatILUFactorSymbolic_SeqAIJ(): Run with -[sub_]pc_factor_fill 0.9375 or use [0] MatILUFactorSymbolic_SeqAIJ(): PCFactorSetFill([sub]pc,0.928571); [0] MatILUFactorSymbolic_SeqAIJ(): for best performance. [1] MatILUFactorSymbolic_SeqAIJ(): for best performance. [0] PetscCommDuplicate(): returning tag 1073741801 [1] PetscCommDuplicate(): returning tag 1073741801 [1] Mat_CheckInode(): Found 6 nodes out of 6 rows. Not using Inode routines [1]PETSC ERROR: --------------------- Error Message ------------------------------------ [1]PETSC ERROR: Detected zero pivot in LU factorization see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! [1]PETSC ERROR: Zero pivot row 5 value 0 tolerance 0 * rowsum 0! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: /media/sdb1/bernhard/svnl/bogen/Cpp/newmesh/bin/./main on a omp_deb_m named node99 by bkubicek Tue Feb 19 11:04:15 2008 [1]PETSC ERROR: Libraries linked from /home/bkubicek/750/Software/petsc-2.3.3-p8/lib/omp_deb_mpi_cxx [1]PETSC ERROR: Configure run at Thu Jan 31 10:02:09 2008 [1]PETSC ERROR: Configure options --with-clanguage=c++ --with-x=0 --with-debugging=1 --with-shared=0 --with-default-arch=0 --with-mpi=1 COPTFLAGS=' -O2 -march=pentium4 -mtune=pentium4 ' FOPTFLAGS='-I -O2 -march=pentium4 -mtune=pentium4 ' [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 529 in src/mat/impls/aij/seq/aijfact.c [1]PETSC ERROR: MatLUFactorNumeric() line 2227 in src/mat/interface/matrix.c [1]PETSC ERROR: PCSetUp_ILU() line 564 in src/ksp/pc/impls/factor/ilu/ilu.c [1]PETSC ERROR: PCSetUp() line 787 in src/ksp/pc/interface/precon.c [1]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: PCSetUpOnBlocks_ASM() line 224 in src/ksp/pc/impls/asm/asm.c [1]PETSC ERROR: PCSetUpOnBlocks() line 820 in src/ksp/pc/interface/precon.c [1]PETSC ERROR: KSPSetUpOnBlocks() line 158 in src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: KSPSolve() line 348 in src/ksp/ksp/interface/itfunc.c [1] PCSetUp(): Setting up new PC [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741800 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741799 [1] PetscCommDuplicate(): Using internal PETSc communicator 92 143 [1] PetscCommDuplicate(): returning tag 1073741798 [1] MatILUFactorSymbolic_SeqAIJ(): Reallocs 0 Fill ratio:given 50 needed 0.9375 [1] MatILUFactorSymbolic_SeqAIJ(): Run with -[sub_]pc_factor_fill 0.9375 or use [1] MatILUFactorSymbolic_SeqAIJ(): PCFactorSetFill([sub]pc,0.9375); [1] MatILUFactorSymbolic_SeqAIJ(): for best performance. [1] PetscCommDuplicate(): returning tag 1073741797 [1] Mat_CheckInode(): Found 6 nodes out of 6 rows. Not using Inode routines [1]PETSC ERROR: --------------------- Error Message ------------------------------------ [1]PETSC ERROR: Detected zero pivot in LU factorization see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! [1]PETSC ERROR: Zero pivot row 5 value 0 tolerance 0 * rowsum 0! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: /media/sdb1/bernhard/svnl/bogen/Cpp/newmesh/bin/./main on a omp_deb_m named node99 by bkubicek Tue Feb 19 11:04:15 2008 [1]PETSC ERROR: Libraries linked from /home/bkubicek/750/Software/petsc-2.3.3-p8/lib/omp_deb_mpi_cxx [1]PETSC ERROR: Configure run at Thu Jan 31 10:02:09 2008 [1]PETSC ERROR: Configure options --with-clanguage=c++ --with-x=0 --with-debugging=1 --with-shared=0 --with-default-arch=0 --with-mpi=1 COPTFLAGS=' -O2 -march=pentium4 -mtune=pentium4 ' FOPTFLAGS='-I -O2 -march=pentium4 -mtune=pentium4 ' [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 529 in src/mat/impls/aij/seq/aijfact.c [1]PETSC ERROR: MatLUFactorNumeric() line 2227 in src/mat/interface/matrix.c [1]PETSC ERROR: PCSetUp_ILU() line 564 in src/ksp/pc/impls/factor/ilu/ilu.c [1]PETSC ERROR: PCSetUp() line 787 in src/ksp/pc/interface/precon.c [1]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: PCSetUpOnBlocks_ASM() line 224 in src/ksp/pc/impls/asm/asm.c [1]PETSC ERROR: PCSetUpOnBlocks() line 820 in src/ksp/pc/interface/precon.c [1]PETSC ERROR: KSPSetUpOnBlocks() line 158 in src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: KSPSolve() line 348 in src/ksp/ksp/interface/itfunc.c PETSC_ERROR: Line 250 File: matrix.cpp Child process exited unexpectedly 0 Aborted (core dumped) -------------- next part -------------- [0] PetscInitialize(): PETSc successfully started: number of processors = 1 [0] PetscGetHostName(): Rejecting domainname, likely is NIS node99.(none) [0] PetscInitialize(): Running on machine: node99 Trying to read Gmsh .msh file "stab.gmsh" Gmsh2 file format recognised Read 44 nodes. alltogether number of elements including faces, ignored:61 [0] PetscCommDuplicate(): Duplicating a communicator 91 141 max tags = 1073741823 [0] PetscCommDuplicate(): returning tag 1073741823 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741822 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741821 [0] PetscCommDuplicate(): returning tag 1073741820 [0] PetscCommDuplicate(): returning tag 1073741819 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 472 unneeded,28 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3 [0] Mat_CheckInode(): Found 10 nodes out of 10 rows. Not using Inode routines [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0 unneeded,28 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741818 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -3.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 -3.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 1.00000e+00 1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00 -2.00000e+00 [0] PetscCommDuplicate(): returning tag 1073741817 0 0 0 0 0 -2 0 0 0 0 [0] PetscCommDuplicate(): returning tag 1073741816 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741815 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741814 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741813 [0] PetscCommDuplicate(): Using internal PETSc communicator 91 141 [0] PetscCommDuplicate(): returning tag 1073741812 [0] PetscCommDuplicate(): returning tag 1073741811 [0] PetscCommDuplicate(): returning tag 1073741810 [0] PetscCommDuplicate(): returning tag 1073741809 [0] PetscCommDuplicate(): returning tag 1073741808 [0] PetscCommDuplicate(): returning tag 1073741807 [0] PetscCommDuplicate(): returning tag 1073741806 [0] PetscCommDuplicate(): returning tag 1073741805 [0] PetscCommDuplicate(): returning tag 1073741804 [0] PetscCommDuplicate(): returning tag 1073741803 [0] PCSetUp(): Setting up new PC [0] PetscCommDuplicate(): returning tag 1073741802 [0] PetscCommDuplicate(): Duplicating a communicator 92 149 max tags = 1073741823 [0] PetscCommDuplicate(): returning tag 1073741823 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 92 [0] PetscCommDestroy(): Deleting PETSc MPI_Comm 149 [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 149 [0] Petsc_DelTag(): Deleting tag data in an MPI_Comm 149 [0] PetscCommDuplicate(): Duplicating a communicator 92 149 max tags = 1073741823 [0] PetscCommDuplicate(): returning tag 1073741823 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 149 [0] PetscCommDuplicate(): returning tag 1073741822 [0] PetscCommDuplicate(): returning tag 1073741821 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 149 [0] PetscCommDuplicate(): returning tag 1073741820 [0] PetscCommDuplicate(): returning tag 1073741801 [0] VecScatterCreate(): Special case: sequential vector general to stride [0] PetscCommDuplicate(): Using internal PETSc communicator 92 149 [0] PetscCommDuplicate(): returning tag 1073741819 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 149 [0] PetscCommDuplicate(): returning tag 1073741818 [0] PetscCommDuplicate(): returning tag 1073741800 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0 unneeded,28 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3 [0] Mat_CheckInode(): Found 10 nodes out of 10 rows. Not using Inode routines [0] PCSetUp(): Setting up new PC [0] PetscCommDuplicate(): Using internal PETSc communicator 92 149 [0] PetscCommDuplicate(): returning tag 1073741817 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 149 [0] PetscCommDuplicate(): returning tag 1073741816 [0] PetscCommDuplicate(): Using internal PETSc communicator 92 149 [0] PetscCommDuplicate(): returning tag 1073741815 [0] MatILUFactorSymbolic_SeqAIJ(): Reallocs 0 Fill ratio:given 50 needed 1 [0] MatILUFactorSymbolic_SeqAIJ(): Run with -[sub_]pc_factor_fill 1 or use [0] MatILUFactorSymbolic_SeqAIJ(): PCFactorSetFill([sub]pc,1); [0] MatILUFactorSymbolic_SeqAIJ(): for best performance. [0] PetscCommDuplicate(): returning tag 1073741799 [0] Mat_CheckInode(): Found 10 nodes out of 10 rows. Not using Inode routines [0] PetscCommDuplicate(): returning tag 1073741798 [0] KSPDefaultConverged(): user has provided nonzero initial guess, computing 2-norm of preconditioned RHS 0 KSP Residual norm 3.146479381785e+02 1 KSP Residual norm 5.527758587503e-14 2 KSP Residual norm 2.957120339776e-14 [0] PetscCommDuplicate(): returning tag 1073741797 [0] PetscCommDuplicate(): returning tag 1073741796 [0] PetscCommDuplicate(): returning tag 1073741795 [0] PetscCommDuplicate(): returning tag 1073741794 [0] PetscCommDuplicate(): returning tag 1073741793 [0] PetscCommDuplicate(): returning tag 1073741792 [0] PetscCommDuplicate(): returning tag 1073741791 [0] PetscCommDuplicate(): returning tag 1073741790 [0] PetscCommDuplicate(): returning tag 1073741789 [0] PetscCommDuplicate(): returning tag 1073741788 3 KSP Residual norm 2.311953842658e-14 4 KSP Residual norm 1.949598161624e-14 5 KSP Residual norm 1.718379006549e-14 6 KSP Residual norm 1.553657802016e-14 7 KSP Residual norm 1.428735337435e-14 8 KSP Residual norm 1.329791896231e-14 9 KSP Residual norm 1.248915022855e-14 10 KSP Residual norm 1.181200860607e-14 11 KSP Residual norm 1.123427218264e-14 12 KSP Residual norm 1.073378041180e-14 [0] PetscCommDuplicate(): returning tag 1073741787 [0] PetscCommDuplicate(): returning tag 1073741786 [0] PetscCommDuplicate(): returning tag 1073741785 [0] PetscCommDuplicate(): returning tag 1073741784 [0] PetscCommDuplicate(): returning tag 1073741783 [0] PetscCommDuplicate(): returning tag 1073741782 [0] PetscCommDuplicate(): returning tag 1073741781 [0] PetscCommDuplicate(): returning tag 1073741780 [0] PetscCommDuplicate(): returning tag 1073741779 [0] PetscCommDuplicate(): returning tag 1073741778 13 KSP Residual norm 1.029472464285e-14 14 KSP Residual norm 9.905483966479e-15 15 KSP Residual norm 9.557299004528e-15 16 KSP Residual norm 9.243425362697e-15 17 KSP Residual norm 8.958574339974e-15 18 KSP Residual norm 8.698532377314e-15 19 KSP Residual norm 8.459895437148e-15 20 KSP Residual norm 8.239879423284e-15 21 KSP Residual norm 8.036182185186e-15 22 KSP Residual norm 7.846881299150e-15 [0] PetscCommDuplicate(): returning tag 1073741777 [0] PetscCommDuplicate(): returning tag 1073741776 [0] PetscCommDuplicate(): returning tag 1073741775 [0] PetscCommDuplicate(): returning tag 1073741774 [0] PetscCommDuplicate(): returning tag 1073741773 [0] PetscCommDuplicate(): returning tag 1073741772 [0] PetscCommDuplicate(): returning tag 1073741771 [0] PetscCommDuplicate(): returning tag 1073741770 [0] PetscCommDuplicate(): returning tag 1073741769 [0] PetscCommDuplicate(): returning tag 1073741768 23 KSP Residual norm 7.670357156941e-15 24 KSP Residual norm 7.505234275465e-15 25 KSP Residual norm 7.350335936204e-15 26 KSP Residual norm 7.204648718167e-15 27 KSP Residual norm 7.067294471298e-15 28 KSP Residual norm 6.937507953310e-15 29 KSP Residual norm 6.814618825309e-15 30 KSP Residual norm 6.698037036492e-15 31 KSP Residual norm 6.587240868887e-15 32 KSP Residual norm 6.481767088291e-15 [0] PetscCommDuplicate(): returning tag 1073741767 [0] PetscCommDuplicate(): returning tag 1073741766 [0] PetscCommDuplicate(): returning tag 1073741765 [0] PetscCommDuplicate(): returning tag 1073741764 33 KSP Residual norm 6.381202776463e-15 34 KSP Residual norm 6.285178515610e-15 35 KSP Residual norm 8.829047469026e-14 [0] KSPDefaultConverged(): Linear solver has converged. Residual norm 2.7398e-30 is less than absolute tolerance 1e-15 at iteration 36 36 KSP Residual norm 2.739801485312e-30 KSP Object: type: gmres GMRES: restart=35, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1000 tolerances: relative=1e-15, absolute=1e-15, divergence=10000 left preconditioning PC Object: type: asm Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - RESTRICT Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(sub_) type: ilu ILU: 15 levels of fill ILU: factor fill ratio allocated 50 ILU: tolerance for zero pivot 1e-12 out-of-place factorization matrix ordering: rcm ILU: factor fill ratio needed 1 Factored matrix follows Matrix Object: type=seqaij, rows=10, cols=10 total: nonzeros=28, allocated nonzeros=28 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=10, cols=10 total: nonzeros=28, allocated nonzeros=28 not using I-node routines [0] PetscCommDuplicate(): returning tag 1073741763 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=10, cols=10 total: nonzeros=28, allocated nonzeros=500 not using I-node routines [0] PetscCommDuplicate(): returning tag 1073741762 [0] PetscCommDuplicate(): returning tag 1073741761 [0] KSPDefaultConverged(): user has provided nonzero initial guess, computing 2-norm of preconditioned RHS [0] KSPDefaultConverged(): Linear solver has converged. Residual norm 5.27447e-16 is less than absolute tolerance 1e-15 at iteration 0 0 KSP Residual norm 5.274472300304e-16 KSP Object: type: gmres GMRES: restart=35, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1000 tolerances: relative=1e-15, absolute=1e-15, divergence=10000 left preconditioning PC Object: type: asm Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - RESTRICT Local solve is same for all blocks, in the following KSP and PC objects: KSP Object:(sub_) type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning PC Object:(sub_) type: ilu ILU: 15 levels of fill ILU: factor fill ratio allocated 50 ILU: tolerance for zero pivot 1e-12 out-of-place factorization matrix ordering: rcm ILU: factor fill ratio needed 1 Factored matrix follows Matrix Object: type=seqaij, rows=10, cols=10 total: nonzeros=28, allocated nonzeros=28 not using I-node routines linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=10, cols=10 total: nonzeros=28, allocated nonzeros=28 not using I-node routines [0] PetscCommDuplicate(): returning tag 1073741760 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=10, cols=10 total: nonzeros=28, allocated nonzeros=500 not using I-node routines [0] PetscCommDuplicate(): returning tag 1073741759 Bnds: 2 NBnds:58 [0] PetscCommDuplicate(): returning tag 1073741758 [0] Petsc_DelViewer(): Deleting viewer data in an MPI_Comm 141 [0] PetscCommDuplicate(): returning tag 1073741757 OptionTable: -info OptionTable: -ksp_atol 1.e-15 OptionTable: -ksp_gmres_restart 35 OptionTable: -ksp_max_it 1000 OptionTable: -ksp_monitor OptionTable: -ksp_rtol 1.e-15 OptionTable: -ksp_view OptionTable: -options_left OptionTable: -pc_type asm OptionTable: -sub_pc_factor_fill 50 OptionTable: -sub_pc_factor_levels 15 OptionTable: -sub_pc_factor_mat_ordering_type rcm OptionTable: -sub_pc_type ilu There are no unused options. From bsmith at mcs.anl.gov Tue Feb 19 07:19:38 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 19 Feb 2008 07:19:38 -0600 Subject: Parallel matrix assembly - SetValues Problem? In-Reply-To: <1203415965.7200.108.camel@node99> References: <1203415965.7200.108.camel@node99> Message-ID: <60948BCF-4D67-439C-9DB2-171BF6158571@mcs.anl.gov> Send the code to petsc-maint at mcs.anl.gov and we'll take a look at it. Barry On Feb 19, 2008, at 4:12 AM, Bernhard Kubicek wrote: > Dear List, > > sorry to bother you but I just finished reading the whole archive and > couldn't find a solution to a problem of mine that keeps on > bothering me > now for 7+ days. > > The problem is that my code produces different matrices if run in > parallel or single cpu. > > I do a manual partitioning of the mesh by using metis by hand. > Thereafter, there is a list of finite-volume elements that I want to > be > stored on the individual cpu and a renumbering that is manged somehow. > I create my matrix with > > MatCreateMPIAIJ(PETSC_COMM_WORLD,mycount,mycount, > PETSC_DETERMINE,PETSC_DETERMINE,50,PETSC_NULL,50,PETSC_NULL,&A), > > where mycount is different on each cpu, and is the mentioned number of > elements I wish to have there locally. > > > for each local row/element at the same time let a user calculate the > matrix elements and column positions, and the right hand side values > for > this row. I output those for debugging. Within the loop for the local > rows, I call > MatSetValues(A,1,&i,nrEntries,entries,v,INSERT_VALUES) > VecSetValue(rhs,i,rhsval,INSERT_VALUES) > in this order > > When I run on one cpu, everything works nicely. A 3d mesh of a > 10-element long bar with each element having volume 1, creates the > following matrix: > -2. 1. 0. 0. 0. 0. 0. 0. 0. 1. > 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. > 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. > 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. > 0. 0. 0. 1. -3. 0. 0. 0. 0. 0. > 0. 0. 0. 0. 0. -3. 1. 0. 0. 0. > 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. > 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. > 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. > 1. 0. 0. 0. 0. 0. 0. 0. 1. -2. > > rhs > 0 > 0 > 0 > 0 > 0 > -2 > 0 > 0 > 0 > 0 > The CPU sets the matrix and rhs like this ( global matrix row: > column/value ...column/value | rhs-value ) > Row 0 cols:0/-2 9/1 1/1 | 0 > Row 1 cols:1/-2 0/1 2/1 | 0 > Row 2 cols:2/-2 1/1 3/1 | 0 > Row 3 cols:3/-2 2/1 4/1 | 0 > Row 4 cols:4/-3 3/1 | 0 > Row 5 cols:5/-3 6/1 | -2 > Row 6 cols:6/-2 5/1 7/1 | 0 > Row 7 cols:7/-2 6/1 8/1 | 0 > Row 8 cols:8/-2 7/1 9/1 | 0 > Row 9 cols:9/-2 0/1 8/1 | 0 > because of the meshing the central rows in the matrix are the most > exterior elements, on which wall boundary condition 0 and 1 are set > (laplace equation). > > one 2 cpus, > the matrix looses is different, although the global-local element > renumbering is defacto nonexisting (cpu 0: rows 0-4, cpu 1: rows 5-9): > 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. > 1. -2. 1. 0. 0. 0. 0. 0. 0. 0. > 0. 1. -2. 1. 0. 0. 0. 0. 0. 0. > 0. 0. 1. -2. 1. 0. 0. 0. 0. 0. > 0. 0. 0. 1. -3. 0. 0. 0. 0. 0. > 0. 0. 0. 0. 0. -3. 1. 0. 0. 0. > 0. 0. 0. 0. 0. 1. -2. 1. 0. 0. > 0. 0. 0. 0. 0. 0. 1. -2. 1. 0. > 0. 0. 0. 0. 0. 0. 0. 1. -2. 1. > 1. 0. 0. 0. 0. 0. 0. 0. 1. -2. > Process [0] > 0 > 0 > 0 > 0 > 0 > Process [1] > -2 > 0 > 0 > 0 > 0 > here cpu 0 sets: > Row 0 cols:0/-2 9/1 1/1 |0 > Row 1 cols:1/-2 0/1 2/1 |0 > Row 2 cols:2/-2 1/1 3/1 |0 > Row 3 cols:3/-2 2/1 4/1 |0 > Row 4 cols:4/-3 3/1 |0 > > and cpu 1: > Row 5 cols:5/-3 6/1|-2 > Row 6 cols:6/-2 5/1 7/1 |0 > Row 7 cols:7/-2 6/1 8/1 |0 > Row 8 cols:8/-2 7/1 9/1 |0 > Row 9 cols:9/-2 0/1 8/1 |0 > I triple verified that ***SetValues is called with the exactly same > values as on one cpu, and that nothing is set twice, and that every > cpu > sets it's correct columns. Also for more sophisticated renumberings > > Attached are the outputs when run with with -info. > > My current guess is that I create the matrix falsely, or that I cannot > mix the setting of Vec and Mat values before their > respective ???AssemblyBegin/Ends. > > If anyone has any idea where the problem is, is would be extremely > nice > to help me here. > > Thank you very much, even for the slightest help > Bernhard Kubicek > > ------ > Physics Doctorate Student Techn. University of Vienna, Austria > Freelancer arsenal research, Vienna Austria > > > > > > From jens.madsen at risoe.dk Tue Feb 19 08:21:15 2008 From: jens.madsen at risoe.dk (jens.madsen at risoe.dk) Date: Tue, 19 Feb 2008 15:21:15 +0100 Subject: Poor performance with BoomerAMG? In-Reply-To: <30BAACA0-5EAE-4A7C-8BD7-2E34071CAE57@mcs.anl.gov> References: <20080215163600.ABA57782@batman.int.colorado.edu> <30BAACA0-5EAE-4A7C-8BD7-2E34071CAE57@mcs.anl.gov> Message-ID: Hi Barry Two questions. 1) What do you mean with "volume" and "wrong scaling"? Could translate this to some other terms? I have a book by Ulrich Trottenberg "Multigrid" and the book by Saad, but could not find similar. 2) Do you know of any summerschools in scientific computing, focusing on Krylov methods, multigrids and preconditioning(all parallel)? Kind Regards Jens Madsen Ph.d.-studerende Phone direct +45 4677 4560 Mobile jens.madsen at risoe.dk Optics and Plasma Research Department Ris? National Laboratory Technical University of Denmark - DTU Building 128, P.O. Box 49 DK-4000 Roskilde, Denmark Tel +45 4677 4500 Fax +45 4677 4565 www.risoe.dk >From 1 January 2007, Ris? National Laboratory, the Danish Institute for Food and Veterinary Research, the Danish Institute for Fisheries Research, the Danish National Space Center and the Danish Transport Research Institute have been merged with the Technical University of Denmark (DTU) with DTU as the continuing unit. -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith Sent: Saturday, February 16, 2008 6:49 PM To: petsc-users at mcs.anl.gov Subject: Re: Poor performance with BoomerAMG? All multigrid solvers depend on proper scaling of the variables. For example for a Laplacian operator the matrix entries are \integral \grad \phi_i dot \grad \phi_j now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the volume is O(h^3) meaning the matrix entries are O(h). Now say you impose a Dirichlet boundary conditions by just saying u_k = g_k. In 2d this is ok but in 3d you need to use h*u_k = h*g_k otherwise when you restrict to the coarser grid the resulting matrix entries for the boundary are "out of whack" with the matrix entries for the interior of the domain. Actually most preconditioners and Krylov methods behavior does depend on the row scaling; multigrid is just particularly sensitive. Barry On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote: > > >> Be careful how you handle boundary conditions; you need to make sure >> they have the same scaling as the other equations. > > Could you clarify what you mean? Is boomerAMG sensitive to scaling > of matrix rows in a way that other solvers/preconditioners are not? > > Andrew > >> >> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: >> >>> Hi Ben, >>> >>> Thank you for answering. With gmres and boomeramg I get a run time >>> of >>> 2s, so that is much better. However, if I increase the grid size to >>> 513x513, I get a run time of one minute. With richardson, it fails >>> to converge. >>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for >>> the 513x513 problem. >>> >>> When using the DMMG framework, I just used the default solvers. >>> I use the Galerkin process to generate the coarse matrices for >>> the multigrid cycle. >>> >>> Best, >>> Knut >>> >>> Siterer Ben Tay : >>> >>>> Hi Knut, >>>> >>>> I'm currently using boomeramg to solve my poisson eqn too. I'm >>>> using it >>>> on my structured C-grid. I found it to be faster than LU, >>>> especially as >>>> the grid size increases. However I use it as a preconditioner with >>>> GMRES as the solver. Have you tried this option? Although it's >>>> faster, >>>> the speed increase is usually less than double. It seems to be >>>> worse if >>>> there is a lot of stretching in the grid. >>>> >>>> Btw, your mention using the DMMG framework and it takes less than a >>>> sec. What solver or preconditioner did you use? It's 4 times faster >>>> than GMRES... >>>> >>>> thanks! >>>> >>>> knutert at stud.ntnu.no wrote: >>>>> Hello, >>>>> >>>>> I am trying to use the hypre multigrid solver to solve a Poisson >>>>> equation. >>>>> However, on a test case with grid size 257x257 it takes 40 >>>>> seconds to converge >>>>> on one processor when I run with >>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >>>>> >>>>> Using the DMMG framework, the same problem takes less than a >>>>> second, >>>>> and the default gmres solver uses only four seconds. >>>>> >>>>> Am I somehow using the solver the wrong way, or is this >>>>> performance expected? >>>>> >>>>> Regards >>>>> Knut Erik Teigen >>>>> >>>>> >>> >>> >>> >> > From bsmith at mcs.anl.gov Tue Feb 19 15:56:18 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 19 Feb 2008 15:56:18 -0600 Subject: Poor performance with BoomerAMG? In-Reply-To: <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no> References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov> <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no> Message-ID: BoomerAMG works like a charm. Likely you forgot the -pc_hypre_type boomeramg Hmm, I think I'll change the default solver to boomeramg Barry barry-smiths-macbook-pro-17:ksp/examples/tutorials] bsmith% ./ex1f - ksp_monitor -pc_type hypre -pc_hypre_type boomeramg -m 513 -n 513 - ksp_type richardson -ksp_view p= 1 0 KSP Residual norm 4.213878296084e+03 1 KSP Residual norm 2.135189837330e+02 2 KSP Residual norm 1.225934028865e+01 3 KSP Residual norm 7.255859884400e-01 4 KSP Residual norm 4.353504737395e-02 5 KSP Residual norm 2.643035146258e-03 6 KSP Residual norm 1.628271972668e-04 KSP Object: type: richardson Richardson: damping factor=1 maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.25 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 HYPRE BoomerAMG: Maximum row sums 0.9 HYPRE BoomerAMG: Sweeps down 1 HYPRE BoomerAMG: Sweeps up 1 HYPRE BoomerAMG: Sweeps on coarse 1 HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax on coarse Gaussian-elimination HYPRE BoomerAMG: Relax weight (all) 1 HYPRE BoomerAMG: Outer relax weight (all) 1 HYPRE BoomerAMG: Using CF-relaxation HYPRE BoomerAMG: Measure type local HYPRE BoomerAMG: Coarsen type Falgout HYPRE BoomerAMG: Interpolation type classical linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=263169, cols=263169 total: nonzeros=1313793, allocated nonzeros=1315845 not using I-node routines Iterations: 7 [barry-smiths-macbook-pro-17:ksp/examples/tutorials] bsmith% ./ex1f - ksp_monitor -pc_type hypre -pc_hypre_type boomeramg -m 513 -n 513 - ksp_type gmres -ksp_view p= 1 0 KSP Residual norm 4.213878296084e+03 1 KSP Residual norm 5.272381634094e+01 2 KSP Residual norm 8.107668116258e-01 3 KSP Residual norm 1.807380875232e-02 4 KSP Residual norm 4.068259191532e-04 KSP Object: type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000 left preconditioning PC Object: type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.25 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 HYPRE BoomerAMG: Maximum row sums 0.9 HYPRE BoomerAMG: Sweeps down 1 HYPRE BoomerAMG: Sweeps up 1 HYPRE BoomerAMG: Sweeps on coarse 1 HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax on coarse Gaussian-elimination HYPRE BoomerAMG: Relax weight (all) 1 HYPRE BoomerAMG: Outer relax weight (all) 1 HYPRE BoomerAMG: Using CF-relaxation HYPRE BoomerAMG: Measure type local HYPRE BoomerAMG: Coarsen type Falgout HYPRE BoomerAMG: Interpolation type classical linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=263169, cols=263169 total: nonzeros=1313793, allocated nonzeros=1315845 not using I-node routines Iterations: 4 On Feb 18, 2008, at 1:57 AM, knutert at stud.ntnu.no wrote: > Thank you for the reply, Barry. > > The same thing happens if I use hypre with the DMMG solver. > As you say, with hypre, the convergence is extremely slow, requiring > a lot of iterations, 1413 iterations (1820 if I use richardson) for > a 257x257 > problem, while the default only needs 5. > > I use the same way of handling boundary conditions in the two codes. > I've also compared the coeff matrix and rhs, and they are equal. > > -Knut Erik- > > Siterer Barry Smith : > >> >> Run with the DMMG solver with the option -pc_type hypre >> What happens? Then run again with the additional option -ksp_type >> richardson >> >> Is hypre taking many, many iterations which is causing the slow >> speed? >> >> I expect there is something wrong with your code that does not use >> DMMG. >> Be careful how you handle boundary conditions; you need to make sure >> they have the same scaling as the other equations. >> >> Barry >> >> >> >> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: >> >>> Hi Ben, >>> >>> Thank you for answering. With gmres and boomeramg I get a run time >>> of >>> 2s, so that is much better. However, if I increase the grid size to >>> 513x513, I get a run time of one minute. With richardson, it >>> fails to converge. >>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s >>> for the 513x513 problem. >>> >>> When using the DMMG framework, I just used the default solvers. >>> I use the Galerkin process to generate the coarse matrices for >>> the multigrid cycle. >>> >>> Best, >>> Knut >>> >>> Siterer Ben Tay : >>> >>>> Hi Knut, >>>> >>>> I'm currently using boomeramg to solve my poisson eqn too. I'm >>>> using it >>>> on my structured C-grid. I found it to be faster than LU, >>>> especially as >>>> the grid size increases. However I use it as a preconditioner with >>>> GMRES as the solver. Have you tried this option? Although it's >>>> faster, >>>> the speed increase is usually less than double. It seems to be >>>> worse if >>>> there is a lot of stretching in the grid. >>>> >>>> Btw, your mention using the DMMG framework and it takes less than a >>>> sec. What solver or preconditioner did you use? It's 4 times faster >>>> than GMRES... >>>> >>>> thanks! >>>> >>>> knutert at stud.ntnu.no wrote: >>>>> Hello, >>>>> >>>>> I am trying to use the hypre multigrid solver to solve a Poisson >>>>> equation. >>>>> However, on a test case with grid size 257x257 it takes 40 >>>>> seconds to converge >>>>> on one processor when I run with >>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >>>>> >>>>> Using the DMMG framework, the same problem takes less than a >>>>> second, >>>>> and the default gmres solver uses only four seconds. >>>>> >>>>> Am I somehow using the solver the wrong way, or is this >>>>> performance expected? >>>>> >>>>> Regards >>>>> Knut Erik Teigen >>>>> >>>>> >>> >>> >>> > > > From bsmith at mcs.anl.gov Tue Feb 19 17:04:14 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 19 Feb 2008 17:04:14 -0600 Subject: Poor performance with BoomerAMG? In-Reply-To: References: <20080215163600.ABA57782@batman.int.colorado.edu> <30BAACA0-5EAE-4A7C-8BD7-2E34071CAE57@mcs.anl.gov> Message-ID: <9CC301B6-B41A-49EF-B9E2-43CF7890CF10@mcs.anl.gov> Trottenberg has a discussion page 178; see the box that begins at the bottom of the page and continues onto the next one). See also the discussion at the bottom of page 182 with equations 5.6.14 and 5.6.15, I totally disagree with his suggestion of interpolating boundary nodes differently from interior nodes. It makes the code unnecessarily complicated. So long as you have the boundary equations suitably scaled you can simply interpolate everywhere identically. Barry On Feb 19, 2008, at 8:21 AM, jens.madsen at risoe.dk wrote: > Hi Barry > > Two questions. > > 1) What do you mean with "volume" and "wrong scaling"? Could > translate this to some other terms? I have a book by Ulrich > Trottenberg "Multigrid" and the book by Saad, but could not find > similar. > > 2) Do you know of any summerschools in scientific computing, > focusing on Krylov methods, multigrids and preconditioning(all > parallel)? > > Kind Regards > > Jens Madsen > Ph.d.-studerende > Phone direct +45 4677 4560 > Mobile > jens.madsen at risoe.dk > > Optics and Plasma Research Department > Ris? National Laboratory > Technical University of Denmark - DTU > Building 128, P.O. Box 49 > DK-4000 Roskilde, Denmark > Tel +45 4677 4500 > Fax +45 4677 4565 > www.risoe.dk > > From 1 January 2007, Ris? National Laboratory, the Danish Institute > for Food and Veterinary Research, > the Danish Institute for Fisheries Research, the Danish National > Space Center and > the Danish Transport Research Institute have been merged with > the Technical University of Denmark (DTU) with DTU as the continuing > unit. > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] On Behalf Of Barry Smith > Sent: Saturday, February 16, 2008 6:49 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Poor performance with BoomerAMG? > > > All multigrid solvers depend on proper scaling of the variables. > For example > for a Laplacian operator the matrix entries are > > \integral \grad \phi_i dot \grad \phi_j > > now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms > in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the > volume is O(h^3) > meaning the matrix entries are O(h). Now say you impose a Dirichlet > boundary > conditions by just saying u_k = g_k. In 2d this is ok but in 3d > you need to > use h*u_k = h*g_k otherwise when you restrict to the coarser grid the > resulting matrix entries for the boundary are "out of whack" with the > matrix > entries for the interior of the domain. > > Actually most preconditioners and Krylov methods behavior does depend > on the row scaling; multigrid is just particularly sensitive. > > Barry > > > On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote: > >> >> >>> Be careful how you handle boundary conditions; you need to make sure >>> they have the same scaling as the other equations. >> >> Could you clarify what you mean? Is boomerAMG sensitive to scaling >> of matrix rows in a way that other solvers/preconditioners are not? >> >> Andrew >> >>> >>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: >>> >>>> Hi Ben, >>>> >>>> Thank you for answering. With gmres and boomeramg I get a run time >>>> of >>>> 2s, so that is much better. However, if I increase the grid size to >>>> 513x513, I get a run time of one minute. With richardson, it fails >>>> to converge. >>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for >>>> the 513x513 problem. >>>> >>>> When using the DMMG framework, I just used the default solvers. >>>> I use the Galerkin process to generate the coarse matrices for >>>> the multigrid cycle. >>>> >>>> Best, >>>> Knut >>>> >>>> Siterer Ben Tay : >>>> >>>>> Hi Knut, >>>>> >>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm >>>>> using it >>>>> on my structured C-grid. I found it to be faster than LU, >>>>> especially as >>>>> the grid size increases. However I use it as a preconditioner with >>>>> GMRES as the solver. Have you tried this option? Although it's >>>>> faster, >>>>> the speed increase is usually less than double. It seems to be >>>>> worse if >>>>> there is a lot of stretching in the grid. >>>>> >>>>> Btw, your mention using the DMMG framework and it takes less >>>>> than a >>>>> sec. What solver or preconditioner did you use? It's 4 times >>>>> faster >>>>> than GMRES... >>>>> >>>>> thanks! >>>>> >>>>> knutert at stud.ntnu.no wrote: >>>>>> Hello, >>>>>> >>>>>> I am trying to use the hypre multigrid solver to solve a Poisson >>>>>> equation. >>>>>> However, on a test case with grid size 257x257 it takes 40 >>>>>> seconds to converge >>>>>> on one processor when I run with >>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre >>>>>> boomeramg >>>>>> >>>>>> Using the DMMG framework, the same problem takes less than a >>>>>> second, >>>>>> and the default gmres solver uses only four seconds. >>>>>> >>>>>> Am I somehow using the solver the wrong way, or is this >>>>>> performance expected? >>>>>> >>>>>> Regards >>>>>> Knut Erik Teigen >>>>>> >>>>>> >>>> >>>> >>>> >>> >> > > From knutert at stud.ntnu.no Wed Feb 20 01:47:22 2008 From: knutert at stud.ntnu.no (knutert at stud.ntnu.no) Date: Wed, 20 Feb 2008 08:47:22 +0100 Subject: Poor performance with BoomerAMG? In-Reply-To: References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov> <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no> Message-ID: <20080220084722.xljveqv9zc44gsw4@webmail.ntnu.no> Wow, that is embarrassing...I had put -pc_type_hypre instead of _pc_hypre_type. Thanks! -Knut Erik- Siterer Barry Smith : > > BoomerAMG works like a charm. Likely you forgot the -pc_hypre_type > boomeramg > > Hmm, I think I'll change the default solver to boomeramg > > Barry > > barry-smiths-macbook-pro-17:ksp/examples/tutorials] bsmith% ./ex1f > -ksp_monitor -pc_type hypre -pc_hypre_type boomeramg -m 513 -n 513 > -ksp_type richardson -ksp_view > p= 1 > 0 KSP Residual norm 4.213878296084e+03 > 1 KSP Residual norm 2.135189837330e+02 > 2 KSP Residual norm 1.225934028865e+01 > 3 KSP Residual norm 7.255859884400e-01 > 4 KSP Residual norm 4.353504737395e-02 > 5 KSP Residual norm 2.643035146258e-03 > 6 KSP Residual norm 1.628271972668e-04 > KSP Object: > type: richardson > Richardson: damping factor=1 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-07, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: hypre > HYPRE BoomerAMG preconditioning > HYPRE BoomerAMG: Cycle type V > HYPRE BoomerAMG: Maximum number of levels 25 > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > HYPRE BoomerAMG: Threshold for strong coupling 0.25 > HYPRE BoomerAMG: Interpolation truncation factor 0 > HYPRE BoomerAMG: Interpolation: max elements per row 0 > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > HYPRE BoomerAMG: Maximum row sums 0.9 > HYPRE BoomerAMG: Sweeps down 1 > HYPRE BoomerAMG: Sweeps up 1 > HYPRE BoomerAMG: Sweeps on coarse 1 > HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax on coarse Gaussian-elimination > HYPRE BoomerAMG: Relax weight (all) 1 > HYPRE BoomerAMG: Outer relax weight (all) 1 > HYPRE BoomerAMG: Using CF-relaxation > HYPRE BoomerAMG: Measure type local > HYPRE BoomerAMG: Coarsen type Falgout > HYPRE BoomerAMG: Interpolation type classical > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=263169, cols=263169 > total: nonzeros=1313793, allocated nonzeros=1315845 > not using I-node routines > Iterations: 7 > > [barry-smiths-macbook-pro-17:ksp/examples/tutorials] bsmith% ./ex1f > -ksp_monitor -pc_type hypre -pc_hypre_type boomeramg -m 513 -n 513 > -ksp_type gmres -ksp_view > p= 1 > 0 KSP Residual norm 4.213878296084e+03 > 1 KSP Residual norm 5.272381634094e+01 > 2 KSP Residual norm 8.107668116258e-01 > 3 KSP Residual norm 1.807380875232e-02 > 4 KSP Residual norm 4.068259191532e-04 > KSP Object: > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-07, absolute=1e-50, divergence=10000 > left preconditioning > PC Object: > type: hypre > HYPRE BoomerAMG preconditioning > HYPRE BoomerAMG: Cycle type V > HYPRE BoomerAMG: Maximum number of levels 25 > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > HYPRE BoomerAMG: Threshold for strong coupling 0.25 > HYPRE BoomerAMG: Interpolation truncation factor 0 > HYPRE BoomerAMG: Interpolation: max elements per row 0 > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > HYPRE BoomerAMG: Maximum row sums 0.9 > HYPRE BoomerAMG: Sweeps down 1 > HYPRE BoomerAMG: Sweeps up 1 > HYPRE BoomerAMG: Sweeps on coarse 1 > HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax on coarse Gaussian-elimination > HYPRE BoomerAMG: Relax weight (all) 1 > HYPRE BoomerAMG: Outer relax weight (all) 1 > HYPRE BoomerAMG: Using CF-relaxation > HYPRE BoomerAMG: Measure type local > HYPRE BoomerAMG: Coarsen type Falgout > HYPRE BoomerAMG: Interpolation type classical > linear system matrix = precond matrix: > Matrix Object: > type=seqaij, rows=263169, cols=263169 > total: nonzeros=1313793, allocated nonzeros=1315845 > not using I-node routines > Iterations: 4 > > > On Feb 18, 2008, at 1:57 AM, knutert at stud.ntnu.no wrote: > >> Thank you for the reply, Barry. >> >> The same thing happens if I use hypre with the DMMG solver. >> As you say, with hypre, the convergence is extremely slow, requiring >> a lot of iterations, 1413 iterations (1820 if I use richardson) for >> a 257x257 >> problem, while the default only needs 5. >> >> I use the same way of handling boundary conditions in the two codes. >> I've also compared the coeff matrix and rhs, and they are equal. >> >> -Knut Erik- >> >> Siterer Barry Smith : >> >>> >>> Run with the DMMG solver with the option -pc_type hypre >>> What happens? Then run again with the additional option -ksp_type >>> richardson >>> >>> Is hypre taking many, many iterations which is causing the slow speed? >>> >>> I expect there is something wrong with your code that does not use DMMG. >>> Be careful how you handle boundary conditions; you need to make sure >>> they have the same scaling as the other equations. >>> >>> Barry >>> >>> >>> >>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: >>> >>>> Hi Ben, >>>> >>>> Thank you for answering. With gmres and boomeramg I get a run time of >>>> 2s, so that is much better. However, if I increase the grid size to >>>> 513x513, I get a run time of one minute. With richardson, it >>>> fails to converge. >>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s >>>> for the 513x513 problem. >>>> >>>> When using the DMMG framework, I just used the default solvers. >>>> I use the Galerkin process to generate the coarse matrices for >>>> the multigrid cycle. >>>> >>>> Best, >>>> Knut >>>> >>>> Siterer Ben Tay : >>>> >>>>> Hi Knut, >>>>> >>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm using it >>>>> on my structured C-grid. I found it to be faster than LU, especially as >>>>> the grid size increases. However I use it as a preconditioner with >>>>> GMRES as the solver. Have you tried this option? Although it's faster, >>>>> the speed increase is usually less than double. It seems to be worse if >>>>> there is a lot of stretching in the grid. >>>>> >>>>> Btw, your mention using the DMMG framework and it takes less than a >>>>> sec. What solver or preconditioner did you use? It's 4 times faster >>>>> than GMRES... >>>>> >>>>> thanks! >>>>> >>>>> knutert at stud.ntnu.no wrote: >>>>>> Hello, >>>>>> >>>>>> I am trying to use the hypre multigrid solver to solve a >>>>>> Poisson equation. >>>>>> However, on a test case with grid size 257x257 it takes 40 >>>>>> seconds to converge >>>>>> on one processor when I run with >>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg >>>>>> >>>>>> Using the DMMG framework, the same problem takes less than a second, >>>>>> and the default gmres solver uses only four seconds. >>>>>> >>>>>> Am I somehow using the solver the wrong way, or is this >>>>>> performance expected? >>>>>> >>>>>> Regards >>>>>> Knut Erik Teigen >>>>>> >>>>>> >>>> >>>> >>>> >> >> >> From jens.madsen at risoe.dk Wed Feb 20 14:54:18 2008 From: jens.madsen at risoe.dk (jens.madsen at risoe.dk) Date: Wed, 20 Feb 2008 21:54:18 +0100 Subject: Poor performance with BoomerAMG? In-Reply-To: <9CC301B6-B41A-49EF-B9E2-43CF7890CF10@mcs.anl.gov> Message-ID: Thank you Barry. I'll take a look at it:-) Did you have any summerschool suggestions? Kind Regards Jens -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith Sent: Wednesday, February 20, 2008 12:04 AM To: petsc-users at mcs.anl.gov Subject: Re: Poor performance with BoomerAMG? Trottenberg has a discussion page 178; see the box that begins at the bottom of the page and continues onto the next one). See also the discussion at the bottom of page 182 with equations 5.6.14 and 5.6.15, I totally disagree with his suggestion of interpolating boundary nodes differently from interior nodes. It makes the code unnecessarily complicated. So long as you have the boundary equations suitably scaled you can simply interpolate everywhere identically. Barry On Feb 19, 2008, at 8:21 AM, jens.madsen at risoe.dk wrote: > Hi Barry > > Two questions. > > 1) What do you mean with "volume" and "wrong scaling"? Could > translate this to some other terms? I have a book by Ulrich > Trottenberg "Multigrid" and the book by Saad, but could not find > similar. > > 2) Do you know of any summerschools in scientific computing, > focusing on Krylov methods, multigrids and preconditioning(all > parallel)? > > Kind Regards > > Jens Madsen > Ph.d.-studerende > Phone direct +45 4677 4560 > Mobile > jens.madsen at risoe.dk > > Optics and Plasma Research Department > Ris? National Laboratory > Technical University of Denmark - DTU > Building 128, P.O. Box 49 > DK-4000 Roskilde, Denmark > Tel +45 4677 4500 > Fax +45 4677 4565 > www.risoe.dk > > From 1 January 2007, Ris? National Laboratory, the Danish Institute > for Food and Veterinary Research, > the Danish Institute for Fisheries Research, the Danish National > Space Center and > the Danish Transport Research Institute have been merged with > the Technical University of Denmark (DTU) with DTU as the continuing > unit. > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] On Behalf Of Barry Smith > Sent: Saturday, February 16, 2008 6:49 PM > To: petsc-users at mcs.anl.gov > Subject: Re: Poor performance with BoomerAMG? > > > All multigrid solvers depend on proper scaling of the variables. > For example > for a Laplacian operator the matrix entries are > > \integral \grad \phi_i dot \grad \phi_j > > now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms > in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the > volume is O(h^3) > meaning the matrix entries are O(h). Now say you impose a Dirichlet > boundary > conditions by just saying u_k = g_k. In 2d this is ok but in 3d > you need to > use h*u_k = h*g_k otherwise when you restrict to the coarser grid the > resulting matrix entries for the boundary are "out of whack" with the > matrix > entries for the interior of the domain. > > Actually most preconditioners and Krylov methods behavior does depend > on the row scaling; multigrid is just particularly sensitive. > > Barry > > > On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote: > >> >> >>> Be careful how you handle boundary conditions; you need to make sure >>> they have the same scaling as the other equations. >> >> Could you clarify what you mean? Is boomerAMG sensitive to scaling >> of matrix rows in a way that other solvers/preconditioners are not? >> >> Andrew >> >>> >>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: >>> >>>> Hi Ben, >>>> >>>> Thank you for answering. With gmres and boomeramg I get a run time >>>> of >>>> 2s, so that is much better. However, if I increase the grid size to >>>> 513x513, I get a run time of one minute. With richardson, it fails >>>> to converge. >>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for >>>> the 513x513 problem. >>>> >>>> When using the DMMG framework, I just used the default solvers. >>>> I use the Galerkin process to generate the coarse matrices for >>>> the multigrid cycle. >>>> >>>> Best, >>>> Knut >>>> >>>> Siterer Ben Tay : >>>> >>>>> Hi Knut, >>>>> >>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm >>>>> using it >>>>> on my structured C-grid. I found it to be faster than LU, >>>>> especially as >>>>> the grid size increases. However I use it as a preconditioner with >>>>> GMRES as the solver. Have you tried this option? Although it's >>>>> faster, >>>>> the speed increase is usually less than double. It seems to be >>>>> worse if >>>>> there is a lot of stretching in the grid. >>>>> >>>>> Btw, your mention using the DMMG framework and it takes less >>>>> than a >>>>> sec. What solver or preconditioner did you use? It's 4 times >>>>> faster >>>>> than GMRES... >>>>> >>>>> thanks! >>>>> >>>>> knutert at stud.ntnu.no wrote: >>>>>> Hello, >>>>>> >>>>>> I am trying to use the hypre multigrid solver to solve a Poisson >>>>>> equation. >>>>>> However, on a test case with grid size 257x257 it takes 40 >>>>>> seconds to converge >>>>>> on one processor when I run with >>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre >>>>>> boomeramg >>>>>> >>>>>> Using the DMMG framework, the same problem takes less than a >>>>>> second, >>>>>> and the default gmres solver uses only four seconds. >>>>>> >>>>>> Am I somehow using the solver the wrong way, or is this >>>>>> performance expected? >>>>>> >>>>>> Regards >>>>>> Knut Erik Teigen >>>>>> >>>>>> >>>> >>>> >>>> >>> >> > > From bsmith at mcs.anl.gov Wed Feb 20 18:57:57 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 20 Feb 2008 18:57:57 -0600 Subject: Poor performance with BoomerAMG? In-Reply-To: References: Message-ID: On Feb 20, 2008, at 2:54 PM, jens.madsen at risoe.dk wrote: > Thank you Barry. I'll take a look at it:-) > > Did you have any summerschool suggestions? Sorry I don't know of any, Barry > > > Kind Regards > > Jens > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov > ] On Behalf Of Barry Smith > Sent: Wednesday, February 20, 2008 12:04 AM > To: petsc-users at mcs.anl.gov > Subject: Re: Poor performance with BoomerAMG? > > > Trottenberg has a discussion page 178; see the box that begins at > the bottom of the page and > continues onto the next one). See also the discussion at the bottom of > page 182 with equations > 5.6.14 and 5.6.15, > > I totally disagree with his suggestion of interpolating boundary > nodes differently from > interior nodes. It makes the code unnecessarily complicated. So long > as you have the > boundary equations suitably scaled you can simply interpolate > everywhere identically. > > Barry > > > > On Feb 19, 2008, at 8:21 AM, jens.madsen at risoe.dk wrote: > >> Hi Barry >> >> Two questions. >> >> 1) What do you mean with "volume" and "wrong scaling"? Could >> translate this to some other terms? I have a book by Ulrich >> Trottenberg "Multigrid" and the book by Saad, but could not find >> similar. >> >> 2) Do you know of any summerschools in scientific computing, >> focusing on Krylov methods, multigrids and preconditioning(all >> parallel)? >> >> Kind Regards >> >> Jens Madsen >> Ph.d.-studerende >> Phone direct +45 4677 4560 >> Mobile >> jens.madsen at risoe.dk >> >> Optics and Plasma Research Department >> Ris? National Laboratory >> Technical University of Denmark - DTU >> Building 128, P.O. Box 49 >> DK-4000 Roskilde, Denmark >> Tel +45 4677 4500 >> Fax +45 4677 4565 >> www.risoe.dk >> >> From 1 January 2007, Ris? National Laboratory, the Danish Institute >> for Food and Veterinary Research, >> the Danish Institute for Fisheries Research, the Danish National >> Space Center and >> the Danish Transport Research Institute have been merged with >> the Technical University of Denmark (DTU) with DTU as the continuing >> unit. >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov >> ] On Behalf Of Barry Smith >> Sent: Saturday, February 16, 2008 6:49 PM >> To: petsc-users at mcs.anl.gov >> Subject: Re: Poor performance with BoomerAMG? >> >> >> All multigrid solvers depend on proper scaling of the variables. >> For example >> for a Laplacian operator the matrix entries are >> >> \integral \grad \phi_i dot \grad \phi_j >> >> now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms >> in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the >> volume is O(h^3) >> meaning the matrix entries are O(h). Now say you impose a Dirichlet >> boundary >> conditions by just saying u_k = g_k. In 2d this is ok but in 3d >> you need to >> use h*u_k = h*g_k otherwise when you restrict to the coarser grid the >> resulting matrix entries for the boundary are "out of whack" with the >> matrix >> entries for the interior of the domain. >> >> Actually most preconditioners and Krylov methods behavior does depend >> on the row scaling; multigrid is just particularly sensitive. >> >> Barry >> >> >> On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote: >> >>> >>> >>>> Be careful how you handle boundary conditions; you need to make >>>> sure >>>> they have the same scaling as the other equations. >>> >>> Could you clarify what you mean? Is boomerAMG sensitive to scaling >>> of matrix rows in a way that other solvers/preconditioners are not? >>> >>> Andrew >>> >>>> >>>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote: >>>> >>>>> Hi Ben, >>>>> >>>>> Thank you for answering. With gmres and boomeramg I get a run time >>>>> of >>>>> 2s, so that is much better. However, if I increase the grid size >>>>> to >>>>> 513x513, I get a run time of one minute. With richardson, it fails >>>>> to converge. >>>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s >>>>> for >>>>> the 513x513 problem. >>>>> >>>>> When using the DMMG framework, I just used the default solvers. >>>>> I use the Galerkin process to generate the coarse matrices for >>>>> the multigrid cycle. >>>>> >>>>> Best, >>>>> Knut >>>>> >>>>> Siterer Ben Tay : >>>>> >>>>>> Hi Knut, >>>>>> >>>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm >>>>>> using it >>>>>> on my structured C-grid. I found it to be faster than LU, >>>>>> especially as >>>>>> the grid size increases. However I use it as a preconditioner >>>>>> with >>>>>> GMRES as the solver. Have you tried this option? Although it's >>>>>> faster, >>>>>> the speed increase is usually less than double. It seems to be >>>>>> worse if >>>>>> there is a lot of stretching in the grid. >>>>>> >>>>>> Btw, your mention using the DMMG framework and it takes less >>>>>> than a >>>>>> sec. What solver or preconditioner did you use? It's 4 times >>>>>> faster >>>>>> than GMRES... >>>>>> >>>>>> thanks! >>>>>> >>>>>> knutert at stud.ntnu.no wrote: >>>>>>> Hello, >>>>>>> >>>>>>> I am trying to use the hypre multigrid solver to solve a Poisson >>>>>>> equation. >>>>>>> However, on a test case with grid size 257x257 it takes 40 >>>>>>> seconds to converge >>>>>>> on one processor when I run with >>>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre >>>>>>> boomeramg >>>>>>> >>>>>>> Using the DMMG framework, the same problem takes less than a >>>>>>> second, >>>>>>> and the default gmres solver uses only four seconds. >>>>>>> >>>>>>> Am I somehow using the solver the wrong way, or is this >>>>>>> performance expected? >>>>>>> >>>>>>> Regards >>>>>>> Knut Erik Teigen >>>>>>> >>>>>>> >>>>> >>>>> >>>>> >>>> >>> >> >> > > From recrusader at gmail.com Sun Feb 24 14:07:28 2008 From: recrusader at gmail.com (Yujie) Date: Mon, 25 Feb 2008 04:07:28 +0800 Subject: about MatMat*() functions Message-ID: <7ff0ee010802241207g728a3edcia610ef7e8f16e492@mail.gmail.com> hi, I am wondering whether all the MatMat*() only are suitable for sequential matrix. I know MatMatSolve() is for sequential matrix. How about MatMatMult()? Thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Sun Feb 24 16:00:19 2008 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 25 Feb 2008 09:00:19 +1100 Subject: about MatMat*() functions In-Reply-To: <7ff0ee010802241207g728a3edcia610ef7e8f16e492@mail.gmail.com> References: <7ff0ee010802241207g728a3edcia610ef7e8f16e492@mail.gmail.com> Message-ID: <956373f0802241400w10d13fauf96b0bf47ab51966@mail.gmail.com> Hey, MatMatMult() will work for MPIAIJ matrices. So will MatPtAP(). If you are ever in doubt, the easiest way (I find) to check whether I certain operation is supported is to just look at the source and see which ops. are defined. I can usually find the answer with the online docs. In this case, starting with the type (MATMPIAIJ), and then searching mpiaij.c for MatMatMult. If you see something like your desired op. in struct _MatOps (i.e. MatMatMult_MPIAIJ_MPIAIJ) then it's supported. Now you know the function defining your operation and you can search for it (with grep as it might be in another file not online) to find out exactly what it does. Cheers, Dave On Mon, Feb 25, 2008 at 7:07 AM, Yujie wrote: > hi, > > I am wondering whether all the MatMat*() only are suitable for sequential > matrix. I know MatMatSolve() is for sequential matrix. How about > MatMatMult()? Thanks a lot. > > Regards, > Yujie > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Tue Feb 26 19:46:59 2008 From: recrusader at gmail.com (Yujie) Date: Tue, 26 Feb 2008 17:46:59 -0800 Subject: any examples to demonstrate how to Spooles package? Message-ID: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> Hi, everyone I have compiled PETSc with spooles. However, I try to find how to use this package in PETSc directory. I can't find any examples for it. Could you give me some advice? I want to use spooles to inverse a sparse matrix. thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From sanjay at ce.berkeley.edu Wed Feb 27 01:10:31 2008 From: sanjay at ce.berkeley.edu (Sanjay Govindjee) Date: Wed, 27 Feb 2008 08:10:31 +0100 Subject: any examples to demonstrate how to Spooles package? In-Reply-To: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> Message-ID: <47C50CE7.5060703@ce.berkeley.edu> from my make file -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary -on_error_attach_debugger -mat_spooles_symmetryflag 0 -options_left -sg Yujie wrote: > Hi, everyone > > I have compiled PETSc with spooles. However, I try to find how to use > this package in PETSc directory. I can't find any examples for it. > Could you give me some advice? I want to use spooles to inverse a > sparse matrix. thanks a lot. > > Regards, > Yujie From amjad11 at gmail.com Wed Feb 27 01:11:14 2008 From: amjad11 at gmail.com (amjad ali) Date: Wed, 27 Feb 2008 12:11:14 +0500 Subject: few questions Message-ID: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com> Hello all, Please answer the following, 1) What is the difference between static and dynamic versions of petsc? 2) How to check that which version (static or dynamic) is installed on a system? 3) Plz comment on if there is any effect of static/dynamic version while using/calling petsc from some external package? 4) how to update an already installed petsc version with newerer/latest version of petsc? Thanks to all. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aja2111 at columbia.edu Wed Feb 27 02:05:24 2008 From: aja2111 at columbia.edu (Aron Ahmadia) Date: Wed, 27 Feb 2008 03:05:24 -0500 Subject: few questions In-Reply-To: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com> References: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com> Message-ID: <37604ab40802270005y22b27007o247d0bae0c8817fd@mail.gmail.com> On Wed, Feb 27, 2008 at 2:11 AM, amjad ali wrote: > Hello all, > > Please answer the following, > > 1) What is the difference between static and dynamic versions of petsc? > Start here: http://en.wikipedia.org/wiki/Library_(computer_science)#Static_libraries In PETSc the primary differences end up being the size and link-time. Statically-linked executables need all the possible code that they could contain in the actual file, so they can be up to several MB in size. Dynamically-linked executables are much leaner for the small price of a little extra load time. Now if you're talking about Dynamically-Loaded code, that's a bit hairier... > 2) How to check that which version (static or dynamic) is installed on a > system? > The fastest way is probably to look in $PETSC_DIR/$PETSC_ARCH/ If you see .a files, you've got static libraries, if you see .so or .dylib files you've got dynamic libraries. > 3) Plz comment on if there is any effect of static/dynamic version while > using/calling petsc from some external package? > I'm not sure what you're asking here. If you mean "Is there a difference between calling dynamically compiled PETSc from statically compiled PETSc" the answer is no. There are differences in how you compile and link the two version but your actual code would look the same. Again, if we're talking about dynamically loaded code (using something like dl_open), then your code will look different. > 4) how to update an already installed petsc version with newerer/latest > version of petsc? > Doing this in place is more trouble than it's worth if you're not using a development copy . I just grab the latest copy of PETSc from their webpage, then re-build and re-install. ~A From tstitt at cscs.ch Wed Feb 27 04:09:30 2008 From: tstitt at cscs.ch (Timothy Stitt) Date: Wed, 27 Feb 2008 11:09:30 +0100 Subject: Error with -log_history Message-ID: <47C536DA.2030002@cscs.ch> Hi PETSc users/developers, I am having some difficulties with the -log_history option on example PETSc codes at my local installation (FYI: Cray XT architecture). When executing the ex2.c code (for example) on multiple processors with the -log_history option I keep getting: Signal number 11 SEGV: Segmentation Violation, probably memory access out of range The log history file is created though but contains the following: Petsc Release Version 2.3.3, Patch 3, Fri Jun 15 16:51:25 CDT 2007 HG revision: f051789beadcd36f77fb6111d20225e26ed7cc0d Wed Feb 27 10:35:55 2008 [8191]PETSC ERROR: [18945200]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [59]PETSC ER ROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [ 23118368]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [19014780]PETSC ERROR: Without the -log_history option the code runs as expected. Is this a build/architecture issue at my end? Thanks in advance for any advice given. Tim. -- Timothy Stitt HPC Applications Analyst Swiss National Supercomputing Centre (CSCS) Galleria 2 - Via Cantonale CH-6928 Manno, Switzerland +41 (0) 91 610 8233 stitt at cscs.ch From bsmith at mcs.anl.gov Wed Feb 27 11:29:27 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 27 Feb 2008 11:29:27 -0600 Subject: Error with -log_history In-Reply-To: <47C536DA.2030002@cscs.ch> References: <47C536DA.2030002@cscs.ch> Message-ID: Was the file opened? That is did you still get an empty file or a file with a few lines (if so please send the lines to petsc-maint at mcs.anl.gov). Is there are problem on ONE process? Barry On Feb 27, 2008, at 4:09 AM, Timothy Stitt wrote: > Hi PETSc users/developers, > > I am having some difficulties with the -log_history option on > example PETSc codes at my local installation (FYI: Cray XT > architecture). When executing the ex2.c code (for example) on > multiple processors with the -log_history option I keep getting: > > Signal number 11 SEGV: Segmentation Violation, probably memory > access out of range > > The log history file is created though but contains the following: > > Petsc Release Version 2.3.3, Patch 3, Fri Jun 15 16:51:25 CDT 2007 > HG revision: f051789beadcd36f77fb6111d20225e26ed7cc0d Wed Feb 27 > 10:35:55 2008 > [8191]PETSC ERROR: [18945200]PETSC ERROR: [118163408]PETSC ERROR: > [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC > ERROR: [118163408]PETSC ERROR: [59]PETSC ER > ROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: > [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC > ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [ > 23118368]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC > ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: > [19014780]PETSC ERROR: > > Without the -log_history option the code runs as expected. > > Is this a build/architecture issue at my end? > > Thanks in advance for any advice given. > > Tim. > > -- > Timothy Stitt > HPC Applications Analyst > > Swiss National Supercomputing Centre (CSCS) > Galleria 2 - Via Cantonale > CH-6928 Manno, Switzerland > > +41 (0) 91 610 8233 > stitt at cscs.ch > From recrusader at gmail.com Wed Feb 27 11:05:47 2008 From: recrusader at gmail.com (Yujie) Date: Wed, 27 Feb 2008 09:05:47 -0800 Subject: any examples to demonstrate how to Spooles package? In-Reply-To: <47C50CE7.5060703@ce.berkeley.edu> References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> <47C50CE7.5060703@ce.berkeley.edu> Message-ID: <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com> Dear Sanjay: Thank you for your reply. I don't understand what you said. Now, I want to use spooles package to inverse a sparse SPD matrix. I have further checked the inferface about spooles in PETSc. I find although spooles can deal with AX=B (B may be a dense matrix) with parallel LU factorization. However, PETSc only provide the following: 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) I don't set b to a matrix even if I use 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo *info,Mat *F) for LU factorization. Could you have any suggestions about this? thanks a lot. Regards, Yujie On 2/26/08, Sanjay Govindjee wrote: > > from my make file > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary > -on_error_attach_debugger -mat_spooles_symmetryflag 0 -options_left > > > -sg > > > Yujie wrote: > > Hi, everyone > > > > I have compiled PETSc with spooles. However, I try to find how to use > > this package in PETSc directory. I can't find any examples for it. > > Could you give me some advice? I want to use spooles to inverse a > > sparse matrix. thanks a lot. > > > > Regards, > > Yujie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 27 13:12:54 2008 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Feb 2008 13:12:54 -0600 Subject: any examples to demonstrate how to Spooles package? In-Reply-To: <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com> References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> <47C50CE7.5060703@ce.berkeley.edu> <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com> Message-ID: On Wed, Feb 27, 2008 at 11:05 AM, Yujie wrote: > Dear Sanjay: > > Thank you for your reply. I don't understand what you said. Now, I want to > use spooles package to inverse a sparse SPD matrix. I have further checked > the inferface about spooles in PETSc. I find although spooles can deal with > AX=B (B may be a dense matrix) with parallel LU factorization. > However, PETSc only provide the following: > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) > I don't set b to a matrix even if I use > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo > *info,Mat *F) for LU factorization. > > Could you have any suggestions about this? thanks a lot. MatMatSolve() http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html Matt > Regards, > Yujie > > On 2/26/08, Sanjay Govindjee wrote: > > from my make file > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary > > -on_error_attach_debugger -mat_spooles_symmetryflag 0 -options_left > > > > > > -sg > > > > > > Yujie wrote: > > > Hi, everyone > > > > > > I have compiled PETSc with spooles. However, I try to find how to use > > > this package in PETSc directory. I can't find any examples for it. > > > Could you give me some advice? I want to use spooles to inverse a > > > sparse matrix. thanks a lot. > > > > > > Regards, > > > Yujie > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From recrusader at gmail.com Wed Feb 27 13:21:39 2008 From: recrusader at gmail.com (Yujie) Date: Wed, 27 Feb 2008 11:21:39 -0800 Subject: any examples to demonstrate how to Spooles package? In-Reply-To: References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> <47C50CE7.5060703@ce.berkeley.edu> <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com> Message-ID: <7ff0ee010802271121l4075e3cbl65bcd4e7a104e4f0@mail.gmail.com> Dear Matt: I checked the codes about MatMatSolve(). However, currently, PETSc didn't realize its parallel version. Is it right? I want to inverse the matrix parallelly. could you give me some examples about it? thanks a lot. Regards, Yujie On 2/27/08, Matthew Knepley wrote: > > On Wed, Feb 27, 2008 at 11:05 AM, Yujie wrote: > > Dear Sanjay: > > > > Thank you for your reply. I don't understand what you said. Now, I want > to > > use spooles package to inverse a sparse SPD matrix. I have further > checked > > the inferface about spooles in PETSc. I find although spooles can deal > with > > AX=B (B may be a dense matrix) with parallel LU factorization. > > However, PETSc only provide the following: > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) > > I don't set b to a matrix even if I use > > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo > > *info,Mat *F) for LU factorization. > > > > Could you have any suggestions about this? thanks a lot. > > > MatMatSolve() > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html > > Matt > > > > Regards, > > Yujie > > > > On 2/26/08, Sanjay Govindjee wrote: > > > from my make file > > > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0 -options_left > > > > > > > > > -sg > > > > > > > > > Yujie wrote: > > > > Hi, everyone > > > > > > > > I have compiled PETSc with spooles. However, I try to find how to > use > > > > this package in PETSc directory. I can't find any examples for it. > > > > Could you give me some advice? I want to use spooles to inverse a > > > > sparse matrix. thanks a lot. > > > > > > > > Regards, > > > > Yujie > > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aja2111 at columbia.edu Wed Feb 27 13:26:08 2008 From: aja2111 at columbia.edu (Aron Ahmadia) Date: Wed, 27 Feb 2008 14:26:08 -0500 Subject: any examples to demonstrate how to Spooles package? In-Reply-To: References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> <47C50CE7.5060703@ce.berkeley.edu> <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com> Message-ID: <37604ab40802271126l7773640axba21eaf0e3d3e928@mail.gmail.com> Hey Matt, You should probably clean up the documentation for MatMatSolve while you're at it, it's indicating that x and b are vectors... Also, should you reference the factor routine you need to use to get a factored matrix? ~A On Wed, Feb 27, 2008 at 2:12 PM, Matthew Knepley wrote: > On Wed, Feb 27, 2008 at 11:05 AM, Yujie wrote: > > Dear Sanjay: > > > > Thank you for your reply. I don't understand what you said. Now, I want to > > use spooles package to inverse a sparse SPD matrix. I have further checked > > the inferface about spooles in PETSc. I find although spooles can deal with > > AX=B (B may be a dense matrix) with parallel LU factorization. > > However, PETSc only provide the following: > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) > > I don't set b to a matrix even if I use > > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo > > *info,Mat *F) for LU factorization. > > > > Could you have any suggestions about this? thanks a lot. > > MatMatSolve() > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html > > Matt > > > > > Regards, > > Yujie > > > > On 2/26/08, Sanjay Govindjee wrote: > > > from my make file > > > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0 -options_left > > > > > > > > > -sg > > > > > > > > > Yujie wrote: > > > > Hi, everyone > > > > > > > > I have compiled PETSc with spooles. However, I try to find how to use > > > > this package in PETSc directory. I can't find any examples for it. > > > > Could you give me some advice? I want to use spooles to inverse a > > > > sparse matrix. thanks a lot. > > > > > > > > Regards, > > > > Yujie > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > From jens.madsen at risoe.dk Wed Feb 27 13:31:43 2008 From: jens.madsen at risoe.dk (jens.madsen at risoe.dk) Date: Wed, 27 Feb 2008 20:31:43 +0100 Subject: MG question Message-ID: Hi I hope that this question is not outside the scope of this mailinglist. As far as I understand PETSc uses preconditioned GMRES(or another KSP method) as pre- and postsmoother on all multigrid levels? I was just wondering why and where in the literature I can read about that method? I thought that a fast method would be to use MG (with Gauss-Seidel RB/zebra smothers) as a preconditioner for GMRES? I have looked at papers written by Oosterlee etc. Kind Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 27 13:32:11 2008 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Feb 2008 13:32:11 -0600 Subject: any examples to demonstrate how to Spooles package? In-Reply-To: <7ff0ee010802271121l4075e3cbl65bcd4e7a104e4f0@mail.gmail.com> References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> <47C50CE7.5060703@ce.berkeley.edu> <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com> <7ff0ee010802271121l4075e3cbl65bcd4e7a104e4f0@mail.gmail.com> Message-ID: On Wed, Feb 27, 2008 at 1:21 PM, Yujie wrote: > Dear Matt: > > I checked the codes about MatMatSolve(). However, currently, PETSc didn't > realize its parallel version. Is it right? I want to inverse the matrix > parallelly. could you give me some examples about it? thanks a lot. Thats right. The parallel version is not implemented. It looks like this would take significant work. Matt > Regards, > Yujie > > > > On 2/27/08, Matthew Knepley wrote: > > On Wed, Feb 27, 2008 at 11:05 AM, Yujie wrote: > > > Dear Sanjay: > > > > > > Thank you for your reply. I don't understand what you said. Now, I want > to > > > use spooles package to inverse a sparse SPD matrix. I have further > checked > > > the inferface about spooles in PETSc. I find although spooles can deal > with > > > AX=B (B may be a dense matrix) with parallel LU factorization. > > > However, PETSc only provide the following: > > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) > > > I don't set b to a matrix even if I use > > > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo > > > *info,Mat *F) for LU factorization. > > > > > > Could you have any suggestions about this? thanks a lot. > > > > > > MatMatSolve() > > > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html > > > > Matt > > > > > > > Regards, > > > Yujie > > > > > > On 2/26/08, Sanjay Govindjee wrote: > > > > from my make file > > > > > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly > > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary > > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0 -options_left > > > > > > > > > > > > -sg > > > > > > > > > > > > Yujie wrote: > > > > > Hi, everyone > > > > > > > > > > I have compiled PETSc with spooles. However, I try to find how to > use > > > > > this package in PETSc directory. I can't find any examples for it. > > > > > Could you give me some advice? I want to use spooles to inverse a > > > > > sparse matrix. thanks a lot. > > > > > > > > > > Regards, > > > > > Yujie > > > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From knepley at gmail.com Wed Feb 27 13:33:57 2008 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Feb 2008 13:33:57 -0600 Subject: any examples to demonstrate how to Spooles package? In-Reply-To: <37604ab40802271126l7773640axba21eaf0e3d3e928@mail.gmail.com> References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> <47C50CE7.5060703@ce.berkeley.edu> <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com> <37604ab40802271126l7773640axba21eaf0e3d3e928@mail.gmail.com> Message-ID: On Wed, Feb 27, 2008 at 1:26 PM, Aron Ahmadia wrote: > Hey Matt, > > You should probably clean up the documentation for MatMatSolve while > you're at it, it's indicating that x and b are vectors... Also, > should you reference the factor routine you need to use to get a > factored matrix? The dev has the correct args. I added links to LU and Cholesky. Matt > ~A > > > > On Wed, Feb 27, 2008 at 2:12 PM, Matthew Knepley wrote: > > On Wed, Feb 27, 2008 at 11:05 AM, Yujie wrote: > > > Dear Sanjay: > > > > > > Thank you for your reply. I don't understand what you said. Now, I want to > > > use spooles package to inverse a sparse SPD matrix. I have further checked > > > the inferface about spooles in PETSc. I find although spooles can deal with > > > AX=B (B may be a dense matrix) with parallel LU factorization. > > > However, PETSc only provide the following: > > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) > > > I don't set b to a matrix even if I use > > > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo > > > *info,Mat *F) for LU factorization. > > > > > > Could you have any suggestions about this? thanks a lot. > > > > MatMatSolve() > > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html > > > > Matt > > > > > > > > > Regards, > > > Yujie > > > > > > On 2/26/08, Sanjay Govindjee wrote: > > > > from my make file > > > > > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly > > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary > > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0 -options_left > > > > > > > > > > > > -sg > > > > > > > > > > > > Yujie wrote: > > > > > Hi, everyone > > > > > > > > > > I have compiled PETSc with spooles. However, I try to find how to use > > > > > this package in PETSc directory. I can't find any examples for it. > > > > > Could you give me some advice? I want to use spooles to inverse a > > > > > sparse matrix. thanks a lot. > > > > > > > > > > Regards, > > > > > Yujie > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From recrusader at gmail.com Wed Feb 27 13:40:52 2008 From: recrusader at gmail.com (Yujie) Date: Wed, 27 Feb 2008 11:40:52 -0800 Subject: any examples to demonstrate how to Spooles package? In-Reply-To: References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com> <47C50CE7.5060703@ce.berkeley.edu> <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com> <7ff0ee010802271121l4075e3cbl65bcd4e7a104e4f0@mail.gmail.com> Message-ID: <7ff0ee010802271140l3852e586sbc45180cf2ea23ba@mail.gmail.com> This is why I have recompiled PETSc with spooles. spooles can deal with AX=Y(Y is a matrix). However, PETSc only provide the following: 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) I don't set b to a matrix even if I use 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo *info,Mat *F) for LU factorization Could you give me some advice or examples? thanks a lot. Regards, Yujie On Wed, Feb 27, 2008 at 11:32 AM, Matthew Knepley wrote: > On Wed, Feb 27, 2008 at 1:21 PM, Yujie wrote: > > Dear Matt: > > > > I checked the codes about MatMatSolve(). However, currently, PETSc > didn't > > realize its parallel version. Is it right? I want to inverse the matrix > > parallelly. could you give me some examples about it? thanks a lot. > > Thats right. The parallel version is not implemented. It looks like this > would > take significant work. > > Matt > > > Regards, > > Yujie > > > > > > > > On 2/27/08, Matthew Knepley wrote: > > > On Wed, Feb 27, 2008 at 11:05 AM, Yujie wrote: > > > > Dear Sanjay: > > > > > > > > Thank you for your reply. I don't understand what you said. Now, I > want > > to > > > > use spooles package to inverse a sparse SPD matrix. I have further > > checked > > > > the inferface about spooles in PETSc. I find although spooles can > deal > > with > > > > AX=B (B may be a dense matrix) with parallel LU factorization. > > > > However, PETSc only provide the following: > > > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) > > > > I don't set b to a matrix even if I use > > > > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo > > > > *info,Mat *F) for LU factorization. > > > > > > > > Could you have any suggestions about this? thanks a lot. > > > > > > > > > MatMatSolve() > > > > > > > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html > > > > > > Matt > > > > > > > > > > Regards, > > > > Yujie > > > > > > > > On 2/26/08, Sanjay Govindjee wrote: > > > > > from my make file > > > > > > > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly > > > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles > -log_summary > > > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0 > -options_left > > > > > > > > > > > > > > > -sg > > > > > > > > > > > > > > > Yujie wrote: > > > > > > Hi, everyone > > > > > > > > > > > > I have compiled PETSc with spooles. However, I try to find how > to > > use > > > > > > this package in PETSc directory. I can't find any examples for > it. > > > > > > Could you give me some advice? I want to use spooles to inverse > a > > > > > > sparse matrix. thanks a lot. > > > > > > > > > > > > Regards, > > > > > > Yujie > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 27 13:40:59 2008 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Feb 2008 13:40:59 -0600 Subject: MG question In-Reply-To: References: Message-ID: On Wed, Feb 27, 2008 at 1:31 PM, wrote: > Hi > > I hope that this question is not outside the scope of this mailinglist. > > As far as I understand PETSc uses preconditioned GMRES(or another KSP > method) as pre- and postsmoother on all multigrid levels? I was just This is the default. However, you can use any combination of KSP/PC on any given level with options. For instance, -mg_level_ksp_type richardson -mg_level_pc_type sor gives "regulation" MG. We default to GMRES because it is more robust. > wondering why and where in the literature I can read about that method? I > thought that a fast method would be to use MG (with Gauss-Seidel RB/zebra > smothers) as a preconditioner for GMRES? I have looked at papers written by > Oosterlee etc. In order to prove something about GMRES/MG, you would need to prove something about the convergence of GMRES on the operators at each level. Good luck. GMRES is the enemy of all convergence proofs. See paper by Greenbaum, Strakos, & Ptak. If SOR works, great and it is much faster. However, GMRES/ILU(0) tends to be more robust. Matt > Kind Regards -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Wed Feb 27 13:48:33 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 27 Feb 2008 13:48:33 -0600 Subject: MG question In-Reply-To: References: Message-ID: <32E0E8CE-D7AF-4B90-89A9-D43EFD17556B@mcs.anl.gov> The reason we default to these "very strong" (gmres + ILU(0)) smoothers is robustness, we'd rather have the solver "just work" for our users and be a little bit slower than have it often fail but be optimal for special cases. Most of the MG community has a mental block about using Krylov methods, this is why you find few papers that discuss their use with multigrid. Note also that using several iterations of GMRES (with or without ILU(0)) is still order n work so you still get the optimal convergence of mutligrid methods (when they work, of course). Barry On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote: > On Wed, Feb 27, 2008 at 1:31 PM, wrote: >> Hi >> >> I hope that this question is not outside the scope of this >> mailinglist. >> >> As far as I understand PETSc uses preconditioned GMRES(or another KSP >> method) as pre- and postsmoother on all multigrid levels? I was just > > This is the default. However, you can use any combination of KSP/PC > on any > given level with options. For instance, > > -mg_level_ksp_type richardson -mg_level_pc_type sor > > gives "regulation" MG. We default to GMRES because it is more robust. > >> wondering why and where in the literature I can read about that >> method? I >> thought that a fast method would be to use MG (with Gauss-Seidel RB/ >> zebra >> smothers) as a preconditioner for GMRES? I have looked at papers >> written by >> Oosterlee etc. > > In order to prove something about GMRES/MG, you would need to prove > something > about the convergence of GMRES on the operators at each level. Good > luck. GMRES > is the enemy of all convergence proofs. See paper by Greenbaum, > Strakos, & Ptak. > If SOR works, great and it is much faster. However, GMRES/ILU(0) tends > to be more > robust. > > Matt > >> Kind Regards > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From jens.madsen at risoe.dk Wed Feb 27 14:22:02 2008 From: jens.madsen at risoe.dk (jens.madsen at risoe.dk) Date: Wed, 27 Feb 2008 21:22:02 +0100 Subject: MG question In-Reply-To: <32E0E8CE-D7AF-4B90-89A9-D43EFD17556B@mcs.anl.gov> Message-ID: Ok Thanks Matthew and Barry First I solve 2d boundary value problems of size 512^2 - 2048^2. Typically either kind of problem(solve for phi) I) poisson type equation: \nabla^2 \phi(x,y) = f(x,y) II) \nabla \cdot (g(x,y) \nabla\phi(x,y)) = f(x,y) Successively with new f and g functions Do you know where to read about the smoothing properties of GMRES and CG? All refs that I find are only describing smoothing with GS-RB etc. My vague idea on how a fast solver is to use a (preconditioned ILU?) krylov (CG for spd ie. problem I, GMRES for II)) method with additional MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)? As my problems are not that big I fear that I will get no MG speedup if I use krylov methods as smoothers? Kind Regards Jens -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith Sent: Wednesday, February 27, 2008 8:49 PM To: petsc-users at mcs.anl.gov Subject: Re: MG question The reason we default to these "very strong" (gmres + ILU(0)) smoothers is robustness, we'd rather have the solver "just work" for our users and be a little bit slower than have it often fail but be optimal for special cases. Most of the MG community has a mental block about using Krylov methods, this is why you find few papers that discuss their use with multigrid. Note also that using several iterations of GMRES (with or without ILU(0)) is still order n work so you still get the optimal convergence of mutligrid methods (when they work, of course). Barry On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote: > On Wed, Feb 27, 2008 at 1:31 PM, wrote: >> Hi >> >> I hope that this question is not outside the scope of this >> mailinglist. >> >> As far as I understand PETSc uses preconditioned GMRES(or another KSP >> method) as pre- and postsmoother on all multigrid levels? I was just > > This is the default. However, you can use any combination of KSP/PC > on any > given level with options. For instance, > > -mg_level_ksp_type richardson -mg_level_pc_type sor > > gives "regulation" MG. We default to GMRES because it is more robust. > >> wondering why and where in the literature I can read about that >> method? I >> thought that a fast method would be to use MG (with Gauss-Seidel RB/ >> zebra >> smothers) as a preconditioner for GMRES? I have looked at papers >> written by >> Oosterlee etc. > > In order to prove something about GMRES/MG, you would need to prove > something > about the convergence of GMRES on the operators at each level. Good > luck. GMRES > is the enemy of all convergence proofs. See paper by Greenbaum, > Strakos, & Ptak. > If SOR works, great and it is much faster. However, GMRES/ILU(0) tends > to be more > robust. > > Matt > >> Kind Regards > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > From knepley at gmail.com Wed Feb 27 14:29:44 2008 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Feb 2008 14:29:44 -0600 Subject: MG question In-Reply-To: References: <32E0E8CE-D7AF-4B90-89A9-D43EFD17556B@mcs.anl.gov> Message-ID: On Wed, Feb 27, 2008 at 2:22 PM, wrote: > Ok > > Thanks Matthew and Barry > > First I solve 2d boundary value problems of size 512^2 - 2048^2. > > Typically either kind of problem(solve for phi) > > I) poisson type equation: > > \nabla^2 \phi(x,y) = f(x,y) > > II) > > \nabla \cdot (g(x,y) \nabla\phi(x,y)) = f(x,y) > > Successively with new f and g functions > > > Do you know where to read about the smoothing properties of GMRES and > CG? All refs that I find are only describing smoothing with GS-RB etc. > > My vague idea on how a fast solver is to use a (preconditioned ILU?) > krylov (CG for spd ie. problem I, GMRES for II)) method with additional > MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)? > > As my problems are not that big I fear that I will get no MG speedup if > I use krylov methods as smoothers? Well, you might need to prove things, but I would not worry about that first. It is so easy to code up, just run everything and see what actually works. Then sit down and try to show it. Matt > Kind Regards Jens > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, February 27, 2008 8:49 PM > To: petsc-users at mcs.anl.gov > Subject: Re: MG question > > > The reason we default to these "very strong" (gmres + ILU(0)) > smoothers is robustness, we'd rather have > the solver "just work" for our users and be a little bit slower than > have it often fail but be optimal > for special cases. > > Most of the MG community has a mental block about using Krylov > methods, this is > why you find few papers that discuss their use with multigrid. Note > also that using several iterations > of GMRES (with or without ILU(0)) is still order n work so you still > get the optimal convergence of > mutligrid methods (when they work, of course). > > Barry > > > On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote: > > > On Wed, Feb 27, 2008 at 1:31 PM, wrote: > >> Hi > >> > >> I hope that this question is not outside the scope of this > >> mailinglist. > >> > >> As far as I understand PETSc uses preconditioned GMRES(or another KSP > >> method) as pre- and postsmoother on all multigrid levels? I was just > > > > This is the default. However, you can use any combination of KSP/PC > > on any > > given level with options. For instance, > > > > -mg_level_ksp_type richardson -mg_level_pc_type sor > > > > gives "regulation" MG. We default to GMRES because it is more robust. > > > >> wondering why and where in the literature I can read about that > >> method? I > >> thought that a fast method would be to use MG (with Gauss-Seidel RB/ > >> zebra > >> smothers) as a preconditioner for GMRES? I have looked at papers > >> written by > >> Oosterlee etc. > > > > In order to prove something about GMRES/MG, you would need to prove > > something > > about the convergence of GMRES on the operators at each level. Good > > luck. GMRES > > is the enemy of all convergence proofs. See paper by Greenbaum, > > Strakos, & Ptak. > > If SOR works, great and it is much faster. However, GMRES/ILU(0) tends > > to be more > > robust. > > > > Matt > > > >> Kind Regards > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From balay at mcs.anl.gov Wed Feb 27 14:40:41 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 27 Feb 2008 14:40:41 -0600 (CST) Subject: few questions In-Reply-To: <37604ab40802270005y22b27007o247d0bae0c8817fd@mail.gmail.com> References: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com> <37604ab40802270005y22b27007o247d0bae0c8817fd@mail.gmail.com> Message-ID: Just a note on terminology. The difference between shared & dynamic is a bit confusing [esp across windows/linux/mac etc..]. I like to use 'shared-libraries' name instead of 'dynamic-libraries', as thats the primary feature of .so/.dylib/.dll etc. PETSc configure supports the following options --with-shared=0/1 --with-dynamic=0/1 The dynamic option refers to the using dlopen() to look for function in a sharedlibrary [instead of resolving these functions at link-time] If petsc is built with dynamic usage- then PETSC_USE_DYNAMIC_LIBRARIES flag is set in petscconf.h. Shared libs can be identified by looking at the library names. Satish On Wed, 27 Feb 2008, Aron Ahmadia wrote: > On Wed, Feb 27, 2008 at 2:11 AM, amjad ali wrote: > > Hello all, > > > > Please answer the following, > > > > 1) What is the difference between static and dynamic versions of petsc? > > > > Start here: http://en.wikipedia.org/wiki/Library_(computer_science)#Static_libraries > > In PETSc the primary differences end up being the size and link-time. > Statically-linked executables need all the possible code that they > could contain in the actual file, so they can be up to several MB in > size. Dynamically-linked executables are much leaner for the small > price of a little extra load time. > > Now if you're talking about Dynamically-Loaded code, that's a bit hairier... > > > 2) How to check that which version (static or dynamic) is installed on a > > system? > > > > The fastest way is probably to look in $PETSC_DIR/$PETSC_ARCH/ > > If you see .a files, you've got static libraries, if you see .so or > .dylib files you've got dynamic libraries. > > > 3) Plz comment on if there is any effect of static/dynamic version while > > using/calling petsc from some external package? > > > > I'm not sure what you're asking here. If you mean "Is there a > difference between calling dynamically compiled PETSc from statically > compiled PETSc" the answer is no. There are differences in how you > compile and link the two version but your actual code would look the > same. > > Again, if we're talking about dynamically loaded code (using something > like dl_open), then your code will look different. > > > 4) how to update an already installed petsc version with newerer/latest > > version of petsc? > > > > Doing this in place is more trouble than it's worth if you're not > using a development copy . I just grab the latest copy of PETSc from > their webpage, then re-build and re-install. > > ~A > > From bsmith at mcs.anl.gov Wed Feb 27 14:45:07 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 27 Feb 2008 14:45:07 -0600 Subject: MG question In-Reply-To: References: Message-ID: <2EF58CAE-5270-45A8-8EB9-1FC9D519BCC2@mcs.anl.gov> On Feb 27, 2008, at 2:22 PM, jens.madsen at risoe.dk wrote: > Ok > > Thanks Matthew and Barry > > First I solve 2d boundary value problems of size 512^2 - 2048^2. > > Typically either kind of problem(solve for phi) > > I) poisson type equation: > > \nabla^2 \phi(x,y) = f(x,y) There is no reason to use GMRES here, use -ksp_type richardson -mg_levels_pc_type sor -mg_levels_ksp_type richardson should require about 5-10 outter iterations to get reasonable convergence on the norm of the residual. > > > II) > > \nabla \cdot (g(x,y) \nabla\phi(x,y)) = f(x,y) > If g(x,y) is smooth and not highly varying again you should not need GMRES. If it is a crazy function than the whole kitchen sink will likely give better convergence. I do not understand your questions. If you don't need GMRES/CG then don't use it and if you think you might need it just try it and see if it helps. Barry > Successively with new f and g functions > > > Do you know where to read about the smoothing properties of GMRES and > CG? All refs that I find are only describing smoothing with GS-RB etc. > > My vague idea on how a fast solver is to use a (preconditioned ILU?) > krylov (CG for spd ie. problem I, GMRES for II)) method with > additional > MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)? > > As my problems are not that big I fear that I will get no MG speedup > if > I use krylov methods as smoothers? > > Kind Regards Jens > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, February 27, 2008 8:49 PM > To: petsc-users at mcs.anl.gov > Subject: Re: MG question > > > The reason we default to these "very strong" (gmres + ILU(0)) > smoothers is robustness, we'd rather have > the solver "just work" for our users and be a little bit slower than > have it often fail but be optimal > for special cases. > > Most of the MG community has a mental block about using Krylov > methods, this is > why you find few papers that discuss their use with multigrid. Note > also that using several iterations > of GMRES (with or without ILU(0)) is still order n work so you still > get the optimal convergence of > mutligrid methods (when they work, of course). > > Barry > > > On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote: > >> On Wed, Feb 27, 2008 at 1:31 PM, wrote: >>> Hi >>> >>> I hope that this question is not outside the scope of this >>> mailinglist. >>> >>> As far as I understand PETSc uses preconditioned GMRES(or another >>> KSP >>> method) as pre- and postsmoother on all multigrid levels? I was just >> >> This is the default. However, you can use any combination of KSP/PC >> on any >> given level with options. For instance, >> >> -mg_level_ksp_type richardson -mg_level_pc_type sor >> >> gives "regulation" MG. We default to GMRES because it is more robust. >> >>> wondering why and where in the literature I can read about that >>> method? I >>> thought that a fast method would be to use MG (with Gauss-Seidel RB/ >>> zebra >>> smothers) as a preconditioner for GMRES? I have looked at papers >>> written by >>> Oosterlee etc. >> >> In order to prove something about GMRES/MG, you would need to prove >> something >> about the convergence of GMRES on the operators at each level. Good >> luck. GMRES >> is the enemy of all convergence proofs. See paper by Greenbaum, >> Strakos, & Ptak. >> If SOR works, great and it is much faster. However, GMRES/ILU(0) >> tends >> to be more >> robust. >> >> Matt >> >>> Kind Regards >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> > > From aja2111 at columbia.edu Wed Feb 27 14:48:34 2008 From: aja2111 at columbia.edu (Aron Ahmadia) Date: Wed, 27 Feb 2008 15:48:34 -0500 Subject: few questions In-Reply-To: References: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com> <37604ab40802270005y22b27007o247d0bae0c8817fd@mail.gmail.com> Message-ID: <37604ab40802271248s4bfe85f9p71f1a9959e8da18@mail.gmail.com> Thanks Satish, That's definitely an easier way to think about it... ~A On Wed, Feb 27, 2008 at 3:40 PM, Satish Balay wrote: > Just a note on terminology. The difference between shared & dynamic is > a bit confusing [esp across windows/linux/mac etc..]. I like to use > 'shared-libraries' name instead of 'dynamic-libraries', as thats the > primary feature of .so/.dylib/.dll etc. > > PETSc configure supports the following options > > --with-shared=0/1 --with-dynamic=0/1 > > The dynamic option refers to the using dlopen() to look for function > in a sharedlibrary [instead of resolving these functions at link-time] > > If petsc is built with dynamic usage- then PETSC_USE_DYNAMIC_LIBRARIES > flag is set in petscconf.h. Shared libs can be identified by looking > at the library names. > > Satish > > > > On Wed, 27 Feb 2008, Aron Ahmadia wrote: > > > On Wed, Feb 27, 2008 at 2:11 AM, amjad ali wrote: > > > Hello all, > > > > > > Please answer the following, > > > > > > 1) What is the difference between static and dynamic versions of petsc? > > > > > > > Start here: http://en.wikipedia.org/wiki/Library_(computer_science)#Static_libraries > > > > In PETSc the primary differences end up being the size and link-time. > > Statically-linked executables need all the possible code that they > > could contain in the actual file, so they can be up to several MB in > > size. Dynamically-linked executables are much leaner for the small > > price of a little extra load time. > > > > Now if you're talking about Dynamically-Loaded code, that's a bit hairier... > > > > > 2) How to check that which version (static or dynamic) is installed on a > > > system? > > > > > > > The fastest way is probably to look in $PETSC_DIR/$PETSC_ARCH/ > > > > If you see .a files, you've got static libraries, if you see .so or > > .dylib files you've got dynamic libraries. > > > > > 3) Plz comment on if there is any effect of static/dynamic version while > > > using/calling petsc from some external package? > > > > > > > I'm not sure what you're asking here. If you mean "Is there a > > difference between calling dynamically compiled PETSc from statically > > compiled PETSc" the answer is no. There are differences in how you > > compile and link the two version but your actual code would look the > > same. > > > > Again, if we're talking about dynamically loaded code (using something > > like dl_open), then your code will look different. > > > > > 4) how to update an already installed petsc version with newerer/latest > > > version of petsc? > > > > > > > Doing this in place is more trouble than it's worth if you're not > > using a development copy . I just grab the latest copy of PETSc from > > their webpage, then re-build and re-install. > > > > ~A > > > > > > From jens.madsen at risoe.dk Wed Feb 27 15:21:49 2008 From: jens.madsen at risoe.dk (jens.madsen at risoe.dk) Date: Wed, 27 Feb 2008 22:21:49 +0100 Subject: MG question In-Reply-To: <2EF58CAE-5270-45A8-8EB9-1FC9D519BCC2@mcs.anl.gov> Message-ID: Thanks again :-) The reason why I ask is that my code is actually much faster without GMRES.. I thought that MG accelerated Krylov methods were always the fastest methods.... I am no expert, so I was just wondering why the default in DMMG is GMRES/ILU. In the articles I have been able to find, PCG/MG(GS-RB/zebra)(SPD) and GMRES/ MG(GS-RB/zebra) on the problems I) and II) respectively, seems to be faster than (one level) preconditioned Krylov methods and MG. I am new in this field and find it very difficult even to choose which methods to test and compare(there are so many possibilities). :-D I will keep on testing :-) Thanks you very much for your answers. Jens -----Original Message----- From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith Sent: Wednesday, February 27, 2008 9:45 PM To: petsc-users at mcs.anl.gov Subject: Re: MG question On Feb 27, 2008, at 2:22 PM, jens.madsen at risoe.dk wrote: > Ok > > Thanks Matthew and Barry > > First I solve 2d boundary value problems of size 512^2 - 2048^2. > > Typically either kind of problem(solve for phi) > > I) poisson type equation: > > \nabla^2 \phi(x,y) = f(x,y) There is no reason to use GMRES here, use -ksp_type richardson -mg_levels_pc_type sor -mg_levels_ksp_type richardson should require about 5-10 outter iterations to get reasonable convergence on the norm of the residual. > > > II) > > \nabla \cdot (g(x,y) \nabla\phi(x,y)) = f(x,y) > If g(x,y) is smooth and not highly varying again you should not need GMRES. If it is a crazy function than the whole kitchen sink will likely give better convergence. I do not understand your questions. If you don't need GMRES/CG then don't use it and if you think you might need it just try it and see if it helps. Barry > Successively with new f and g functions > > > Do you know where to read about the smoothing properties of GMRES and > CG? All refs that I find are only describing smoothing with GS-RB etc. > > My vague idea on how a fast solver is to use a (preconditioned ILU?) > krylov (CG for spd ie. problem I, GMRES for II)) method with > additional > MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)? > > As my problems are not that big I fear that I will get no MG speedup > if > I use krylov methods as smoothers? > > Kind Regards Jens > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, February 27, 2008 8:49 PM > To: petsc-users at mcs.anl.gov > Subject: Re: MG question > > > The reason we default to these "very strong" (gmres + ILU(0)) > smoothers is robustness, we'd rather have > the solver "just work" for our users and be a little bit slower than > have it often fail but be optimal > for special cases. > > Most of the MG community has a mental block about using Krylov > methods, this is > why you find few papers that discuss their use with multigrid. Note > also that using several iterations > of GMRES (with or without ILU(0)) is still order n work so you still > get the optimal convergence of > mutligrid methods (when they work, of course). > > Barry > > > On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote: > >> On Wed, Feb 27, 2008 at 1:31 PM, wrote: >>> Hi >>> >>> I hope that this question is not outside the scope of this >>> mailinglist. >>> >>> As far as I understand PETSc uses preconditioned GMRES(or another >>> KSP >>> method) as pre- and postsmoother on all multigrid levels? I was just >> >> This is the default. However, you can use any combination of KSP/PC >> on any >> given level with options. For instance, >> >> -mg_level_ksp_type richardson -mg_level_pc_type sor >> >> gives "regulation" MG. We default to GMRES because it is more robust. >> >>> wondering why and where in the literature I can read about that >>> method? I >>> thought that a fast method would be to use MG (with Gauss-Seidel RB/ >>> zebra >>> smothers) as a preconditioner for GMRES? I have looked at papers >>> written by >>> Oosterlee etc. >> >> In order to prove something about GMRES/MG, you would need to prove >> something >> about the convergence of GMRES on the operators at each level. Good >> luck. GMRES >> is the enemy of all convergence proofs. See paper by Greenbaum, >> Strakos, & Ptak. >> If SOR works, great and it is much faster. However, GMRES/ILU(0) >> tends >> to be more >> robust. >> >> Matt >> >>> Kind Regards >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> > > From bsmith at mcs.anl.gov Wed Feb 27 15:39:36 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 27 Feb 2008 15:39:36 -0600 Subject: MG question In-Reply-To: References: Message-ID: On Feb 27, 2008, at 3:21 PM, jens.madsen at risoe.dk wrote: > Thanks again :-) > > The reason why I ask is that my code is actually much faster without > GMRES.. I thought that MG accelerated Krylov methods were always the > fastest methods.... I am no expert, so I was just wondering why the > default in DMMG is GMRES/ILU. It is just for robustness, not for speed. > > > In the articles I have been able to find, PCG/MG(GS-RB/zebra)(SPD) and > GMRES/ MG(GS-RB/zebra) on the problems I) and II) respectively, > seems to > be faster than (one level) preconditioned Krylov methods and MG. > > I am new in this field and find it very difficult even to choose which > methods to test and compare(there are so many possibilities). :-D > > I will keep on testing :-) > > Thanks you very much for your answers. > > Jens > > > -----Original Message----- > From: owner-petsc-users at mcs.anl.gov > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith > Sent: Wednesday, February 27, 2008 9:45 PM > To: petsc-users at mcs.anl.gov > Subject: Re: MG question > > > On Feb 27, 2008, at 2:22 PM, jens.madsen at risoe.dk wrote: > >> Ok >> >> Thanks Matthew and Barry >> >> First I solve 2d boundary value problems of size 512^2 - 2048^2. >> >> Typically either kind of problem(solve for phi) >> >> I) poisson type equation: >> >> \nabla^2 \phi(x,y) = f(x,y) > > There is no reason to use GMRES here, use > > -ksp_type richardson -mg_levels_pc_type sor -mg_levels_ksp_type > richardson > should require about 5-10 outter iterations to get reasonable > convergence > on the norm of the residual. > >> >> >> II) >> >> \nabla \cdot (g(x,y) \nabla\phi(x,y)) = f(x,y) >> > If g(x,y) is smooth and not highly varying again you should not > need GMRES. > If it is a crazy function than the whole kitchen sink will likely give > better convergence. > > I do not understand your questions. If you don't need GMRES/CG > then don't use > it and if you think you might need it just try it and see if it helps. > > Barry > >> Successively with new f and g functions >> >> >> Do you know where to read about the smoothing properties of GMRES and >> CG? All refs that I find are only describing smoothing with GS-RB >> etc. >> >> My vague idea on how a fast solver is to use a (preconditioned ILU?) >> krylov (CG for spd ie. problem I, GMRES for II)) method with >> additional >> MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)? >> >> As my problems are not that big I fear that I will get no MG speedup >> if >> I use krylov methods as smoothers? >> >> Kind Regards Jens >> >> >> -----Original Message----- >> From: owner-petsc-users at mcs.anl.gov >> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith >> Sent: Wednesday, February 27, 2008 8:49 PM >> To: petsc-users at mcs.anl.gov >> Subject: Re: MG question >> >> >> The reason we default to these "very strong" (gmres + ILU(0)) >> smoothers is robustness, we'd rather have >> the solver "just work" for our users and be a little bit slower than >> have it often fail but be optimal >> for special cases. >> >> Most of the MG community has a mental block about using Krylov >> methods, this is >> why you find few papers that discuss their use with multigrid. Note >> also that using several iterations >> of GMRES (with or without ILU(0)) is still order n work so you still >> get the optimal convergence of >> mutligrid methods (when they work, of course). >> >> Barry >> >> >> On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote: >> >>> On Wed, Feb 27, 2008 at 1:31 PM, wrote: >>>> Hi >>>> >>>> I hope that this question is not outside the scope of this >>>> mailinglist. >>>> >>>> As far as I understand PETSc uses preconditioned GMRES(or another >>>> KSP >>>> method) as pre- and postsmoother on all multigrid levels? I was >>>> just >>> >>> This is the default. However, you can use any combination of KSP/PC >>> on any >>> given level with options. For instance, >>> >>> -mg_level_ksp_type richardson -mg_level_pc_type sor >>> >>> gives "regulation" MG. We default to GMRES because it is more >>> robust. >>> >>>> wondering why and where in the literature I can read about that >>>> method? I >>>> thought that a fast method would be to use MG (with Gauss-Seidel >>>> RB/ >>>> zebra >>>> smothers) as a preconditioner for GMRES? I have looked at papers >>>> written by >>>> Oosterlee etc. >>> >>> In order to prove something about GMRES/MG, you would need to prove >>> something >>> about the convergence of GMRES on the operators at each level. Good >>> luck. GMRES >>> is the enemy of all convergence proofs. See paper by Greenbaum, >>> Strakos, & Ptak. >>> If SOR works, great and it is much faster. However, GMRES/ILU(0) >>> tends >>> to be more >>> robust. >>> >>> Matt >>> >>>> Kind Regards >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >>> >> >> > > From recrusader at gmail.com Wed Feb 27 16:47:46 2008 From: recrusader at gmail.com (Yujie) Date: Wed, 27 Feb 2008 14:47:46 -0800 Subject: any good ideas for me from you? thanks a lot.Fwd: any examples to demonstrate how to Spooles package? In-Reply-To: References: <7ff0ee010802271344s497ce960g89e99d4ff00ad3dd@mail.gmail.com> Message-ID: <7ff0ee010802271447y6d755a4dn7dee083042b5de65@mail.gmail.com> What do you mean about 3)? I am considering to use MatSolve_MPIAIJSpooles with setting b to 1 0 0 . . .0 0 1 0 0 0 0 1 0 . . . . . . . . . 1 for solving Ax=b. After finishing all, I will rearrange X=[x1,x2,x3,x4], which is the inversion of A. whether is it similar with 1) you mentioned? Practically, If I use such method, I may use some iterative methods to solve it, not direct inversion method. What is the time difference or time complexity regarding using spooles (direct inversion method) or other iterative methods? thanks a lot. Regards, Yujie On Wed, Feb 27, 2008 at 2:33 PM, Matthew Knepley wrote: > This seems like it would involve significant programming time. Therefore, > I suggest > > 1) Solving each vector in a loop > > 2) Taking a look at MatMatSolve_SeqAIJ() and > MatSolve_MPIAIJSpooles() and trying to implement it yourself for > Spooles > > 3) Reformulating your problem so as not use an inverse, but rather just > solves > > Thanks, > > Matt > > On Wed, Feb 27, 2008 at 3:44 PM, Yujie wrote: > > > > > > > > ---------- Forwarded message ---------- > > From: Yujie > > Date: Wed, Feb 27, 2008 at 11:40 AM > > Subject: Re: any examples to demonstrate how to Spooles package? > > To: petsc-users at mcs.anl.gov > > > > > > > > This is why I have recompiled PETSc with spooles. spooles can deal with > > AX=Y(Y is a matrix). However, PETSc only provide the following: > > > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) > > I don't set b to a matrix even if I use > > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo > > *info,Mat *F) for LU factorization > > > > Could you give me some advice or examples? thanks a lot. > > > > Regards, > > Yujie > > > > > > > > > > > > On Wed, Feb 27, 2008 at 11:32 AM, Matthew Knepley > wrote: > > > > > > > > On Wed, Feb 27, 2008 at 1:21 PM, Yujie wrote: > > > > Dear Matt: > > > > > > > > I checked the codes about MatMatSolve(). However, currently, PETSc > > didn't > > > > realize its parallel version. Is it right? I want to inverse the > matrix > > > > parallelly. could you give me some examples about it? thanks a lot. > > > > > > Thats right. The parallel version is not implemented. It looks like > this > > would > > > take significant work. > > > > > > > > > > > > > > > Matt > > > > > > > Regards, > > > > Yujie > > > > > > > > > > > > > > > > On 2/27/08, Matthew Knepley wrote: > > > > > On Wed, Feb 27, 2008 at 11:05 AM, Yujie > wrote: > > > > > > Dear Sanjay: > > > > > > > > > > > > Thank you for your reply. I don't understand what you said. Now, > I > > want > > > > to > > > > > > use spooles package to inverse a sparse SPD matrix. I have > further > > > > checked > > > > > > the inferface about spooles in PETSc. I find although spooles > can > > deal > > > > with > > > > > > AX=B (B may be a dense matrix) with parallel LU factorization. > > > > > > However, PETSc only provide the following: > > > > > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x) > > > > > > I don't set b to a matrix even if I use > > > > > > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat > A,MatFactorInfo > > > > > > *info,Mat *F) for LU factorization. > > > > > > > > > > > > Could you have any suggestions about this? thanks a lot. > > > > > > > > > > > > > > > MatMatSolve() > > > > > > > > > > > > > > > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html > > > > > > > > > > Matt > > > > > > > > > > > > > > > > Regards, > > > > > > Yujie > > > > > > > > > > > > On 2/26/08, Sanjay Govindjee wrote: > > > > > > > from my make file > > > > > > > > > > > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly > > > > > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles > > -log_summary > > > > > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0 > > -options_left > > > > > > > > > > > > > > > > > > > > > -sg > > > > > > > > > > > > > > > > > > > > > Yujie wrote: > > > > > > > > Hi, everyone > > > > > > > > > > > > > > > > I have compiled PETSc with spooles. However, I try to find > how > > to > > > > use > > > > > > > > this package in PETSc directory. I can't find any examples > for > > it. > > > > > > > > Could you give me some advice? I want to use spooles to > inverse > > a > > > > > > > > sparse matrix. thanks a lot. > > > > > > > > > > > > > > > > Regards, > > > > > > > > Yujie > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before they begin their > > > > > experiments is infinitely more interesting than any results to > which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Amit.Itagi at seagate.com Thu Feb 28 13:07:36 2008 From: Amit.Itagi at seagate.com (Amit.Itagi at seagate.com) Date: Thu, 28 Feb 2008 14:07:36 -0500 Subject: Direct LU solver In-Reply-To: <200802281846.m1SIkPA31406@mcs.anl.gov> Message-ID: Hi, I need to do direct LU solves (repeatedly, with the same matrix) in one of my MPI applications. I am having trouble implementing the solver. To identify the problem, I wrote a short toy code to run with 2 processes. I can run it with either the spooles parallel matrix or the superlu_dist matrix. I am using C++ and complex matrices. Here is the code listing: #include #include #include #include "petsc.h" #include "petscmat.h" #include "petscvec.h" #include "petscksp.h" using namespace std; int main( int argc, char *argv[] ) { int rank, size; Mat A; PetscErrorCode ierr; PetscInt loc; PetscScalar val; Vec x, y; KSP solver; PC prec; MPI_Comm comm; // Number of non-zeros in each row int d_nnz=1, o_nnz=1; ierr=PetscInitialize(&argc,&argv,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); // Initialization (including MPI) comm=PETSC_COMM_WORLD; ierr=MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); ierr=MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); // Assemble matrix A if(rank==0) { ierr=MatCreateMPIAIJ(comm,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); CHKERRQ(ierr); val=complex(1.0,0.0); ierr=MatSetValue(A,0,0,val,INSERT_VALUES);CHKERRQ(ierr); val=complex(0.0,1.0); ierr=MatSetValue(A,0,1,val,INSERT_VALUES);CHKERRQ(ierr); } else { ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); CHKERRQ(ierr); val=complex(1.0,1.0); ierr=MatSetValue(A,1,1,val,INSERT_VALUES);CHKERRQ(ierr); val=complex(0.0,-1.0); ierr=MatSetValue(A,1,0,val,INSERT_VALUES);CHKERRQ(ierr); } ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); cout << "============ Mat A ==================" << endl; ierr=MatView(A,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); cout << "======================================" << endl; // For spooles //ierr=MatConvert(A,MATMPIAIJSPOOLES,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr); // For superlu_dist ierr=MatConvert(A,MATSUPERLU_DIST,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr); // Direct LU solver ierr=KSPCreate(comm,&solver); CHKERRQ(ierr); ierr=KSPSetType(solver,KSPPREONLY); CHKERRQ(ierr); ierr=KSPSetOperators(solver,A,A,SAME_NONZERO_PATTERN); CHKERRQ(ierr); ierr=KSPGetPC(solver,&prec); CHKERRQ(ierr); ierr=PCSetType(prec,PCLU); CHKERRQ(ierr); ierr=KSPSetFromOptions(solver); CHKERRQ(ierr); //============ Vector assembly ======================== if(rank==0) { ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr); val=complex(1.0,0.0); loc=0; ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr); } else { ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr); val=complex(-1.0,0.0); loc=1; ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr); } ierr=VecAssemblyBegin(x); CHKERRQ(ierr); ierr=VecAssemblyEnd(x); CHKERRQ(ierr); cout << "============== Vec x ==================" << endl; ierr=VecView(x,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); cout << "======================================" << endl; VecDuplicate(x,&y); // Duplicate the matrix storage // Solve the matrix equation ierr=KSPSolve(solver,x,y); CHKERRQ(ierr); cout << "============== Vec y =================" << endl; ierr=VecView(y,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); cout << "======================================" << endl; // Destructors ierr=KSPDestroy(solver); CHKERRQ(ierr); ierr=VecDestroy(x); CHKERRQ(ierr); ierr=VecDestroy(y); CHKERRQ(ierr); ierr=MatDestroy(A); CHKERRQ(ierr); // Finalize ierr=PetscFinalize(); CHKERRQ(ierr); return 0; } When I run the program with spooles, I get the following output. ============ Mat A ================== ============ Mat A ================== ====================================== row 0: (0, 1) (1, 0 + 1 i) row 1: (0, 0 - 1 i) (1, 1 + 1 i) ====================================== ============== Vec x ================== ============== Vec x ================== Process [0] 1 ====================================== Process [1] -1 ====================================== fatal error in InpMtx_MPI_split() firsttag = 0, tagbound = -1 fatal error in InpMtx_MPI_split() firsttag = 0, tagbound = -1 ----------------------------------------------------------------------------- One of the processes started by mpirun has exited with a nonzero exit code. This typically indicates that the process finished in error. If your process did not finish in error, be sure to include a "return 0" or "exit(0)" in your C code before exiting the application. PID 22881 failed on node n0 (127.0.0.1) with exit status 255. ----------------------------------------------------------------------------- mpirun failed with exit status 255 When run in the debugger, there is no stack trace. The error is Program exited with code 0377 With superlu_dist, the output is ============ Mat A ================== ============ Mat A ================== row 0: (0, 1) (1, 0 + 1 i) row 1: (0, 0 - 1 i) (1, 1 + 1 i) ====================================== ====================================== ============== Vec x ================== ============== Vec x ================== Process [0] 1 Process [1] -1 ====================================== ====================================== [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: lu on a linux-gnu named tabla by amit Thu Feb 28 13:48:54 2008 [0]PETSC ERROR: Libraries linked from /home/amit/programs/ParEM/petsc-2.3.3-p8/lib/linux-gnu-c-debug [0]PETSC ERROR: Configure run at Thu Feb 28 12:19:39 2008 [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-debugging=no --with-fortran-kernels=generic --with-clanguage=cxx --with-metis=1 --download-metis=1 --with-parmetis=1 --download-parmetis=1 --with-superlu_dist=1 --download-superlu_dist=1 --with-spooles=1 --with-spooles-dir=/home/amit/programs/ParEM/spooles-2.2 COPTFLAGS="-O3 -march=p4 -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe -fomit-frame-pointer -finline-functions -msse2" CXXOPTFLAGS="-O3 -march=p4 -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe -fomit-frame-pointer -finline-functions -msse2" FOPTS="-O3 -qarch=p4 -qtune=p4" --with-shared=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file ----------------------------------------------------------------------------- One of the processes started by mpirun has exited with a nonzero exit code. This typically indicates that the process finished in error. If your process did not finish in error, be sure to include a "return 0" or "exit(0)" in your C code before exiting the application. PID 22998 failed on node n0 (127.0.0.1) with exit status 1. ----------------------------------------------------------------------------- mpirun failed with exit status 1 The debugger tracks the segmentation violation to main -> KSPSolve -> KSPSetUp -> PCSetUp -> PCSetUp_LU -> MatLUFactorNumeric -> MatLUFactorNumeric_SUperLU_DIST -> pzgssvx Could someone kindly point out what I am missing ? Thanks Rgds, Amit From knepley at gmail.com Thu Feb 28 14:45:48 2008 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 28 Feb 2008 14:45:48 -0600 Subject: Direct LU solver In-Reply-To: References: <200802281846.m1SIkPA31406@mcs.anl.gov> Message-ID: I would recommend using KSP ex10 and then customizing with options. That way we know it should work. For SuperLU_dist, -ksp_type preonly -pc_type lu -mat_type superlu_dist Matt On Thu, Feb 28, 2008 at 1:07 PM, wrote: > Hi, > > I need to do direct LU solves (repeatedly, with the same matrix) in one of > my MPI applications. I am having trouble implementing the solver. To > identify the problem, I wrote a short toy code to run with 2 processes. I > can run it with either the spooles parallel matrix or the superlu_dist > matrix. I am using C++ and complex matrices. > > Here is the code listing: > > #include > #include > #include > #include "petsc.h" > #include "petscmat.h" > #include "petscvec.h" > #include "petscksp.h" > > using namespace std; > > int main( int argc, char *argv[] ) { > > int rank, size; > Mat A; > PetscErrorCode ierr; > PetscInt loc; > PetscScalar val; > Vec x, y; > KSP solver; > PC prec; > MPI_Comm comm; > > // Number of non-zeros in each row > int d_nnz=1, o_nnz=1; > > ierr=PetscInitialize(&argc,&argv,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); > // Initialization (including MPI) > > comm=PETSC_COMM_WORLD; > > ierr=MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); > ierr=MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); > > // Assemble matrix A > > if(rank==0) { > ierr=MatCreateMPIAIJ(comm,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); CHKERRQ(ierr); > val=complex(1.0,0.0); > ierr=MatSetValue(A,0,0,val,INSERT_VALUES);CHKERRQ(ierr); > val=complex(0.0,1.0); > ierr=MatSetValue(A,0,1,val,INSERT_VALUES);CHKERRQ(ierr); > } > else { > ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); > CHKERRQ(ierr); > val=complex(1.0,1.0); > ierr=MatSetValue(A,1,1,val,INSERT_VALUES);CHKERRQ(ierr); > val=complex(0.0,-1.0); > ierr=MatSetValue(A,1,0,val,INSERT_VALUES);CHKERRQ(ierr); > } > > ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > > cout << "============ Mat A ==================" << endl; > ierr=MatView(A,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); > cout << "======================================" << endl; > > // For spooles > //ierr=MatConvert(A,MATMPIAIJSPOOLES,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr); > > // For superlu_dist > ierr=MatConvert(A,MATSUPERLU_DIST,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr); > > // Direct LU solver > ierr=KSPCreate(comm,&solver); CHKERRQ(ierr); > ierr=KSPSetType(solver,KSPPREONLY); CHKERRQ(ierr); > ierr=KSPSetOperators(solver,A,A,SAME_NONZERO_PATTERN); CHKERRQ(ierr); > ierr=KSPGetPC(solver,&prec); CHKERRQ(ierr); > ierr=PCSetType(prec,PCLU); CHKERRQ(ierr); > ierr=KSPSetFromOptions(solver); CHKERRQ(ierr); > > //============ Vector assembly ======================== > > if(rank==0) { > ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr); > val=complex(1.0,0.0); > loc=0; > ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr); > } > else { > ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr); > val=complex(-1.0,0.0); > loc=1; > ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr); > } > > ierr=VecAssemblyBegin(x); CHKERRQ(ierr); > ierr=VecAssemblyEnd(x); CHKERRQ(ierr); > > cout << "============== Vec x ==================" << endl; > ierr=VecView(x,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); > cout << "======================================" << endl; > > VecDuplicate(x,&y); // Duplicate the matrix storage > > // Solve the matrix equation > ierr=KSPSolve(solver,x,y); CHKERRQ(ierr); > > cout << "============== Vec y =================" << endl; > ierr=VecView(y,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); > cout << "======================================" << endl; > > > // Destructors > ierr=KSPDestroy(solver); CHKERRQ(ierr); > ierr=VecDestroy(x); CHKERRQ(ierr); > ierr=VecDestroy(y); CHKERRQ(ierr); > ierr=MatDestroy(A); CHKERRQ(ierr); > > // Finalize > ierr=PetscFinalize(); CHKERRQ(ierr); > > > return 0; > > } > > > When I run the program with spooles, I get the following output. > > > ============ Mat A ================== > ============ Mat A ================== > ====================================== > row 0: (0, 1) (1, 0 + 1 i) > row 1: (0, 0 - 1 i) (1, 1 + 1 i) > ====================================== > ============== Vec x ================== > ============== Vec x ================== > Process [0] > 1 > ====================================== > Process [1] > -1 > ====================================== > > fatal error in InpMtx_MPI_split() > firsttag = 0, tagbound = -1 > > fatal error in InpMtx_MPI_split() > firsttag = 0, tagbound = -1 > ----------------------------------------------------------------------------- > One of the processes started by mpirun has exited with a nonzero exit > code. This typically indicates that the process finished in error. > If your process did not finish in error, be sure to include a "return > 0" or "exit(0)" in your C code before exiting the application. > > PID 22881 failed on node n0 (127.0.0.1) with exit status 255. > ----------------------------------------------------------------------------- > mpirun failed with exit status 255 > > > When run in the debugger, there is no stack trace. The error is > > Program exited with code 0377 > > > With superlu_dist, the output is > > ============ Mat A ================== > ============ Mat A ================== > row 0: (0, 1) (1, 0 + 1 i) > row 1: (0, 0 - 1 i) (1, 1 + 1 i) > ====================================== > ====================================== > ============== Vec x ================== > ============== Vec x ================== > Process [0] > 1 > Process [1] > -1 > ====================================== > ====================================== > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to > find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 > CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: lu on a linux-gnu named tabla by amit Thu Feb 28 13:48:54 > 2008 > [0]PETSC ERROR: Libraries linked from > /home/amit/programs/ParEM/petsc-2.3.3-p8/lib/linux-gnu-c-debug > [0]PETSC ERROR: Configure run at Thu Feb 28 12:19:39 2008 > [0]PETSC ERROR: Configure options --with-scalar-type=complex > --with-debugging=no --with-fortran-kernels=generic --with-clanguage=cxx > --with-metis=1 --download-metis=1 --with-parmetis=1 --download-parmetis=1 > --with-superlu_dist=1 --download-superlu_dist=1 --with-spooles=1 > --with-spooles-dir=/home/amit/programs/ParEM/spooles-2.2 COPTFLAGS="-O3 > -march=p4 -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe > -fomit-frame-pointer -finline-functions -msse2" CXXOPTFLAGS="-O3 -march=p4 > -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe > -fomit-frame-pointer -finline-functions -msse2" FOPTS="-O3 -qarch=p4 > -qtune=p4" --with-shared=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > ----------------------------------------------------------------------------- > One of the processes started by mpirun has exited with a nonzero exit > code. This typically indicates that the process finished in error. > If your process did not finish in error, be sure to include a "return > 0" or "exit(0)" in your C code before exiting the application. > > PID 22998 failed on node n0 (127.0.0.1) with exit status 1. > ----------------------------------------------------------------------------- > mpirun failed with exit status 1 > > > The debugger tracks the segmentation violation to > > > main -> KSPSolve -> KSPSetUp -> PCSetUp -> PCSetUp_LU -> MatLUFactorNumeric > -> MatLUFactorNumeric_SUperLU_DIST -> pzgssvx > > > Could someone kindly point out what I am missing ? > > > Thanks > > Rgds, > Amit > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From hzhang at mcs.anl.gov Thu Feb 28 14:57:24 2008 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 28 Feb 2008 14:57:24 -0600 (CST) Subject: Direct LU solver In-Reply-To: References: <200802281846.m1SIkPA31406@mcs.anl.gov> Message-ID: or ~petsc/src/ksp/ksp/examples/tutorials/ex5.c to avoid using matrix data file, e.g. mpiexec -np 2 ./ex5 -ksp_type preonly -pc_type lu -mat_type superlu_dist Hong On Thu, 28 Feb 2008, Matthew Knepley wrote: > I would recommend using KSP ex10 and then customizing with options. > That way we know it should work. For SuperLU_dist, > > -ksp_type preonly -pc_type lu -mat_type superlu_dist > > Matt > > On Thu, Feb 28, 2008 at 1:07 PM, wrote: >> Hi, >> >> I need to do direct LU solves (repeatedly, with the same matrix) in one of >> my MPI applications. I am having trouble implementing the solver. To >> identify the problem, I wrote a short toy code to run with 2 processes. I >> can run it with either the spooles parallel matrix or the superlu_dist >> matrix. I am using C++ and complex matrices. >> >> Here is the code listing: >> >> #include >> #include >> #include >> #include "petsc.h" >> #include "petscmat.h" >> #include "petscvec.h" >> #include "petscksp.h" >> >> using namespace std; >> >> int main( int argc, char *argv[] ) { >> >> int rank, size; >> Mat A; >> PetscErrorCode ierr; >> PetscInt loc; >> PetscScalar val; >> Vec x, y; >> KSP solver; >> PC prec; >> MPI_Comm comm; >> >> // Number of non-zeros in each row >> int d_nnz=1, o_nnz=1; >> >> ierr=PetscInitialize(&argc,&argv,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >> // Initialization (including MPI) >> >> comm=PETSC_COMM_WORLD; >> >> ierr=MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr); >> ierr=MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr); >> >> // Assemble matrix A >> >> if(rank==0) { >> ierr=MatCreateMPIAIJ(comm,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); CHKERRQ(ierr); >> val=complex(1.0,0.0); >> ierr=MatSetValue(A,0,0,val,INSERT_VALUES);CHKERRQ(ierr); >> val=complex(0.0,1.0); >> ierr=MatSetValue(A,0,1,val,INSERT_VALUES);CHKERRQ(ierr); >> } >> else { >> ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); >> CHKERRQ(ierr); >> val=complex(1.0,1.0); >> ierr=MatSetValue(A,1,1,val,INSERT_VALUES);CHKERRQ(ierr); >> val=complex(0.0,-1.0); >> ierr=MatSetValue(A,1,0,val,INSERT_VALUES);CHKERRQ(ierr); >> } >> >> ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); >> >> cout << "============ Mat A ==================" << endl; >> ierr=MatView(A,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); >> cout << "======================================" << endl; >> >> // For spooles >> //ierr=MatConvert(A,MATMPIAIJSPOOLES,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr); >> >> // For superlu_dist >> ierr=MatConvert(A,MATSUPERLU_DIST,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr); >> >> // Direct LU solver >> ierr=KSPCreate(comm,&solver); CHKERRQ(ierr); >> ierr=KSPSetType(solver,KSPPREONLY); CHKERRQ(ierr); >> ierr=KSPSetOperators(solver,A,A,SAME_NONZERO_PATTERN); CHKERRQ(ierr); >> ierr=KSPGetPC(solver,&prec); CHKERRQ(ierr); >> ierr=PCSetType(prec,PCLU); CHKERRQ(ierr); >> ierr=KSPSetFromOptions(solver); CHKERRQ(ierr); >> >> //============ Vector assembly ======================== >> >> if(rank==0) { >> ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr); >> val=complex(1.0,0.0); >> loc=0; >> ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr); >> } >> else { >> ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr); >> val=complex(-1.0,0.0); >> loc=1; >> ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr); >> } >> >> ierr=VecAssemblyBegin(x); CHKERRQ(ierr); >> ierr=VecAssemblyEnd(x); CHKERRQ(ierr); >> >> cout << "============== Vec x ==================" << endl; >> ierr=VecView(x,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); >> cout << "======================================" << endl; >> >> VecDuplicate(x,&y); // Duplicate the matrix storage >> >> // Solve the matrix equation >> ierr=KSPSolve(solver,x,y); CHKERRQ(ierr); >> >> cout << "============== Vec y =================" << endl; >> ierr=VecView(y,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); >> cout << "======================================" << endl; >> >> >> // Destructors >> ierr=KSPDestroy(solver); CHKERRQ(ierr); >> ierr=VecDestroy(x); CHKERRQ(ierr); >> ierr=VecDestroy(y); CHKERRQ(ierr); >> ierr=MatDestroy(A); CHKERRQ(ierr); >> >> // Finalize >> ierr=PetscFinalize(); CHKERRQ(ierr); >> >> >> return 0; >> >> } >> >> >> When I run the program with spooles, I get the following output. >> >> >> ============ Mat A ================== >> ============ Mat A ================== >> ====================================== >> row 0: (0, 1) (1, 0 + 1 i) >> row 1: (0, 0 - 1 i) (1, 1 + 1 i) >> ====================================== >> ============== Vec x ================== >> ============== Vec x ================== >> Process [0] >> 1 >> ====================================== >> Process [1] >> -1 >> ====================================== >> >> fatal error in InpMtx_MPI_split() >> firsttag = 0, tagbound = -1 >> >> fatal error in InpMtx_MPI_split() >> firsttag = 0, tagbound = -1 >> ----------------------------------------------------------------------------- >> One of the processes started by mpirun has exited with a nonzero exit >> code. This typically indicates that the process finished in error. >> If your process did not finish in error, be sure to include a "return >> 0" or "exit(0)" in your C code before exiting the application. >> >> PID 22881 failed on node n0 (127.0.0.1) with exit status 255. >> ----------------------------------------------------------------------------- >> mpirun failed with exit status 255 >> >> >> When run in the debugger, there is no stack trace. The error is >> >> Program exited with code 0377 >> >> >> With superlu_dist, the output is >> >> ============ Mat A ================== >> ============ Mat A ================== >> row 0: (0, 1) (1, 0 + 1 i) >> row 1: (0, 0 - 1 i) (1, 1 + 1 i) >> ====================================== >> ====================================== >> ============== Vec x ================== >> ============== Vec x ================== >> Process [0] >> 1 >> Process [1] >> -1 >> ====================================== >> ====================================== >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC >> ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to >> find memory corruption errors >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and >> run >> [0]PETSC ERROR: to get more information on the crash. >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Signal received! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 >> CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: lu on a linux-gnu named tabla by amit Thu Feb 28 13:48:54 >> 2008 >> [0]PETSC ERROR: Libraries linked from >> /home/amit/programs/ParEM/petsc-2.3.3-p8/lib/linux-gnu-c-debug >> [0]PETSC ERROR: Configure run at Thu Feb 28 12:19:39 2008 >> [0]PETSC ERROR: Configure options --with-scalar-type=complex >> --with-debugging=no --with-fortran-kernels=generic --with-clanguage=cxx >> --with-metis=1 --download-metis=1 --with-parmetis=1 --download-parmetis=1 >> --with-superlu_dist=1 --download-superlu_dist=1 --with-spooles=1 >> --with-spooles-dir=/home/amit/programs/ParEM/spooles-2.2 COPTFLAGS="-O3 >> -march=p4 -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe >> -fomit-frame-pointer -finline-functions -msse2" CXXOPTFLAGS="-O3 -march=p4 >> -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe >> -fomit-frame-pointer -finline-functions -msse2" FOPTS="-O3 -qarch=p4 >> -qtune=p4" --with-shared=0 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> ----------------------------------------------------------------------------- >> One of the processes started by mpirun has exited with a nonzero exit >> code. This typically indicates that the process finished in error. >> If your process did not finish in error, be sure to include a "return >> 0" or "exit(0)" in your C code before exiting the application. >> >> PID 22998 failed on node n0 (127.0.0.1) with exit status 1. >> ----------------------------------------------------------------------------- >> mpirun failed with exit status 1 >> >> >> The debugger tracks the segmentation violation to >> >> >> main -> KSPSolve -> KSPSetUp -> PCSetUp -> PCSetUp_LU -> MatLUFactorNumeric >> -> MatLUFactorNumeric_SUperLU_DIST -> pzgssvx >> >> >> Could someone kindly point out what I am missing ? >> >> >> Thanks >> >> Rgds, >> Amit >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > From Amit.Itagi at seagate.com Thu Feb 28 16:32:13 2008 From: Amit.Itagi at seagate.com (Amit.Itagi at seagate.com) Date: Thu, 28 Feb 2008 17:32:13 -0500 Subject: Direct LU solver In-Reply-To: Message-ID: Matt and Hong, I will try to customize the example. However, since my application involves multiple ksp solvers (using different algorithms), I would really like to set the options with in the code, instead of on the command line. Is there a way of doing this ? Thanks Rgds, Amit From balay at mcs.anl.gov Thu Feb 28 16:58:16 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 28 Feb 2008 16:58:16 -0600 (CST) Subject: Direct LU solver In-Reply-To: References: Message-ID: On Thu, 28 Feb 2008, Amit.Itagi at seagate.com wrote: > Matt and Hong, > > I will try to customize the example. However, since my application involves > multiple ksp solvers (using different algorithms), I would really like to > set the options with in the code, instead of on the command line. Is there > a way of doing this ? -ksp_type preonly -pc_type lu -mat_type superlu_dist There are 2 ways of doing this. One is within the code: MatCreate(&mat); MatSetType(mat,MATSUPERLU_DIST); MatSetFromOptions(mat); KSPSetType(ksp,KSPPREONLY); KSPGetPC(ksp,&pc); PCSetType(pc,PCLU); KSPSetFromOptions(ksp); etc.. Another way is to give each object a prefix [if you have multiple objects of the same type. You can use this prefix with the command line options. For eg: KSPCreate(&ksp1) KSPCreate(&ksp2) KSPSetOptionsPrefix(ksp1,"a_") KSPSetOptionsPrefix(ksp2,"b_") Now you can use -a_ksp_type gmres -b_ksp_type cg etc.. Satish From Amit.Itagi at seagate.com Fri Feb 29 08:23:22 2008 From: Amit.Itagi at seagate.com (Amit.Itagi at seagate.com) Date: Fri, 29 Feb 2008 09:23:22 -0500 Subject: Direct LU solver In-Reply-To: Message-ID: Matt/Hong/Satish, My toy-problem would run with the command line options. However, the in-code options were still giving a problem. I also found that I had a Petsc version compiled with the debugging flag off. On recompiling Petsc by turning the debugging flag on, the in-code options worked. I am wondering about the cause for this behavior. Thanks for your help. I will now fiddle around with the actual application. Rgds, Amit Satish Balay To Sent by: petsc-users at mcs.anl.gov owner-petsc-users cc @mcs.anl.gov No Phone Info Subject Available Re: Direct LU solver 02/28/2008 05:58 PM Please respond to petsc-users at mcs.a nl.gov On Thu, 28 Feb 2008, Amit.Itagi at seagate.com wrote: > Matt and Hong, > > I will try to customize the example. However, since my application involves > multiple ksp solvers (using different algorithms), I would really like to > set the options with in the code, instead of on the command line. Is there > a way of doing this ? -ksp_type preonly -pc_type lu -mat_type superlu_dist There are 2 ways of doing this. One is within the code: MatCreate(&mat); MatSetType(mat,MATSUPERLU_DIST); MatSetFromOptions(mat); KSPSetType(ksp,KSPPREONLY); KSPGetPC(ksp,&pc); PCSetType(pc,PCLU); KSPSetFromOptions(ksp); etc.. Another way is to give each object a prefix [if you have multiple objects of the same type. You can use this prefix with the command line options. For eg: KSPCreate(&ksp1) KSPCreate(&ksp2) KSPSetOptionsPrefix(ksp1,"a_") KSPSetOptionsPrefix(ksp2,"b_") Now you can use -a_ksp_type gmres -b_ksp_type cg etc.. Satish From balay at mcs.anl.gov Fri Feb 29 09:09:17 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 29 Feb 2008 09:09:17 -0600 (CST) Subject: Direct LU solver In-Reply-To: References: Message-ID: On Fri, 29 Feb 2008, Amit.Itagi at seagate.com wrote: > Matt/Hong/Satish, > > My toy-problem would run with the command line options. However, the > in-code options were still giving a problem. I also found that I had a > Petsc version compiled with the debugging flag off. On recompiling Petsc by > turning the debugging flag on, the in-code options worked. I am wondering > about the cause for this behavior. > > Thanks for your help. I will now fiddle around with the actual application. Hmm - there should be some example usages in src/ksp/ksp/examples/tutorials [like ex2.c, ex30.c etc..]. You can verify if these work fine for you without debuging, and then see if your usage is same as these examples. Satish From knepley at gmail.com Fri Feb 29 09:14:20 2008 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 29 Feb 2008 09:14:20 -0600 Subject: Direct LU solver In-Reply-To: References: Message-ID: On Fri, Feb 29, 2008 at 8:23 AM, wrote: > Matt/Hong/Satish, > > My toy-problem would run with the command line options. However, the > in-code options were still giving a problem. I also found that I had a > Petsc version compiled with the debugging flag off. On recompiling Petsc by > turning the debugging flag on, the in-code options worked. I am wondering > about the cause for this behavior. I am sure this is a misinterpretation. The code just does not work that way. Something you have not notices changed between those versions of your code. When you say "giving a problem", I assume you mean the option does not take effect. The most common cause is a misunderstanding of the mechanism. If you call a function to set something, but subsequently call SetFromOptions(), it will be overridden by command line arguments Matt > Thanks for your help. I will now fiddle around with the actual application. > > Rgds, > Amit -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From Amit.Itagi at seagate.com Fri Feb 29 12:22:00 2008 From: Amit.Itagi at seagate.com (Amit.Itagi at seagate.com) Date: Fri, 29 Feb 2008 13:22:00 -0500 Subject: Direct LU solver In-Reply-To: Message-ID: owner-petsc-users at mcs.anl.gov wrote on 02/29/2008 10:14:20 AM: > On Fri, Feb 29, 2008 at 8:23 AM, wrote: > > Matt/Hong/Satish, > > > > My toy-problem would run with the command line options. However, the > > in-code options were still giving a problem. I also found that I had a > > Petsc version compiled with the debugging flag off. On recompiling Petsc by > > turning the debugging flag on, the in-code options worked. I am wondering > > about the cause for this behavior. > > I am sure this is a misinterpretation. The code just does not work that way. > Something you have not notices changed between those versions of your code. > When you say "giving a problem", I assume you mean the option does not take > effect. The most common cause is a misunderstanding of the mechanism. If you > call a function to set something, but subsequently call > SetFromOptions(), it will > be overridden by command line arguments > > Matt > Hi, My woes continue. Based on the earlier discussions, I implemented the matrix as //========================================================================= // Option 1 ierr=MatCreate(PETSC_COMM_WORLD,&A); CHKERRQ(ierr); ierr=MatSetSizes(A,1,1,2,2); CHKERRQ(ierr); /* Option 2 PetscInt d_nnz=1, o_nnz=1; ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); CHKERRQ(ierr); */ /* Option 3 ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,PETSC_NULL,0,PETSC_NULL,&A); CHKERRQ(ierr); */ ierr=MatSetType(A,MATSUPERLU_DIST); CHKERRQ(ierr); ierr=MatSetFromOptions(A); CHKERRQ(ierr); // (After this, I set the values and do the assembly). I then use the direct LU solver. //============================================================================ Note: I have a simple 2 by 2 matrix (with non-zero values in all 4 places). If I use "option 1" (based on Satish's email), the program executes successfully. If instead of "option 1", I use "option 2" or "option 3", I get a crash. If I am not mistaken, options 1 and 3 are the same. Option 2, additionally, does a pre-allocation. Am I correct ? Thanks Rgds, Amit From balay at mcs.anl.gov Fri Feb 29 13:06:06 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 29 Feb 2008 13:06:06 -0600 (CST) Subject: Direct LU solver In-Reply-To: References: Message-ID: On Fri, 29 Feb 2008, Amit.Itagi at seagate.com wrote: > > My woes continue. Based on the earlier discussions, I implemented the > matrix as > > //========================================================================= > > // Option 1 > ierr=MatCreate(PETSC_COMM_WORLD,&A); CHKERRQ(ierr); > ierr=MatSetSizes(A,1,1,2,2); CHKERRQ(ierr); > > > /* Option 2 > PetscInt d_nnz=1, o_nnz=1; > ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); > CHKERRQ(ierr); > */ > > /* Option 3 > > ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,PETSC_NULL,0,PETSC_NULL,&A); > CHKERRQ(ierr); > */ > > ierr=MatSetType(A,MATSUPERLU_DIST); CHKERRQ(ierr); > ierr=MatSetFromOptions(A); CHKERRQ(ierr); > > // (After this, I set the values and do the assembly). I then use the > direct LU solver. > > //============================================================================ > > Note: I have a simple 2 by 2 matrix (with non-zero values in all 4 places). > If I use "option 1" (based on Satish's email), the program executes > successfully. If instead of "option 1", I use "option 2" or "option 3", I > get a crash. > If I am not mistaken, options 1 and 3 are the same. Option 2, additionally, > does a pre-allocation. Am I correct ? Nope - Option 3 is same as: MatCreate() MatSetType(MPIAIJ) MatMPIAIJSetPreallocation() MatSetType(MATSUPERLU_DIST) [i.e first you are setting type as MPIAIJ, and then changing to MATSUPERLU_DIST] What you want is: MatCreate() MatSetType(MATSUPERLU_DIST) MatMPIAIJSetPreallocation() [Ideally you need MatSuerLU_DistSetPreallocation() - but that would be same as MatMPIAIJSetPreallocation()] Satish From recrusader at gmail.com Fri Feb 29 16:45:44 2008 From: recrusader at gmail.com (Yujie) Date: Fri, 29 Feb 2008 14:45:44 -0800 Subject: --with-clanguage: c and c++ Message-ID: <7ff0ee010802291445g544b30f7xf89493105635eb84@mail.gmail.com> Hi, everyone When PETSc is compiled with "--with-clanguage=C" or "--with-clanguage=C++", what is the difference between them, the parameters in the functions are adjusted or else? thanks, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Feb 29 16:51:11 2008 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 29 Feb 2008 16:51:11 -0600 (CST) Subject: --with-clanguage: c and c++ In-Reply-To: <7ff0ee010802291445g544b30f7xf89493105635eb84@mail.gmail.com> References: <7ff0ee010802291445g544b30f7xf89493105635eb84@mail.gmail.com> Message-ID: On Fri, 29 Feb 2008, Yujie wrote: > Hi, everyone > > When PETSc is compiled with "--with-clanguage=C" or "--with-clanguage=C++", > what is the difference between them, the parameters in the functions are > adjusted or else? Primary difference is that the default compiler used to compile the sources is c vs c++. [So, if the user is developing with c++, its easiest to build PETSc with c++, and use the default makefiles to compile user code aswell] There are some components of PETSc [sieve] that are coded in c++, and can be built only in the c++ mode. Satish From recrusader at gmail.com Fri Feb 29 18:46:27 2008 From: recrusader at gmail.com (Yujie) Date: Fri, 29 Feb 2008 16:46:27 -0800 Subject: how to add a parallel MatMatSolve() function? Message-ID: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> Hi, everyone I am considering to add a parallel MatMatSolve() into PETSc based on SuperLu_DIST or Spooles. If I want to use it like current sequential MatMatSolve() in the application codes, how to do it? Could you give me some examples about how to add a new function? thanks a lot. Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Feb 29 19:46:19 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Feb 2008 19:46:19 -0600 Subject: how to add a parallel MatMatSolve() function? In-Reply-To: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> Message-ID: <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> Please edit the file src/mat/interface/matrix.c and remove the function MatMatSolve(). Replace it with the following two functions, then run "make lib shared" in that directory. Please let us know at petsc-maint at mcs.anl.gov if it crashes or produces incorrect results. Barry #undef __FUNCT__ #define __FUNCT__ "MatMatSolve_Basic" PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X) { PetscErrorCode ierr; Vec b,x; PetscInt m,N,i; PetscScalar *bb,*xx; PetscFunctionBegin; ierr = MatGetArray(B,&bb);CHKERRQ(ierr); ierr = MatGetArray(X,&xx);CHKERRQ(ierr); ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr); /* number local rows */ ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr); /* total columns in dense matrix */ ierr = VecCreateMPIWithArray(A- >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr); ierr = VecCreateMPIWithArray(A- >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr); for (i=0; ifactor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored matrix"); if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat X: global dim %D %D",A->cmap.N,X->rmap.N); if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat B: global dim %D %D",A->rmap.N,B->rmap.N); if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat B: local dim %D %D",A->rmap.n,B->rmap.n); if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0); ierr = MatPreallocated(A);CHKERRQ(ierr); ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); if (!A->ops->matsolve) { ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve", ((PetscObject)A)->type_name);CHKERRQ(ierr); ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr); } else { ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr); } ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr); PetscFunctionReturn(0); } On Feb 29, 2008, at 6:46 PM, Yujie wrote: > Hi, everyone > > I am considering to add a parallel MatMatSolve() into PETSc based on > SuperLu_DIST or Spooles. If I want to use it like current sequential > MatMatSolve() in the application codes, how to do it? Could you give > me some examples about how to add a new function? > thanks a lot. > > Regards, > Yujie From recrusader at gmail.com Fri Feb 29 20:01:59 2008 From: recrusader at gmail.com (Yujie) Date: Fri, 29 Feb 2008 18:01:59 -0800 Subject: how to add a parallel MatMatSolve() function? In-Reply-To: <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> Message-ID: <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com> Dear Barry: Thank you for your help. I check the codes roughly, the method in the codes is to use MatSolve() to solve AX=B in a loop. I also consider such a method. However, I am wondering whether it is slower than the method that directly solves AX=B? thanks again. Regards, Yujie On 2/29/08, Barry Smith wrote: > > > Please edit the file src/mat/interface/matrix.c and remove the > function MatMatSolve(). Replace it with the following two functions, > then run "make lib shared" in that directory. Please let us know at > petsc-maint at mcs.anl.gov > if it crashes or produces incorrect > results. > > Barry > > > > #undef __FUNCT__ > #define __FUNCT__ "MatMatSolve_Basic" > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X) > { > PetscErrorCode ierr; > Vec b,x; > PetscInt m,N,i; > PetscScalar *bb,*xx; > > PetscFunctionBegin; > ierr = MatGetArray(B,&bb);CHKERRQ(ierr); > ierr = MatGetArray(X,&xx);CHKERRQ(ierr); > ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr); /* number > local rows */ > ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr); /* total > columns in dense matrix */ > ierr = VecCreateMPIWithArray(A- > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr); > ierr = VecCreateMPIWithArray(A- > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr); > for (i=0; i ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr); > ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr); > ierr = MatSolve(A,b,x);CHKERRQ(ierr); > } > ierr = VecDestroy(b);CHKERRQ(ierr); > ierr = VecDestroy(x);CHKERRQ(ierr); > PetscFunctionReturn(0); > } > > #undef __FUNCT__ > #define __FUNCT__ "MatMatSolve" > /*@ > MatMatSolve - Solves A X = B, given a factored matrix. > > Collective on Mat > > Input Parameters: > + mat - the factored matrix > - B - the right-hand-side matrix (dense matrix) > > Output Parameter: > . B - the result matrix (dense matrix) > > Notes: > The matrices b and x cannot be the same. I.e., one cannot > call MatMatSolve(A,x,x). > > Notes: > Most users should usually employ the simplified KSP interface for > linear solvers > instead of working directly with matrix algebra routines such as > this. > See, e.g., KSPCreate(). However KSP can only solve for one vector > (column of X) > at a time. > > Level: developer > > Concepts: matrices^triangular solves > > .seealso: MatMatSolveAdd(), MatMatSolveTranspose(), > MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor() > @*/ > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X) > { > PetscErrorCode ierr; > > PetscFunctionBegin; > PetscValidHeaderSpecific(A,MAT_COOKIE,1); > PetscValidType(A,1); > PetscValidHeaderSpecific(B,MAT_COOKIE,2); > PetscValidHeaderSpecific(X,MAT_COOKIE,3); > PetscCheckSameComm(A,1,B,2); > PetscCheckSameComm(A,1,X,3); > if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different > matrices"); > if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored > matrix"); > if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > X: global dim %D %D",A->cmap.N,X->rmap.N); > if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > B: global dim %D %D",A->rmap.N,B->rmap.N); > if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > B: local dim %D %D",A->rmap.n,B->rmap.n); > if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0); > ierr = MatPreallocated(A);CHKERRQ(ierr); > > ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > if (!A->ops->matsolve) { > ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve", > ((PetscObject)A)->type_name);CHKERRQ(ierr); > ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr); > } else { > ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr); > } > ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr); > PetscFunctionReturn(0); > > } > > On Feb 29, 2008, at 6:46 PM, Yujie wrote: > > > Hi, everyone > > > > I am considering to add a parallel MatMatSolve() into PETSc based on > > SuperLu_DIST or Spooles. If I want to use it like current sequential > > MatMatSolve() in the application codes, how to do it? Could you give > > me some examples about how to add a new function? > > thanks a lot. > > > > Regards, > > Yujie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Feb 29 20:15:16 2008 From: recrusader at gmail.com (Yujie) Date: Fri, 29 Feb 2008 18:15:16 -0800 Subject: how to add a parallel MatMatSolve() function? In-Reply-To: <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> Message-ID: <7ff0ee010802291815i42daaf82kbb3d97500542f80c@mail.gmail.com> Dear Barry: the following is the compiled errors: /home/yujie/mpich127/bin/mpicxx -o matrix.o -c -Wall -Wwrite-strings -g -fPIC -I/home/yujie/codes/petsc-2.3.3-p8 -I/home/yujie/codes/petsc-2.3.3-p8/bmake/linux -I/home/yujie/codes/petsc-2.3.3-p8/include -I/home/yujie/codes/petsc- 2.3.3-p8/externalpackages/spooles-2.2/linux/ -I/home/yujie/mpich127/include -I/usr/X11R6/include -D__SDIR__='"src/mat/interface/"' matrix.c matrix.c: In function `PetscErrorCode MatMatSolve_Basic(_p_Mat*, _p_Mat*, _p_Mat*)': matrix.c:2531: `struct _p_Mat' has no member named `hdr' matrix.c:2532: `struct _p_Mat' has no member named `hdr' matrix.c:2588:41: warning: multi-line string literals are deprecated matrix.c:2590:52: warning: multi-line string literals are deprecated matrix.c:2592:58: warning: multi-line string literals are deprecated matrix.c:2594:58: warning: multi-line string literals are deprecated matrix.c:2596:58: warning: multi-line string literals are deprecated make[1]: [/home/yujie/codes/petsc-2.3.3-p8/lib/linux/libpetscmat.a(matrix.o)] Error 1 (ignored) /usr/bin/ar cr /home/yujie/codes/petsc-2.3.3-p8/lib/linux/libpetscmat.a matrix.o /usr/bin/ar: matrix.o: No such file or directory make[1]: [/home/yujie/codes/petsc-2.3.3-p8/lib/linux/libpetscmat.a(matrix.o)] Error 1 (ignored) if test -n ""; then /usr/bin/ar cr matrix.lo; fi /bin/rm -f matrix.o matrix.lo making shared libraries in /home/yujie/codes/petsc-2.3.3-p8/lib/linux building libpetscmat.so thanks, Yujie On 2/29/08, Barry Smith wrote: > > > Please edit the file src/mat/interface/matrix.c and remove the > function MatMatSolve(). Replace it with the following two functions, > then run "make lib shared" in that directory. Please let us know at > petsc-maint at mcs.anl.gov > if it crashes or produces incorrect > results. > > Barry > > > > #undef __FUNCT__ > #define __FUNCT__ "MatMatSolve_Basic" > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X) > { > PetscErrorCode ierr; > Vec b,x; > PetscInt m,N,i; > PetscScalar *bb,*xx; > > PetscFunctionBegin; > ierr = MatGetArray(B,&bb);CHKERRQ(ierr); > ierr = MatGetArray(X,&xx);CHKERRQ(ierr); > ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr); /* number > local rows */ > ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr); /* total > columns in dense matrix */ > ierr = VecCreateMPIWithArray(A- > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr); > ierr = VecCreateMPIWithArray(A- > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr); > for (i=0; i ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr); > ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr); > ierr = MatSolve(A,b,x);CHKERRQ(ierr); > } > ierr = VecDestroy(b);CHKERRQ(ierr); > ierr = VecDestroy(x);CHKERRQ(ierr); > PetscFunctionReturn(0); > } > > #undef __FUNCT__ > #define __FUNCT__ "MatMatSolve" > /*@ > MatMatSolve - Solves A X = B, given a factored matrix. > > Collective on Mat > > Input Parameters: > + mat - the factored matrix > - B - the right-hand-side matrix (dense matrix) > > Output Parameter: > . B - the result matrix (dense matrix) > > Notes: > The matrices b and x cannot be the same. I.e., one cannot > call MatMatSolve(A,x,x). > > Notes: > Most users should usually employ the simplified KSP interface for > linear solvers > instead of working directly with matrix algebra routines such as > this. > See, e.g., KSPCreate(). However KSP can only solve for one vector > (column of X) > at a time. > > Level: developer > > Concepts: matrices^triangular solves > > .seealso: MatMatSolveAdd(), MatMatSolveTranspose(), > MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor() > @*/ > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X) > { > PetscErrorCode ierr; > > PetscFunctionBegin; > PetscValidHeaderSpecific(A,MAT_COOKIE,1); > PetscValidType(A,1); > PetscValidHeaderSpecific(B,MAT_COOKIE,2); > PetscValidHeaderSpecific(X,MAT_COOKIE,3); > PetscCheckSameComm(A,1,B,2); > PetscCheckSameComm(A,1,X,3); > if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different > matrices"); > if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored > matrix"); > if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > X: global dim %D %D",A->cmap.N,X->rmap.N); > if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > B: global dim %D %D",A->rmap.N,B->rmap.N); > if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > B: local dim %D %D",A->rmap.n,B->rmap.n); > if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0); > ierr = MatPreallocated(A);CHKERRQ(ierr); > > ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > if (!A->ops->matsolve) { > ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve", > ((PetscObject)A)->type_name);CHKERRQ(ierr); > ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr); > } else { > ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr); > } > ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr); > PetscFunctionReturn(0); > > } > > On Feb 29, 2008, at 6:46 PM, Yujie wrote: > > > Hi, everyone > > > > I am considering to add a parallel MatMatSolve() into PETSc based on > > SuperLu_DIST or Spooles. If I want to use it like current sequential > > MatMatSolve() in the application codes, how to do it? Could you give > > me some examples about how to add a new function? > > thanks a lot. > > > > Regards, > > Yujie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Feb 29 20:50:51 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Feb 2008 20:50:51 -0600 Subject: how to add a parallel MatMatSolve() function? In-Reply-To: <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com> References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com> Message-ID: <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov> Some direct solver packages have support for solving directly with several right hand sides at the same time. They could be a bit faster than solving one at a time; maybe 30% faster at most, not 10 times faster. What is more important solving the problem you want to solve in a reasonable time or solving the problem a bit faster after spending several weeks writing the much more complicated code? Barry On Feb 29, 2008, at 8:01 PM, Yujie wrote: > Dear Barry: > > Thank you for your help. I check the codes roughly, the method in > the codes is to use MatSolve() to solve AX=B in a loop. I also > consider such a method. > However, I am wondering whether it is slower than the method that > directly solves AX=B? thanks again. > > Regards, > Yujie > > On 2/29/08, Barry Smith wrote: > > Please edit the file src/mat/interface/matrix.c and remove the > function MatMatSolve(). Replace it with the following two functions, > then run "make lib shared" in that directory. Please let us know at petsc-maint at mcs.anl.gov > if it crashes or produces incorrect > results. > > Barry > > > > #undef __FUNCT__ > #define __FUNCT__ "MatMatSolve_Basic" > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X) > { > PetscErrorCode ierr; > Vec b,x; > PetscInt m,N,i; > PetscScalar *bb,*xx; > > PetscFunctionBegin; > ierr = MatGetArray(B,&bb);CHKERRQ(ierr); > ierr = MatGetArray(X,&xx);CHKERRQ(ierr); > ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr); /* number > local rows */ > ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr); /* total > columns in dense matrix */ > ierr = VecCreateMPIWithArray(A- > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr); > ierr = VecCreateMPIWithArray(A- > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr); > for (i=0; i ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr); > ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr); > ierr = MatSolve(A,b,x);CHKERRQ(ierr); > } > ierr = VecDestroy(b);CHKERRQ(ierr); > ierr = VecDestroy(x);CHKERRQ(ierr); > PetscFunctionReturn(0); > } > > #undef __FUNCT__ > #define __FUNCT__ "MatMatSolve" > /*@ > MatMatSolve - Solves A X = B, given a factored matrix. > > Collective on Mat > > Input Parameters: > + mat - the factored matrix > - B - the right-hand-side matrix (dense matrix) > > Output Parameter: > . B - the result matrix (dense matrix) > > Notes: > The matrices b and x cannot be the same. I.e., one cannot > call MatMatSolve(A,x,x). > > Notes: > Most users should usually employ the simplified KSP interface for > linear solvers > instead of working directly with matrix algebra routines such as > this. > See, e.g., KSPCreate(). However KSP can only solve for one vector > (column of X) > at a time. > > Level: developer > > Concepts: matrices^triangular solves > > .seealso: MatMatSolveAdd(), MatMatSolveTranspose(), > MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor() > @*/ > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X) > { > PetscErrorCode ierr; > > PetscFunctionBegin; > PetscValidHeaderSpecific(A,MAT_COOKIE,1); > PetscValidType(A,1); > PetscValidHeaderSpecific(B,MAT_COOKIE,2); > PetscValidHeaderSpecific(X,MAT_COOKIE,3); > PetscCheckSameComm(A,1,B,2); > PetscCheckSameComm(A,1,X,3); > if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different > matrices"); > if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored > matrix"); > if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > X: global dim %D %D",A->cmap.N,X->rmap.N); > if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > B: global dim %D %D",A->rmap.N,B->rmap.N); > if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > B: local dim %D %D",A->rmap.n,B->rmap.n); > if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0); > ierr = MatPreallocated(A);CHKERRQ(ierr); > > ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > if (!A->ops->matsolve) { > ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve", > ((PetscObject)A)->type_name);CHKERRQ(ierr); > ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr); > } else { > ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr); > } > ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr); > PetscFunctionReturn(0); > > } > > On Feb 29, 2008, at 6:46 PM, Yujie wrote: > > > Hi, everyone > > > > I am considering to add a parallel MatMatSolve() into PETSc based on > > SuperLu_DIST or Spooles. If I want to use it like current sequential > > MatMatSolve() in the application codes, how to do it? Could you give > > me some examples about how to add a new function? > > thanks a lot. > > > > Regards, > > Yujie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Feb 29 21:35:46 2008 From: recrusader at gmail.com (Yujie) Date: Sat, 1 Mar 2008 11:35:46 +0800 Subject: how to add a parallel MatMatSolve() function? In-Reply-To: <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov> References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com> <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov> Message-ID: <7ff0ee010802291935o65002d30o4a8189c730f829d6@mail.gmail.com> Dear Barry: I have checked SuperLU_Dist codes. It looks like relative easy to write codes for AX=B based on MatSolve(). This is why I ask you the above problem. how about your advice? thanks a lot. Regards, Yujie On 3/1/08, Barry Smith wrote: > > > Some direct solver packages have support for solving directly with > several right hand sides at the same time.They could be a bit faster than > solving one at a time; maybe 30% faster at most, not 10 times faster. What > is more > important solving the problem you want to solve in a reasonable time or > solving the problem a bit faster > after spending several weeks writing the much more complicated code? > > Barry > > On Feb 29, 2008, at 8:01 PM, Yujie wrote: > > Dear Barry: > > Thank you for your help. I check the codes > roughly, the method in the codes is to use MatSolve() to solve AX=B in a loop. I also consider such a method. > However, I am wondering whether it is slower than the method that directly > solves AX=B? thanks again. > > Regards, > Yujie > > On 2/29/08, Barry Smith wrote: > > > > > > Please edit the file src/mat/interface/matrix.c and remove the > > function MatMatSolve(). Replace it with the following two functions, > > then run "make lib shared" in that directory. Please let us know at > > petsc-maint at mcs.anl.gov > > if it crashes or produces incorrect > > results. > > > > Barry > > > > > > > > #undef __FUNCT__ > > #define __FUNCT__ "MatMatSolve_Basic" > > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X) > > { > > PetscErrorCode ierr; > > Vec b,x; > > PetscInt m,N,i; > > PetscScalar *bb,*xx; > > > > PetscFunctionBegin; > > ierr = MatGetArray(B,&bb);CHKERRQ(ierr); > > ierr = MatGetArray(X,&xx);CHKERRQ(ierr); > > ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr); /* number > > local rows */ > > ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr); /* total > > columns in dense matrix */ > > ierr = VecCreateMPIWithArray(A- > > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr); > > ierr = VecCreateMPIWithArray(A- > > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr); > > for (i=0; i > ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr); > > ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr); > > ierr = MatSolve(A,b,x);CHKERRQ(ierr); > > } > > ierr = VecDestroy(b);CHKERRQ(ierr); > > ierr = VecDestroy(x);CHKERRQ(ierr); > > PetscFunctionReturn(0); > > } > > > > #undef __FUNCT__ > > #define __FUNCT__ "MatMatSolve" > > /*@ > > MatMatSolve - Solves A X = B, given a factored matrix. > > > > Collective on Mat > > > > Input Parameters: > > + mat - the factored matrix > > - B - the right-hand-side matrix (dense matrix) > > > > Output Parameter: > > . B - the result matrix (dense matrix) > > > > Notes: > > The matrices b and x cannot be the same. I.e., one cannot > > call MatMatSolve(A,x,x). > > > > Notes: > > Most users should usually employ the simplified KSP interface for > > linear solvers > > instead of working directly with matrix algebra routines such as > > this. > > See, e.g., KSPCreate(). However KSP can only solve for one vector > > (column of X) > > at a time. > > > > Level: developer > > > > Concepts: matrices^triangular solves > > > > .seealso: MatMatSolveAdd(), MatMatSolveTranspose(), > > MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor() > > @*/ > > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X) > > { > > PetscErrorCode ierr; > > > > PetscFunctionBegin; > > PetscValidHeaderSpecific(A,MAT_COOKIE,1); > > PetscValidType(A,1); > > PetscValidHeaderSpecific(B,MAT_COOKIE,2); > > PetscValidHeaderSpecific(X,MAT_COOKIE,3); > > PetscCheckSameComm(A,1,B,2); > > PetscCheckSameComm(A,1,X,3); > > if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different > > matrices"); > > if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored > > matrix"); > > if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > > X: global dim %D %D",A->cmap.N,X->rmap.N); > > if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > > B: global dim %D %D",A->rmap.N,B->rmap.N); > > if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > > B: local dim %D %D",A->rmap.n,B->rmap.n); > > if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0); > > ierr = MatPreallocated(A);CHKERRQ(ierr); > > > > ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > > if (!A->ops->matsolve) { > > ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve", > > ((PetscObject)A)->type_name);CHKERRQ(ierr); > > ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr); > > } else { > > ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr); > > } > > ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > > ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr); > > PetscFunctionReturn(0); > > > > } > > > > On Feb 29, 2008, at 6:46 PM, Yujie wrote: > > > > > Hi, everyone > > > > > > I am considering to add a parallel MatMatSolve() into PETSc based on > > > SuperLu_DIST or Spooles. If I want to use it like current sequential > > > MatMatSolve() in the application codes, how to do it? Could you give > > > me some examples about how to add a new function? > > > thanks a lot. > > > > > > Regards, > > > Yujie > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Feb 29 21:48:33 2008 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Feb 2008 21:48:33 -0600 Subject: how to add a parallel MatMatSolve() function? In-Reply-To: <7ff0ee010802291935o65002d30o4a8189c730f829d6@mail.gmail.com> References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com> <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov> <7ff0ee010802291935o65002d30o4a8189c730f829d6@mail.gmail.com> Message-ID: <61DD008D-868D-4B76-BBDE-734E08E0E6E7@mcs.anl.gov> The code I provided works for any LU solver; the way the PETSc code is written you can customize a routine for any specific matrix format, like the PETSc SuperLU_Dist format. You are certainly free to try to write a custom one for SuperLU_dist() (see MatMatSolve_SeqAIJ() for how to do this). It is your time, not mine. Personally I'd rather have the computer run a few more minutes then spend my time looking at code :-) Barry On Feb 29, 2008, at 9:35 PM, Yujie wrote: > Dear Barry: > > I have checked SuperLU_Dist codes. It looks like relative easy to > write codes for AX=B based on MatSolve(). This is why I ask you the > above problem. > how about your advice? > thanks a lot. > > Regards, > Yujie > > On 3/1/08, Barry Smith wrote: > > Some direct solver packages have support for solving directly with > several right hand sides at the same time. > They could be a bit faster than solving one at a time; maybe 30% > faster at most, not 10 times faster. What is more > important solving the problem you want to solve in a reasonable time > or solving the problem a bit faster > after spending several weeks writing the much more complicated code? > > Barry > > On Feb 29, 2008, at 8:01 PM, Yujie wrote: > >> Dear Barry: >> >> Thank you for your help. I check the codes roughly, the method in >> the codes is to use MatSolve() to solve AX=B in a loop. I also >> consider such a method. >> However, I am wondering whether it is slower than the method that >> directly solves AX=B? thanks again. >> >> Regards, >> Yujie >> >> On 2/29/08, Barry Smith wrote: >> >> Please edit the file src/mat/interface/matrix.c and remove the >> function MatMatSolve(). Replace it with the following two functions, >> then run "make lib shared" in that directory. Please let us know at petsc-maint at mcs.anl.gov >> if it crashes or produces incorrect >> results. >> >> Barry >> >> >> >> #undef __FUNCT__ >> #define __FUNCT__ "MatMatSolve_Basic" >> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat >> X) >> { >> PetscErrorCode ierr; >> Vec b,x; >> PetscInt m,N,i; >> PetscScalar *bb,*xx; >> >> PetscFunctionBegin; >> ierr = MatGetArray(B,&bb);CHKERRQ(ierr); >> ierr = MatGetArray(X,&xx);CHKERRQ(ierr); >> ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr); /* number >> local rows */ >> ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr); /* total >> columns in dense matrix */ >> ierr = VecCreateMPIWithArray(A- >> >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr); >> ierr = VecCreateMPIWithArray(A- >> >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr); >> for (i=0; i> ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr); >> ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr); >> ierr = MatSolve(A,b,x);CHKERRQ(ierr); >> } >> ierr = VecDestroy(b);CHKERRQ(ierr); >> ierr = VecDestroy(x);CHKERRQ(ierr); >> PetscFunctionReturn(0); >> } >> >> #undef __FUNCT__ >> #define __FUNCT__ "MatMatSolve" >> /*@ >> MatMatSolve - Solves A X = B, given a factored matrix. >> >> Collective on Mat >> >> Input Parameters: >> + mat - the factored matrix >> - B - the right-hand-side matrix (dense matrix) >> >> Output Parameter: >> . B - the result matrix (dense matrix) >> >> Notes: >> The matrices b and x cannot be the same. I.e., one cannot >> call MatMatSolve(A,x,x). >> >> Notes: >> Most users should usually employ the simplified KSP interface for >> linear solvers >> instead of working directly with matrix algebra routines such as >> this. >> See, e.g., KSPCreate(). However KSP can only solve for one vector >> (column of X) >> at a time. >> >> Level: developer >> >> Concepts: matrices^triangular solves >> >> .seealso: MatMatSolveAdd(), MatMatSolveTranspose(), >> MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor() >> @*/ >> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X) >> { >> PetscErrorCode ierr; >> >> PetscFunctionBegin; >> PetscValidHeaderSpecific(A,MAT_COOKIE,1); >> PetscValidType(A,1); >> PetscValidHeaderSpecific(B,MAT_COOKIE,2); >> PetscValidHeaderSpecific(X,MAT_COOKIE,3); >> PetscCheckSameComm(A,1,B,2); >> PetscCheckSameComm(A,1,X,3); >> if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different >> matrices"); >> if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored >> matrix"); >> if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat >> X: global dim %D %D",A->cmap.N,X->rmap.N); >> if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat >> B: global dim %D %D",A->rmap.N,B->rmap.N); >> if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat >> B: local dim %D %D",A->rmap.n,B->rmap.n); >> if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0); >> ierr = MatPreallocated(A);CHKERRQ(ierr); >> >> ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); >> if (!A->ops->matsolve) { >> ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve", >> ((PetscObject)A)->type_name);CHKERRQ(ierr); >> ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr); >> } else { >> ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr); >> } >> ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); >> ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr); >> PetscFunctionReturn(0); >> >> } >> >> On Feb 29, 2008, at 6:46 PM, Yujie wrote: >> >> > Hi, everyone >> > >> > I am considering to add a parallel MatMatSolve() into PETSc based >> on >> > SuperLu_DIST or Spooles. If I want to use it like current >> sequential >> > MatMatSolve() in the application codes, how to do it? Could you >> give >> > me some examples about how to add a new function? >> > thanks a lot. >> > >> > Regards, >> > Yujie >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From recrusader at gmail.com Fri Feb 29 21:59:48 2008 From: recrusader at gmail.com (Yujie) Date: Sat, 1 Mar 2008 11:59:48 +0800 Subject: how to add a parallel MatMatSolve() function? In-Reply-To: <61DD008D-868D-4B76-BBDE-734E08E0E6E7@mcs.anl.gov> References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com> <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov> <7ff0ee010802291935o65002d30o4a8189c730f829d6@mail.gmail.com> <61DD008D-868D-4B76-BBDE-734E08E0E6E7@mcs.anl.gov> Message-ID: <7ff0ee010802291959i35e2aa0fyb38af45644115121@mail.gmail.com> Thanks a lot:). Regards, Yujie On 3/1/08, Barry Smith wrote: > > > The code I provided works for any LU solver; the way the PETSc code is > written you can customizea routine for any specific matrix format, like > the PETSc SuperLU_Dist format. You are certainly > free to try to write a custom one for SuperLU_dist() (see > MatMatSolve_SeqAIJ() for how to do this). > It is your time, not mine. Personally I'd rather have the computer run a > few more minutes then spend > my time looking at code :-) > > Barry > > On Feb 29, 2008, at 9:35 PM, Yujie wrote: > > Dear Barry: > > I have checked SuperLU_Dist codes. It looks like relative easy to write > codes for AX=B based on MatSolve(). This is why I ask you the above problem. > > how about your advice? > thanks a lot. > > Regards, > Yujie > > On 3/1/08, Barry Smith wrote: > > > > > > Some direct solver packages have support for solving directly with > > several right hand sides at the same time.They could be a bit faster > > than solving one at a time; maybe 30% faster at most, not 10 times faster. > > What is more > > important solving the problem you want to solve in a reasonable time or > > solving the problem a bit faster > > after spending several weeks writing the much more complicated code? > > > > Barry > > > > On Feb 29, 2008, at 8:01 PM, Yujie wrote: > > > > Dear Barry: > > > > Thank you for your help. I check the codes > > roughly, the method in the codes is to use MatSolve() to solve AX=B in a loop. I also consider such a method. > > However, I am wondering whether it is slower than the method that > > directly solves AX=B? thanks again. > > > > Regards, > > Yujie > > > > On 2/29/08, Barry Smith wrote: > > > > > > > > > Please edit the file src/mat/interface/matrix.c and remove the > > > function MatMatSolve(). Replace it with the following two functions, > > > then run "make lib shared" in that directory. Please let us know at > > > petsc-maint at mcs.anl.gov > > > if it crashes or produces incorrect > > > results. > > > > > > Barry > > > > > > > > > > > > #undef __FUNCT__ > > > #define __FUNCT__ "MatMatSolve_Basic" > > > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X) > > > { > > > PetscErrorCode ierr; > > > Vec b,x; > > > PetscInt m,N,i; > > > PetscScalar *bb,*xx; > > > > > > PetscFunctionBegin; > > > ierr = MatGetArray(B,&bb);CHKERRQ(ierr); > > > ierr = MatGetArray(X,&xx);CHKERRQ(ierr); > > > ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr); /* number > > > local rows */ > > > ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr); /* total > > > columns in dense matrix */ > > > ierr = VecCreateMPIWithArray(A- > > > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr); > > > ierr = VecCreateMPIWithArray(A- > > > >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr); > > > for (i=0; i > > ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr); > > > ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr); > > > ierr = MatSolve(A,b,x);CHKERRQ(ierr); > > > } > > > ierr = VecDestroy(b);CHKERRQ(ierr); > > > ierr = VecDestroy(x);CHKERRQ(ierr); > > > PetscFunctionReturn(0); > > > } > > > > > > #undef __FUNCT__ > > > #define __FUNCT__ "MatMatSolve" > > > /*@ > > > MatMatSolve - Solves A X = B, given a factored matrix. > > > > > > Collective on Mat > > > > > > Input Parameters: > > > + mat - the factored matrix > > > - B - the right-hand-side matrix (dense matrix) > > > > > > Output Parameter: > > > . B - the result matrix (dense matrix) > > > > > > Notes: > > > The matrices b and x cannot be the same. I.e., one cannot > > > call MatMatSolve(A,x,x). > > > > > > Notes: > > > Most users should usually employ the simplified KSP interface for > > > linear solvers > > > instead of working directly with matrix algebra routines such as > > > this. > > > See, e.g., KSPCreate(). However KSP can only solve for one vector > > > (column of X) > > > at a time. > > > > > > Level: developer > > > > > > Concepts: matrices^triangular solves > > > > > > .seealso: MatMatSolveAdd(), MatMatSolveTranspose(), > > > MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor() > > > @*/ > > > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X) > > > { > > > PetscErrorCode ierr; > > > > > > PetscFunctionBegin; > > > PetscValidHeaderSpecific(A,MAT_COOKIE,1); > > > PetscValidType(A,1); > > > PetscValidHeaderSpecific(B,MAT_COOKIE,2); > > > PetscValidHeaderSpecific(X,MAT_COOKIE,3); > > > PetscCheckSameComm(A,1,B,2); > > > PetscCheckSameComm(A,1,X,3); > > > if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different > > > matrices"); > > > if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored > > > matrix"); > > > if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > > > X: global dim %D %D",A->cmap.N,X->rmap.N); > > > if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > > > B: global dim %D %D",A->rmap.N,B->rmap.N); > > > if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat > > > B: local dim %D %D",A->rmap.n,B->rmap.n); > > > if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0); > > > ierr = MatPreallocated(A);CHKERRQ(ierr); > > > > > > ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > > > if (!A->ops->matsolve) { > > > ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve", > > > ((PetscObject)A)->type_name);CHKERRQ(ierr); > > > ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr); > > > } else { > > > ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr); > > > } > > > ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr); > > > ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr); > > > PetscFunctionReturn(0); > > > > > > } > > > > > > On Feb 29, 2008, at 6:46 PM, Yujie wrote: > > > > > > > Hi, everyone > > > > > > > > I am considering to add a parallel MatMatSolve() into PETSc based on > > > > SuperLu_DIST or Spooles. If I want to use it like current sequential > > > > MatMatSolve() in the application codes, how to do it? Could you give > > > > me some examples about how to add a new function? > > > > thanks a lot. > > > > > > > > Regards, > > > > Yujie > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: