From ibarletta at inogs.it Thu Sep 1 04:01:45 2016 From: ibarletta at inogs.it (Ivano Barletta) Date: Thu, 1 Sep 2016 11:01:45 +0200 Subject: [petsc-users] Number of Iteration of KSP and relative tolerance In-Reply-To: References: Message-ID: Thanks for the replies Adding these two lines before calling KSPSolve did the job CALL KSPSetInitialGuessNonzero(ksp, PETSC_TRUE, ierr) CALL KSPSetNormType(ksp, KSP_NORM_UNPRECONDITIONED, ierr) Regards Ivano 2016-08-31 19:51 GMT+02:00 Barry Smith : > > From the KSP view > > using PRECONDITIONED norm type for convergence test > > it is using the ratio of the preconditioned residual norms for the > convergence test, not the true residual norms. > If you want to use the true residual norm use -ksp_pc_side right See > also KSPSetNormType() and KSPSetPCSide() > > Barry > > > > On Aug 31, 2016, at 10:32 AM, Ivano Barletta wrote: > > > > Dear Petsc Users > > > > I'm using Petsc to solve an elliptic equation > > > > The code can be run in parallel but I'm running > > some tests in sequential by the moment > > > > When I look at the output, what it looks odd to > > me is that the relative tolerance that I set is > > not fulfilled. > > I've set -ksp_rtol 1e-8 in my runtime options > > but the solver stops when the ratio > > || r || / || b || is still 9e-8, then almost > > one order of magnitude greater of the rtol > > that I set (as you can see in the txt in attachment). > > > > My question is, isn't the solver supposed to > > make other few iterations to reach the relative tolerance? > > > > Thanks in advance for replies and suggestions > > Kind Regards > > Ivano > > > > P.S. my runtime options are these: > > -ksp_monitor_true_residual -ksp_type cg -ksp_converged_reason -ksp_view > -ksp_rtol 1e-8 > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Thu Sep 1 08:04:01 2016 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Thu, 1 Sep 2016 09:04:01 -0400 Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2 In-Reply-To: References: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca> <33b3cb0d-78f8-fb84-2ad5-a447f5cdce9e@giref.ulaval.ca> Message-ID: <976fb6ef-fefd-9efc-9962-270935a6e401@giref.ulaval.ca> Just to "close" this thread, the offending bug has been found and corrected (was with MPI I/O implementation) (see https://github.com/open-mpi/ompi/issues/1875). So with forthcoming OpenMPI 2.0.1 everyhting is fine with PETSc for me. have a nice day! Eric On 25/07/16 03:53 PM, Matthew Knepley wrote: > On Mon, Jul 25, 2016 at 12:44 PM, Eric Chamberland > > wrote: > > Ok, > > here is the 2 points answered: > > #1) got valgrind output... here is the fatal free operation: > > > Okay, this is not the MatMult scatter, this is for local representations > of ghosted vectors. However, to me > it looks like OpenMPI mistakenly frees its built-in type for MPI_DOUBLE. > > > ==107156== Invalid free() / delete / delete[] / realloc() > ==107156== at 0x4C2A37C: free (in > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) > ==107156== by 0x1E63CD5F: opal_free (malloc.c:184) > ==107156== by 0x27622627: mca_pml_ob1_recv_request_fini > (pml_ob1_recvreq.h:133) > ==107156== by 0x27622C4F: mca_pml_ob1_recv_request_free > (pml_ob1_recvreq.c:90) > ==107156== by 0x1D3EF9DC: ompi_request_free (request.h:362) > ==107156== by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59) > ==107156== by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219) > ==107156== by 0x14ADEB74: VecScatterDestroy (vscat.c:1860) > ==107156== by 0x14A8D426: VecDestroy_MPI (pdvec.c:25) > ==107156== by 0x14A33809: VecDestroy (vector.c:432) > ==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) > (girefConfigurationPETSc.h:115) > ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() > (VecteurPETSc.cc:2292) > ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() > (VecteurPETSc.cc:287) > ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() > (VecteurPETSc.cc:281) > ==107156== by 0x1135A57B: > PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() > (PPReactionsAppuiEL3D.cc:216) > ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in > /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so) > ==107156== by 0x435702: main (Test.ProblemeGD.icc:381) > ==107156== Address 0x1d6acbc0 is 0 bytes inside data symbol > "ompi_mpi_double" > --107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to > 0x4c2f330 (__GI_stpcpy) > ==107156== > ==107156== Process terminating with default action of signal 6 > (SIGABRT): dumping core > ==107156== at 0x1DD520C7: raise (in /lib64/libc-2.19.so > ) > ==107156== by 0x1DD53534: abort (in /lib64/libc-2.19.so > ) > ==107156== by 0x1DD4B145: __assert_fail_base (in > /lib64/libc-2.19.so ) > ==107156== by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so > ) > ==107156== by 0x27626D12: mca_pml_ob1_send_request_fini > (pml_ob1_sendreq.h:221) > ==107156== by 0x276274C9: mca_pml_ob1_send_request_free > (pml_ob1_sendreq.c:117) > ==107156== by 0x1D3EF9DC: ompi_request_free (request.h:362) > ==107156== by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59) > ==107156== by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225) > ==107156== by 0x14ADEB74: VecScatterDestroy (vscat.c:1860) > ==107156== by 0x14A8D426: VecDestroy_MPI (pdvec.c:25) > ==107156== by 0x14A33809: VecDestroy (vector.c:432) > ==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) > (girefConfigurationPETSc.h:115) > ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() > (VecteurPETSc.cc:2292) > ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() > (VecteurPETSc.cc:287) > ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() > (VecteurPETSc.cc:281) > ==107156== by 0x1135A57B: > PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() > (PPReactionsAppuiEL3D.cc:216) > ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in > /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so) > ==107156== by 0x435702: main (Test.ProblemeGD.icc:381) > > > #2) For the run with -vecscatter_alltoall it works...! > > As an "end user", should I ever modify these VecScatterCreate > options? How do they change the performances of the code on large > problems? > > > Yep, those options are there because the different variants are better > on different architectures, and you can't know which one to pick until > runtime, > (and without experimentation). > > Thanks, > > Matt > > > Thanks, > > Eric > > On 25/07/16 02:57 PM, Matthew Knepley wrote: > > On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland > > >> wrote: > > Hi, > > has someone tried OpenMPI 2.0 with Petsc 3.7.2? > > I am having some errors with petsc, maybe someone have them too? > > Here are the configure logs for PETSc: > > > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log > > > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log > > And for OpenMPI: > > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log > > (in fact, I am testing the ompi-release branch, a sort of > petsc-master branch, since I need the commit 9ba6678156). > > For a set of parallel tests, I have 104 that works on 124 > total tests. > > > It appears that the fault happens when freeing the VecScatter we > build > for MatMult, which contains Request structures > for the ISends and IRecvs. These looks like internal OpenMPI > errors to > me since the Request should be opaque. > I would try at least two things: > > 1) Run under valgrind. > > 2) Switch the VecScatter implementation. All the options are here, > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate > > but maybe use alltoall. > > Thanks, > > Matt > > > And the typical error: > *** Error in > > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': > free(): invalid pointer: > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f] > /lib64/libc.so.6(+0x78026)[0x7f80eb11c026] > /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53] > > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60] > > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628] > > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd] > > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334] > > a similar one: > *** Error in > > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev': > free(): invalid pointer: 0x00007f382a7c5bc0 *** > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f] > /lib64/libc.so.6(+0x78026)[0x7f3829f22026] > /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53] > > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60] > > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628] > > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd] > > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334] > > another one: > > *** Error in > > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev': > free(): invalid pointer: 0x00007f67b6d37bc0 *** > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f] > /lib64/libc.so.6(+0x78026)[0x7f67b6494026] > /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53] > > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60] > > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae] > > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd] > > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da] > > I feel like I should wait until someone else from Petsc have > tested > it too... > > Thanks, > > Eric > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From eugenio.aulisa at ttu.edu Thu Sep 1 09:02:34 2016 From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio) Date: Thu, 1 Sep 2016 14:02:34 +0000 Subject: [petsc-users] How to use rtol as stopping criteria for PCMG subksp solver Message-ID: Hi, I have ksp GMRES->preconditioned with PCMG and at each level subksp GMRES -> preconditioned with different PCs I would like to control the stopping criteria of each level-subksp using either the relative tolerance or the npre/npost number of smoothings I set at each level KSPSetNormType(subksp, KSP_NORM_PRECONDITIONED); // assuming left preconditioner KSPSetTolerances(subksp, rtol, PETSC_DEFAULT, PETSC_DEFAULT, npre); It seams (but I am not sure how to check it properly) that rtol is completely uninfluential, rather it always uses the fix number of iterations fixed by npre. When I run with the option -ksp_view I see that the norm for the subksp has effectively changed from NONE to PRECONDITIONED for example with rtol=0.1 and npre=10 I get %%%%%%%%%%%%%%%%%%% Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (level-2) 4 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10 tolerances: relative=0.1, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using PRECONDITIONED norm type for convergence test %%%%%%%%%%%%%%%%%%%%% however when I run the code with -ksp_monitor_true_residual I only see the iteration info of the external ksp GMRES, but I do not see any iteration info relative to the subksps. I assume (but I am not sure) that subksp is behaving as the NONE norm were set. Any idea if I set something wrong? How do I effectively check how many iterations subksp does? thanks, Eugenio Eugenio Aulisa Department of Mathematics and Statistics, Texas Tech University Lubbock TX, 79409-1042 room: 226 http://www.math.ttu.edu/~eaulisa/ phone: (806) 834-6684 fax: (806) 742-1112 From patrick.sanan at gmail.com Thu Sep 1 09:40:28 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Thu, 1 Sep 2016 16:40:28 +0200 Subject: [petsc-users] How to use rtol as stopping criteria for PCMG subksp solver In-Reply-To: References: Message-ID: On Thu, Sep 1, 2016 at 4:02 PM, Aulisa, Eugenio wrote: > Hi, > > I have > > ksp GMRES->preconditioned with PCMG > > and at each level > subksp GMRES -> preconditioned with different PCs > > I would like to control the stopping criteria of each level-subksp > using either the relative tolerance or the npre/npost number of smoothings > > I set at each level > > KSPSetNormType(subksp, KSP_NORM_PRECONDITIONED); // assuming left preconditioner > KSPSetTolerances(subksp, rtol, PETSC_DEFAULT, PETSC_DEFAULT, npre); > > It seams (but I am not sure how to check it properly) that rtol is completely uninfluential, > rather it always uses the fix number of iterations fixed by npre. > > When I run with the option -ksp_view I see that the norm for the subksp > has effectively changed from NONE to PRECONDITIONED > for example with rtol=0.1 and npre=10 I get > > %%%%%%%%%%%%%%%%%%% > > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object: (level-2) 4 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10 > tolerances: relative=0.1, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > > %%%%%%%%%%%%%%%%%%%%% > > however when I run the code with -ksp_monitor_true_residual > I only see the iteration info of the external ksp GMRES, > but I do not see any iteration info relative to the subksps. Note that each subksp has its own prefix which you can use to control its behavior. It looks like you set this to "level-2", so (assuming that you can have dashes in options prefixes, which I'm not completely certain of), you should be able to do things like -level-2_ksp_converged_reason -level-2_ksp_monitor_true_residual To see what's going on with the sub solver. > > I assume (but I am not sure) that subksp is behaving > as the NONE norm were set. > > Any idea if I set something wrong? > How do I effectively check how many iterations subksp does? > > thanks, > Eugenio > > > > > > > > > > Eugenio Aulisa > > Department of Mathematics and Statistics, > Texas Tech University > Lubbock TX, 79409-1042 > room: 226 > http://www.math.ttu.edu/~eaulisa/ > phone: (806) 834-6684 > fax: (806) 742-1112 > > > From mvalera at mail.sdsu.edu Thu Sep 1 17:42:06 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Thu, 1 Sep 2016 15:42:06 -0700 Subject: [petsc-users] Sorted CSR Matrix and Multigrid PC. Message-ID: Hello everyone, As my first intervention in this list i want to congratulate the PETSc devs, since ive been working with the library for a couple months and i am very impressed with it. Up until now, ive been developing a module to solve a big laplacian CSR matrix by multigrid pc and GCR ksp, i think i have a basic understanding of whats happening, but i cannot seem to find a good multigrid implementation for my case. I need to use some kind of automatic coarsener/interpolator and so far turning galerkin on does not seem to be working. The part of my code where i try to implement multigrid looks like this: call PCSetType(mg,PCJACOBI,ierr) call PCSetOperators(mg,pmat,pmat,ierr) !!trying to implement interpolators from ex42 nl = 2 call PetscMalloc(nl,da_list,ierr) da_list = PETSC_NULL_OBJECT call PetscMalloc(nl,daclist,ierr) daclist = PETSC_NULL_OBJECT ! ???? not sure what to do here... call PCMGSetLevels(mg,2,PETSC_NULL_OBJECT,ierr) !trying two levels first call PCMGSetType(mg,PC_MG_FULL,ierr) call PCMGSetGalerkin(mg,PETSC_TRUE,ierr) call PCMGSetCycleType(mg,PC_MG_CYCLE_V,ierr) call PCMGSetNumberSmoothDown(mg,1,ierr) call PCMGSetNumberSmoothUp(mg,1,ierr) call PCMGSetInterpolation(mg,1,pmat,ierr) call PCMGSetRestriction(mg,1,pmat,ierr) call PCMGSetX(mg,0,xp,ierr) !Probably and error to set these two call PCMGSetRhs(mg,0,bp,ierr) call PCSetUp(mg,ierr) .... I also have an alternative setup try, less developed as: call PCCreate(PETSC_COMM_WORLD,mg,ierr) !Create preconditioner context call PCSetType(mg,PCGAMG,ierr) call PCGAMGSetType(mg,PCGAMGAGG,ierr) !ERROR: 11 SEGV: Segmentation Violation | IS MY PETSC CORRUPTED? gamg.h empty call PCGAMGSetNLevels(mg,3,PETSC_NULL_OBJECT,ierr) call PCGAMGSetNSmooths(mg,1,ierr) call PCGAMGSetThreshold(mg,0.0,ierr) So far i've also read ML/trilinos multigrid solver is probably easier to implement, but i cant seem to configure petsc correctly to download and install it >From the code ive sent, can you spot any glaring errors? im sorry my knowledge of multigrid is very small. .-.-.- Finally, a second question, My CSR Column identifier (JA) array is not sorted for each row, can you give me an idea on how to sort it with PetscSortInt() for each row, as suggested in previous mails from this list? so far ive figured that i can loop over the row chunk using the spacing give by the row pointer array (IA) but i still don't know how to sort the row chunk in JA along with the real matrix values array (App). Any help will be apreciated, thanks so much, Manuel Valera. Computational Science Reseach Center - SDSU -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 1 17:52:56 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 1 Sep 2016 17:52:56 -0500 Subject: [petsc-users] Sorted CSR Matrix and Multigrid PC. In-Reply-To: References: Message-ID: <836C9DDB-75AC-480C-8000-290C1E6205DC@mcs.anl.gov> > On Sep 1, 2016, at 5:42 PM, Manuel Valera wrote: > > Hello everyone, > > As my first intervention in this list i want to congratulate the PETSc devs, since ive been working with the library for a couple months and i am very impressed with it. > > Up until now, ive been developing a module to solve a big laplacian CSR matrix by multigrid pc and GCR ksp, i think i have a basic understanding of whats happening, but i cannot seem to find a good multigrid implementation for my case. I need to use some kind of automatic coarsener/interpolator and so far turning galerkin on does not seem to be working. > > The part of my code where i try to implement multigrid looks like this: > > > call PCSetType(mg,PCJACOBI,ierr) > call PCSetOperators(mg,pmat,pmat,ierr) > > !!trying to implement interpolators from ex42 > nl = 2 > call PetscMalloc(nl,da_list,ierr) > da_list = PETSC_NULL_OBJECT > call PetscMalloc(nl,daclist,ierr) > daclist = PETSC_NULL_OBJECT > ! ???? not sure what to do here... > > > call PCMGSetLevels(mg,2,PETSC_NULL_OBJECT,ierr) !trying two levels first > call PCMGSetType(mg,PC_MG_FULL,ierr) > call PCMGSetGalerkin(mg,PETSC_TRUE,ierr) > > call PCMGSetCycleType(mg,PC_MG_CYCLE_V,ierr) > call PCMGSetNumberSmoothDown(mg,1,ierr) > call PCMGSetNumberSmoothUp(mg,1,ierr) > > call PCMGSetInterpolation(mg,1,pmat,ierr) > call PCMGSetRestriction(mg,1,pmat,ierr) > call PCMGSetX(mg,0,xp,ierr) !Probably and error to set these two > call PCMGSetRhs(mg,0,bp,ierr) > > call PCSetUp(mg,ierr) > .... Don't do the above. > > I also have an alternative setup try, less developed as: > > call PCCreate(PETSC_COMM_WORLD,mg,ierr) !Create preconditioner context > call PCSetType(mg,PCGAMG,ierr) > call PCGAMGSetType(mg,PCGAMGAGG,ierr) !ERROR: 11 SEGV: Segmentation Violation | IS MY PETSC CORRUPTED? gamg.h empty This should work ok. Send a complete file we can compile and run that crashes like this. > call PCGAMGSetNLevels(mg,3,PETSC_NULL_OBJECT,ierr) > call PCGAMGSetNSmooths(mg,1,ierr) > call PCGAMGSetThreshold(mg,0.0,ierr) > > > > So far i've also read ML/trilinos multigrid solver is probably easier to implement, but i cant seem to configure petsc correctly to download and install it Use --download-ml with your ./configure command and send the resulting configure.log to petsc-maint at mcs.anl.gov if it does not work. > > From the code ive sent, can you spot any glaring errors? im sorry my knowledge of multigrid is very small. > > .-.-.- > > Finally, a second question, > > My CSR Column identifier (JA) array is not sorted for each row, can you give me an idea on how to sort it with PetscSortInt() for each row, as suggested in previous mails from this list? > > so far ive figured that i can loop over the row chunk using the spacing give by the row pointer array (IA) but i still don't know how to sort the row chunk in JA along with the real matrix values array (App). PetscSortIntWithScalarArray() > > Any help will be apreciated, > > thanks so much, > > Manuel Valera. > Computational Science Reseach Center - SDSU > > > > > From mfadams at lbl.gov Fri Sep 2 12:34:24 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 2 Sep 2016 13:34:24 -0400 Subject: [petsc-users] Sorted CSR Matrix and Multigrid PC. In-Reply-To: <836C9DDB-75AC-480C-8000-290C1E6205DC@mcs.anl.gov> References: <836C9DDB-75AC-480C-8000-290C1E6205DC@mcs.anl.gov> Message-ID: > > > > > > call PCCreate(PETSC_COMM_WORLD,mg,ierr) !Create preconditioner context > > call PCSetType(mg,PCGAMG,ierr) > > call PCGAMGSetType(mg,PCGAMGAGG,ierr) !ERROR: 11 SEGV: Segmentation > Violation | IS MY PETSC CORRUPTED? gamg.h empty > > This should work ok. Send a complete file we can compile and run that > crashes like this. > > I would reconfigure your PETSc from scratch by deleting the whole architecture directory $PETSC_DIR/$PETSC_ARCH, the configure, make. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sospinar at unal.edu.co Fri Sep 2 19:36:43 2016 From: sospinar at unal.edu.co (Santiago Ospina De Los Rios) Date: Fri, 2 Sep 2016 19:36:43 -0500 Subject: [petsc-users] Questions about SAWs Message-ID: Hello there! I'm learning how to use SAWs; three questions: 1. Why do I have this output coming out constantly in my terminal while I'm looking the web page? (only when I use -saws_options) /MyLibs/PETSc/src/sys/objects/aoptions.c:91: __FUNCT__="PetscOptionCreate_Private" does not agree with __func__="PetscOptionItemCreate_Private"-- 2. Do I have to do something special to use -saws_options when I'm running in parallel? At the moment hasn't been able to get it running. (only when I use -saws_options) 3. Where can I find the complete list of options of SAWs? It doesn't appear when one uses -help and I didn't find any user manual or reference relating SAWs with PETSc Thank you, Santiago -- Att: Santiago Ospina De Los R?os National University of Colombia -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Sep 3 16:01:47 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 3 Sep 2016 16:01:47 -0500 Subject: [petsc-users] Questions about SAWs In-Reply-To: References: Message-ID: <46191037-9BC6-4A6B-9C3E-24004770EF6B@mcs.anl.gov> > On Sep 2, 2016, at 7:36 PM, Santiago Ospina De Los Rios wrote: > > Hello there! > > I'm learning how to use SAWs; three questions: > > 1. Why do I have this output coming out constantly in my terminal while I'm looking the web page? (only when I use -saws_options) > > /MyLibs/PETSc/src/sys/objects/aoptions.c:91: __FUNCT__="PetscOptionCreate_Private" does not agree with __func__="PetscOptionItemCreate_Private"-- This is our mistake, thanks for letting us know. I have fixed it in the maint and master branches. > > 2. Do I have to do something special to use -saws_options when I'm running in parallel? At the moment hasn't been able to get it running. (only when I use -saws_options) We haven't tested it in parallel. It is tricky because the SAWs web server runs only with MPI process zero. You may need to do some coding to get it working well in parallel. > > 3. Where can I find the complete list of options of SAWs? It doesn't appear when one uses -help and I didn't find any user manual or reference relating SAWs with PETSc You can run the git command git grep PetscStackCallSAWs which will show all the source files that use SAWS and then find the options from there. I think the main two are -snes_monitor saws and -ksp_monitor_saws Barry > > Thank you, > Santiago > -- > Att: > > Santiago Ospina De Los R?os > National University of Colombia From jychang48 at gmail.com Mon Sep 5 03:43:16 2016 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 5 Sep 2016 03:43:16 -0500 Subject: [petsc-users] Coloring of -mat_view draw Message-ID: Hi all, So i used the following command-line options to view the non-zero structure of my assembled matrix: -mat_view draw -draw_pause -1 And I got an image filled with cyan squares and dots. However, if I right-click on the image a couple times, I now get cyan, blue, and red squares. What do the different colors mean? Attached are the images of the two cases: Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: MatNZ1.png Type: image/png Size: 3485 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: MatNZ2.png Type: image/png Size: 4319 bytes Desc: not available URL: From dave.mayhem23 at gmail.com Mon Sep 5 04:31:45 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 5 Sep 2016 11:31:45 +0200 Subject: [petsc-users] Coloring of -mat_view draw In-Reply-To: References: Message-ID: On 5 September 2016 at 10:43, Justin Chang wrote: > Hi all, > > So i used the following command-line options to view the non-zero > structure of my assembled matrix: > > -mat_view draw -draw_pause -1 > > And I got an image filled with cyan squares and dots. However, if I > right-click on the image a couple times, I now get cyan, blue, and red > squares. What do the different colors mean? > Red represents positive numbers. Blue represents negative numbers. I believe cyan represents allocated non-zero entries which were never populated with entries (explicit zeroes). Someone will correct me if I am wrong here regarding cyan... Thanks, Dave > Attached are the images of the two cases: > > Thanks, > Justin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 5 06:11:56 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 5 Sep 2016 06:11:56 -0500 Subject: [petsc-users] Coloring of -mat_view draw In-Reply-To: References: Message-ID: On Mon, Sep 5, 2016 at 4:31 AM, Dave May wrote: > On 5 September 2016 at 10:43, Justin Chang wrote: > >> Hi all, >> >> So i used the following command-line options to view the non-zero >> structure of my assembled matrix: >> >> -mat_view draw -draw_pause -1 >> >> And I got an image filled with cyan squares and dots. However, if I >> right-click on the image a couple times, I now get cyan, blue, and red >> squares. What do the different colors mean? >> > > Red represents positive numbers. > Blue represents negative numbers. > > I believe cyan represents allocated non-zero entries which were never > populated with entries (explicit zeroes). Someone will correct me if I am > wrong here regarding cyan... > When DMMatrix() is called, it first preallocates a matrix, then fills it with zeros and assembles. This is the cyan matrix. Then you fillit with nonzeros. This is the second picture with red and blue. There is an option for turning off the zero filling. Matt > Thanks, > Dave > > >> Attached are the images of the two cases: >> >> Thanks, >> Justin >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Mon Sep 5 11:21:11 2016 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 5 Sep 2016 10:21:11 -0600 Subject: [petsc-users] questions on hypre preconditioner Message-ID: Hi Developers, There are two questions on the hypre preconditioner. (1) How to set different relax types on different levels? It looks to use the SAME relax type on all levels except the coarse level which we could set it to a different solver. Especially, could I set the smoother type on the finest level as NONE? (2) How could I know how many levels have been actually created in hypre, and how many unknowns on different levels? The "-pc_view" can not tell me this information: type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.25 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 HYPRE BoomerAMG: Maximum row sums 0.9 HYPRE BoomerAMG: Sweeps down 1 HYPRE BoomerAMG: Sweeps up 1 HYPRE BoomerAMG: Sweeps on coarse 1 HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax on coarse Gaussian-elimination HYPRE BoomerAMG: Relax weight (all) 1 HYPRE BoomerAMG: Outer relax weight (all) 1 HYPRE BoomerAMG: Using CF-relaxation HYPRE BoomerAMG: Measure type local HYPRE BoomerAMG: Coarsen type Falgout HYPRE BoomerAMG: Interpolation type classical linear system matrix = precond matrix: Fande, -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 5 12:26:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 5 Sep 2016 12:26:14 -0500 Subject: [petsc-users] questions on hypre preconditioner In-Reply-To: References: Message-ID: <1348C765-E4DA-4A31-9EBC-73EEB599B4F0@mcs.anl.gov> > On Sep 5, 2016, at 11:21 AM, Fande Kong wrote: > > Hi Developers, > > There are two questions on the hypre preconditioner. > > (1) How to set different relax types on different levels? It looks to use the SAME relax type on all levels except the coarse level which we could set it to a different solver. Especially, could I set the smoother type on the finest level as NONE? I don't think this is possible through the PETSc interface; it may or may not be possible by adding additional hypre calls. You need to check the hypre documentation. > > (2) How could I know how many levels have been actually created in hypre, and how many unknowns on different levels? The "-pc_view" can not tell me this information: -pc_hypre_boomeramg_print_statistics integer different integers give different amounts of detail, I don't know what the integers mean. > > type: hypre > HYPRE BoomerAMG preconditioning > HYPRE BoomerAMG: Cycle type V > HYPRE BoomerAMG: Maximum number of levels 25 > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > HYPRE BoomerAMG: Threshold for strong coupling 0.25 > HYPRE BoomerAMG: Interpolation truncation factor 0 > HYPRE BoomerAMG: Interpolation: max elements per row 0 > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > HYPRE BoomerAMG: Maximum row sums 0.9 > HYPRE BoomerAMG: Sweeps down 1 > HYPRE BoomerAMG: Sweeps up 1 > HYPRE BoomerAMG: Sweeps on coarse 1 > HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax on coarse Gaussian-elimination > HYPRE BoomerAMG: Relax weight (all) 1 > HYPRE BoomerAMG: Outer relax weight (all) 1 > HYPRE BoomerAMG: Using CF-relaxation > HYPRE BoomerAMG: Measure type local > HYPRE BoomerAMG: Coarsen type Falgout > HYPRE BoomerAMG: Interpolation type classical > linear system matrix = precond matrix: > > > > Fande, From tonioherrmann at gmail.com Mon Sep 5 12:52:02 2016 From: tonioherrmann at gmail.com (Tonio Herrmann) Date: Mon, 5 Sep 2016 19:52:02 +0200 Subject: [petsc-users] beginner's, technical programming manual? Message-ID: Hello, I am new to PETSc, and I am struggling to use it for some numerics problems. The mathematical capabilities are well explained in the manual, in several tutorials and examples. But I am stuck at every tiny step, because I cannot find the required functions for all the very basic technical details, like getting the vertex coordinates of a DMPlex, the face areas and cell volumes (if available through PETSc?). Merging two DMPlexes, that share a common boundary, into one. Extracting a boundary of one DMPlex as a new DMPlex. Etc. Is there any technical introduction that shows how to deal with the data structures on a basic, geometrical and topological level without necessarily discussing the numerics of PDEs and equation systems? Thank you Hermann -------------- next part -------------- An HTML attachment was scrubbed... URL: From huyaoyu1986 at gmail.com Tue Sep 6 02:48:27 2016 From: huyaoyu1986 at gmail.com (Yaoyu Hu) Date: Tue, 6 Sep 2016 15:48:27 +0800 Subject: [petsc-users] Problems of Viewing Mat with MATLAB format Message-ID: Hi everyone, I want to view a parallel dense Mat with MATLAB format. I use the following method. ========= Sample codes. ========= /* Set the name of the Mat.*/ PetscObjectSetName() /* View the Mat. */ PetscViewerASCIIOpen() PetscViewerPushFormat() MatView() PetscViewerDestroy() ====== End of sample codes. ======= However, after the Mat has been viewed, the MATLAB object?s name is not the same with that specified by PetscObjectSetName(). The resultant MATLAB file looks like this: ========= Sample viewed file in MATLAB format. ========== %Mat Object:Mat_ElementLastFaceVelocity 8 MPI processes % type: mpidense % Size = 16000 18 Mat_0x1370b20_7 = zeros(16000,18); Mat_0x1370b20_7 = [ ======= End of sample viewed file in MATLAB format. ======== The Mat Object has the right name but the MATLAB object has the name Mat_0xxxxxxx. How can I make the MATLAB object has the same name with the Mat object? I also checkout the output for Vec object. The resultant MATLAB file has the right names for both Vec object and MATLAB object. Thanks! P.S.: I am using Version 3.7.0 HU Yaoyu From jychang48 at gmail.com Tue Sep 6 02:50:02 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 6 Sep 2016 02:50:02 -0500 Subject: [petsc-users] Coloring of -mat_view draw In-Reply-To: References: Message-ID: Thanks all, How do you turn off the zero-filling? Justin On Monday, September 5, 2016, Matthew Knepley wrote: > On Mon, Sep 5, 2016 at 4:31 AM, Dave May > wrote: > >> On 5 September 2016 at 10:43, Justin Chang > > wrote: >> >>> Hi all, >>> >>> So i used the following command-line options to view the non-zero >>> structure of my assembled matrix: >>> >>> -mat_view draw -draw_pause -1 >>> >>> And I got an image filled with cyan squares and dots. However, if I >>> right-click on the image a couple times, I now get cyan, blue, and red >>> squares. What do the different colors mean? >>> >> >> Red represents positive numbers. >> Blue represents negative numbers. >> >> I believe cyan represents allocated non-zero entries which were never >> populated with entries (explicit zeroes). Someone will correct me if I am >> wrong here regarding cyan... >> > > When DMMatrix() is called, it first preallocates a matrix, then fills it > with zeros and assembles. This is the cyan matrix. Then > you fillit with nonzeros. This is the second picture with red and blue. > There is an option for turning off the zero filling. > > Matt > > >> Thanks, >> Dave >> >> >>> Attached are the images of the two cases: >>> >>> Thanks, >>> Justin >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 6 05:21:55 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 Sep 2016 05:21:55 -0500 Subject: [petsc-users] Coloring of -mat_view draw In-Reply-To: References: Message-ID: On Tue, Sep 6, 2016 at 2:50 AM, Justin Chang wrote: > Thanks all, > > How do you turn off the zero-filling? > -dm_preallocate_only Matt > Justin > > On Monday, September 5, 2016, Matthew Knepley wrote: > >> On Mon, Sep 5, 2016 at 4:31 AM, Dave May wrote: >> >>> On 5 September 2016 at 10:43, Justin Chang wrote: >>> >>>> Hi all, >>>> >>>> So i used the following command-line options to view the non-zero >>>> structure of my assembled matrix: >>>> >>>> -mat_view draw -draw_pause -1 >>>> >>>> And I got an image filled with cyan squares and dots. However, if I >>>> right-click on the image a couple times, I now get cyan, blue, and red >>>> squares. What do the different colors mean? >>>> >>> >>> Red represents positive numbers. >>> Blue represents negative numbers. >>> >>> I believe cyan represents allocated non-zero entries which were never >>> populated with entries (explicit zeroes). Someone will correct me if I am >>> wrong here regarding cyan... >>> >> >> When DMMatrix() is called, it first preallocates a matrix, then fills it >> with zeros and assembles. This is the cyan matrix. Then >> you fillit with nonzeros. This is the second picture with red and blue. >> There is an option for turning off the zero filling. >> >> Matt >> >> >>> Thanks, >>> Dave >>> >>> >>>> Attached are the images of the two cases: >>>> >>>> Thanks, >>>> Justin >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 6 10:56:05 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 6 Sep 2016 10:56:05 -0500 Subject: [petsc-users] Problems of Viewing Mat with MATLAB format In-Reply-To: References: Message-ID: Thanks for reporting this. It is our error. Please find attached the patch that resolves the problem -------------- next part -------------- A non-text attachment was scrubbed... Name: matviewmatlab.patch Type: application/octet-stream Size: 900 bytes Desc: not available URL: -------------- next part -------------- I have also fixed this in the maint and master branches. Barry > On Sep 6, 2016, at 2:48 AM, Yaoyu Hu wrote: > > Hi everyone, > > I want to view a parallel dense Mat with MATLAB format. I use the > following method. > > ========= Sample codes. ========= > > /* Set the name of the Mat.*/ > PetscObjectSetName() > > /* View the Mat. */ > PetscViewerASCIIOpen() > PetscViewerPushFormat() > MatView() > PetscViewerDestroy() > > ====== End of sample codes. ======= > > However, after the Mat has been viewed, the MATLAB object?s name is > not the same with that specified by PetscObjectSetName(). The > resultant MATLAB file looks like this: > > ========= Sample viewed file in MATLAB format. ========== > > %Mat Object:Mat_ElementLastFaceVelocity 8 MPI processes > % type: mpidense > % Size = 16000 18 > Mat_0x1370b20_7 = zeros(16000,18); > Mat_0x1370b20_7 = [ > > ======= End of sample viewed file in MATLAB format. ======== > > The Mat Object has the right name but the MATLAB object has the name > Mat_0xxxxxxx. How can I make the MATLAB object has the same name with > the Mat object? > > I also checkout the output for Vec object. The resultant MATLAB file > has the right names for both Vec object and MATLAB object. > > Thanks! > > P.S.: I am using Version 3.7.0 > > HU Yaoyu From huyaoyu1986 at gmail.com Wed Sep 7 01:42:01 2016 From: huyaoyu1986 at gmail.com (Yaoyu Hu) Date: Wed, 7 Sep 2016 14:42:01 +0800 Subject: [petsc-users] Problems of Viewing Mat with MATLAB format Message-ID: Thank you Barry. I put the new codes in the source by hand and it works fine. HU Yaoyu. > > Thanks for reporting this. It is our error. Please find attached the patch that resolves the problem > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: matviewmatlab.patch > Type: application/octet-stream > Size: 900 bytes > Desc: not available > URL: > -------------- next part -------------- > > > I have also fixed this in the maint and master branches. > > Barry > >> On Sep 6, 2016, at 2:48 AM, Yaoyu Hu wrote: >> >> Hi everyone, >> >> I want to view a parallel dense Mat with MATLAB format. I use the >> following method. >> >> ========= Sample codes. ========= >> >> /* Set the name of the Mat.*/ >> PetscObjectSetName() >> >> /* View the Mat. */ >> PetscViewerASCIIOpen() >> PetscViewerPushFormat() >> MatView() >> PetscViewerDestroy() >> >> ====== End of sample codes. ======= >> >> However, after the Mat has been viewed, the MATLAB object?s name is >> not the same with that specified by PetscObjectSetName(). The >> resultant MATLAB file looks like this: >> >> ========= Sample viewed file in MATLAB format. ========== >> >> %Mat Object:Mat_ElementLastFaceVelocity 8 MPI processes >> % type: mpidense >> % Size = 16000 18 >> Mat_0x1370b20_7 = zeros(16000,18); >> Mat_0x1370b20_7 = [ >> >> ======= End of sample viewed file in MATLAB format. ======== >> >> The Mat Object has the right name but the MATLAB object has the name >> Mat_0xxxxxxx. How can I make the MATLAB object has the same name with >> the Mat object? >> >> I also checkout the output for Vec object. The resultant MATLAB file >> has the right names for both Vec object and MATLAB object. >> >> Thanks! >> >> P.S.: I am using Version 3.7.0 >> >> HU Yaoyu > From knepley at gmail.com Wed Sep 7 09:39:55 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Sep 2016 09:39:55 -0500 Subject: [petsc-users] beginner's, technical programming manual? In-Reply-To: References: Message-ID: On Mon, Sep 5, 2016 at 12:52 PM, Tonio Herrmann wrote: > Hello, > I am new to PETSc, and I am struggling to use it for some numerics > problems. The mathematical capabilities are well explained in the manual, > in several tutorials and examples. > > But I am stuck at every tiny step, because I cannot find the required > functions for all the very basic technical details, like getting the vertex > coordinates of a DMPlex, the face areas and cell volumes (if available > through PETSc?). Merging two DMPlexes, that share a common boundary, into > one. Extracting a boundary of one DMPlex as a new DMPlex. Etc. > Hi Hermann, There is currently no introduction of the kind you want. Everything I have written is geared towards solving PDEs because that it what I do with most of my time, and it is what I get the most questions about. However, there are some resources: a) Papers http://arxiv.org/abs/0908.4427 http://arxiv.org/abs/1506.07749 http://arxiv.org/abs/1508.02470 http://arxiv.org/abs/1506.06194 b) Manpages For example, you can get the coordinates of a DM using http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMGetCoordinates.html http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMGetCoordinatesLocal.html#DMGetCoordinatesLocal or compute things like volumes or face areas, http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMPlexComputeCellGeometryFVM.html or extracting the boundary http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMPlexCreateSubmesh.html c) Examples There are a lot of examples, like SNES ex12, ex62, ex77 where we use these operations to solve PDEs. There is no operation for merging two Plex objects, but it would not be hard to write, if you marked the common boundary in both using a DMLabel. Plex is intended to be transparent enough for users to write new operations like these. Thanks, Matt Is there any technical introduction that shows how to deal with the data > structures on a basic, geometrical and topological level without > necessarily discussing the numerics of PDEs and equation systems? > > Thank you > Hermann > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonioherrmann at gmail.com Wed Sep 7 11:38:51 2016 From: tonioherrmann at gmail.com (Tonio Herrmann) Date: Wed, 7 Sep 2016 18:38:51 +0200 Subject: [petsc-users] beginner's, technical programming manual? In-Reply-To: References: Message-ID: Thank you, Matt, the source codes of DMPlexComputeCellGeometryFVM and your other links seem to be very helpful. I think they answer most of my questions. I might come back with a few more detailed questions in 1-2 weeks if that is ok with you. Thanks Herrmann On Wed, Sep 7, 2016 at 4:39 PM, Matthew Knepley wrote: > On Mon, Sep 5, 2016 at 12:52 PM, Tonio Herrmann > wrote: > >> Hello, >> I am new to PETSc, and I am struggling to use it for some numerics >> problems. The mathematical capabilities are well explained in the manual, >> in several tutorials and examples. >> >> But I am stuck at every tiny step, because I cannot find the required >> functions for all the very basic technical details, like getting the vertex >> coordinates of a DMPlex, the face areas and cell volumes (if available >> through PETSc?). Merging two DMPlexes, that share a common boundary, into >> one. Extracting a boundary of one DMPlex as a new DMPlex. Etc. >> > > Hi Hermann, > > There is currently no introduction of the kind you want. Everything I have > written is geared towards solving PDEs because that > it what I do with most of my time, and it is what I get the most questions > about. However, there are some resources: > > a) Papers > > http://arxiv.org/abs/0908.4427 > http://arxiv.org/abs/1506.07749 > http://arxiv.org/abs/1508.02470 > http://arxiv.org/abs/1506.06194 > > b) Manpages > > For example, you can get the coordinates of a DM using > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/ > DMGetCoordinates.html > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/ > DMGetCoordinatesLocal.html#DMGetCoordinatesLocal > > or compute things like volumes or face areas, > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/ > DMPlexComputeCellGeometryFVM.html > > or extracting the boundary > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/ > DMPlexCreateSubmesh.html > > c) Examples > > There are a lot of examples, like SNES ex12, ex62, ex77 where we use these > operations to solve PDEs. > > There is no operation for merging two Plex objects, but it would not be > hard to write, if you marked the common > boundary in both using a DMLabel. Plex is intended to be transparent > enough for users to write new operations > like these. > > Thanks, > > Matt > > Is there any technical introduction that shows how to deal with the data >> structures on a basic, geometrical and topological level without >> necessarily discussing the numerics of PDEs and equation systems? >> >> Thank you >> Hermann >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 7 12:02:59 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Sep 2016 12:02:59 -0500 Subject: [petsc-users] beginner's, technical programming manual? In-Reply-To: References: Message-ID: On Wed, Sep 7, 2016 at 11:38 AM, Tonio Herrmann wrote: > Thank you, Matt, > the source codes of DMPlexComputeCellGeometryFVM and your other links seem > to be very helpful. I think they answer most of my questions. I might come > back with a few more detailed questions in 1-2 weeks if that is ok with you. > Yep, that is fine. Thanks, Matt > Thanks > Herrmann > > > On Wed, Sep 7, 2016 at 4:39 PM, Matthew Knepley wrote: > >> On Mon, Sep 5, 2016 at 12:52 PM, Tonio Herrmann >> wrote: >> >>> Hello, >>> I am new to PETSc, and I am struggling to use it for some numerics >>> problems. The mathematical capabilities are well explained in the manual, >>> in several tutorials and examples. >>> >>> But I am stuck at every tiny step, because I cannot find the required >>> functions for all the very basic technical details, like getting the vertex >>> coordinates of a DMPlex, the face areas and cell volumes (if available >>> through PETSc?). Merging two DMPlexes, that share a common boundary, into >>> one. Extracting a boundary of one DMPlex as a new DMPlex. Etc. >>> >> >> Hi Hermann, >> >> There is currently no introduction of the kind you want. Everything I >> have written is geared towards solving PDEs because that >> it what I do with most of my time, and it is what I get the most >> questions about. However, there are some resources: >> >> a) Papers >> >> http://arxiv.org/abs/0908.4427 >> http://arxiv.org/abs/1506.07749 >> http://arxiv.org/abs/1508.02470 >> http://arxiv.org/abs/1506.06194 >> >> b) Manpages >> >> For example, you can get the coordinates of a DM using >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >> s/DM/DMGetCoordinates.html >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >> s/DM/DMGetCoordinatesLocal.html#DMGetCoordinatesLocal >> >> or compute things like volumes or face areas, >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >> s/DM/DMPlexComputeCellGeometryFVM.html >> >> or extracting the boundary >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >> s/DM/DMPlexCreateSubmesh.html >> >> c) Examples >> >> There are a lot of examples, like SNES ex12, ex62, ex77 where we use >> these operations to solve PDEs. >> >> There is no operation for merging two Plex objects, but it would not be >> hard to write, if you marked the common >> boundary in both using a DMLabel. Plex is intended to be transparent >> enough for users to write new operations >> like these. >> >> Thanks, >> >> Matt >> >> Is there any technical introduction that shows how to deal with the data >>> structures on a basic, geometrical and topological level without >>> necessarily discussing the numerics of PDEs and equation systems? >>> >>> Thank you >>> Hermann >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jshen25 at jhu.edu Wed Sep 7 20:37:02 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Wed, 7 Sep 2016 21:37:02 -0400 Subject: [petsc-users] PETSc parallel scalability Message-ID: Hi, I am trying to test the parallel scalablity of iterative solver (CG with BJacobi preconditioner) in PETSc. Since the iteration number increases with more processors, I calculated the single iteration time by dividing the total KSPSolve time by number of iteration in this test. The linear system I'm solving has 315342 unknowns. Only KSPSolve cost is analyzed. The results show that the parallelism works well with small number of processes (less than 32 in my case), and is almost perfect parallel within first 10 processors. However, the effect of parallelization degrades if I use more processors. The wired thing is that with more than 100 processors, the single iteration cost is slightly increasing. To investigate this issue, I then looked into the composition of KSPSolve time. It seems KSPSolve consists of MatMult, VecTDot(min),VecNorm(min), VecAXPY(max),VecAXPX(max),ApplyPC. Please correct me if I'm wrong. And I found for small number of processors, all these components scale well. However, using more processors(roughly larger than 40), MatMult, VecTDot(min),VecNorm(min) behaves worse, and even increasing after 100 processors, while the other three parts parallel well even for 1000 processors. Since MatMult composed major cost in single iteration, the total single iteration cost increases as well.(See the below figure). My question: 1. Is such situation reasonable? Could anyone explain why MatMult scales poor after certain number of processors? I heard some about different algorithms for matrix multiplication. Is that the bottleneck? 2. Is the parallelism dependent of matrix size? If I use larger system size,e.g. million , can the solver scale better? 3. Do you have any idea to improve the parallel performance? Thank you very much. JInlei [image: Inline image 1] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 39477 bytes Desc: not available URL: From mvalera at mail.sdsu.edu Wed Sep 7 20:46:36 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Wed, 7 Sep 2016 18:46:36 -0700 Subject: [petsc-users] Fwd: Sorted CSR Matrix and Multigrid PC. In-Reply-To: References: <836C9DDB-75AC-480C-8000-290C1E6205DC@mcs.anl.gov> Message-ID: ---------- Forwarded message ---------- From: Manuel Valera Date: Wed, Sep 7, 2016 at 6:40 PM Subject: Re: [petsc-users] Sorted CSR Matrix and Multigrid PC. To: Cc: PETSc users list Hello, I was able to sort the data but the PCGAMG does not seem to be working. I reconfigured everything from scratch as suggested and updated to the latest PETSc version, same results, I get the following error: [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [...] [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] PetscStrcmp line 524 /home/valera/sergcemv4/ bitbucket/serucoamv4/petsc-3.7.3/src/sys/utils/str.c [0]PETSC ERROR: [0] PetscFunctionListFind_Private line 352 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/sys/dll/reg.c [0]PETSC ERROR: [0] PCGAMGSetType_GAMG line 1157 /home/valera/sergcemv4/ bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c [0]PETSC ERROR: [0] PCGAMGSetType line 1102 /home/valera/sergcemv4/ bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- .... You can find the working program attached, sorry for the large files and messy makefile, Many thanks, Manuel Valera. ? MGpetscSolver.tar.gz ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Sep 7 21:09:16 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 7 Sep 2016 21:09:16 -0500 Subject: [petsc-users] PETSc parallel scalability In-Reply-To: References: Message-ID: Lots of topics to discuss here... - 315,342 unknowns is a very small problem. The PETSc gurus require at minimum 10,000 unknowns per process for the computation time to outweigh communication time (although 20,000 unknowns or more is preferable). So when using 32 MPI processes and more, you're going to have ~10k unknowns or less so that's one reason why you're going to see less speedup. - Another reason you get poor parallel scalability is that PETSc is limited by the memory-bandwidth. Meaning you have to use the optimal number of cores per compute node or whatever it is you're running on. The PETSc gurus talk about this issue in depth . So not only do you need proper MPI process bindings, but it is likely that you will not want to saturate all available cores on a single node (the STREAMS Benchmark can tell you this). In other words, 16 cores spread across 2 nodes is going to outperform 16 cores on 1 node. - If operations like MatMult are not scaling, this is likely due to the memory bandwidth limitations. If operations like VecDot or VecNorm are not scaling, this is likely due to the network latency between compute nodes. - What kind of problem are you solving? CG/BJacobi is a mediocre solver/preconditioner combination, and solver iterations will increase with MPI processes if your tolerances are too lax. You can try using CG with any of the multi-grid preconditioners like GAMG if you have something nice like the poission equation. - The best way to improve parallel performance is to make your code really inefficient and crappy. - And most importantly, always send -log_view if you want people to help identify performance related issues with your application :) Justin On Wed, Sep 7, 2016 at 8:37 PM, Jinlei Shen wrote: > Hi, > > I am trying to test the parallel scalablity of iterative solver (CG with > BJacobi preconditioner) in PETSc. > > Since the iteration number increases with more processors, I calculated > the single iteration time by dividing the total KSPSolve time by number of > iteration in this test. > > The linear system I'm solving has 315342 unknowns. Only KSPSolve cost is > analyzed. > > The results show that the parallelism works well with small number of > processes (less than 32 in my case), and is almost perfect parallel within > first 10 processors. > > However, the effect of parallelization degrades if I use more processors. > The wired thing is that with more than 100 processors, the single iteration > cost is slightly increasing. > > To investigate this issue, I then looked into the composition of KSPSolve > time. > It seems KSPSolve consists of MatMult, VecTDot(min),VecNorm(min),VecAXPY(max),VecAXPX(max),ApplyPC. > Please correct me if I'm wrong. > > And I found for small number of processors, all these components scale > well. > However, using more processors(roughly larger than 40), MatMult, > VecTDot(min),VecNorm(min) behaves worse, and even increasing after 100 > processors, while the other three parts parallel well even for 1000 > processors. > Since MatMult composed major cost in single iteration, the total single > iteration cost increases as well.(See the below figure). > > My question: > 1. Is such situation reasonable? Could anyone explain why MatMult scales > poor after certain number of processors? I heard some about different > algorithms for matrix multiplication. Is that the bottleneck? > > 2. Is the parallelism dependent of matrix size? If I use larger system > size,e.g. million , can the solver scale better? > > 3. Do you have any idea to improve the parallel performance? > > Thank you very much. > > JInlei > > [image: Inline image 1] > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 39477 bytes Desc: not available URL: From bsmith at mcs.anl.gov Wed Sep 7 21:26:57 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 7 Sep 2016 21:26:57 -0500 Subject: [petsc-users] PETSc parallel scalability In-Reply-To: References: Message-ID: <870D8020-DAF8-4414-8B6D-B40A425382C8@mcs.anl.gov> > On Sep 7, 2016, at 8:37 PM, Jinlei Shen wrote: > > Hi, > > I am trying to test the parallel scalablity of iterative solver (CG with BJacobi preconditioner) in PETSc. > > Since the iteration number increases with more processors, I calculated the single iteration time by dividing the total KSPSolve time by number of iteration in this test. > > The linear system I'm solving has 315342 unknowns. Only KSPSolve cost is analyzed. > > The results show that the parallelism works well with small number of processes (less than 32 in my case), and is almost perfect parallel within first 10 processors. > > However, the effect of parallelization degrades if I use more processors. The wired thing is that with more than 100 processors, the single iteration cost is slightly increasing. > > To investigate this issue, I then looked into the composition of KSPSolve time. > It seems KSPSolve consists of MatMult, VecTDot(min),VecNorm(min),VecAXPY(max),VecAXPX(max),ApplyPC. Please correct me if I'm wrong. > > And I found for small number of processors, all these components scale well. > However, using more processors(roughly larger than 40), MatMult, VecTDot(min),VecNorm(min) behaves worse, and even increasing after 100 processors, while the other three parts parallel well even for 1000 processors. > Since MatMult composed major cost in single iteration, the total single iteration cost increases as well.(See the below figure). > > My question: > 1. Is such situation reasonable? Yes > Could anyone explain why MatMult scales poor after certain number of processors? I heard some about different algorithms for matrix multiplication. Is that the bottleneck? The MatMult inherently requires communication, as the number of processors increases the amount of communication increases while the total work needed remains the same. This is true regardless of the particular details of the algorithms used. Similar computations like norms and inner products require a communication among all processes, as the increase the number of processes the communication time starts to dominate for norms and inner products hence they begin to take a great deal of time for large numbers of processes. > > 2. Is the parallelism dependent of matrix size? If I use larger system size,e.g. million , can the solver scale better? Absolutely. You should look up the concepts of strong scaling and weak scaling. These are important concepts to understand to understand parallel computing. > > 3. Do you have any idea to improve the parallel performance? Worrying about parallel performance should never be a priority, the priorities should be "can I solve the size problems I need to solve in a reasonable amount of time to accomplish whatever science or engineering I am working on". To be able to do this first depends on using good discretization methods for your model (i.e. not just first or second order whenever possible) and am I using efficient algebraic solvers when I need to solve algebraic systems. (As Justin noted bjacobi GMRES is not a particularly efficient algebraic solver). Only after you have done all that does improving parallel performance come into play; you know you have an efficient discretization and solver and now you want to make it run faster in parallel. Since each type of simulation is unique you need to work through the process for the problem YOU need to solve, you can't just run it for a model problem and then try to reuse the same discretizations and solvers for your "real" problem. Barry > > Thank you very much. > > JInlei > > > > From bsmith at mcs.anl.gov Wed Sep 7 22:22:08 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 7 Sep 2016 22:22:08 -0500 Subject: [petsc-users] Fwd: Sorted CSR Matrix and Multigrid PC. In-Reply-To: References: <836C9DDB-75AC-480C-8000-290C1E6205DC@mcs.anl.gov> Message-ID: <9F710A22-1A5D-4CA3-8D28-6C7F1EEBCB1D@mcs.anl.gov> Sorry, this was due to our bug, the fortran function for PCGAMGSetType() was wrong. I have fixed this in the maint and master branch of PETSc in the git repository. But you can simply remove the call to PCGAMGSetType() from your code since what you are setting is the default type. BTW: there are other problems with your code after that call that you will have to work through. Barry > On Sep 7, 2016, at 8:46 PM, Manuel Valera wrote: > > > ---------- Forwarded message ---------- > From: Manuel Valera > Date: Wed, Sep 7, 2016 at 6:40 PM > Subject: Re: [petsc-users] Sorted CSR Matrix and Multigrid PC. > To: > Cc: PETSc users list > > > Hello, > > I was able to sort the data but the PCGAMG does not seem to be working. > > I reconfigured everything from scratch as suggested and updated to the latest PETSc version, same results, > > I get the following error: > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [...] > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] PetscStrcmp line 524 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/sys/utils/str.c > [0]PETSC ERROR: [0] PetscFunctionListFind_Private line 352 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/sys/dll/reg.c > [0]PETSC ERROR: [0] PCGAMGSetType_GAMG line 1157 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: [0] PCGAMGSetType line 1102 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > .... > > You can find the working program attached, sorry for the large files and messy makefile, > > Many thanks, > > Manuel Valera. > > > ? > MGpetscSolver.tar.gz > ? > From hsahasra at purdue.edu Thu Sep 8 10:38:29 2016 From: hsahasra at purdue.edu (Harshad Sahasrabudhe) Date: Thu, 8 Sep 2016 11:38:29 -0400 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH Message-ID: Hi, I'm using GAMG + GMRES for my Poisson problem. The solver converges with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which is much higher than what I need (I need a tolerance of at least 1E-8). I am not able to figure out which tolerance I need to set to avoid convergence due to CONVERGED_STEP_LENGTH. Any help is appreciated! Output of -ksp_view and -ksp_monitor: 0 KSP Residual norm 3.121347818142e+00 1 KSP Residual norm 9.773346857844e-02 Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5, needed 1.91048 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=284, cols=284 package used to perform factorization: petsc total: nonzeros=7726, allocated nonzeros=7726 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 133 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=284, cols=284 total: nonzeros=4044, allocated nonzeros=4044 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=284, cols=284 total: nonzeros=4044, allocated nonzeros=4044 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.195339, max = 4.10212 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1 linear system matrix = precond matrix: Mat Object: () 1 MPI processes type: seqaij rows=9036, cols=9036 total: nonzeros=192256, allocated nonzeros=192256 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: () 1 MPI processes type: seqaij rows=9036, cols=9036 total: nonzeros=192256, allocated nonzeros=192256 total number of mallocs used during MatSetValues calls =0 not using I-node routines Thanks, Harshad -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonioherrmann at gmail.com Thu Sep 8 10:43:29 2016 From: tonioherrmann at gmail.com (Tonio Herrmann) Date: Thu, 8 Sep 2016 17:43:29 +0200 Subject: [petsc-users] VecGetValues and VecGetArray return nothing but zeros Message-ID: Hello, during my attempts to learn PETSc, I am trying to understand the relation between global and local vectors on a DMDA. There is a problem in my code below, which only returns zeros when I try to access the local values, although VecView shows that the values are correctly transferred from the global vector g to the local vector l. Is it obvious, what I am doing wrong? Another question, I was expecting that "VecView(l,PETSC_VIEWER_STDOUT_WORLD);" would print l synchronized from each process, but only l from process #0 is shown. How can I see the other parts of l? Thank you Herrmann The example code: #include "petsc.h" int main(int argc,char **argv){ PetscInt sz; int ierr,rank; Vec g,l; DM da; ierr=PetscInitialize(&argc, &argv, NULL, NULL);CHKERRQ(ierr); MPI_Comm_rank(PETSC_COMM_WORLD,&rank); int nx=2,ny=4; //create DA ierr=DMDACreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_ BOUNDARY_GHOSTED,DMDA_STENCIL_BOX,nx,ny,PETSC_DECIDE,PETSC_ DECIDE,1,1,NULL,NULL,&da);CHKERRQ(ierr); //create global vector and fill with numbers 101,102,... ierr=DMCreateGlobalVector(da,&g);CHKERRQ(ierr); ierr=VecGetSize(g,&sz);CHKERRQ(ierr); for (int k=0;k From jed at jedbrown.org Thu Sep 8 11:28:08 2016 From: jed at jedbrown.org (Jed Brown) Date: Thu, 08 Sep 2016 10:28:08 -0600 Subject: [petsc-users] VecGetValues and VecGetArray return nothing but zeros In-Reply-To: References: Message-ID: <87bmzywe3r.fsf@jedbrown.org> Tonio Herrmann writes: > Hello, > during my attempts to learn PETSc, I am trying to understand the relation > between global and local vectors on a DMDA. There is a problem in my code > below, which only returns zeros when I try to access the local values, > although VecView shows that the values are correctly transferred from the > global vector g to the local vector l. > Is it obvious, what I am doing wrong? > > Another question, I was expecting that "VecView(l,PETSC_VIEWER_STDOUT_WORLD);" > would print l synchronized from each process, but only l from process #0 is > shown. How can I see the other parts of l? They're there, you just botched the format string. > PetscSynchronizedPrintf(PETSC_COMM_WORLD,"VecGetValues on rank > %d:\n",rank); > for (int k=0;k ",data[k]); %d is for integers, but you're passing scalars. Use %g or %f, for example. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From bsmith at mcs.anl.gov Thu Sep 8 12:59:20 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 8 Sep 2016 12:59:20 -0500 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: Message-ID: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> This is very odd. CONVERGED_STEP_LENGTH for KSP is very specialized and should never occur with GMRES. Can you run with valgrind to make sure there is no memory corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind Is your code fortran or C? Barry > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe wrote: > > Hi, > > I'm using GAMG + GMRES for my Poisson problem. The solver converges with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which is much higher than what I need (I need a tolerance of at least 1E-8). I am not able to figure out which tolerance I need to set to avoid convergence due to CONVERGED_STEP_LENGTH. > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > 0 KSP Residual norm 3.121347818142e+00 > 1 KSP Residual norm 9.773346857844e-02 > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: gamg > MG: type is MULTIPLICATIVE, levels=2 cycles=v > Cycles per PCApply=1 > Using Galerkin computed coarse grid matrices > Coarse grid solver -- level ------------------------------- > KSP Object: (mg_coarse_) 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (mg_coarse_) 1 MPI processes > type: bjacobi > block Jacobi: number of blocks = 1 > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: nd > factor fill ratio given 5, needed 1.91048 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=284, cols=284 > package used to perform factorization: petsc > total: nonzeros=7726, allocated nonzeros=7726 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 133 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=284, cols=284 > total: nonzeros=4044, allocated nonzeros=4044 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=284, cols=284 > total: nonzeros=4044, allocated nonzeros=4044 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object: (mg_levels_1_) 1 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.195339, max = 4.10212 > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (mg_levels_1_) 1 MPI processes > type: sor > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1 > linear system matrix = precond matrix: > Mat Object: () 1 MPI processes > type: seqaij > rows=9036, cols=9036 > total: nonzeros=192256, allocated nonzeros=192256 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix = precond matrix: > Mat Object: () 1 MPI processes > type: seqaij > rows=9036, cols=9036 > total: nonzeros=192256, allocated nonzeros=192256 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > > Thanks, > Harshad From bsmith at mcs.anl.gov Thu Sep 8 13:30:41 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 8 Sep 2016 13:30:41 -0500 Subject: [petsc-users] VecGetValues and VecGetArray return nothing but zeros In-Reply-To: References: Message-ID: <33BA16E2-CF35-423C-9371-99799CEC733D@mcs.anl.gov> > On Sep 8, 2016, at 10:43 AM, Tonio Herrmann wrote: > > Hello, > during my attempts to learn PETSc, I am trying to understand the relation between global and local vectors on a DMDA. There is a problem in my code below, which only returns zeros when I try to access the local values, although VecView shows that the values are correctly transferred from the global vector g to the local vector l. > Is it obvious, what I am doing wrong? > > Another question, I was expecting that "VecView(l,PETSC_VIEWER_STDOUT_WORLD);" would print l synchronized from each process, but only l from process #0 is shown. This will not work; when you do a view the MPI communicator of the object must be the same as the MPI communicator in the viewer. I don't really recommend ever looking at all the subdomains local vectors but if you really want to you can use PetscViewerGetSubViewer() to get a sub viewer for each MPI process one at a time and call the VecView() one at at time with each viewer and local vector. Barry > How can I see the other parts of l? > > Thank you > Herrmann > > The example code: > > #include "petsc.h" > > int main(int argc,char **argv){ > PetscInt sz; > int ierr,rank; > Vec g,l; > DM da; > > ierr=PetscInitialize(&argc, &argv, NULL, NULL);CHKERRQ(ierr); > MPI_Comm_rank(PETSC_COMM_WORLD,&rank); > int nx=2,ny=4; > //create DA > ierr=DMDACreate2d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DMDA_STENCIL_BOX,nx,ny,PETSC_DECIDE,PETSC_DECIDE,1,1,NULL,NULL,&da);CHKERRQ(ierr); > //create global vector and fill with numbers 101,102,... > ierr=DMCreateGlobalVector(da,&g);CHKERRQ(ierr); > ierr=VecGetSize(g,&sz);CHKERRQ(ierr); > for (int k=0;k VecSetValue(g,k,100+k,INSERT_VALUES); > } > VecAssemblyBegin(g); > VecAssemblyEnd(g); > //create local vector and transfer global values into local > ierr=DMCreateLocalVector(da,&l);CHKERRQ(ierr); > ierr=VecGetSize(l,&sz);CHKERRQ(ierr); > DMGlobalToLocalBegin(da,g,INSERT_VALUES,l); > DMGlobalToLocalEnd(da,g,INSERT_VALUES,l); > VecAssemblyBegin(l); > VecAssemblyEnd(l); > > //view local vector, ok > VecView(l,PETSC_VIEWER_STDOUT_WORLD); > > //access local values by VecGetValues, fails (only zeros!) > double *data=(double*)malloc(100*sizeof(double)); > PetscInt ix[100]; > ierr=VecGetSize(l,&sz);CHKERRQ(ierr); > for (int k=0;k ierr=VecGetValues(l,sz,ix,data);CHKERRQ(ierr); > > PetscSynchronizedPrintf(PETSC_COMM_WORLD,"VecGetValues on rank %d:\n",rank); > for (int k=0;k PetscSynchronizedPrintf(PETSC_COMM_WORLD,"\n"); > > //access local values by VecGetArrays, fails (only zeros!) > ierr=VecGetArray(l,&data);CHKERRQ(ierr); > PetscSynchronizedPrintf(PETSC_COMM_WORLD,"VecGetArray on rank %d:\n",rank); > for (int k=0;k PetscSynchronizedPrintf(PETSC_COMM_WORLD,"%d ",data[k]); > } > PetscSynchronizedPrintf(PETSC_COMM_WORLD,"\n"); > PetscSynchronizedFlush(PETSC_COMM_WORLD,NULL); > VecRestoreArray(l,&data);CHKERRQ(ierr); > > ierr = PetscFinalize();CHKERRQ(ierr); > return ierr; > } From tonioherrmann at gmail.com Thu Sep 8 14:54:24 2016 From: tonioherrmann at gmail.com (Tonio Herrmann) Date: Thu, 8 Sep 2016 21:54:24 +0200 Subject: [petsc-users] VecGetValues and VecGetArray return nothing but zeros In-Reply-To: <87bmzywe3r.fsf@jedbrown.org> References: <87bmzywe3r.fsf@jedbrown.org> Message-ID: What a stupid mistake of mine :( Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Thu Sep 8 17:30:35 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Fri, 9 Sep 2016 00:30:35 +0200 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: Sorry I slept quite a while in this thread. Now I start to look at it again. In the last try, the previous setting doesn't work either (in fact diverge). So I would speculate if the Schur complement in my case is actually not invertible. It's also possible that the code is wrong somewhere. However, before looking at that, I want to understand thoroughly the settings for Schur complement I experimented ex42 with the settings: mpirun -np 1 ex42 \ -stokes_ksp_monitor \ -stokes_ksp_type fgmres \ -stokes_pc_type fieldsplit \ -stokes_pc_fieldsplit_type schur \ -stokes_pc_fieldsplit_schur_fact_type full \ -stokes_pc_fieldsplit_schur_precondition selfp \ -stokes_fieldsplit_u_ksp_type preonly \ -stokes_fieldsplit_u_pc_type lu \ -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ -stokes_fieldsplit_p_ksp_type gmres \ -stokes_fieldsplit_p_ksp_monitor_true_residual \ -stokes_fieldsplit_p_ksp_max_it 300 \ -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ -stokes_fieldsplit_p_ksp_gmres_restart 300 \ -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ -stokes_fieldsplit_p_pc_type lu \ -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps In my understanding, the solver should converge in 1 (outer) step. Execution gives: Residual norms for stokes_ solve. 0 KSP Residual norm 1.327791371202e-02 Residual norms for stokes_fieldsplit_p_ solve. 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan 1 KSP Residual norm 7.656238881621e-04 Residual norms for stokes_fieldsplit_p_ solve. 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 2 KSP Residual norm 9.895645456398e-06 Residual norms for stokes_fieldsplit_p_ solve. 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 3 KSP Residual norm 1.608206702571e-06 Residual norms for stokes_fieldsplit_p_ solve. 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 4 KSP Residual norm 2.453618999882e-07 Residual norms for stokes_fieldsplit_p_ solve. 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 5 KSP Residual norm 4.230399264750e-08 Looks like the "selfp" does construct the Schur nicely. But does "full" really construct the full block preconditioner? Giang P/S: I'm also generating a smaller size of the previous problem for checking again. On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley wrote: > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui > wrote: > >> >> >>> It could be taking time in the MatMatMult() here if that matrix is >>> dense. Is there any reason to >>> believe that is a good preconditioner for your problem? >>> >> >> This is the first approach to the problem, so I chose the most simple >> setting. Do you have any other recommendation? >> > > This is in no way the simplest PC. We need to make it simpler first. > > 1) Run on only 1 proc > > 2) Use -pc_fieldsplit_schur_fact_type full > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_tru > e_residual > > This should converge in 1 outer iteration, but we will see how good your > Schur complement preconditioner > is for this problem. > > You need to start out from something you understand and then start making > approximations. > > Matt > > >> For any solver question, please send us the output of >>> >>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>> >>> >> I sent here the full output (after changed to fgmres), again it takes >> long at the first iteration but after that, it does not converge >> >> -ksp_type fgmres >> -ksp_max_it 300 >> -ksp_gmres_restart 300 >> -ksp_gmres_modifiedgramschmidt >> -pc_fieldsplit_type schur >> -pc_fieldsplit_schur_fact_type diag >> -pc_fieldsplit_schur_precondition selfp >> -pc_fieldsplit_detect_saddle_point >> -fieldsplit_u_ksp_type preonly >> -fieldsplit_u_pc_type lu >> -fieldsplit_u_pc_factor_mat_solver_package mumps >> -fieldsplit_lu_ksp_type preonly >> -fieldsplit_lu_pc_type lu >> -fieldsplit_lu_pc_factor_mat_solver_package mumps >> >> 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm >> 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm >> 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 >> 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm >> 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 >> 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm >> 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 >> 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm >> 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 >> 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm >> 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 >> 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm >> 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 >> 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm >> 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 >> 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm >> 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 >> 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm >> 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 >> 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm >> 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 >> 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm >> 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 >> 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm >> 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 >> 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm >> 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 >> 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm >> 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 >> 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm >> 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 >> 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm >> 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 >> 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm >> 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 >> 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm >> 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 >> 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm >> 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 >> 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm >> 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 >> 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm >> 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 >> 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm >> 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 >> 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm >> 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 >> 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm >> 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 >> 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm >> 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 >> 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm >> 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 >> 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm >> 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 >> 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm >> 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 >> 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm >> 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 >> 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm >> 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 >> 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm >> 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 >> 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm >> 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 >> 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm >> 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 >> 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm >> 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 >> 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm >> 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 >> 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm >> 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 >> 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm >> 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 >> 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm >> 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 >> 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm >> 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 >> 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm >> 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 >> 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm >> 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 >> 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm >> 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 >> 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm >> 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 >> 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm >> 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 >> 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm >> 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 >> 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm >> 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 >> 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm >> 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 >> 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm >> 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 >> 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm >> 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 >> 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm >> 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 >> 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm >> 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 >> 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm >> 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 >> 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm >> 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 >> 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm >> 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 >> 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm >> 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 >> 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm >> 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 >> 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm >> 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 >> 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm >> 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 >> 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm >> 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 >> 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm >> 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 >> 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm >> 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 >> 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm >> 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 >> 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm >> 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 >> 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm >> 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 >> 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm >> 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 >> 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm >> 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 >> 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm >> 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 >> 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm >> 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 >> 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm >> 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 >> 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm >> 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 >> 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm >> 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 >> 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm >> 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 >> 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm >> 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 >> 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm >> 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 >> 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm >> 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 >> 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm >> 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 >> 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm >> 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 >> 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm >> 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 >> 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm >> 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 >> 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm >> 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 >> 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm >> 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 >> 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm >> 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 >> 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm >> 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 >> 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm >> 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 >> 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm >> 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 >> 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm >> 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 >> 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm >> 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 >> 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm >> 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 >> 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm >> 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 >> 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm >> 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 >> 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm >> 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 >> 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm >> 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 >> 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm >> 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 >> 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm >> 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 >> 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm >> 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 >> 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm >> 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 >> 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm >> 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 >> 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm >> 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 >> 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm >> 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 >> 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm >> 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 >> 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm >> 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 >> 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm >> 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 >> 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm >> 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 >> 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm >> 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 >> 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm >> 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 >> 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm >> 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 >> 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm >> 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 >> 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm >> 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 >> 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm >> 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 >> 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm >> 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 >> 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm >> 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 >> 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm >> 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 >> 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm >> 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 >> 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm >> 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 >> 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm >> 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 >> 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm >> 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 >> 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm >> 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 >> 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm >> 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 >> 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm >> 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 >> 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm >> 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 >> 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm >> 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 >> 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm >> 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 >> 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm >> 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 >> 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm >> 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 >> 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm >> 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 >> 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm >> 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 >> 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm >> 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 >> 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm >> 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 >> 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm >> 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 >> 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm >> 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 >> 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm >> 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 >> 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm >> 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 >> 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm >> 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 >> 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm >> 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 >> 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm >> 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 >> 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm >> 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 >> 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm >> 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 >> 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm >> 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 >> 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm >> 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 >> 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm >> 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 >> 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm >> 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 >> 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm >> 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 >> 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm >> 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 >> 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm >> 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 >> 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm >> 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 >> 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm >> 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 >> 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm >> 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 >> 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm >> 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 >> 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm >> 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 >> 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm >> 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 >> 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm >> 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 >> 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm >> 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 >> 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm >> 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 >> 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm >> 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 >> 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm >> 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 >> 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm >> 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 >> 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm >> 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 >> 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm >> 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 >> 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm >> 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 >> 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm >> 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 >> 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm >> 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 >> 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm >> 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 >> 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm >> 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 >> 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm >> 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 >> 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm >> 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 >> 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm >> 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 >> 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm >> 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 >> 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm >> 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 >> 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm >> 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 >> 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm >> 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 >> 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm >> 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 >> 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm >> 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 >> 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm >> 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 >> 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm >> 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 >> 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm >> 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 >> 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm >> 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 >> 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm >> 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 >> 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm >> 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 >> 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm >> 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 >> 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm >> 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 >> 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm >> 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 >> 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm >> 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 >> 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm >> 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 >> 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm >> 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 >> 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm >> 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 >> 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm >> 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 >> 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm >> 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 >> 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm >> 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 >> 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm >> 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 >> 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm >> 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 >> 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm >> 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 >> 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm >> 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 >> 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm >> 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 >> 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm >> 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 >> 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm >> 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 >> 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm >> 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 >> 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm >> 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 >> 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm >> 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 >> 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm >> 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 >> 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm >> 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 >> 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm >> 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 >> 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm >> 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 >> 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm >> 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 >> 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm >> 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 >> 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm >> 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 >> 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm >> 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 >> 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm >> 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 >> 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm >> 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 >> 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm >> 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 >> 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm >> 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 >> 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm >> 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 >> 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm >> 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 >> 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm >> 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 >> 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm >> 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 >> 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm >> 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 >> 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm >> 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 >> 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm >> 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 >> 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm >> 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 >> 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm >> 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 >> 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm >> 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 >> 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm >> 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 >> 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm >> 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 >> 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm >> 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 >> 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm >> 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 >> 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm >> 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 >> 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm >> 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 >> 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm >> 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 >> 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm >> 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 >> 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm >> 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 >> 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm >> 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 >> 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm >> 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 >> 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm >> 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 >> 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm >> 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 >> 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm >> 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 >> 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm >> 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 >> 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm >> 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 >> 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm >> 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 >> 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm >> 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 >> 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm >> 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 >> 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm >> 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 >> 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm >> 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 >> 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm >> 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 >> 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm >> 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 >> 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm >> 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 >> 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm >> 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 >> 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm >> 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 >> 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm >> 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 >> 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm >> 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 >> 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm >> 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 >> 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm >> 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 >> 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm >> 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 >> 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm >> 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 >> 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm >> 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 >> 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm >> 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 >> 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm >> 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 >> 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm >> 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 >> 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm >> 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 >> 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm >> 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 >> 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm >> 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 >> 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm >> 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 >> 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm >> 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 >> 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm >> 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 >> 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm >> 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 >> 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm >> 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 >> 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm >> 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 >> 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm >> 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 >> 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm >> 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 >> 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm >> 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 >> 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm >> 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 >> 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm >> 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 >> 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm >> 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 >> 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm >> 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 >> 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm >> 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 >> 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm >> 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 >> 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm >> 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 >> 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm >> 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 >> 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm >> 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 >> 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm >> 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 >> 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm >> 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 >> 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm >> 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 >> 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm >> 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 >> 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm >> 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 >> 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm >> 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 >> 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm >> 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 >> 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm >> 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 >> 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm >> 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 >> 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm >> 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 >> 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm >> 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 >> 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm >> 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 >> 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm >> 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 >> 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm >> 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 >> 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm >> 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 >> 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm >> 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 >> 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm >> 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 >> 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm >> 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 >> 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm >> 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 >> 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm >> 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 >> 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm >> 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 >> 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm >> 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 >> 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm >> 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 >> Linear solve did not converge due to DIVERGED_ITS iterations 300 >> KSP Object: 2 MPI processes >> type: fgmres >> GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=300, initial guess is zero >> tolerances: relative=1e-09, absolute=1e-20, divergence=10000 >> right preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 2 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, factorization DIAG >> Preconditioner for the Schur complement formed from Sp, an assembled >> approximation to S, which uses (lumped, if requested) A00's diagonal's >> inverse >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (fieldsplit_u_) 2 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_u_) 2 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 0, needed 0 >> Factored matrix follows: >> Mat Object: 2 MPI processes >> type: mpiaij >> rows=184326, cols=184326 >> package used to perform factorization: mumps >> total: nonzeros=4.03041e+08, allocated >> nonzeros=4.03041e+08 >> total number of mallocs used during MatSetValues calls =0 >> MUMPS run parameters: >> SYM (matrix type): 0 >> PAR (host participation): 1 >> ICNTL(1) (output for error): 6 >> ICNTL(2) (output of diagnostic msg): 0 >> ICNTL(3) (output for global info): 0 >> ICNTL(4) (level of printing): 0 >> ICNTL(5) (input mat struct): 0 >> ICNTL(6) (matrix prescaling): 7 >> ICNTL(7) (sequentia matrix ordering):7 >> ICNTL(8) (scalling strategy): 77 >> ICNTL(10) (max num of refinements): 0 >> ICNTL(11) (error analysis): 0 >> ICNTL(12) (efficiency control): >> 1 >> ICNTL(13) (efficiency control): >> 0 >> ICNTL(14) (percentage of estimated workspace >> increase): 20 >> ICNTL(18) (input mat struct): >> 3 >> ICNTL(19) (Shur complement info): >> 0 >> ICNTL(20) (rhs sparse pattern): >> 0 >> ICNTL(21) (solution struct): >> 1 >> ICNTL(22) (in-core/out-of-core facility): >> 0 >> ICNTL(23) (max size of memory can be allocated >> locally):0 >> ICNTL(24) (detection of null pivot rows): >> 0 >> ICNTL(25) (computation of a null space basis): >> 0 >> ICNTL(26) (Schur options for rhs or solution): >> 0 >> ICNTL(27) (experimental parameter): >> -24 >> ICNTL(28) (use parallel or sequential ordering): >> 1 >> ICNTL(29) (parallel ordering): >> 0 >> ICNTL(30) (user-specified set of entries in inv(A)): >> 0 >> ICNTL(31) (factors is discarded in the solve phase): >> 0 >> ICNTL(33) (compute determinant): >> 0 >> CNTL(1) (relative pivoting threshold): 0.01 >> CNTL(2) (stopping criterion of refinement): >> 1.49012e-08 >> CNTL(3) (absolute pivoting threshold): 0 >> CNTL(4) (value of static pivoting): -1 >> CNTL(5) (fixation for null pivots): 0 >> RINFO(1) (local estimated flops for the elimination >> after analysis): >> [0] 5.59214e+11 >> [1] 5.35237e+11 >> RINFO(2) (local estimated flops for the assembly >> after factorization): >> [0] 4.2839e+08 >> [1] 3.799e+08 >> RINFO(3) (local estimated flops for the elimination >> after factorization): >> [0] 5.59214e+11 >> [1] 5.35237e+11 >> INFO(15) (estimated size of (in MB) MUMPS internal >> data for running numerical factorization): >> [0] 2621 >> [1] 2649 >> INFO(16) (size of (in MB) MUMPS internal data used >> during numerical factorization): >> [0] 2621 >> [1] 2649 >> INFO(23) (num of pivots eliminated on this processor >> after factorization): >> [0] 90423 >> [1] 93903 >> RINFOG(1) (global estimated flops for the elimination >> after analysis): 1.09445e+12 >> RINFOG(2) (global estimated flops for the assembly >> after factorization): 8.0829e+08 >> RINFOG(3) (global estimated flops for the elimination >> after factorization): 1.09445e+12 >> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >> (0,0)*(2^0) >> INFOG(3) (estimated real workspace for factors on all >> processors after analysis): 403041366 >> INFOG(4) (estimated integer workspace for factors on >> all processors after analysis): 2265748 >> INFOG(5) (estimated maximum front size in the >> complete tree): 6663 >> INFOG(6) (number of nodes in the complete tree): 2812 >> INFOG(7) (ordering option effectively use after >> analysis): 5 >> INFOG(8) (structural symmetry in percent of the >> permuted matrix after analysis): 100 >> INFOG(9) (total real/complex workspace to store the >> matrix factors after factorization): 403041366 >> INFOG(10) (total integer space store the matrix >> factors after factorization): 2265766 >> INFOG(11) (order of largest frontal matrix after >> factorization): 6663 >> INFOG(12) (number of off-diagonal pivots): 0 >> INFOG(13) (number of delayed pivots after >> factorization): 0 >> INFOG(14) (number of memory compress after >> factorization): 0 >> INFOG(15) (number of steps of iterative refinement >> after solution): 0 >> INFOG(16) (estimated size (in MB) of all MUMPS >> internal data for factorization after analysis: value on the most memory >> consuming processor): 2649 >> INFOG(17) (estimated size of all MUMPS internal data >> for factorization after analysis: sum over all processors): 5270 >> INFOG(18) (size of all MUMPS internal data allocated >> during factorization: value on the most memory consuming processor): 2649 >> INFOG(19) (size of all MUMPS internal data allocated >> during factorization: sum over all processors): 5270 >> INFOG(20) (estimated number of entries in the >> factors): 403041366 >> INFOG(21) (size in MB of memory effectively used >> during factorization - value on the most memory consuming processor): 2121 >> INFOG(22) (size in MB of memory effectively used >> during factorization - sum over all processors): 4174 >> INFOG(23) (after analysis: value of ICNTL(6) >> effectively used): 0 >> INFOG(24) (after analysis: value of ICNTL(12) >> effectively used): 1 >> INFOG(25) (after factorization: number of pivots >> modified by static pivoting): 0 >> INFOG(28) (after factorization: number of null pivots >> encountered): 0 >> INFOG(29) (after factorization: effective number of >> entries in the factors (sum over all processors)): 403041366 >> INFOG(30, 31) (after solution: size in Mbytes of >> memory used during solution phase): 2467, 4922 >> INFOG(32) (after analysis: type of analysis done): 1 >> INFOG(33) (value used for ICNTL(8)): 7 >> INFOG(34) (exponent of the determinant if determinant >> is requested): 0 >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_u_) 2 MPI processes >> type: mpiaij >> rows=184326, cols=184326, bs=3 >> total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 26829 nodes, >> limit used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (fieldsplit_lu_) 2 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_lu_) 2 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 0, needed 0 >> Factored matrix follows: >> Mat Object: 2 MPI processes >> type: mpiaij >> rows=2583, cols=2583 >> package used to perform factorization: mumps >> total: nonzeros=2.17621e+06, allocated >> nonzeros=2.17621e+06 >> total number of mallocs used during MatSetValues calls =0 >> MUMPS run parameters: >> SYM (matrix type): 0 >> PAR (host participation): 1 >> ICNTL(1) (output for error): 6 >> ICNTL(2) (output of diagnostic msg): 0 >> ICNTL(3) (output for global info): 0 >> ICNTL(4) (level of printing): 0 >> ICNTL(5) (input mat struct): 0 >> ICNTL(6) (matrix prescaling): 7 >> ICNTL(7) (sequentia matrix ordering):7 >> ICNTL(8) (scalling strategy): 77 >> ICNTL(10) (max num of refinements): 0 >> ICNTL(11) (error analysis): 0 >> ICNTL(12) (efficiency control): >> 1 >> ICNTL(13) (efficiency control): >> 0 >> ICNTL(14) (percentage of estimated workspace >> increase): 20 >> ICNTL(18) (input mat struct): >> 3 >> ICNTL(19) (Shur complement info): >> 0 >> ICNTL(20) (rhs sparse pattern): >> 0 >> ICNTL(21) (solution struct): >> 1 >> ICNTL(22) (in-core/out-of-core facility): >> 0 >> ICNTL(23) (max size of memory can be allocated >> locally):0 >> ICNTL(24) (detection of null pivot rows): >> 0 >> ICNTL(25) (computation of a null space basis): >> 0 >> ICNTL(26) (Schur options for rhs or solution): >> 0 >> ICNTL(27) (experimental parameter): >> -24 >> ICNTL(28) (use parallel or sequential ordering): >> 1 >> ICNTL(29) (parallel ordering): >> 0 >> ICNTL(30) (user-specified set of entries in inv(A)): >> 0 >> ICNTL(31) (factors is discarded in the solve phase): >> 0 >> ICNTL(33) (compute determinant): >> 0 >> CNTL(1) (relative pivoting threshold): 0.01 >> CNTL(2) (stopping criterion of refinement): >> 1.49012e-08 >> CNTL(3) (absolute pivoting threshold): 0 >> CNTL(4) (value of static pivoting): -1 >> CNTL(5) (fixation for null pivots): 0 >> RINFO(1) (local estimated flops for the elimination >> after analysis): >> [0] 5.12794e+08 >> [1] 5.02142e+08 >> RINFO(2) (local estimated flops for the assembly >> after factorization): >> [0] 815031 >> [1] 745263 >> RINFO(3) (local estimated flops for the elimination >> after factorization): >> [0] 5.12794e+08 >> [1] 5.02142e+08 >> INFO(15) (estimated size of (in MB) MUMPS internal >> data for running numerical factorization): >> [0] 34 >> [1] 34 >> INFO(16) (size of (in MB) MUMPS internal data used >> during numerical factorization): >> [0] 34 >> [1] 34 >> INFO(23) (num of pivots eliminated on this processor >> after factorization): >> [0] 1158 >> [1] 1425 >> RINFOG(1) (global estimated flops for the elimination >> after analysis): 1.01494e+09 >> RINFOG(2) (global estimated flops for the assembly >> after factorization): 1.56029e+06 >> RINFOG(3) (global estimated flops for the elimination >> after factorization): 1.01494e+09 >> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >> (0,0)*(2^0) >> INFOG(3) (estimated real workspace for factors on all >> processors after analysis): 2176209 >> INFOG(4) (estimated integer workspace for factors on >> all processors after analysis): 14427 >> INFOG(5) (estimated maximum front size in the >> complete tree): 699 >> INFOG(6) (number of nodes in the complete tree): 15 >> INFOG(7) (ordering option effectively use after >> analysis): 2 >> INFOG(8) (structural symmetry in percent of the >> permuted matrix after analysis): 100 >> INFOG(9) (total real/complex workspace to store the >> matrix factors after factorization): 2176209 >> INFOG(10) (total integer space store the matrix >> factors after factorization): 14427 >> INFOG(11) (order of largest frontal matrix after >> factorization): 699 >> INFOG(12) (number of off-diagonal pivots): 0 >> INFOG(13) (number of delayed pivots after >> factorization): 0 >> INFOG(14) (number of memory compress after >> factorization): 0 >> INFOG(15) (number of steps of iterative refinement >> after solution): 0 >> INFOG(16) (estimated size (in MB) of all MUMPS >> internal data for factorization after analysis: value on the most memory >> consuming processor): 34 >> INFOG(17) (estimated size of all MUMPS internal data >> for factorization after analysis: sum over all processors): 68 >> INFOG(18) (size of all MUMPS internal data allocated >> during factorization: value on the most memory consuming processor): 34 >> INFOG(19) (size of all MUMPS internal data allocated >> during factorization: sum over all processors): 68 >> INFOG(20) (estimated number of entries in the >> factors): 2176209 >> INFOG(21) (size in MB of memory effectively used >> during factorization - value on the most memory consuming processor): 30 >> INFOG(22) (size in MB of memory effectively used >> during factorization - sum over all processors): 59 >> INFOG(23) (after analysis: value of ICNTL(6) >> effectively used): 0 >> INFOG(24) (after analysis: value of ICNTL(12) >> effectively used): 1 >> INFOG(25) (after factorization: number of pivots >> modified by static pivoting): 0 >> INFOG(28) (after factorization: number of null pivots >> encountered): 0 >> INFOG(29) (after factorization: effective number of >> entries in the factors (sum over all processors)): 2176209 >> INFOG(30, 31) (after solution: size in Mbytes of >> memory used during solution phase): 16, 32 >> INFOG(32) (after analysis: type of analysis done): 1 >> INFOG(33) (value used for ICNTL(8)): 7 >> INFOG(34) (exponent of the determinant if determinant >> is requested): 0 >> linear system matrix followed by preconditioner matrix: >> Mat Object: (fieldsplit_lu_) 2 MPI processes >> type: schurcomplement >> rows=2583, cols=2583 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (fieldsplit_lu_) 2 >> MPI processes >> type: mpiaij >> rows=2583, cols=2583, bs=3 >> total: nonzeros=117369, allocated nonzeros=117369 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> A10 >> Mat Object: 2 MPI processes >> type: mpiaij >> rows=2583, cols=184326, rbs=3, cbs = 1 >> total: nonzeros=292770, allocated nonzeros=292770 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> KSP of A00 >> KSP Object: (fieldsplit_u_) 2 >> MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_u_) 2 MPI >> processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 0, needed 0 >> Factored matrix follows: >> Mat Object: 2 MPI processes >> type: mpiaij >> rows=184326, cols=184326 >> package used to perform factorization: mumps >> total: nonzeros=4.03041e+08, allocated >> nonzeros=4.03041e+08 >> total number of mallocs used during MatSetValues >> calls =0 >> MUMPS run parameters: >> SYM (matrix type): 0 >> PAR (host participation): 1 >> ICNTL(1) (output for error): 6 >> ICNTL(2) (output of diagnostic msg): 0 >> ICNTL(3) (output for global info): 0 >> ICNTL(4) (level of printing): 0 >> ICNTL(5) (input mat struct): 0 >> ICNTL(6) (matrix prescaling): 7 >> ICNTL(7) (sequentia matrix ordering):7 >> ICNTL(8) (scalling strategy): 77 >> ICNTL(10) (max num of refinements): 0 >> ICNTL(11) (error analysis): 0 >> ICNTL(12) (efficiency control): >> 1 >> ICNTL(13) (efficiency control): >> 0 >> ICNTL(14) (percentage of estimated workspace >> increase): 20 >> ICNTL(18) (input mat struct): >> 3 >> ICNTL(19) (Shur complement info): >> 0 >> ICNTL(20) (rhs sparse pattern): >> 0 >> ICNTL(21) (solution struct): >> 1 >> ICNTL(22) (in-core/out-of-core facility): >> 0 >> ICNTL(23) (max size of memory can be >> allocated locally):0 >> ICNTL(24) (detection of null pivot rows): >> 0 >> ICNTL(25) (computation of a null space >> basis): 0 >> ICNTL(26) (Schur options for rhs or >> solution): 0 >> ICNTL(27) (experimental parameter): >> -24 >> ICNTL(28) (use parallel or sequential >> ordering): 1 >> ICNTL(29) (parallel ordering): >> 0 >> ICNTL(30) (user-specified set of entries in >> inv(A)): 0 >> ICNTL(31) (factors is discarded in the solve >> phase): 0 >> ICNTL(33) (compute determinant): >> 0 >> CNTL(1) (relative pivoting threshold): >> 0.01 >> CNTL(2) (stopping criterion of refinement): >> 1.49012e-08 >> CNTL(3) (absolute pivoting threshold): 0 >> CNTL(4) (value of static pivoting): >> -1 >> CNTL(5) (fixation for null pivots): 0 >> RINFO(1) (local estimated flops for the >> elimination after analysis): >> [0] 5.59214e+11 >> [1] 5.35237e+11 >> RINFO(2) (local estimated flops for the >> assembly after factorization): >> [0] 4.2839e+08 >> [1] 3.799e+08 >> RINFO(3) (local estimated flops for the >> elimination after factorization): >> [0] 5.59214e+11 >> [1] 5.35237e+11 >> INFO(15) (estimated size of (in MB) MUMPS >> internal data for running numerical factorization): >> [0] 2621 >> [1] 2649 >> INFO(16) (size of (in MB) MUMPS internal data >> used during numerical factorization): >> [0] 2621 >> [1] 2649 >> INFO(23) (num of pivots eliminated on this >> processor after factorization): >> [0] 90423 >> [1] 93903 >> RINFOG(1) (global estimated flops for the >> elimination after analysis): 1.09445e+12 >> RINFOG(2) (global estimated flops for the >> assembly after factorization): 8.0829e+08 >> RINFOG(3) (global estimated flops for the >> elimination after factorization): 1.09445e+12 >> (RINFOG(12) RINFOG(13))*2^INFOG(34) >> (determinant): (0,0)*(2^0) >> INFOG(3) (estimated real workspace for >> factors on all processors after analysis): 403041366 >> INFOG(4) (estimated integer workspace for >> factors on all processors after analysis): 2265748 >> INFOG(5) (estimated maximum front size in the >> complete tree): 6663 >> INFOG(6) (number of nodes in the complete >> tree): 2812 >> INFOG(7) (ordering option effectively use >> after analysis): 5 >> INFOG(8) (structural symmetry in percent of >> the permuted matrix after analysis): 100 >> INFOG(9) (total real/complex workspace to >> store the matrix factors after factorization): 403041366 >> INFOG(10) (total integer space store the >> matrix factors after factorization): 2265766 >> INFOG(11) (order of largest frontal matrix >> after factorization): 6663 >> INFOG(12) (number of off-diagonal pivots): 0 >> INFOG(13) (number of delayed pivots after >> factorization): 0 >> INFOG(14) (number of memory compress after >> factorization): 0 >> INFOG(15) (number of steps of iterative >> refinement after solution): 0 >> INFOG(16) (estimated size (in MB) of all >> MUMPS internal data for factorization after analysis: value on the most >> memory consuming processor): 2649 >> INFOG(17) (estimated size of all MUMPS >> internal data for factorization after analysis: sum over all processors): >> 5270 >> INFOG(18) (size of all MUMPS internal data >> allocated during factorization: value on the most memory consuming >> processor): 2649 >> INFOG(19) (size of all MUMPS internal data >> allocated during factorization: sum over all processors): 5270 >> INFOG(20) (estimated number of entries in the >> factors): 403041366 >> INFOG(21) (size in MB of memory effectively >> used during factorization - value on the most memory consuming processor): >> 2121 >> INFOG(22) (size in MB of memory effectively >> used during factorization - sum over all processors): 4174 >> INFOG(23) (after analysis: value of ICNTL(6) >> effectively used): 0 >> INFOG(24) (after analysis: value of ICNTL(12) >> effectively used): 1 >> INFOG(25) (after factorization: number of >> pivots modified by static pivoting): 0 >> INFOG(28) (after factorization: number of >> null pivots encountered): 0 >> INFOG(29) (after factorization: effective >> number of entries in the factors (sum over all processors)): 403041366 >> INFOG(30, 31) (after solution: size in Mbytes >> of memory used during solution phase): 2467, 4922 >> INFOG(32) (after analysis: type of analysis >> done): 1 >> INFOG(33) (value used for ICNTL(8)): 7 >> INFOG(34) (exponent of the determinant if >> determinant is requested): 0 >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_u_) >> 2 MPI processes >> type: mpiaij >> rows=184326, cols=184326, bs=3 >> total: nonzeros=3.32649e+07, allocated >> nonzeros=3.32649e+07 >> total number of mallocs used during MatSetValues calls >> =0 >> using I-node (on process 0) routines: found 26829 >> nodes, limit used is 5 >> A01 >> Mat Object: 2 MPI processes >> type: mpiaij >> rows=184326, cols=2583, rbs=3, cbs = 1 >> total: nonzeros=292770, allocated nonzeros=292770 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 16098 >> nodes, limit used is 5 >> Mat Object: 2 MPI processes >> type: mpiaij >> rows=2583, cols=2583, rbs=3, cbs = 1 >> total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Mat Object: 2 MPI processes >> type: mpiaij >> rows=186909, cols=186909 >> total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 26829 nodes, limit used >> is 5 >> KSPSolve completed >> >> >> Giang >> >> On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley >> wrote: >> >>> On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui >>> wrote: >>> >>>> Hello >>>> >>>> I'm solving an indefinite problem arising from mesh tying/contact using >>>> Lagrange multiplier, the matrix has the form >>>> >>>> K = [A P^T >>>> P 0] >>>> >>>> I used the FIELDSPLIT preconditioner with one field is the main >>>> variable (displacement) and the other field for dual variable (Lagrange >>>> multiplier). The block size for each field is 3. According to the manual, I >>>> first chose the preconditioner based on Schur complement to treat this >>>> problem. >>>> >>> >>> >> >>> For any solver question, please send us the output of >>> >>> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>> >>> >> >>> However, I will comment below >>> >>> >>>> The parameters used for the solve is >>>> -ksp_type gmres >>>> >>> >>> You need 'fgmres' here with the options you have below. >>> >>> >>>> -ksp_max_it 300 >>>> -ksp_gmres_restart 300 >>>> -ksp_gmres_modifiedgramschmidt >>>> -pc_fieldsplit_type schur >>>> -pc_fieldsplit_schur_fact_type diag >>>> -pc_fieldsplit_schur_precondition selfp >>>> >>> >>> >> >> >>> It could be taking time in the MatMatMult() here if that matrix is >>> dense. Is there any reason to >>> believe that is a good preconditioner for your problem? >>> >> >> >>> >>> >>>> -pc_fieldsplit_detect_saddle_point >>>> -fieldsplit_u_pc_type hypre >>>> >>> >>> I would just use MUMPS here to start, especially if it works on the >>> whole problem. Same with the one below. >>> >>> Matt >>> >>> >>>> -fieldsplit_u_pc_hypre_type boomeramg >>>> -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS >>>> -fieldsplit_lu_pc_type hypre >>>> -fieldsplit_lu_pc_hypre_type boomeramg >>>> -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS >>>> >>>> For the test case, a small problem is solved on 2 processes. Due to the >>>> decomposition, the contact only happens in 1 proc, so the size of Lagrange >>>> multiplier dofs on proc 0 is 0. >>>> >>>> 0: mIndexU.size(): 80490 >>>> 0: mIndexLU.size(): 0 >>>> 1: mIndexU.size(): 103836 >>>> 1: mIndexLU.size(): 2583 >>>> >>>> However, with this setup the solver takes very long at KSPSolve before >>>> going to iteration, and the first iteration seems forever so I have to stop >>>> the calculation. I guessed that the solver takes time to compute the Schur >>>> complement, but according to the manual only the diagonal of A is used to >>>> approximate the Schur complement, so it should not take long to compute >>>> this. >>>> >>>> Note that I ran the same problem with direct solver (MUMPS) and it's >>>> able to produce the valid results. The parameter for the solve is pretty >>>> standard >>>> -ksp_type preonly >>>> -pc_type lu >>>> -pc_factor_mat_solver_package mumps >>>> >>>> Hence the matrix/rhs must not have any problem here. Do you have any >>>> idea or suggestion for this case? >>>> >>>> >>>> Giang >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 8 18:04:34 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 8 Sep 2016 18:04:34 -0500 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: Normally you'd be absolutely correct to expect convergence in one iteration. However in this example note the call ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); It is solving the linear system defined by A but building the preconditioner (i.e. the entire fieldsplit process) from a different matrix B. Since A is not B you should not expect convergence in one iteration. If you change the code to ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); you will see exactly what you expect, convergence in one iteration. Sorry about this, the example is lacking clarity and documentation its author obviously knew too well what he was doing that he didn't realize everyone else in the world would need more comments in the code. If you change the code to ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); it will stop without being able to build the preconditioner because LU factorization of the Sp matrix will result in a zero pivot. This is why this "auxiliary" matrix B is used to define the preconditioner instead of A. Barry > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui wrote: > > Sorry I slept quite a while in this thread. Now I start to look at it again. In the last try, the previous setting doesn't work either (in fact diverge). So I would speculate if the Schur complement in my case is actually not invertible. It's also possible that the code is wrong somewhere. However, before looking at that, I want to understand thoroughly the settings for Schur complement > > I experimented ex42 with the settings: > mpirun -np 1 ex42 \ > -stokes_ksp_monitor \ > -stokes_ksp_type fgmres \ > -stokes_pc_type fieldsplit \ > -stokes_pc_fieldsplit_type schur \ > -stokes_pc_fieldsplit_schur_fact_type full \ > -stokes_pc_fieldsplit_schur_precondition selfp \ > -stokes_fieldsplit_u_ksp_type preonly \ > -stokes_fieldsplit_u_pc_type lu \ > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > -stokes_fieldsplit_p_ksp_type gmres \ > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > -stokes_fieldsplit_p_ksp_max_it 300 \ > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > -stokes_fieldsplit_p_pc_type lu \ > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > In my understanding, the solver should converge in 1 (outer) step. Execution gives: > Residual norms for stokes_ solve. > 0 KSP Residual norm 1.327791371202e-02 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > 1 KSP Residual norm 7.656238881621e-04 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > 2 KSP Residual norm 9.895645456398e-06 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > 3 KSP Residual norm 1.608206702571e-06 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > 4 KSP Residual norm 2.453618999882e-07 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > 5 KSP Residual norm 4.230399264750e-08 > > Looks like the "selfp" does construct the Schur nicely. But does "full" really construct the full block preconditioner? > > Giang > P/S: I'm also generating a smaller size of the previous problem for checking again. > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley wrote: > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui wrote: > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > believe that is a good preconditioner for your problem? > > This is the first approach to the problem, so I chose the most simple setting. Do you have any other recommendation? > > This is in no way the simplest PC. We need to make it simpler first. > > 1) Run on only 1 proc > > 2) Use -pc_fieldsplit_schur_fact_type full > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_true_residual > > This should converge in 1 outer iteration, but we will see how good your Schur complement preconditioner > is for this problem. > > You need to start out from something you understand and then start making approximations. > > Matt > > For any solver question, please send us the output of > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > I sent here the full output (after changed to fgmres), again it takes long at the first iteration but after that, it does not converge > > -ksp_type fgmres > -ksp_max_it 300 > -ksp_gmres_restart 300 > -ksp_gmres_modifiedgramschmidt > -pc_fieldsplit_type schur > -pc_fieldsplit_schur_fact_type diag > -pc_fieldsplit_schur_precondition selfp > -pc_fieldsplit_detect_saddle_point > -fieldsplit_u_ksp_type preonly > -fieldsplit_u_pc_type lu > -fieldsplit_u_pc_factor_mat_solver_package mumps > -fieldsplit_lu_ksp_type preonly > -fieldsplit_lu_pc_type lu > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > Linear solve did not converge due to DIVERGED_ITS iterations 300 > KSP Object: 2 MPI processes > type: fgmres > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=300, initial guess is zero > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: 2 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization DIAG > Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (fieldsplit_u_) 2 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_u_) 2 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 2 MPI processes > type: mpiaij > rows=184326, cols=184326 > package used to perform factorization: mumps > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 0 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequentia matrix ordering):7 > ICNTL(8) (scalling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 1 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 20 > ICNTL(18) (input mat struct): 3 > ICNTL(19) (Shur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (solution struct): 1 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): -24 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absolute pivoting threshold): 0 > CNTL(4) (value of static pivoting): -1 > CNTL(5) (fixation for null pivots): 0 > RINFO(1) (local estimated flops for the elimination after analysis): > [0] 5.59214e+11 > [1] 5.35237e+11 > RINFO(2) (local estimated flops for the assembly after factorization): > [0] 4.2839e+08 > [1] 3.799e+08 > RINFO(3) (local estimated flops for the elimination after factorization): > [0] 5.59214e+11 > [1] 5.35237e+11 > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > [0] 2621 > [1] 2649 > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > [0] 2621 > [1] 2649 > INFO(23) (num of pivots eliminated on this processor after factorization): > [0] 90423 > [1] 93903 > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > INFOG(5) (estimated maximum front size in the complete tree): 6663 > INFOG(6) (number of nodes in the complete tree): 2812 > INFOG(7) (ordering option effectively use after analysis): 5 > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > INFOG(11) (order of largest frontal matrix after factorization): 6663 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > INFOG(20) (estimated number of entries in the factors): 403041366 > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > INFOG(28) (after factorization: number of null pivots encountered): 0 > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is requested): 0 > linear system matrix = precond matrix: > Mat Object: (fieldsplit_u_) 2 MPI processes > type: mpiaij > rows=184326, cols=184326, bs=3 > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (fieldsplit_lu_) 2 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_lu_) 2 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 2 MPI processes > type: mpiaij > rows=2583, cols=2583 > package used to perform factorization: mumps > total: nonzeros=2.17621e+06, allocated nonzeros=2.17621e+06 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 0 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequentia matrix ordering):7 > ICNTL(8) (scalling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 1 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 20 > ICNTL(18) (input mat struct): 3 > ICNTL(19) (Shur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (solution struct): 1 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): -24 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absolute pivoting threshold): 0 > CNTL(4) (value of static pivoting): -1 > CNTL(5) (fixation for null pivots): 0 > RINFO(1) (local estimated flops for the elimination after analysis): > [0] 5.12794e+08 > [1] 5.02142e+08 > RINFO(2) (local estimated flops for the assembly after factorization): > [0] 815031 > [1] 745263 > RINFO(3) (local estimated flops for the elimination after factorization): > [0] 5.12794e+08 > [1] 5.02142e+08 > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > [0] 34 > [1] 34 > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > [0] 34 > [1] 34 > INFO(23) (num of pivots eliminated on this processor after factorization): > [0] 1158 > [1] 1425 > RINFOG(1) (global estimated flops for the elimination after analysis): 1.01494e+09 > RINFOG(2) (global estimated flops for the assembly after factorization): 1.56029e+06 > RINFOG(3) (global estimated flops for the elimination after factorization): 1.01494e+09 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > INFOG(3) (estimated real workspace for factors on all processors after analysis): 2176209 > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 14427 > INFOG(5) (estimated maximum front size in the complete tree): 699 > INFOG(6) (number of nodes in the complete tree): 15 > INFOG(7) (ordering option effectively use after analysis): 2 > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 2176209 > INFOG(10) (total integer space store the matrix factors after factorization): 14427 > INFOG(11) (order of largest frontal matrix after factorization): 699 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 34 > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 68 > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 34 > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 68 > INFOG(20) (estimated number of entries in the factors): 2176209 > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 30 > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 59 > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > INFOG(28) (after factorization: number of null pivots encountered): 0 > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 2176209 > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 16, 32 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is requested): 0 > linear system matrix followed by preconditioner matrix: > Mat Object: (fieldsplit_lu_) 2 MPI processes > type: schurcomplement > rows=2583, cols=2583 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (fieldsplit_lu_) 2 MPI processes > type: mpiaij > rows=2583, cols=2583, bs=3 > total: nonzeros=117369, allocated nonzeros=117369 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > A10 > Mat Object: 2 MPI processes > type: mpiaij > rows=2583, cols=184326, rbs=3, cbs = 1 > total: nonzeros=292770, allocated nonzeros=292770 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > KSP of A00 > KSP Object: (fieldsplit_u_) 2 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_u_) 2 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 2 MPI processes > type: mpiaij > rows=184326, cols=184326 > package used to perform factorization: mumps > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 0 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequentia matrix ordering):7 > ICNTL(8) (scalling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 1 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 20 > ICNTL(18) (input mat struct): 3 > ICNTL(19) (Shur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (solution struct): 1 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): -24 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absolute pivoting threshold): 0 > CNTL(4) (value of static pivoting): -1 > CNTL(5) (fixation for null pivots): 0 > RINFO(1) (local estimated flops for the elimination after analysis): > [0] 5.59214e+11 > [1] 5.35237e+11 > RINFO(2) (local estimated flops for the assembly after factorization): > [0] 4.2839e+08 > [1] 3.799e+08 > RINFO(3) (local estimated flops for the elimination after factorization): > [0] 5.59214e+11 > [1] 5.35237e+11 > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > [0] 2621 > [1] 2649 > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > [0] 2621 > [1] 2649 > INFO(23) (num of pivots eliminated on this processor after factorization): > [0] 90423 > [1] 93903 > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > INFOG(5) (estimated maximum front size in the complete tree): 6663 > INFOG(6) (number of nodes in the complete tree): 2812 > INFOG(7) (ordering option effectively use after analysis): 5 > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > INFOG(11) (order of largest frontal matrix after factorization): 6663 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > INFOG(20) (estimated number of entries in the factors): 403041366 > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > INFOG(28) (after factorization: number of null pivots encountered): 0 > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is requested): 0 > linear system matrix = precond matrix: > Mat Object: (fieldsplit_u_) 2 MPI processes > type: mpiaij > rows=184326, cols=184326, bs=3 > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > A01 > Mat Object: 2 MPI processes > type: mpiaij > rows=184326, cols=2583, rbs=3, cbs = 1 > total: nonzeros=292770, allocated nonzeros=292770 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 16098 nodes, limit used is 5 > Mat Object: 2 MPI processes > type: mpiaij > rows=2583, cols=2583, rbs=3, cbs = 1 > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Mat Object: 2 MPI processes > type: mpiaij > rows=186909, cols=186909 > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > KSPSolve completed > > > Giang > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley wrote: > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui wrote: > Hello > > I'm solving an indefinite problem arising from mesh tying/contact using Lagrange multiplier, the matrix has the form > > K = [A P^T > P 0] > > I used the FIELDSPLIT preconditioner with one field is the main variable (displacement) and the other field for dual variable (Lagrange multiplier). The block size for each field is 3. According to the manual, I first chose the preconditioner based on Schur complement to treat this problem. > > > For any solver question, please send us the output of > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > However, I will comment below > > The parameters used for the solve is > -ksp_type gmres > > You need 'fgmres' here with the options you have below. > > -ksp_max_it 300 > -ksp_gmres_restart 300 > -ksp_gmres_modifiedgramschmidt > -pc_fieldsplit_type schur > -pc_fieldsplit_schur_fact_type diag > -pc_fieldsplit_schur_precondition selfp > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > believe that is a good preconditioner for your problem? > > > -pc_fieldsplit_detect_saddle_point > -fieldsplit_u_pc_type hypre > > I would just use MUMPS here to start, especially if it works on the whole problem. Same with the one below. > > Matt > > -fieldsplit_u_pc_hypre_type boomeramg > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > -fieldsplit_lu_pc_type hypre > -fieldsplit_lu_pc_hypre_type boomeramg > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > For the test case, a small problem is solved on 2 processes. Due to the decomposition, the contact only happens in 1 proc, so the size of Lagrange multiplier dofs on proc 0 is 0. > > 0: mIndexU.size(): 80490 > 0: mIndexLU.size(): 0 > 1: mIndexU.size(): 103836 > 1: mIndexLU.size(): 2583 > > However, with this setup the solver takes very long at KSPSolve before going to iteration, and the first iteration seems forever so I have to stop the calculation. I guessed that the solver takes time to compute the Schur complement, but according to the manual only the diagonal of A is used to approximate the Schur complement, so it should not take long to compute this. > > Note that I ran the same problem with direct solver (MUMPS) and it's able to produce the valid results. The parameter for the solve is pretty standard > -ksp_type preonly > -pc_type lu > -pc_factor_mat_solver_package mumps > > Hence the matrix/rhs must not have any problem here. Do you have any idea or suggestion for this case? > > > Giang > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From jshen25 at jhu.edu Thu Sep 8 19:45:36 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Thu, 8 Sep 2016 20:45:36 -0400 Subject: [petsc-users] PETSc parallel scalability In-Reply-To: <870D8020-DAF8-4414-8B6D-B40A425382C8@mcs.anl.gov> References: <870D8020-DAF8-4414-8B6D-B40A425382C8@mcs.anl.gov> Message-ID: Hi, Thanks a lot for the replies. They are really helpful. I just used the ksp ex2.c as an example to test the parallelization on my cluster, since the example have flexible number of unknowns. I do observe that for larger size of the problem, the speed-up shows better with more processors. Now, I'm moving on incorporating PETSC into our real CPFEM code, and investigating the suitable solver and preconditioners for the specific system. Thanks again. Bests, Jinlei On Wed, Sep 7, 2016 at 10:26 PM, Barry Smith wrote: > > > On Sep 7, 2016, at 8:37 PM, Jinlei Shen wrote: > > > > Hi, > > > > I am trying to test the parallel scalablity of iterative solver (CG with > BJacobi preconditioner) in PETSc. > > > > Since the iteration number increases with more processors, I calculated > the single iteration time by dividing the total KSPSolve time by number of > iteration in this test. > > > > The linear system I'm solving has 315342 unknowns. Only KSPSolve cost is > analyzed. > > > > The results show that the parallelism works well with small number of > processes (less than 32 in my case), and is almost perfect parallel within > first 10 processors. > > > > However, the effect of parallelization degrades if I use more > processors. The wired thing is that with more than 100 processors, the > single iteration cost is slightly increasing. > > > > To investigate this issue, I then looked into the composition of > KSPSolve time. > > It seems KSPSolve consists of MatMult, VecTDot(min),VecNorm(min), > VecAXPY(max),VecAXPX(max),ApplyPC. Please correct me if I'm wrong. > > > > And I found for small number of processors, all these components scale > well. > > However, using more processors(roughly larger than 40), MatMult, > VecTDot(min),VecNorm(min) behaves worse, and even increasing after 100 > processors, while the other three parts parallel well even for 1000 > processors. > > Since MatMult composed major cost in single iteration, the total single > iteration cost increases as well.(See the below figure). > > > > My question: > > 1. Is such situation reasonable? > > Yes > > > Could anyone explain why MatMult scales poor after certain number of > processors? I heard some about different algorithms for matrix > multiplication. Is that the bottleneck? > > The MatMult inherently requires communication, as the number of > processors increases the amount of communication increases while the total > work needed remains the same. This is true regardless of the particular > details of the algorithms used. > > Similar computations like norms and inner products require a > communication among all processes, as the increase the number of processes > the communication time starts to dominate for norms and inner products > hence they begin to take a great deal of time for large numbers of > processes. > > > > 2. Is the parallelism dependent of matrix size? If I use larger system > size,e.g. million , can the solver scale better? > > Absolutely. You should look up the concepts of strong scaling and weak > scaling. These are important concepts to understand to understand parallel > computing. > > > > > 3. Do you have any idea to improve the parallel performance? > > Worrying about parallel performance should never be a priority, the > priorities should be "can I solve the size problems I need to solve in a > reasonable amount of time to accomplish whatever science or engineering I > am working on". To be able to do this first depends on using good > discretization methods for your model (i.e. not just first or second order > whenever possible) and am I using efficient algebraic solvers when I need > to solve algebraic systems. (As Justin noted bjacobi GMRES is not a > particularly efficient algebraic solver). Only after you have done all that > does improving parallel performance come into play; you know you have an > efficient discretization and solver and now you want to make it run faster > in parallel. Since each type of simulation is unique you need to work > through the process for the problem YOU need to solve, you can't just run > it for a model problem and then try to reuse the same discretizations and > solvers for your "real" problem. > > Barry > > > > > Thank you very much. > > > > JInlei > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 8 22:26:35 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 8 Sep 2016 22:26:35 -0500 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> Message-ID: <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> Install your MPI with --download-mpich as a PETSc ./configure option, this will eliminate all the MPICH valgrind errors. Then send as an attachment the resulting valgrind file. I do not 100 % trust any code that produces such valgrind errors. Barry > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe wrote: > > Hi Barry, > > Thanks for the reply. My code is in C. I ran with Valgrind and found many "Conditional jump or move depends on uninitialized value(s)", "Invalid read" and "Use of uninitialized value" errors. I think all of them are from the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure what I'm looking for in the Valgrind output. At the end of the file, I get: > > ==40223== More than 10000000 total errors detected. I'm not reporting any more. > ==40223== Final error counts will be inaccurate. Go fix your program! > ==40223== Rerun with --error-limit=no to disable this cutoff. Note > ==40223== that errors may occur in your program without prior warning from > ==40223== Valgrind, because errors are no longer being displayed. > > Can you give some suggestions on how I should proceed? > > Thanks, > Harshad > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith wrote: > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very specialized and should never occur with GMRES. > > Can you run with valgrind to make sure there is no memory corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > Is your code fortran or C? > > Barry > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe wrote: > > > > Hi, > > > > I'm using GAMG + GMRES for my Poisson problem. The solver converges with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which is much higher than what I need (I need a tolerance of at least 1E-8). I am not able to figure out which tolerance I need to set to avoid convergence due to CONVERGED_STEP_LENGTH. > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > > > 0 KSP Residual norm 3.121347818142e+00 > > 1 KSP Residual norm 9.773346857844e-02 > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > > KSP Object: 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: 1 MPI processes > > type: gamg > > MG: type is MULTIPLICATIVE, levels=2 cycles=v > > Cycles per PCApply=1 > > Using Galerkin computed coarse grid matrices > > Coarse grid solver -- level ------------------------------- > > KSP Object: (mg_coarse_) 1 MPI processes > > type: preonly > > maximum iterations=1, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (mg_coarse_) 1 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 1 > > Local solve is same for all blocks, in the following KSP and PC objects: > > KSP Object: (mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=1, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > > matrix ordering: nd > > factor fill ratio given 5, needed 1.91048 > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=284, cols=284 > > package used to perform factorization: petsc > > total: nonzeros=7726, allocated nonzeros=7726 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 133 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=284, cols=284 > > total: nonzeros=4044, allocated nonzeros=4044 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=284, cols=284 > > total: nonzeros=4044, allocated nonzeros=4044 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Down solver (pre-smoother) on level 1 ------------------------------- > > KSP Object: (mg_levels_1_) 1 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.195339, max = 4.10212 > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (mg_levels_1_) 1 MPI processes > > type: sor > > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1 > > linear system matrix = precond matrix: > > Mat Object: () 1 MPI processes > > type: seqaij > > rows=9036, cols=9036 > > total: nonzeros=192256, allocated nonzeros=192256 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Up solver (post-smoother) same as down solver (pre-smoother) > > linear system matrix = precond matrix: > > Mat Object: () 1 MPI processes > > type: seqaij > > rows=9036, cols=9036 > > total: nonzeros=192256, allocated nonzeros=192256 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > > > Thanks, > > Harshad > > From jed at jedbrown.org Fri Sep 9 00:29:57 2016 From: jed at jedbrown.org (Jed Brown) Date: Thu, 08 Sep 2016 23:29:57 -0600 Subject: [petsc-users] PETSc parallel scalability In-Reply-To: References: <870D8020-DAF8-4414-8B6D-B40A425382C8@mcs.anl.gov> Message-ID: <87wpiltzca.fsf@jedbrown.org> Jinlei Shen writes: > Hi, > > Thanks a lot for the replies. They are really helpful. > > I just used the ksp ex2.c as an example to test the parallelization on my > cluster, since the example have flexible number of unknowns. > I do observe that for larger size of the problem, the speed-up shows better > with more processors. That example is intended to be readable by people new to PETSc, but it is not a good choice for benchmarking because it uses a naive distribution and solves an extremely easy problem. Perhaps a large deformation elasticity example (SNES ex16) would be more representative of your target application? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From mono at dtu.dk Fri Sep 9 04:04:12 2016 From: mono at dtu.dk (=?utf-8?B?TW9ydGVuIE5vYmVsLUrDuHJnZW5zZW4=?=) Date: Fri, 9 Sep 2016 09:04:12 +0000 Subject: [petsc-users] DMPlex problem Message-ID: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> Dear PETSc developers and users, Last week we posted a question regarding an error with DMPlex and multiple dofs and have not gotten any feedback yet. This is uncharted waters for us, since we have gotten used to an extremely fast feedback from the PETSc crew. So - with the chance of sounding impatient and ungrateful - we would like to hear if anybody has any ideas that could point us in the right direction? We have created a small example problem that demonstrates the error in the matrix assembly. Thanks, Morten -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex18k.cc Type: application/octet-stream Size: 4631 bytes Desc: ex18k.cc URL: From knepley at gmail.com Fri Sep 9 05:21:08 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 Sep 2016 05:21:08 -0500 Subject: [petsc-users] DMPlex problem In-Reply-To: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> Message-ID: On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen wrote: > Dear PETSc developers and users, > > Last week we posted a question regarding an error with DMPlex and multiple > dofs and have not gotten any feedback yet. This is uncharted waters for us, > since we have gotten used to an extremely fast feedback from the PETSc > crew. So - with the chance of sounding impatient and ungrateful - we would > like to hear if anybody has any ideas that could point us in the right > direction? > This is my fault. You have not gotten a response because everyone else was waiting for me, and I have been slow because I just moved houses at the same time as term started here. Sorry about that. The example ran for me and I saw your problem. The local-tp-global map is missing for some reason. I am tracking it down now. It should be made by DMCreateMatrix(), so this is mysterious. I hope to have this fixed by early next week. Thanks, Matt > We have created a small example problem that demonstrates the error in the > matrix assembly. > > Thanks, > Morten > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Fri Sep 9 09:32:06 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Fri, 9 Sep 2016 16:32:06 +0200 Subject: [petsc-users] Diagnosing a difference between "unpreconditioned" and "true" residual norms Message-ID: I am debugging a linear solver which uses a custom operator and preconditioner, via MATSHELL and PCSHELL. Convergence seems to be fine, except that I unexpectedly see a difference between the "unpreconditioned" and "true" residual norms when I use -ksp_monitor_true_residual with a right-preconditioned Krylov method (FGMRES or right-preconditioned GMRES). 0 KSP unpreconditioned resid norm 9.266794204683e+08 true resid norm 9.266794204683e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 2.317801431974e+04 true resid norm 2.317826550333e+04 ||r(i)||/||b|| 2.501217248530e-05 2 KSP unpreconditioned resid norm 4.453270507534e+00 true resid norm 2.699824780158e+01 ||r(i)||/||b|| 2.913439880638e-08 3 KSP unpreconditioned resid norm 1.015490793887e-03 true resid norm 2.658635801018e+01 ||r(i)||/||b|| 2.868991953738e-08 4 KSP unpreconditioned resid norm 4.710220776105e-07 true resid norm 2.658631616810e+01 ||r(i)||/||b|| 2.868987438467e-08 KSP Object:(mgk_) 1 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-13, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(mgk_) 1 MPI processes type: shell Shell: Custom PC linear system matrix = precond matrix: Mat Object: Custom Operator 1 MPI processes type: shell rows=256, cols=256 has attached null space I have dumped the explicit operator and preconditioned operator, and I can see that the operator and the preconditioned operator each have a 1-dimensional nullspace (a constant-pressure nullspace) which I have accounted for by constructing a normalized, constant-pressure vector and supplying it to the operator via a MatNullSpace. If I disregard the (numerically) zero singular value, the operator has a condition number of 1.5669e+05 and the preconditioned operator has a condition number of 1.01 (strong preconditioner). Has anyone seen this sort of behavior before and if so, is there a common culprit that I am overlooking? Any ideas of what to test next to try to isolate the issue? As I understand it, the unpreconditioned and true residual norms should be identical in exact arithmetic, so I would suspect that somehow I've ended up with a "bad Hessenberg matrix" in some way as I perform this solve (or maybe I have a more subtle bug). From mlohry at princeton.edu Fri Sep 9 09:49:18 2016 From: mlohry at princeton.edu (Mark Lohry) Date: Fri, 9 Sep 2016 08:49:18 -0600 Subject: [petsc-users] DMPlex problem In-Reply-To: References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> Message-ID: <967aa3a1-af2d-8d6f-a15e-8ca65a5fc887@princeton.edu> Regarding DMPlex, I'm unclear on the separation between Index Sets and Plex for unstructured solvers. I've got an existing unstructured serial solver using petsc's newton-krylov solvers. Should I be looking to parallelize this via IS or plex? Is there an interface for either that allows me to specify the partitioning (e.g. by metis' output)? Last but not least, is there a working example of dmplex in the docs? The only unstructured code example I see in the docs is SNES ex10, which uses IS. Thanks, Mark On 09/09/2016 04:21 AM, Matthew Knepley wrote: > On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen > wrote: > > Dear PETSc developers and users, > > Last week we posted a question regarding an error with DMPlex and > multiple dofs and have not gotten any feedback yet. This is > uncharted waters for us, since we have gotten used to an extremely > fast feedback from the PETSc crew. So - with the chance of > sounding impatient and ungrateful - we would like to hear if > anybody has any ideas that could point us in the right direction? > > > This is my fault. You have not gotten a response because everyone else > was waiting for me, and I have been > slow because I just moved houses at the same time as term started > here. Sorry about that. > > The example ran for me and I saw your problem. The local-tp-global map > is missing for some reason. > I am tracking it down now. It should be made by DMCreateMatrix(), so > this is mysterious. I hope to have > this fixed by early next week. > > Thanks, > > Matt > > We have created a small example problem that demonstrates the > error in the matrix assembly. > > Thanks, > Morten > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hengjiew at uci.edu Fri Sep 9 10:56:35 2016 From: hengjiew at uci.edu (frank) Date: Fri, 9 Sep 2016 08:56:35 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> Message-ID: Hi, I want to continue digging into the memory problem here. I did find a work around in the past, which is to use less cores per node so that each core has 8G memory. However this is deficient and expensive. I hope to locate the place that uses the most memory. Here is a brief summary of the tests I did in past: > Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 Maximum (over computational time) process memory: total 7.0727e+08 Current process memory: total 7.0727e+08 Maximum (over computational time) space PetscMalloc()ed: total 6.3908e+11 Current space PetscMalloc()ed: total 1.8275e+09 > Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 Maximum (over computational time) process memory: total 5.9431e+09 Current process memory: total 5.9431e+09 Maximum (over computational time) space PetscMalloc()ed: total 5.3202e+12 Current space PetscMalloc()ed: total 5.4844e+09 > Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 OOM( Out Of Memory ) killer of the supercomputer terminated the job during "KSPSolve". I attached the output of ksp_view( the third test's output is from ksp_view_pre ), memory_view and also the petsc options. In all the tests, each core can access about 2G memory. In test3, there are 4223139840 non-zeros in the matrix. This will consume about 1.74M, using double precision. Considering some extra memory used to store integer index, 2G memory should still be way enough. Is there a way to find out which part of KSPSolve uses the most memory? Thank you so much. BTW, there are 4 options remains unused and I don't understand why they are omitted: -mg_coarse_telescope_mg_coarse_ksp_type value: preonly -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 -mg_coarse_telescope_mg_levels_ksp_type value: richardson Regards, Frank On 07/13/2016 05:47 PM, Dave May wrote: > > > On 14 July 2016 at 01:07, frank > wrote: > > Hi Dave, > > Sorry for the late reply. > Thank you so much for your detailed reply. > > I have a question about the estimation of the memory usage. There > are 4223139840 allocated non-zeros and 18432 MPI processes. Double > precision is used. So the memory per process is: > 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? > Did I do sth wrong here? Because this seems too small. > > > No - I totally f***ed it up. You are correct. That'll teach me for > fumbling around with my iphone calculator and not using my brain. > (Note that to convert to MB just divide by 1e6, not 1024^2 - although > I apparently cannot convert between units correctly....) > > From the PETSc objects associated with the solver, It looks like it > _should_ run with 2GB per MPI rank. Sorry for my mistake. > Possibilities are: somewhere in your usage of PETSc you've introduced > a memory leak; PETSc is doing a huge over allocation (e.g. as per our > discussion of MatPtAP); or in your application code there are other > objects you have forgotten to log the memory for. > > > > I am running this job on Bluewater > > > I am using the 7 points FD stencil in 3D. > > > I thought so on both counts. > > > I apologize that I made a stupid mistake in computing the memory > per core. My settings render each core can access only 2G memory > on average instead of 8G which I mentioned in previous email. I > re-run the job with 8G memory per core on average and there is no > "Out Of Memory" error. I would do more test to see if there is > still some memory issue. > > > Ok. I'd still like to know where the memory was being used since my > estimates were off. > > > Thanks, > Dave > > > Regards, > Frank > > > > On 07/11/2016 01:18 PM, Dave May wrote: >> Hi Frank, >> >> >> On 11 July 2016 at 19:14, frank > > wrote: >> >> Hi Dave, >> >> I re-run the test using bjacobi as the preconditioner on the >> coarse mesh of telescope. The Grid is 3072*256*768 and >> process mesh is 96*8*24. The petsc option file is attached. >> I still got the "Out Of Memory" error. The error occurred >> before the linear solver finished one step. So I don't have >> the full info from ksp_view. The info from ksp_view_pre is >> attached. >> >> >> Okay - that is essentially useless (sorry) >> >> >> It seems to me that the error occurred when the decomposition >> was going to be changed. >> >> >> Based on what information? >> Running with -info would give us more clues, but will create a >> ton of output. >> Please try running the case which failed with -info >> >> I had another test with a grid of 1536*128*384 and the same >> process mesh as above. There was no error. The ksp_view info >> is attached for comparison. >> Thank you. >> >> >> >> [3] Here is my crude estimate of your memory usage. >> I'll target the biggest memory hogs only to get an order of >> magnitude estimate >> >> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB >> per MPI rank assuming double precision. >> The indices for the AIJ could amount to another 0.3 GB (assuming >> 32 bit integers) >> >> * You use 5 levels of coarsening, so the other operators should >> represent (collectively) >> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on >> the communicator with 18432 ranks. >> The coarse grid should consume ~ 0.5 MB per MPI rank on the >> communicator with 18432 ranks. >> >> * You use a reduction factor of 64, making the new communicator >> with 288 MPI ranks. >> PCTelescope will first gather a temporary matrix associated with >> your coarse level operator assuming a comm size of 288 living on >> the comm with size 18432. >> This matrix will require approximately 0.5 * 64 = 32 MB per core >> on the 288 ranks. >> This matrix is then used to form a new MPIAIJ matrix on the >> subcomm, thus require another 32 MB per rank. >> The temporary matrix is now destroyed. >> >> * Because a DMDA is detected, a permutation matrix is assembled. >> This requires 2 doubles per point in the DMDA. >> Your coarse DMDA contains 92 x 16 x 48 points. >> Thus the permutation matrix will require < 1 MB per MPI rank on >> the sub-comm. >> >> * Lastly, the matrix is permuted. This uses MatPtAP(), but the >> resulting operator will have the same memory footprint as the >> unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 >> operators of size 32 MB are held in memory when the DMDA is provided. >> >> From my rough estimates, the worst case memory foot print for any >> given core, given your options is approximately >> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >> This is way below 8 GB. >> >> Note this estimate completely ignores: >> (1) the memory required for the restriction operator, >> (2) the potential growth in the number of non-zeros per row due >> to Galerkin coarsening (I wished -ksp_view_pre reported the >> output from MatView so we could see the number of non-zeros >> required by the coarse level operators) >> (3) all temporary vectors required by the CG solver, and those >> required by the smoothers. >> (4) internal memory allocated by MatPtAP >> (5) memory associated with IS's used within PCTelescope >> >> So either I am completely off in my estimates, or you have not >> carefully estimated the memory usage of your application code. >> Hopefully others might examine/correct my rough estimates >> >> Since I don't have your code I cannot access the latter. >> Since I don't have access to the same machine you are running on, >> I think we need to take a step back. >> >> [1] What machine are you running on? Send me a URL if its available >> >> [2] What discretization are you using? (I am guessing a scalar 7 >> point FD stencil) >> If it's a 7 point FD stencil, we should be able to examine the >> memory usage of your solver configuration using a standard, light >> weight existing PETSc example, run on your machine at the same >> scale. >> This would hopefully enable us to correctly evaluate the actual >> memory usage required by the solver configuration you are using. >> >> Thanks, >> Dave >> >> >> >> Frank >> >> >> >> >> On 07/08/2016 10:38 PM, Dave May wrote: >>> >>> >>> On Saturday, 9 July 2016, frank >> > wrote: >>> >>> Hi Barry and Dave, >>> >>> Thank both of you for the advice. >>> >>> @Barry >>> I made a mistake in the file names in last email. I >>> attached the correct files this time. >>> For all the three tests, 'Telescope' is used as the >>> coarse preconditioner. >>> >>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>> Part of the memory usage: Vector 125 124 >>> 3971904 0. >>> Matrix 101 101 9462372 0 >>> >>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>> Part of the memory usage: Vector 125 124 >>> 681672 0. >>> Matrix 101 101 1462180 0. >>> >>> In theory, the memory usage in Test1 should be 8 times >>> of Test2. In my case, it is about 6 times. >>> >>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. >>> Sub-domain per process: 32*32*32 >>> Here I get the out of memory error. >>> >>> I tried to use -mg_coarse jacobi. In this way, I don't >>> need to set -mg_coarse_ksp_type and -mg_coarse_pc_type >>> explicitly, right? >>> The linear solver didn't work in this case. Petsc output >>> some errors. >>> >>> @Dave >>> In test3, I use only one instance of 'Telescope'. On the >>> coarse mesh of 'Telescope', I used LU as the >>> preconditioner instead of SVD. >>> If my set the levels correctly, then on the last coarse >>> mesh of MG where it calls 'Telescope', the sub-domain >>> per process is 2*2*2. >>> On the last coarse mesh of 'Telescope', there is only >>> one grid point per process. >>> I still got the OOM error. The detailed petsc option >>> file is attached. >>> >>> >>> Do you understand the expected memory usage for the >>> particular parallel LU implementation you are using? I don't >>> (seriously). Replace LU with bjacobi and re-run this >>> test. My point about solver debugging is still valid. >>> >>> And please send the result of KSPView so we can see what is >>> actually used in the computations >>> >>> Thanks >>> Dave >>> >>> >>> >>> Thank you so much. >>> >>> Frank >>> >>> >>> >>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>> >>> On Jul 6, 2016, at 4:19 PM, frank >>> > wrote: >>> >>> Hi Barry, >>> >>> Thank you for you advice. >>> I tried three test. In the 1st test, the grid is >>> 3072*256*768 and the process mesh is 96*8*24. >>> The linear solver is 'cg' the preconditioner is >>> 'mg' and 'telescope' is used as the >>> preconditioner at the coarse mesh. >>> The system gives me the "Out of Memory" error >>> before the linear system is completely solved. >>> The info from '-ksp_view_pre' is attached. I >>> seems to me that the error occurs when it >>> reaches the coarse mesh. >>> >>> The 2nd test uses a grid of 1536*128*384 and >>> process mesh is 96*8*24. The 3rd test uses the >>> same grid but a different process mesh 48*4*12. >>> >>> Are you sure this is right? The total matrix and >>> vector memory usage goes from 2nd test >>> Vector 384 383 >>> 8,193,712 0. >>> Matrix 103 103 >>> 11,508,688 0. >>> to 3rd test >>> Vector 384 383 >>> 1,590,520 0. >>> Matrix 103 103 >>> 3,508,664 0. >>> that is the memory usage got smaller but if you have >>> only 1/8th the processes and the same grid it should >>> have gotten about 8 times bigger. Did you maybe cut >>> the grid by a factor of 8 also? If so that still >>> doesn't explain it because the memory usage changed >>> by a factor of 5 something for the vectors and 3 >>> something for the matrices. >>> >>> >>> The linear solver and petsc options in 2nd and >>> 3rd tests are the same in 1st test. The linear >>> solver works fine in both test. >>> I attached the memory usage of the 2nd and 3rd >>> tests. The memory info is from the option >>> '-log_summary'. I tried to use '-momery_info' as >>> you suggested, but in my case petsc treated it >>> as an unused option. It output nothing about the >>> memory. Do I need to add sth to my code so I can >>> use '-memory_info'? >>> >>> Sorry, my mistake the option is -memory_view >>> >>> Can you run the one case with -memory_view and >>> -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't >>> iterate forever) to see how much memory is used >>> without the telescope? Also run case 2 the same way. >>> >>> Barry >>> >>> >>> >>> In both tests the memory usage is not large. >>> >>> It seems to me that it might be the 'telescope' >>> preconditioner that allocated a lot of memory >>> and caused the error in the 1st test. >>> Is there is a way to show how much memory it >>> allocated? >>> >>> Frank >>> >>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>> >>> Frank, >>> >>> You can run with -ksp_view_pre to have >>> it "view" the KSP before the solve so >>> hopefully it gets that far. >>> >>> Please run the problem that does fit >>> with -memory_info when the problem completes >>> it will show the "high water mark" for PETSc >>> allocated memory and total memory used. We >>> first want to look at these numbers to see >>> if it is using more memory than you expect. >>> You could also run with say half the grid >>> spacing to see how the memory usage scaled >>> with the increase in grid points. Make the >>> runs also with -log_view and send all the >>> output from these options. >>> >>> Barry >>> >>> On Jul 5, 2016, at 5:23 PM, frank >>> >> > wrote: >>> >>> Hi, >>> >>> I am using the CG ksp solver and >>> Multigrid preconditioner to solve a >>> linear system in parallel. >>> I chose to use the 'Telescope' as the >>> preconditioner on the coarse mesh for >>> its good performance. >>> The petsc options file is attached. >>> >>> The domain is a 3d box. >>> It works well when the grid is >>> 1536*128*384 and the process mesh is >>> 96*8*24. When I double the size of grid >>> and keep the same process mesh and petsc >>> options, I get an "out of memory" error >>> from the super-cluster I am using. >>> Each process has access to at least 8G >>> memory, which should be more than enough >>> for my application. I am sure that all >>> the other parts of my code( except the >>> linear solver ) do not use much memory. >>> So I doubt if there is something wrong >>> with the linear solver. >>> The error occurs before the linear >>> system is completely solved so I don't >>> have the info from ksp view. I am not >>> able to re-produce the error with a >>> smaller problem either. >>> In addition, I tried to use the block >>> jacobi as the preconditioner with the >>> same grid and same decomposition. The >>> linear solver runs extremely slow but >>> there is no memory error. >>> >>> How can I diagnose what exactly cause >>> the error? >>> Thank you so much. >>> >>> Frank >>> >>> >>> >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- Linear solve converged due to CONVERGED_ATOL iterations 0 KSP Object: 2304 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 2304 MPI processes type: mg MG: type is MULTIPLICATIVE, levels=4 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 2304 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 2304 MPI processes type: telescope Telescope: parent comm size reduction factor = 64 Telescope: comm_size = 2304 , subcomm_size = 36 Telescope: subcomm type: interlaced Telescope: DMDA detected DMDA Object: (mg_coarse_telescope_repart_) 36 MPI processes M 192 N 16 P 48 m 12 n 1 p 3 dof 1 overlap 1 KSP Object: (mg_coarse_telescope_) 36 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using DEFAULT norm type for convergence test PC Object: (mg_coarse_telescope_) 36 MPI processes type: mg PC has not been set up so information may be incomplete MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_telescope_mg_coarse_) 36 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using DEFAULT norm type for convergence test PC Object: (mg_coarse_telescope_mg_coarse_) 36 MPI processes type: redundant PC has not been set up so information may be incomplete Redundant preconditioner: Not yet setup Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_1_) 36 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_1_) 36 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_2_) 36 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_2_) 36 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 36 MPI processes type: mpiaij rows=147456, cols=147456 total: nonzeros=1013760, allocated nonzeros=1013760 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines linear system matrix = precond matrix: Mat Object: 2304 MPI processes type: mpiaij rows=147456, cols=147456 total: nonzeros=1013760, allocated nonzeros=1013760 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 2304 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 2304 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 2304 MPI processes type: mpiaij rows=1179648, cols=1179648 total: nonzeros=8183808, allocated nonzeros=8183808 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 2304 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 2304 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 2304 MPI processes type: mpiaij rows=9437184, cols=9437184 total: nonzeros=65765376, allocated nonzeros=65765376 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 2304 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_3_) 2304 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 2304 MPI processes type: mpiaij rows=75497472, cols=75497472 total: nonzeros=527302656, allocated nonzeros=527302656 total number of mallocs used during MatSetValues calls =0 has attached null space Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 2304 MPI processes type: mpiaij rows=75497472, cols=75497472 total: nonzeros=527302656, allocated nonzeros=527302656 total number of mallocs used during MatSetValues calls =0 has attached null space -------------- next part -------------- KSP Object: 18432 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 18432 MPI processes type: mg PC has not been set up so information may be incomplete MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 18432 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using DEFAULT norm type for convergence test PC Object: (mg_coarse_) 18432 MPI processes type: redundant PC has not been set up so information may be incomplete Redundant preconditioner: Not yet setup Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=75497472, cols=75497472 total: nonzeros=527302656, allocated nonzeros=527302656 total number of mallocs used during MatSetValues calls =0 has attached null space Linear solve converged due to CONVERGED_RTOL iterations 131 KSP Object: 18432 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 18432 MPI processes type: mg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 18432 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 18432 MPI processes type: telescope Telescope: parent comm size reduction factor = 64 Telescope: comm_size = 18432 , subcomm_size = 288 Telescope: subcomm type: interlaced Telescope: DMDA detected DMDA Object: (mg_coarse_telescope_repart_) 288 MPI processes M 384 N 32 P 96 m 24 n 2 p 6 dof 1 overlap 1 KSP Object: (mg_coarse_telescope_) 288 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_) 288 MPI processes type: mg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_telescope_mg_coarse_) 288 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_coarse_) 288 MPI processes type: bjacobi block Jacobi: number of blocks = 288 Local solve is same for all blocks, in the following KSP and PC objects: linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=18432, cols=18432 total: nonzeros=124416, allocated nonzeros=124416 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_1_) 288 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_1_) 288 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=147456, cols=147456 total: nonzeros=1013760, allocated nonzeros=1013760 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_2_) 288 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_2_) 288 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=1179648, cols=1179648 total: nonzeros=8183808, allocated nonzeros=8183808 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=1179648, cols=1179648 total: nonzeros=8183808, allocated nonzeros=8183808 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines KSP Object: (mg_coarse_telescope_mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_coarse_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=64, cols=64 package used to perform factorization: petsc total: nonzeros=352, allocated nonzeros=352 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=64, cols=64 total: nonzeros=352, allocated nonzeros=352 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=1179648, cols=1179648 total: nonzeros=8183808, allocated nonzeros=8183808 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 18432 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 18432 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=9437184, cols=9437184 total: nonzeros=65765376, allocated nonzeros=65765376 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 18432 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 18432 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=75497472, cols=75497472 total: nonzeros=527302656, allocated nonzeros=527302656 total number of mallocs used during MatSetValues calls =0 has attached null space Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=75497472, cols=75497472 total: nonzeros=527302656, allocated nonzeros=527302656 total number of mallocs used during MatSetValues calls =0 has attached null space -------------- next part -------------- KSP Object: 18432 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 18432 MPI processes type: mg PC has not been set up so information may be incomplete MG: type is MULTIPLICATIVE, levels=4 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 18432 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using DEFAULT norm type for convergence test PC Object: (mg_coarse_) 18432 MPI processes type: redundant PC has not been set up so information may be incomplete Redundant preconditioner: Not yet setup Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=603979776, cols=603979776 total: nonzeros=4223139840, allocated nonzeros=4223139840 total number of mallocs used during MatSetValues calls =0 has attached null space -------------- next part -------------- Summary of Memory Usage in PETSc Maximum (over computational time) process memory: total 7.0727e+08 max 3.0916e+05 min 3.0554e+05 Current process memory: total 7.0727e+08 max 3.0916e+05 min 3.0554e+05 Maximum (over computational time) space PetscMalloc()ed: total 6.3908e+11 max 2.7844e+08 min 2.7726e+08 Current space PetscMalloc()ed: total 1.8275e+09 max 8.1280e+05 min 7.7357e+05 Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 6 5 4160 0. Vector 69 68 3491192 0. Vector Scatter 16 12 46144 0. Matrix 59 59 7940068 0. Matrix Null Space 1 1 592 0. Distributed Mesh 5 1 5104 0. Star Forest Bipartite Graph 10 2 1696 0. Discrete System 5 1 876 0. Index Set 33 33 255408 0. IS L to G Mapping 5 1 21360 0. Krylov Solver 9 9 11112 0. DMKSP interface 3 0 0 0. Preconditioner 9 9 8880 0. -------------- next part -------------- Summary of Memory Usage in PETSc Maximum (over computational time) process memory: total 5.9431e+09 max 3.2733e+05 min 3.1110e+05 Current process memory: total 5.9431e+09 max 3.2733e+05 min 3.1110e+05 Maximum (over computational time) space PetscMalloc()ed: total 5.3202e+12 max 2.8973e+08 min 2.8858e+08 Current space PetscMalloc()ed: total 5.4844e+09 max 3.0005e+05 min 2.9005e+05 Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 9 8 6656 0. Vector 480 479 6915440 0. Vector Scatter 19 16 55040 0. Matrix 73 73 2588332 0. Matrix Null Space 1 1 592 0. Distributed Mesh 6 3 15312 0. Star Forest Bipartite Graph 12 6 5088 0. Discrete System 6 3 2628 0. Index Set 42 42 98496 0. IS L to G Mapping 6 3 28224 0. Krylov Solver 9 9 11032 0. DMKSP interface 4 2 1296 0. Preconditioner 9 9 8992 0. -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 4 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -log_view -memory_view -matptap_scalable -ksp_view -options_left 1 # Setting dmdarepart on subcomm -mg_coarse_telescope_repart_da_processors_x 12 -mg_coarse_telescope_repart_da_processors_y 1 -mg_coarse_telescope_repart_da_processors_z 3 -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 3 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type bjacobi -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 3 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -log_view -memory_view -matptap_scalable -ksp_view -options_left 1 # Setting dmdarepart on subcomm -mg_coarse_telescope_repart_da_processors_x 24 -mg_coarse_telescope_repart_da_processors_y 2 -mg_coarse_telescope_repart_da_processors_z 6 -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 3 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type bjacobi -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 4 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -log_view -memory_view -matptap_scalable -ksp_view -ksp_view_pre -options_left 1 # Setting dmdarepart on subcomm -mg_coarse_telescope_repart_da_processors_x 24 -mg_coarse_telescope_repart_da_processors_y 2 -mg_coarse_telescope_repart_da_processors_z 6 -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 3 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type bjacobi From bsmith at mcs.anl.gov Fri Sep 9 14:38:37 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 9 Sep 2016 14:38:37 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> Message-ID: <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt has only one KSPView in it? Did you run two different solves in the 2 case but not the one? Barry > On Sep 9, 2016, at 10:56 AM, frank wrote: > > Hi, > > I want to continue digging into the memory problem here. > I did find a work around in the past, which is to use less cores per node so that each core has 8G memory. However this is deficient and expensive. I hope to locate the place that uses the most memory. > > Here is a brief summary of the tests I did in past: > > Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 > Maximum (over computational time) process memory: total 7.0727e+08 > Current process memory: total 7.0727e+08 > Maximum (over computational time) space PetscMalloc()ed: total 6.3908e+11 > Current space PetscMalloc()ed: total 1.8275e+09 > > > Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 > Maximum (over computational time) process memory: total 5.9431e+09 > Current process memory: total 5.9431e+09 > Maximum (over computational time) space PetscMalloc()ed: total 5.3202e+12 > Current space PetscMalloc()ed: total 5.4844e+09 > > > Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 > OOM( Out Of Memory ) killer of the supercomputer terminated the job during "KSPSolve". > > I attached the output of ksp_view( the third test's output is from ksp_view_pre ), memory_view and also the petsc options. > > In all the tests, each core can access about 2G memory. In test3, there are 4223139840 non-zeros in the matrix. This will consume about 1.74M, using double precision. Considering some extra memory used to store integer index, 2G memory should still be way enough. > > Is there a way to find out which part of KSPSolve uses the most memory? > Thank you so much. > > BTW, there are 4 options remains unused and I don't understand why they are omitted: > -mg_coarse_telescope_mg_coarse_ksp_type value: preonly > -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi > -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 > -mg_coarse_telescope_mg_levels_ksp_type value: richardson > > > Regards, > Frank > > On 07/13/2016 05:47 PM, Dave May wrote: >> >> >> On 14 July 2016 at 01:07, frank wrote: >> Hi Dave, >> >> Sorry for the late reply. >> Thank you so much for your detailed reply. >> >> I have a question about the estimation of the memory usage. There are 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is used. So the memory per process is: >> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >> Did I do sth wrong here? Because this seems too small. >> >> No - I totally f***ed it up. You are correct. That'll teach me for fumbling around with my iphone calculator and not using my brain. (Note that to convert to MB just divide by 1e6, not 1024^2 - although I apparently cannot convert between units correctly....) >> >> From the PETSc objects associated with the solver, It looks like it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities are: somewhere in your usage of PETSc you've introduced a memory leak; PETSc is doing a huge over allocation (e.g. as per our discussion of MatPtAP); or in your application code there are other objects you have forgotten to log the memory for. >> >> >> >> I am running this job on Bluewater >> I am using the 7 points FD stencil in 3D. >> >> I thought so on both counts. >> >> >> I apologize that I made a stupid mistake in computing the memory per core. My settings render each core can access only 2G memory on average instead of 8G which I mentioned in previous email. I re-run the job with 8G memory per core on average and there is no "Out Of Memory" error. I would do more test to see if there is still some memory issue. >> >> Ok. I'd still like to know where the memory was being used since my estimates were off. >> >> >> Thanks, >> Dave >> >> >> Regards, >> Frank >> >> >> >> On 07/11/2016 01:18 PM, Dave May wrote: >>> Hi Frank, >>> >>> >>> On 11 July 2016 at 19:14, frank wrote: >>> Hi Dave, >>> >>> I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached. >>> I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached. >>> >>> Okay - that is essentially useless (sorry) >>> >>> >>> It seems to me that the error occurred when the decomposition was going to be changed. >>> >>> Based on what information? >>> Running with -info would give us more clues, but will create a ton of output. >>> Please try running the case which failed with -info >>> >>> I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison. >>> Thank you. >>> >>> >>> [3] Here is my crude estimate of your memory usage. >>> I'll target the biggest memory hogs only to get an order of magnitude estimate >>> >>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision. >>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers) >>> >>> * You use 5 levels of coarsening, so the other operators should represent (collectively) >>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the communicator with 18432 ranks. >>> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks. >>> >>> * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. >>> PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. >>> This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. >>> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. >>> The temporary matrix is now destroyed. >>> >>> * Because a DMDA is detected, a permutation matrix is assembled. >>> This requires 2 doubles per point in the DMDA. >>> Your coarse DMDA contains 92 x 16 x 48 points. >>> Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm. >>> >>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. >>> >>> From my rough estimates, the worst case memory foot print for any given core, given your options is approximately >>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >>> This is way below 8 GB. >>> >>> Note this estimate completely ignores: >>> (1) the memory required for the restriction operator, >>> (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) >>> (3) all temporary vectors required by the CG solver, and those required by the smoothers. >>> (4) internal memory allocated by MatPtAP >>> (5) memory associated with IS's used within PCTelescope >>> >>> So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates >>> >>> Since I don't have your code I cannot access the latter. >>> Since I don't have access to the same machine you are running on, I think we need to take a step back. >>> >>> [1] What machine are you running on? Send me a URL if its available >>> >>> [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) >>> If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. >>> This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. >>> >>> Thanks, >>> Dave >>> >>> >>> >>> Frank >>> >>> >>> >>> >>> On 07/08/2016 10:38 PM, Dave May wrote: >>>> >>>> >>>> On Saturday, 9 July 2016, frank wrote: >>>> Hi Barry and Dave, >>>> >>>> Thank both of you for the advice. >>>> >>>> @Barry >>>> I made a mistake in the file names in last email. I attached the correct files this time. >>>> For all the three tests, 'Telescope' is used as the coarse preconditioner. >>>> >>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>>> Part of the memory usage: Vector 125 124 3971904 0. >>>> Matrix 101 101 9462372 0 >>>> >>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>>> Part of the memory usage: Vector 125 124 681672 0. >>>> Matrix 101 101 1462180 0. >>>> >>>> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. >>>> >>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 >>>> Here I get the out of memory error. >>>> >>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>>> The linear solver didn't work in this case. Petsc output some errors. >>>> >>>> @Dave >>>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >>>> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >>>> On the last coarse mesh of 'Telescope', there is only one grid point per process. >>>> I still got the OOM error. The detailed petsc option file is attached. >>>> >>>> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. >>>> >>>> And please send the result of KSPView so we can see what is actually used in the computations >>>> >>>> Thanks >>>> Dave >>>> >>>> >>>> >>>> Thank you so much. >>>> >>>> Frank >>>> >>>> >>>> >>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>>> >>>> Hi Barry, >>>> >>>> Thank you for you advice. >>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. >>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. >>>> The system gives me the "Out of Memory" error before the linear system is completely solved. >>>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. >>>> >>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. >>>> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test >>>> Vector 384 383 8,193,712 0. >>>> Matrix 103 103 11,508,688 0. >>>> to 3rd test >>>> Vector 384 383 1,590,520 0. >>>> Matrix 103 103 3,508,664 0. >>>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. >>>> >>>> >>>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. >>>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? >>>> Sorry, my mistake the option is -memory_view >>>> >>>> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. >>>> >>>> Barry >>>> >>>> >>>> >>>> In both tests the memory usage is not large. >>>> >>>> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. >>>> Is there is a way to show how much memory it allocated? >>>> >>>> Frank >>>> >>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>>> Frank, >>>> >>>> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >>>> >>>> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >>>> >>>> Barry >>>> >>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>> >>>> Hi, >>>> >>>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >>>> The petsc options file is attached. >>>> >>>> The domain is a 3d box. >>>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >>>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >>>> >>>> How can I diagnose what exactly cause the error? >>>> Thank you so much. >>>> >>>> Frank >>>> >>>> >>>> >>> >>> >> >> > > From bsmith at mcs.anl.gov Fri Sep 9 14:44:56 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 9 Sep 2016 14:44:56 -0500 Subject: [petsc-users] Diagnosing a difference between "unpreconditioned" and "true" residual norms In-Reply-To: References: Message-ID: Patrick, I have only seen this when the "linear" operator turned out to not actually be linear or at least not linear in double precision. Are you using differencing or anything in your MatShell that might make it not be a linear operator in full precision? Since your problem is so small you can compute the Jacobian explicitly via finite differencing and then use that matrix plus your shell preconditioner. I beat if you do this you will see the true and non-true residual norms remain the same, this would likely mean something is wonky with your shell matrix. Barry > On Sep 9, 2016, at 9:32 AM, Patrick Sanan wrote: > > I am debugging a linear solver which uses a custom operator and > preconditioner, via MATSHELL and PCSHELL. Convergence seems to be > fine, except that I unexpectedly see a difference between the > "unpreconditioned" and "true" residual norms when I use > -ksp_monitor_true_residual with a right-preconditioned Krylov method > (FGMRES or right-preconditioned GMRES). > > 0 KSP unpreconditioned resid norm 9.266794204683e+08 true resid norm > 9.266794204683e+08 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 2.317801431974e+04 true resid norm > 2.317826550333e+04 ||r(i)||/||b|| 2.501217248530e-05 > 2 KSP unpreconditioned resid norm 4.453270507534e+00 true resid norm > 2.699824780158e+01 ||r(i)||/||b|| 2.913439880638e-08 > 3 KSP unpreconditioned resid norm 1.015490793887e-03 true resid norm > 2.658635801018e+01 ||r(i)||/||b|| 2.868991953738e-08 > 4 KSP unpreconditioned resid norm 4.710220776105e-07 true resid norm > 2.658631616810e+01 ||r(i)||/||b|| 2.868987438467e-08 > KSP Object:(mgk_) 1 MPI processes > type: fgmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-13, absolute=1e-50, divergence=10000. > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object:(mgk_) 1 MPI processes > type: shell > Shell: Custom PC > linear system matrix = precond matrix: > Mat Object: Custom Operator 1 MPI processes > type: shell > rows=256, cols=256 > has attached null space > > I have dumped the explicit operator and preconditioned operator, and I > can see that the operator and the preconditioned operator each have a > 1-dimensional nullspace (a constant-pressure nullspace) which I have > accounted for by constructing a normalized, constant-pressure vector > and supplying it to the operator via a MatNullSpace. > > If I disregard the (numerically) zero singular value, the operator has > a condition number of 1.5669e+05 and the preconditioned operator has a > condition number of 1.01 (strong preconditioner). > > Has anyone seen this sort of behavior before and if so, is there a > common culprit that I am overlooking? Any ideas of what to test next > to try to isolate the issue? > > As I understand it, the unpreconditioned and true residual norms > should be identical in exact arithmetic, so I would suspect that > somehow I've ended up with a "bad Hessenberg matrix" in some way as I > perform this solve (or maybe I have a more subtle bug). From hengjiew at uci.edu Fri Sep 9 15:11:52 2016 From: hengjiew at uci.edu (frank) Date: Fri, 9 Sep 2016 13:11:52 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> Message-ID: Hi Barry, I think the first KSP view output is from -ksp_view_pre. Before I submitted the test, I was not sure whether there would be OOM error or not. So I added both -ksp_view_pre and -ksp_view. Frank On 09/09/2016 12:38 PM, Barry Smith wrote: > Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt has only one KSPView in it? Did you run two different solves in the 2 case but not the one? > > Barry > > > >> On Sep 9, 2016, at 10:56 AM, frank wrote: >> >> Hi, >> >> I want to continue digging into the memory problem here. >> I did find a work around in the past, which is to use less cores per node so that each core has 8G memory. However this is deficient and expensive. I hope to locate the place that uses the most memory. >> >> Here is a brief summary of the tests I did in past: >>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >> Maximum (over computational time) process memory: total 7.0727e+08 >> Current process memory: total 7.0727e+08 >> Maximum (over computational time) space PetscMalloc()ed: total 6.3908e+11 >> Current space PetscMalloc()ed: total 1.8275e+09 >> >>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >> Maximum (over computational time) process memory: total 5.9431e+09 >> Current process memory: total 5.9431e+09 >> Maximum (over computational time) space PetscMalloc()ed: total 5.3202e+12 >> Current space PetscMalloc()ed: total 5.4844e+09 >> >>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >> OOM( Out Of Memory ) killer of the supercomputer terminated the job during "KSPSolve". >> >> I attached the output of ksp_view( the third test's output is from ksp_view_pre ), memory_view and also the petsc options. >> >> In all the tests, each core can access about 2G memory. In test3, there are 4223139840 non-zeros in the matrix. This will consume about 1.74M, using double precision. Considering some extra memory used to store integer index, 2G memory should still be way enough. >> >> Is there a way to find out which part of KSPSolve uses the most memory? >> Thank you so much. >> >> BTW, there are 4 options remains unused and I don't understand why they are omitted: >> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >> >> >> Regards, >> Frank >> >> On 07/13/2016 05:47 PM, Dave May wrote: >>> >>> On 14 July 2016 at 01:07, frank wrote: >>> Hi Dave, >>> >>> Sorry for the late reply. >>> Thank you so much for your detailed reply. >>> >>> I have a question about the estimation of the memory usage. There are 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is used. So the memory per process is: >>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >>> Did I do sth wrong here? Because this seems too small. >>> >>> No - I totally f***ed it up. You are correct. That'll teach me for fumbling around with my iphone calculator and not using my brain. (Note that to convert to MB just divide by 1e6, not 1024^2 - although I apparently cannot convert between units correctly....) >>> >>> From the PETSc objects associated with the solver, It looks like it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities are: somewhere in your usage of PETSc you've introduced a memory leak; PETSc is doing a huge over allocation (e.g. as per our discussion of MatPtAP); or in your application code there are other objects you have forgotten to log the memory for. >>> >>> >>> >>> I am running this job on Bluewater >>> I am using the 7 points FD stencil in 3D. >>> >>> I thought so on both counts. >>> >>> >>> I apologize that I made a stupid mistake in computing the memory per core. My settings render each core can access only 2G memory on average instead of 8G which I mentioned in previous email. I re-run the job with 8G memory per core on average and there is no "Out Of Memory" error. I would do more test to see if there is still some memory issue. >>> >>> Ok. I'd still like to know where the memory was being used since my estimates were off. >>> >>> >>> Thanks, >>> Dave >>> >>> >>> Regards, >>> Frank >>> >>> >>> >>> On 07/11/2016 01:18 PM, Dave May wrote: >>>> Hi Frank, >>>> >>>> >>>> On 11 July 2016 at 19:14, frank wrote: >>>> Hi Dave, >>>> >>>> I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached. >>>> I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached. >>>> >>>> Okay - that is essentially useless (sorry) >>>> >>>> >>>> It seems to me that the error occurred when the decomposition was going to be changed. >>>> >>>> Based on what information? >>>> Running with -info would give us more clues, but will create a ton of output. >>>> Please try running the case which failed with -info >>>> >>>> I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison. >>>> Thank you. >>>> >>>> >>>> [3] Here is my crude estimate of your memory usage. >>>> I'll target the biggest memory hogs only to get an order of magnitude estimate >>>> >>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision. >>>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers) >>>> >>>> * You use 5 levels of coarsening, so the other operators should represent (collectively) >>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the communicator with 18432 ranks. >>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks. >>>> >>>> * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. >>>> PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. >>>> This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. >>>> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. >>>> The temporary matrix is now destroyed. >>>> >>>> * Because a DMDA is detected, a permutation matrix is assembled. >>>> This requires 2 doubles per point in the DMDA. >>>> Your coarse DMDA contains 92 x 16 x 48 points. >>>> Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm. >>>> >>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. >>>> >>>> From my rough estimates, the worst case memory foot print for any given core, given your options is approximately >>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >>>> This is way below 8 GB. >>>> >>>> Note this estimate completely ignores: >>>> (1) the memory required for the restriction operator, >>>> (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) >>>> (3) all temporary vectors required by the CG solver, and those required by the smoothers. >>>> (4) internal memory allocated by MatPtAP >>>> (5) memory associated with IS's used within PCTelescope >>>> >>>> So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates >>>> >>>> Since I don't have your code I cannot access the latter. >>>> Since I don't have access to the same machine you are running on, I think we need to take a step back. >>>> >>>> [1] What machine are you running on? Send me a URL if its available >>>> >>>> [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) >>>> If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. >>>> This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. >>>> >>>> Thanks, >>>> Dave >>>> >>>> >>>> >>>> Frank >>>> >>>> >>>> >>>> >>>> On 07/08/2016 10:38 PM, Dave May wrote: >>>>> >>>>> On Saturday, 9 July 2016, frank wrote: >>>>> Hi Barry and Dave, >>>>> >>>>> Thank both of you for the advice. >>>>> >>>>> @Barry >>>>> I made a mistake in the file names in last email. I attached the correct files this time. >>>>> For all the three tests, 'Telescope' is used as the coarse preconditioner. >>>>> >>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>>>> Part of the memory usage: Vector 125 124 3971904 0. >>>>> Matrix 101 101 9462372 0 >>>>> >>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>>>> Part of the memory usage: Vector 125 124 681672 0. >>>>> Matrix 101 101 1462180 0. >>>>> >>>>> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. >>>>> >>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 >>>>> Here I get the out of memory error. >>>>> >>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>>>> The linear solver didn't work in this case. Petsc output some errors. >>>>> >>>>> @Dave >>>>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >>>>> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >>>>> On the last coarse mesh of 'Telescope', there is only one grid point per process. >>>>> I still got the OOM error. The detailed petsc option file is attached. >>>>> >>>>> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. >>>>> >>>>> And please send the result of KSPView so we can see what is actually used in the computations >>>>> >>>>> Thanks >>>>> Dave >>>>> >>>>> >>>>> >>>>> Thank you so much. >>>>> >>>>> Frank >>>>> >>>>> >>>>> >>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>>>> >>>>> Hi Barry, >>>>> >>>>> Thank you for you advice. >>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. >>>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. >>>>> The system gives me the "Out of Memory" error before the linear system is completely solved. >>>>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. >>>>> >>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. >>>>> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test >>>>> Vector 384 383 8,193,712 0. >>>>> Matrix 103 103 11,508,688 0. >>>>> to 3rd test >>>>> Vector 384 383 1,590,520 0. >>>>> Matrix 103 103 3,508,664 0. >>>>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. >>>>> >>>>> >>>>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. >>>>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? >>>>> Sorry, my mistake the option is -memory_view >>>>> >>>>> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> In both tests the memory usage is not large. >>>>> >>>>> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. >>>>> Is there is a way to show how much memory it allocated? >>>>> >>>>> Frank >>>>> >>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>>>> Frank, >>>>> >>>>> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >>>>> >>>>> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >>>>> >>>>> Barry >>>>> >>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >>>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >>>>> The petsc options file is attached. >>>>> >>>>> The domain is a 3d box. >>>>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >>>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >>>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >>>>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >>>>> >>>>> How can I diagnose what exactly cause the error? >>>>> Thank you so much. >>>>> >>>>> Frank >>>>> >>>>> >>>>> >>>> >>> >> From knepley at gmail.com Fri Sep 9 16:51:20 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 Sep 2016 16:51:20 -0500 Subject: [petsc-users] DMPlex problem In-Reply-To: <967aa3a1-af2d-8d6f-a15e-8ca65a5fc887@princeton.edu> References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <967aa3a1-af2d-8d6f-a15e-8ca65a5fc887@princeton.edu> Message-ID: On Fri, Sep 9, 2016 at 9:49 AM, Mark Lohry wrote: > Regarding DMPlex, I'm unclear on the separation between Index Sets and > Plex for unstructured solvers. I've got an existing unstructured serial > solver using petsc's newton-krylov > Index Sets are lists of integers. They can mean whatever you want, but they are just lists. Plex stores mesh topology and has topological query functions. > solvers. Should I be looking to parallelize this via IS or plex? Is there > an interface for either that allows me to specify the partitioning (e.g. by > metis' output)? Last but not least, is > There is DMPlexDistribute() which partitions and distributes, or you can use DMPlexMigrates() if you want to calculate your own partition for some reason. > there a working example of dmplex in the docs? The only unstructured code > example I see in the docs is SNES ex10, which uses IS. > SNES ex12, ex62, ex77 and TS ex11. Thanks, Matt > Thanks, > > Mark > > On 09/09/2016 04:21 AM, Matthew Knepley wrote: > > On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen > wrote: > >> Dear PETSc developers and users, >> >> Last week we posted a question regarding an error with DMPlex and >> multiple dofs and have not gotten any feedback yet. This is uncharted >> waters for us, since we have gotten used to an extremely fast feedback from >> the PETSc crew. So - with the chance of sounding impatient and ungrateful - >> we would like to hear if anybody has any ideas that could point us in the >> right direction? >> > > This is my fault. You have not gotten a response because everyone else was > waiting for me, and I have been > slow because I just moved houses at the same time as term started here. > Sorry about that. > > The example ran for me and I saw your problem. The local-tp-global map is > missing for some reason. > I am tracking it down now. It should be made by DMCreateMatrix(), so this > is mysterious. I hope to have > this fixed by early next week. > > Thanks, > > Matt > > >> We have created a small example problem that demonstrates the error in >> the matrix assembly. >> >> Thanks, >> Morten >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 9 19:19:57 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 9 Sep 2016 19:19:57 -0500 Subject: [petsc-users] Sorted CSR Matrix and Multigrid PC. In-Reply-To: References: <836C9DDB-75AC-480C-8000-290C1E6205DC@mcs.anl.gov> <9F710A22-1A5D-4CA3-8D28-6C7F1EEBCB1D@mcs.anl.gov> Message-ID: The missing third argument to PCApply means that you haven't allocated the vector work? > On Sep 9, 2016, at 3:34 PM, Manuel Valera wrote: > > Hello everyone, > > I'm having an error with my program that i cannot understand, the weird part is that the same implementation in my main model does not show errors and they are virtually identical. > > The problematic part of the code is: > > > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > call KSPSetOperators(ksp,Ap,Ap,ierr) > call KSPGetPC(ksp,pc,ierr) > tol = 1.e-5 > call KSPSetTolerances(ksp,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) > > call PCGetOperators(pc,PETSC_NULL_OBJECT,pmat,ierr) > call PCCreate(PETSC_COMM_WORLD,mg,ierr) > call PCSetType(mg,PCJACOBI,ierr) > call PCSetOperators(mg,pmat,pmat,ierr) > call PCSetUp(mg,ierr) > > call PCApply(mg,xp,work,ierr) > > > And the errors i get are: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Object: Parameter # 3 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./solvelinearmgPETSc on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 9 12:46:02 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #1 PCApply() line 467 in /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Object: Parameter # 3 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./solvelinearmgPETSc > > > As i said before, the exact same code in a bigger model does not print errors. I'm trying to solve this before moving into multigrid implementation in my prototype, > > Thanks for your time, > > > > On Wed, Sep 7, 2016 at 8:27 PM, Barry Smith wrote: > > > On Sep 7, 2016, at 10:24 PM, Manuel Valera wrote: > > > > Thank you I will try this. What would be the call if I wanted to use other Multigrid option? > > There really isn't any other choices. > > > Don't worry this is a standalone prototype, it should be fine on the main model. Anyway, any hints would be appreciated. Thanks a lot for your time. > > > > > > On Sep 7, 2016 8:22 PM, "Barry Smith" wrote: > > > > Sorry, this was due to our bug, the fortran function for PCGAMGSetType() was wrong. I have fixed this in the maint and master branch of PETSc in the git repository. But you can simply remove the call to PCGAMGSetType() from your code since what you are setting is the default type. > > > > BTW: there are other problems with your code after that call that you will have to work through. > > > > Barry > > > > > On Sep 7, 2016, at 8:46 PM, Manuel Valera wrote: > > > > > > > > > ---------- Forwarded message ---------- > > > From: Manuel Valera > > > Date: Wed, Sep 7, 2016 at 6:40 PM > > > Subject: Re: [petsc-users] Sorted CSR Matrix and Multigrid PC. > > > To: > > > Cc: PETSc users list > > > > > > > > > Hello, > > > > > > I was able to sort the data but the PCGAMG does not seem to be working. > > > > > > I reconfigured everything from scratch as suggested and updated to the latest PETSc version, same results, > > > > > > I get the following error: > > > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > [...] > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] PetscStrcmp line 524 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/sys/utils/str.c > > > [0]PETSC ERROR: [0] PetscFunctionListFind_Private line 352 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/sys/dll/reg.c > > > [0]PETSC ERROR: [0] PCGAMGSetType_GAMG line 1157 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c > > > [0]PETSC ERROR: [0] PCGAMGSetType line 1102 /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > .... > > > > > > You can find the working program attached, sorry for the large files and messy makefile, > > > > > > Many thanks, > > > > > > Manuel Valera. > > > > > > > > > ? > > > MGpetscSolver.tar.gz > > > ? > > > > > > > From mvalera at mail.sdsu.edu Fri Sep 9 19:28:19 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Fri, 9 Sep 2016 17:28:19 -0700 Subject: [petsc-users] Sorted CSR Matrix and Multigrid PC. In-Reply-To: References: <836C9DDB-75AC-480C-8000-290C1E6205DC@mcs.anl.gov> <9F710A22-1A5D-4CA3-8D28-6C7F1EEBCB1D@mcs.anl.gov> Message-ID: Thank you SO much for helping me out on this. Dumb error from my part not to notice. This means the common /mypcs/ elements are preconfigure internally in PETSc? Regards and happy weekend, On Fri, Sep 9, 2016 at 5:19 PM, Barry Smith wrote: > > The missing third argument to PCApply means that you haven't allocated > the vector work? > > > > On Sep 9, 2016, at 3:34 PM, Manuel Valera wrote: > > > > Hello everyone, > > > > I'm having an error with my program that i cannot understand, the weird > part is that the same implementation in my main model does not show errors > and they are virtually identical. > > > > The problematic part of the code is: > > > > > > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > > call KSPSetOperators(ksp,Ap,Ap,ierr) > > call KSPGetPC(ksp,pc,ierr) > > tol = 1.e-5 > > call KSPSetTolerances(ksp,tol,PETSC_DEFAULT_REAL,PETSC_ > DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) > > > > call PCGetOperators(pc,PETSC_NULL_OBJECT,pmat,ierr) > > call PCCreate(PETSC_COMM_WORLD,mg,ierr) > > call PCSetType(mg,PCJACOBI,ierr) > > call PCSetOperators(mg,pmat,pmat,ierr) > > call PCSetUp(mg,ierr) > > > > call PCApply(mg,xp,work,ierr) > > > > > > And the errors i get are: > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Null argument, when expecting valid pointer > > [0]PETSC ERROR: Null Object: Parameter # 3 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > > [0]PETSC ERROR: ./solvelinearmgPETSc > > > on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 9 > 12:46:02 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #1 PCApply() line 467 in /home/valera/sergcemv4/ > bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Null argument, when expecting valid pointer > > [0]PETSC ERROR: Null Object: Parameter # 3 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > > [0]PETSC ERROR: ./solvelinearmgPETSc > > > > > > As i said before, the exact same code in a bigger model does not print > errors. I'm trying to solve this before moving into multigrid > implementation in my prototype, > > > > Thanks for your time, > > > > > > > > On Wed, Sep 7, 2016 at 8:27 PM, Barry Smith wrote: > > > > > On Sep 7, 2016, at 10:24 PM, Manuel Valera > wrote: > > > > > > Thank you I will try this. What would be the call if I wanted to use > other Multigrid option? > > > > There really isn't any other choices. > > > > > Don't worry this is a standalone prototype, it should be fine on the > main model. Anyway, any hints would be appreciated. Thanks a lot for your > time. > > > > > > > > > On Sep 7, 2016 8:22 PM, "Barry Smith" wrote: > > > > > > Sorry, this was due to our bug, the fortran function for > PCGAMGSetType() was wrong. I have fixed this in the maint and master branch > of PETSc in the git repository. But you can simply remove the call to > PCGAMGSetType() from your code since what you are setting is the default > type. > > > > > > BTW: there are other problems with your code after that call that > you will have to work through. > > > > > > Barry > > > > > > > On Sep 7, 2016, at 8:46 PM, Manuel Valera > wrote: > > > > > > > > > > > > ---------- Forwarded message ---------- > > > > From: Manuel Valera > > > > Date: Wed, Sep 7, 2016 at 6:40 PM > > > > Subject: Re: [petsc-users] Sorted CSR Matrix and Multigrid PC. > > > > To: > > > > Cc: PETSc users list > > > > > > > > > > > > Hello, > > > > > > > > I was able to sort the data but the PCGAMG does not seem to be > working. > > > > > > > > I reconfigured everything from scratch as suggested and updated to > the latest PETSc version, same results, > > > > > > > > I get the following error: > > > > > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > > > > [...] > > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > > [0]PETSC ERROR: is given. > > > > [0]PETSC ERROR: [0] PetscStrcmp line 524 /home/valera/sergcemv4/ > bitbucket/serucoamv4/petsc-3.7.3/src/sys/utils/str.c > > > > [0]PETSC ERROR: [0] PetscFunctionListFind_Private line 352 > /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3.7.3/src/sys/dll/reg.c > > > > [0]PETSC ERROR: [0] PCGAMGSetType_GAMG line 1157 > /home/valera/sergcemv4/bitbucket/serucoamv4/petsc-3. > 7.3/src/ksp/pc/impls/gamg/gamg.c > > > > [0]PETSC ERROR: [0] PCGAMGSetType line 1102 /home/valera/sergcemv4/ > bitbucket/serucoamv4/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > .... > > > > > > > > You can find the working program attached, sorry for the large files > and messy makefile, > > > > > > > > Many thanks, > > > > > > > > Manuel Valera. > > > > > > > > > > > > ? > > > > MGpetscSolver.tar.gz > > > > ? > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 9 19:28:46 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 9 Sep 2016 19:28:46 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> Message-ID: <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> > On Sep 9, 2016, at 3:11 PM, frank wrote: > > Hi Barry, > > I think the first KSP view output is from -ksp_view_pre. Before I submitted the test, I was not sure whether there would be OOM error or not. So I added both -ksp_view_pre and -ksp_view. But the options file you sent specifically does NOT list the -ksp_view_pre so how could it be from that? Sorry to be pedantic but I've spent too much time in the past trying to debug from incorrect information and want to make sure that the information I have is correct before thinking. Please recheck exactly what happened. Rerun with the exact input file you emailed if that is needed. Barry > > Frank > > > On 09/09/2016 12:38 PM, Barry Smith wrote: >> Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt has only one KSPView in it? Did you run two different solves in the 2 case but not the one? >> >> Barry >> >> >> >>> On Sep 9, 2016, at 10:56 AM, frank wrote: >>> >>> Hi, >>> >>> I want to continue digging into the memory problem here. >>> I did find a work around in the past, which is to use less cores per node so that each core has 8G memory. However this is deficient and expensive. I hope to locate the place that uses the most memory. >>> >>> Here is a brief summary of the tests I did in past: >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >>> Maximum (over computational time) process memory: total 7.0727e+08 >>> Current process memory: total 7.0727e+08 >>> Maximum (over computational time) space PetscMalloc()ed: total 6.3908e+11 >>> Current space PetscMalloc()ed: total 1.8275e+09 >>> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >>> Maximum (over computational time) process memory: total 5.9431e+09 >>> Current process memory: total 5.9431e+09 >>> Maximum (over computational time) space PetscMalloc()ed: total 5.3202e+12 >>> Current space PetscMalloc()ed: total 5.4844e+09 >>> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >>> OOM( Out Of Memory ) killer of the supercomputer terminated the job during "KSPSolve". >>> >>> I attached the output of ksp_view( the third test's output is from ksp_view_pre ), memory_view and also the petsc options. >>> >>> In all the tests, each core can access about 2G memory. In test3, there are 4223139840 non-zeros in the matrix. This will consume about 1.74M, using double precision. Considering some extra memory used to store integer index, 2G memory should still be way enough. >>> >>> Is there a way to find out which part of KSPSolve uses the most memory? >>> Thank you so much. >>> >>> BTW, there are 4 options remains unused and I don't understand why they are omitted: >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >>> >>> >>> Regards, >>> Frank >>> >>> On 07/13/2016 05:47 PM, Dave May wrote: >>>> >>>> On 14 July 2016 at 01:07, frank wrote: >>>> Hi Dave, >>>> >>>> Sorry for the late reply. >>>> Thank you so much for your detailed reply. >>>> >>>> I have a question about the estimation of the memory usage. There are 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is used. So the memory per process is: >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >>>> Did I do sth wrong here? Because this seems too small. >>>> >>>> No - I totally f***ed it up. You are correct. That'll teach me for fumbling around with my iphone calculator and not using my brain. (Note that to convert to MB just divide by 1e6, not 1024^2 - although I apparently cannot convert between units correctly....) >>>> >>>> From the PETSc objects associated with the solver, It looks like it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities are: somewhere in your usage of PETSc you've introduced a memory leak; PETSc is doing a huge over allocation (e.g. as per our discussion of MatPtAP); or in your application code there are other objects you have forgotten to log the memory for. >>>> >>>> >>>> >>>> I am running this job on Bluewater >>>> I am using the 7 points FD stencil in 3D. >>>> >>>> I thought so on both counts. >>>> >>>> I apologize that I made a stupid mistake in computing the memory per core. My settings render each core can access only 2G memory on average instead of 8G which I mentioned in previous email. I re-run the job with 8G memory per core on average and there is no "Out Of Memory" error. I would do more test to see if there is still some memory issue. >>>> >>>> Ok. I'd still like to know where the memory was being used since my estimates were off. >>>> >>>> >>>> Thanks, >>>> Dave >>>> >>>> Regards, >>>> Frank >>>> >>>> >>>> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >>>>> Hi Frank, >>>>> >>>>> >>>>> On 11 July 2016 at 19:14, frank wrote: >>>>> Hi Dave, >>>>> >>>>> I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached. >>>>> I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached. >>>>> >>>>> Okay - that is essentially useless (sorry) >>>>> >>>>> It seems to me that the error occurred when the decomposition was going to be changed. >>>>> >>>>> Based on what information? >>>>> Running with -info would give us more clues, but will create a ton of output. >>>>> Please try running the case which failed with -info >>>>> I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison. >>>>> Thank you. >>>>> >>>>> >>>>> [3] Here is my crude estimate of your memory usage. >>>>> I'll target the biggest memory hogs only to get an order of magnitude estimate >>>>> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision. >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers) >>>>> >>>>> * You use 5 levels of coarsening, so the other operators should represent (collectively) >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the communicator with 18432 ranks. >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks. >>>>> >>>>> * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. >>>>> PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. >>>>> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. >>>>> The temporary matrix is now destroyed. >>>>> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. >>>>> This requires 2 doubles per point in the DMDA. >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm. >>>>> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. >>>>> >>>>> From my rough estimates, the worst case memory foot print for any given core, given your options is approximately >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >>>>> This is way below 8 GB. >>>>> >>>>> Note this estimate completely ignores: >>>>> (1) the memory required for the restriction operator, >>>>> (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) >>>>> (3) all temporary vectors required by the CG solver, and those required by the smoothers. >>>>> (4) internal memory allocated by MatPtAP >>>>> (5) memory associated with IS's used within PCTelescope >>>>> >>>>> So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates >>>>> >>>>> Since I don't have your code I cannot access the latter. >>>>> Since I don't have access to the same machine you are running on, I think we need to take a step back. >>>>> >>>>> [1] What machine are you running on? Send me a URL if its available >>>>> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) >>>>> If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. >>>>> This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. >>>>> >>>>> Thanks, >>>>> Dave >>>>> >>>>> >>>>> Frank >>>>> >>>>> >>>>> >>>>> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >>>>>> >>>>>> On Saturday, 9 July 2016, frank wrote: >>>>>> Hi Barry and Dave, >>>>>> >>>>>> Thank both of you for the advice. >>>>>> >>>>>> @Barry >>>>>> I made a mistake in the file names in last email. I attached the correct files this time. >>>>>> For all the three tests, 'Telescope' is used as the coarse preconditioner. >>>>>> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>>>>> Part of the memory usage: Vector 125 124 3971904 0. >>>>>> Matrix 101 101 9462372 0 >>>>>> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>>>>> Part of the memory usage: Vector 125 124 681672 0. >>>>>> Matrix 101 101 1462180 0. >>>>>> >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. >>>>>> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 >>>>>> Here I get the out of memory error. >>>>>> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>>>>> The linear solver didn't work in this case. Petsc output some errors. >>>>>> >>>>>> @Dave >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >>>>>> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >>>>>> On the last coarse mesh of 'Telescope', there is only one grid point per process. >>>>>> I still got the OOM error. The detailed petsc option file is attached. >>>>>> >>>>>> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. >>>>>> >>>>>> And please send the result of KSPView so we can see what is actually used in the computations >>>>>> >>>>>> Thanks >>>>>> Dave >>>>>> >>>>>> >>>>>> Thank you so much. >>>>>> >>>>>> Frank >>>>>> >>>>>> >>>>>> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>>>>> >>>>>> Hi Barry, >>>>>> >>>>>> Thank you for you advice. >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. >>>>>> The system gives me the "Out of Memory" error before the linear system is completely solved. >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. >>>>>> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. >>>>>> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test >>>>>> Vector 384 383 8,193,712 0. >>>>>> Matrix 103 103 11,508,688 0. >>>>>> to 3rd test >>>>>> Vector 384 383 1,590,520 0. >>>>>> Matrix 103 103 3,508,664 0. >>>>>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. >>>>>> >>>>>> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? >>>>>> Sorry, my mistake the option is -memory_view >>>>>> >>>>>> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> In both tests the memory usage is not large. >>>>>> >>>>>> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. >>>>>> Is there is a way to show how much memory it allocated? >>>>>> >>>>>> Frank >>>>>> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>>>>> Frank, >>>>>> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >>>>>> >>>>>> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >>>>>> >>>>>> Barry >>>>>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >>>>>> The petsc options file is attached. >>>>>> >>>>>> The domain is a 3d box. >>>>>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >>>>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >>>>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >>>>>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >>>>>> >>>>>> How can I diagnose what exactly cause the error? >>>>>> Thank you so much. >>>>>> >>>>>> Frank >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> > From hengjiew at uci.edu Fri Sep 9 20:38:38 2016 From: hengjiew at uci.edu (Hengjie Wang) Date: Fri, 9 Sep 2016 18:38:38 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> Message-ID: Hi Barry, I checked. On the supercomputer, I had the option "-ksp_view_pre" but it is not in file I sent you. I am sorry for the confusion. Regards, Frank On Friday, September 9, 2016, Barry Smith wrote: > > > On Sep 9, 2016, at 3:11 PM, frank > > wrote: > > > > Hi Barry, > > > > I think the first KSP view output is from -ksp_view_pre. Before I > submitted the test, I was not sure whether there would be OOM error or not. > So I added both -ksp_view_pre and -ksp_view. > > But the options file you sent specifically does NOT list the > -ksp_view_pre so how could it be from that? > > Sorry to be pedantic but I've spent too much time in the past trying to > debug from incorrect information and want to make sure that the information > I have is correct before thinking. Please recheck exactly what happened. > Rerun with the exact input file you emailed if that is needed. > > Barry > > > > > Frank > > > > > > On 09/09/2016 12:38 PM, Barry Smith wrote: > >> Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt > has only one KSPView in it? Did you run two different solves in the 2 case > but not the one? > >> > >> Barry > >> > >> > >> > >>> On Sep 9, 2016, at 10:56 AM, frank > > wrote: > >>> > >>> Hi, > >>> > >>> I want to continue digging into the memory problem here. > >>> I did find a work around in the past, which is to use less cores per > node so that each core has 8G memory. However this is deficient and > expensive. I hope to locate the place that uses the most memory. > >>> > >>> Here is a brief summary of the tests I did in past: > >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 > >>> Maximum (over computational time) process memory: total > 7.0727e+08 > >>> Current process memory: > total 7.0727e+08 > >>> Maximum (over computational time) space PetscMalloc()ed: total > 6.3908e+11 > >>> Current space PetscMalloc()ed: > total 1.8275e+09 > >>> > >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 > >>> Maximum (over computational time) process memory: total > 5.9431e+09 > >>> Current process memory: > total 5.9431e+09 > >>> Maximum (over computational time) space PetscMalloc()ed: total > 5.3202e+12 > >>> Current space PetscMalloc()ed: > total 5.4844e+09 > >>> > >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 > >>> OOM( Out Of Memory ) killer of the supercomputer terminated the > job during "KSPSolve". > >>> > >>> I attached the output of ksp_view( the third test's output is from > ksp_view_pre ), memory_view and also the petsc options. > >>> > >>> In all the tests, each core can access about 2G memory. In test3, > there are 4223139840 non-zeros in the matrix. This will consume about > 1.74M, using double precision. Considering some extra memory used to store > integer index, 2G memory should still be way enough. > >>> > >>> Is there a way to find out which part of KSPSolve uses the most memory? > >>> Thank you so much. > >>> > >>> BTW, there are 4 options remains unused and I don't understand why > they are omitted: > >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly > >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi > >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 > >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson > >>> > >>> > >>> Regards, > >>> Frank > >>> > >>> On 07/13/2016 05:47 PM, Dave May wrote: > >>>> > >>>> On 14 July 2016 at 01:07, frank > > wrote: > >>>> Hi Dave, > >>>> > >>>> Sorry for the late reply. > >>>> Thank you so much for your detailed reply. > >>>> > >>>> I have a question about the estimation of the memory usage. There are > 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is > used. So the memory per process is: > >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? > >>>> Did I do sth wrong here? Because this seems too small. > >>>> > >>>> No - I totally f***ed it up. You are correct. That'll teach me for > fumbling around with my iphone calculator and not using my brain. (Note > that to convert to MB just divide by 1e6, not 1024^2 - although I > apparently cannot convert between units correctly....) > >>>> > >>>> From the PETSc objects associated with the solver, It looks like it > _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities > are: somewhere in your usage of PETSc you've introduced a memory leak; > PETSc is doing a huge over allocation (e.g. as per our discussion of > MatPtAP); or in your application code there are other objects you have > forgotten to log the memory for. > >>>> > >>>> > >>>> > >>>> I am running this job on Bluewater > >>>> I am using the 7 points FD stencil in 3D. > >>>> > >>>> I thought so on both counts. > >>>> > >>>> I apologize that I made a stupid mistake in computing the memory per > core. My settings render each core can access only 2G memory on average > instead of 8G which I mentioned in previous email. I re-run the job with 8G > memory per core on average and there is no "Out Of Memory" error. I would > do more test to see if there is still some memory issue. > >>>> > >>>> Ok. I'd still like to know where the memory was being used since my > estimates were off. > >>>> > >>>> > >>>> Thanks, > >>>> Dave > >>>> > >>>> Regards, > >>>> Frank > >>>> > >>>> > >>>> > >>>> On 07/11/2016 01:18 PM, Dave May wrote: > >>>>> Hi Frank, > >>>>> > >>>>> > >>>>> On 11 July 2016 at 19:14, frank > > wrote: > >>>>> Hi Dave, > >>>>> > >>>>> I re-run the test using bjacobi as the preconditioner on the coarse > mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. > The petsc option file is attached. > >>>>> I still got the "Out Of Memory" error. The error occurred before the > linear solver finished one step. So I don't have the full info from > ksp_view. The info from ksp_view_pre is attached. > >>>>> > >>>>> Okay - that is essentially useless (sorry) > >>>>> > >>>>> It seems to me that the error occurred when the decomposition was > going to be changed. > >>>>> > >>>>> Based on what information? > >>>>> Running with -info would give us more clues, but will create a ton > of output. > >>>>> Please try running the case which failed with -info > >>>>> I had another test with a grid of 1536*128*384 and the same process > mesh as above. There was no error. The ksp_view info is attached for > comparison. > >>>>> Thank you. > >>>>> > >>>>> > >>>>> [3] Here is my crude estimate of your memory usage. > >>>>> I'll target the biggest memory hogs only to get an order of > magnitude estimate > >>>>> > >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB > per MPI rank assuming double precision. > >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 > bit integers) > >>>>> > >>>>> * You use 5 levels of coarsening, so the other operators should > represent (collectively) > >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the > communicator with 18432 ranks. > >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the > communicator with 18432 ranks. > >>>>> > >>>>> * You use a reduction factor of 64, making the new communicator with > 288 MPI ranks. > >>>>> PCTelescope will first gather a temporary matrix associated with > your coarse level operator assuming a comm size of 288 living on the comm > with size 18432. > >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core on > the 288 ranks. > >>>>> This matrix is then used to form a new MPIAIJ matrix on the subcomm, > thus require another 32 MB per rank. > >>>>> The temporary matrix is now destroyed. > >>>>> > >>>>> * Because a DMDA is detected, a permutation matrix is assembled. > >>>>> This requires 2 doubles per point in the DMDA. > >>>>> Your coarse DMDA contains 92 x 16 x 48 points. > >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on the > sub-comm. > >>>>> > >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the > resulting operator will have the same memory footprint as the unpermuted > matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB > are held in memory when the DMDA is provided. > >>>>> > >>>>> From my rough estimates, the worst case memory foot print for any > given core, given your options is approximately > >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > >>>>> This is way below 8 GB. > >>>>> > >>>>> Note this estimate completely ignores: > >>>>> (1) the memory required for the restriction operator, > >>>>> (2) the potential growth in the number of non-zeros per row due to > Galerkin coarsening (I wished -ksp_view_pre reported the output from > MatView so we could see the number of non-zeros required by the coarse > level operators) > >>>>> (3) all temporary vectors required by the CG solver, and those > required by the smoothers. > >>>>> (4) internal memory allocated by MatPtAP > >>>>> (5) memory associated with IS's used within PCTelescope > >>>>> > >>>>> So either I am completely off in my estimates, or you have not > carefully estimated the memory usage of your application code. Hopefully > others might examine/correct my rough estimates > >>>>> > >>>>> Since I don't have your code I cannot access the latter. > >>>>> Since I don't have access to the same machine you are running on, I > think we need to take a step back. > >>>>> > >>>>> [1] What machine are you running on? Send me a URL if its available > >>>>> > >>>>> [2] What discretization are you using? (I am guessing a scalar 7 > point FD stencil) > >>>>> If it's a 7 point FD stencil, we should be able to examine the > memory usage of your solver configuration using a standard, light weight > existing PETSc example, run on your machine at the same scale. > >>>>> This would hopefully enable us to correctly evaluate the actual > memory usage required by the solver configuration you are using. > >>>>> > >>>>> Thanks, > >>>>> Dave > >>>>> > >>>>> > >>>>> Frank > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On 07/08/2016 10:38 PM, Dave May wrote: > >>>>>> > >>>>>> On Saturday, 9 July 2016, frank > > wrote: > >>>>>> Hi Barry and Dave, > >>>>>> > >>>>>> Thank both of you for the advice. > >>>>>> > >>>>>> @Barry > >>>>>> I made a mistake in the file names in last email. I attached the > correct files this time. > >>>>>> For all the three tests, 'Telescope' is used as the coarse > preconditioner. > >>>>>> > >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > >>>>>> Part of the memory usage: Vector 125 124 3971904 > 0. > >>>>>> Matrix 101 101 > 9462372 0 > >>>>>> > >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > >>>>>> Part of the memory usage: Vector 125 124 681672 0. > >>>>>> Matrix 101 101 > 1462180 0. > >>>>>> > >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. In > my case, it is about 6 times. > >>>>>> > >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain > per process: 32*32*32 > >>>>>> Here I get the out of memory error. > >>>>>> > >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set > -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > >>>>>> The linear solver didn't work in this case. Petsc output some > errors. > >>>>>> > >>>>>> @Dave > >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse > mesh of 'Telescope', I used LU as the preconditioner instead of SVD. > >>>>>> If my set the levels correctly, then on the last coarse mesh of MG > where it calls 'Telescope', the sub-domain per process is 2*2*2. > >>>>>> On the last coarse mesh of 'Telescope', there is only one grid > point per process. > >>>>>> I still got the OOM error. The detailed petsc option file is > attached. > >>>>>> > >>>>>> Do you understand the expected memory usage for the particular > parallel LU implementation you are using? I don't (seriously). Replace LU > with bjacobi and re-run this test. My point about solver debugging is still > valid. > >>>>>> > >>>>>> And please send the result of KSPView so we can see what is > actually used in the computations > >>>>>> > >>>>>> Thanks > >>>>>> Dave > >>>>>> > >>>>>> > >>>>>> Thank you so much. > >>>>>> > >>>>>> Frank > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: > >>>>>> On Jul 6, 2016, at 4:19 PM, frank > > wrote: > >>>>>> > >>>>>> Hi Barry, > >>>>>> > >>>>>> Thank you for you advice. > >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and > the process mesh is 96*8*24. > >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and > 'telescope' is used as the preconditioner at the coarse mesh. > >>>>>> The system gives me the "Out of Memory" error before the linear > system is completely solved. > >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that the > error occurs when it reaches the coarse mesh. > >>>>>> > >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is > 96*8*24. The 3rd test uses the > same grid but a different process mesh 48*4*12. > >>>>>> Are you sure this is right? The total matrix and vector memory > usage goes from 2nd test > >>>>>> Vector 384 383 8,193,712 0. > >>>>>> Matrix 103 103 11,508,688 0. > >>>>>> to 3rd test > >>>>>> Vector 384 383 1,590,520 0. > >>>>>> Matrix 103 103 3,508,664 0. > >>>>>> that is the memory usage got smaller but if you have only 1/8th the > processes and the same grid it should have gotten about 8 times bigger. Did > you maybe cut the grid by a factor of 8 also? If so that still doesn't > explain it because the memory usage changed by a factor of 5 something for > the vectors and 3 something for the matrices. > >>>>>> > >>>>>> > >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the > same in 1st test. The linear solver works fine in both test. > >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory > info is from the option '-log_summary'. I tried to use '-momery_info' as > you suggested, but in my case petsc treated it as an unused option. It > output nothing about the memory. Do I need to add sth to my code so I can > use '-memory_info'? > >>>>>> Sorry, my mistake the option is -memory_view > >>>>>> > >>>>>> Can you run the one case with -memory_view and -mg_coarse jacobi > -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory > is used without the telescope? Also run case 2 the same way. > >>>>>> > >>>>>> Barry > >>>>>> > >>>>>> > >>>>>> > >>>>>> In both tests the memory usage is not large. > >>>>>> > >>>>>> It seems to me that it might be the 'telescope' preconditioner > that allocated a lot of memory and caused the error in the 1st test. > >>>>>> Is there is a way to show how much memory it allocated? > >>>>>> > >>>>>> Frank > >>>>>> > >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: > >>>>>> Frank, > >>>>>> > >>>>>> You can run with -ksp_view_pre to have it "view" the KSP > before the solve so hopefully it gets that far. > >>>>>> > >>>>>> Please run the problem that does fit with -memory_info when > the problem completes it will show the "high water mark" for PETSc > allocated memory and total memory used. We first want to look at these > numbers to see if it is using more memory than you expect. You could also > run with say half the grid spacing to see how the memory usage scaled with > the increase in grid points. Make the runs also with -log_view and send all > the output from these options. > >>>>>> > >>>>>> Barry > >>>>>> > >>>>>> On Jul 5, 2016, at 5:23 PM, frank > > wrote: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I am using the CG ksp solver and Multigrid preconditioner to solve > a linear system in parallel. > >>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse > mesh for its good performance. > >>>>>> The petsc options file is attached. > >>>>>> > >>>>>> The domain is a 3d box. > >>>>>> It works well when the grid is 1536*128*384 and the process mesh > is 96*8*24. When I double the size of grid and > keep the same process mesh and petsc options, I get an > "out of memory" error from the super-cluster I am using. > >>>>>> Each process has access to at least 8G memory, which should be more > than enough for my application. I am sure that all the other parts of my > code( except the linear solver ) do not use much memory. So I doubt if > there is something wrong with the linear solver. > >>>>>> The error occurs before the linear system is completely solved so I > don't have the info from ksp view. I am not able to re-produce the error > with a smaller problem either. > >>>>>> In addition, I tried to use the block jacobi as the preconditioner > with the same grid and same decomposition. The linear solver runs extremely > slow but there is no memory error. > >>>>>> > >>>>>> How can I diagnose what exactly cause the error? > >>>>>> Thank you so much. > >>>>>> > >>>>>> Frank > >>>>>> > >>>>>> < > petsc_options.txt> > >>>>>> > >>>>> > >>>> > >>> < > memory2.txt> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Sat Sep 10 11:39:23 2016 From: fdkong.jd at gmail.com (Fande Kong) Date: Sat, 10 Sep 2016 10:39:23 -0600 Subject: [petsc-users] questions on hypre preconditioner In-Reply-To: <1348C765-E4DA-4A31-9EBC-73EEB599B4F0@mcs.anl.gov> References: <1348C765-E4DA-4A31-9EBC-73EEB599B4F0@mcs.anl.gov> Message-ID: Thanks, Barry. On Mon, Sep 5, 2016 at 11:26 AM, Barry Smith wrote: > > > On Sep 5, 2016, at 11:21 AM, Fande Kong wrote: > > > > Hi Developers, > > > > There are two questions on the hypre preconditioner. > > > > (1) How to set different relax types on different levels? It looks to > use the SAME relax type on all levels except the coarse level which we > could set it to a different solver. Especially, could I set the smoother > type on the finest level as NONE? > > I don't think this is possible through the PETSc interface; it may or > may not be possible by adding additional hypre calls. You need to check the > hypre documentation. > I already took a look into the hypre code. It is not easy to change. But there is a way to do that. I already figured out a way to extract interpolation, restriction and coarse operators from hypre. I could construct any algorithms I want with these operators. The question is: any existing ways to convert a parallel hypre matrix to a petsc Mat, and the same operations for vectors? > > > > > (2) How could I know how many levels have been actually created in > hypre, and how many unknowns on different levels? The "-pc_view" can not > tell me this information: > > -pc_hypre_boomeramg_print_statistics integer different integers give > different amounts of detail, I don't know what the integers mean. > we could get all information by extracting data from hypre_solver. Fande, > > > > > > > > > type: hypre > > HYPRE BoomerAMG preconditioning > > HYPRE BoomerAMG: Cycle type V > > HYPRE BoomerAMG: Maximum number of levels 25 > > HYPRE BoomerAMG: Maximum number of iterations > PER hypre call 1 > > HYPRE BoomerAMG: Convergence tolerance PER hypre > call 0 > > HYPRE BoomerAMG: Threshold for strong coupling > 0.25 > > HYPRE BoomerAMG: Interpolation truncation factor > 0 > > HYPRE BoomerAMG: Interpolation: max elements per > row 0 > > HYPRE BoomerAMG: Number of levels of aggressive > coarsening 0 > > HYPRE BoomerAMG: Number of paths for aggressive > coarsening 1 > > HYPRE BoomerAMG: Maximum row sums 0.9 > > HYPRE BoomerAMG: Sweeps down 1 > > HYPRE BoomerAMG: Sweeps up 1 > > HYPRE BoomerAMG: Sweeps on coarse 1 > > HYPRE BoomerAMG: Relax down > symmetric-SOR/Jacobi > > HYPRE BoomerAMG: Relax up > symmetric-SOR/Jacobi > > HYPRE BoomerAMG: Relax on coarse > Gaussian-elimination > > HYPRE BoomerAMG: Relax weight (all) 1 > > HYPRE BoomerAMG: Outer relax weight (all) 1 > > HYPRE BoomerAMG: Using CF-relaxation > > HYPRE BoomerAMG: Measure type local > > HYPRE BoomerAMG: Coarsen type Falgout > > HYPRE BoomerAMG: Interpolation type classical > > linear system matrix = precond matrix: > > > > > > > > Fande, > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Sep 10 11:58:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 10 Sep 2016 11:58:14 -0500 Subject: [petsc-users] questions on hypre preconditioner In-Reply-To: References: <1348C765-E4DA-4A31-9EBC-73EEB599B4F0@mcs.anl.gov> Message-ID: <8BC8E70D-7477-4639-AABC-69CDA94ECC3D@mcs.anl.gov> > On Sep 10, 2016, at 11:39 AM, Fande Kong wrote: > > Thanks, Barry. > > > On Mon, Sep 5, 2016 at 11:26 AM, Barry Smith wrote: > > > On Sep 5, 2016, at 11:21 AM, Fande Kong wrote: > > > > Hi Developers, > > > > There are two questions on the hypre preconditioner. > > > > (1) How to set different relax types on different levels? It looks to use the SAME relax type on all levels except the coarse level which we could set it to a different solver. Especially, could I set the smoother type on the finest level as NONE? > > I don't think this is possible through the PETSc interface; it may or may not be possible by adding additional hypre calls. You need to check the hypre documentation. > > I already took a look into the hypre code. It is not easy to change. But there is a way to do that. I already figured out a way to extract interpolation, restriction and coarse operators from hypre. I could construct any algorithms I want with these operators. > > The question is: any existing ways to convert a parallel hypre matrix to a petsc Mat, and the same operations for vectors? In the PETSc hypre interface we go the other direction; it is a little hacky in order to be efficient. So look at that code and it may be possible from that to do the mapping in the other direction. Barry > > > > > > (2) How could I know how many levels have been actually created in hypre, and how many unknowns on different levels? The "-pc_view" can not tell me this information: > > -pc_hypre_boomeramg_print_statistics integer different integers give different amounts of detail, I don't know what the integers mean. > > we could get all information by extracting data from hypre_solver. > > Fande, > > > > > > > > > type: hypre > > HYPRE BoomerAMG preconditioning > > HYPRE BoomerAMG: Cycle type V > > HYPRE BoomerAMG: Maximum number of levels 25 > > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > > HYPRE BoomerAMG: Threshold for strong coupling 0.25 > > HYPRE BoomerAMG: Interpolation truncation factor 0 > > HYPRE BoomerAMG: Interpolation: max elements per row 0 > > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > > HYPRE BoomerAMG: Maximum row sums 0.9 > > HYPRE BoomerAMG: Sweeps down 1 > > HYPRE BoomerAMG: Sweeps up 1 > > HYPRE BoomerAMG: Sweeps on coarse 1 > > HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi > > HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi > > HYPRE BoomerAMG: Relax on coarse Gaussian-elimination > > HYPRE BoomerAMG: Relax weight (all) 1 > > HYPRE BoomerAMG: Outer relax weight (all) 1 > > HYPRE BoomerAMG: Using CF-relaxation > > HYPRE BoomerAMG: Measure type local > > HYPRE BoomerAMG: Coarsen type Falgout > > HYPRE BoomerAMG: Interpolation type classical > > linear system matrix = precond matrix: > > > > > > > > Fande, From dominik.brands at uni-due.de Mon Sep 12 11:54:11 2016 From: dominik.brands at uni-due.de (Dr.-Ing. Dominik Brands) Date: Mon, 12 Sep 2016 18:54:11 +0200 Subject: [petsc-users] PETSc 3.2-p7: sources to Prometheus 1.8.9 Message-ID: <57D6DDB3.9060005@uni-due.de> Dear all, due to some compatibilities I must use an older version of PETSc 3.2-p7. In my program I want to use the already outdated solver prometheus. Therefore I need version 1.8.9, but my last stored version is 1.8.8 Has anyone of you the needed version or know at place where I can get it? Bests, Dominik From balay at mcs.anl.gov Mon Sep 12 12:18:59 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 12 Sep 2016 12:18:59 -0500 Subject: [petsc-users] PETSc 3.2-p7: sources to Prometheus 1.8.9 In-Reply-To: <57D6DDB3.9060005@uni-due.de> References: <57D6DDB3.9060005@uni-due.de> Message-ID: On Mon, 12 Sep 2016, Dr.-Ing. Dominik Brands wrote: > Dear all, > > due to some compatibilities I must use an older version of PETSc 3.2-p7. > In my program I want to use the already outdated solver prometheus. > Therefore I need version 1.8.9, but my last stored version is 1.8.8 > > Has anyone of you the needed version or know at place where I can get it? PETSc release snapshots are at http://ftp.mcs.anl.gov/pub/petsc/release-snapshots/ We have some prometheus snapshots at http://ftp.mcs.anl.gov/pub/petsc/externalpackages But 3.2-p7 is attempting to use http://www.columbia.edu/~ma2325/Prometheus-1.8.9.tar.gz This URL doesn't work anymore. cc:ing Mark. You might be able to look at compile errors with 1.8.8 - and then fix it up to work with petsc-3.2-p7 Satish From gotofd at gmail.com Mon Sep 12 20:24:11 2016 From: gotofd at gmail.com (Ji Zhang) Date: Tue, 13 Sep 2016 09:24:11 +0800 Subject: [petsc-users] (no subject) Message-ID: Dear all, I'm using petsc4py and now face some problems. I have a number of small petsc dense matrices mij, and I want to construct them to a big matrix M like this: [ m11 m12 m13 ] M = | m21 m22 m23 | , [ m31 m32 m33 ] How could I do it effectively? Now I'm using the code below: # get indexes of matrix mij index1_begin, index1_end = getindex_i( ) index2_begin, index2_end = getindex_j( ) M[index1_begin:index1_end, index2_begin:index2_end] = mij[:, :] which report such error messages: petsc4py.PETSc.Error: error code 56 [0] MatGetValues() line 1818 in /home/zhangji/PycharmProjects/petsc-petsc-31a1859eaff6/src/mat/interface/matrix.c [0] MatGetValues_MPIDense() line 154 in /home/zhangji/PycharmProjects/petsc-petsc-31a1859eaff6/src/mat/impls/dense/mpi/mpidense.c [0] No support for this operation for this object type [0] Only local values currently supported Thanks. 2016-09-13 Best, Regards, Zhang Ji Beijing Computational Science Research Center E-mail: gotofd at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 12 20:30:49 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 12 Sep 2016 20:30:49 -0500 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: On Mon, Sep 12, 2016 at 8:24 PM, Ji Zhang wrote: > Dear all, > > I'm using petsc4py and now face some problems. > I have a number of small petsc dense matrices mij, and I want to construct > them to a big matrix M like this: > > [ m11 m12 m13 ] > M = | m21 m22 m23 | , > [ m31 m32 m33 ] > How could I do it effectively? > > Now I'm using the code below: > > # get indexes of matrix mij > index1_begin, index1_end = getindex_i( ) > index2_begin, index2_end = getindex_j( ) > M[index1_begin:index1_end, index2_begin:index2_end] = mij[:, :] > which report such error messages: > > petsc4py.PETSc.Error: error code 56 > [0] MatGetValues() line 1818 in /home/zhangji/PycharmProjects/ > petsc-petsc-31a1859eaff6/src/mat/interface/matrix.c > [0] MatGetValues_MPIDense() line 154 in /home/zhangji/PycharmProjects/ > petsc-petsc-31a1859eaff6/src/mat/impls/dense/mpi/mpidense.c > Make M a sequential dense matrix. Matt > [0] No support for this operation for this object type > [0] Only local values currently supported > > Thanks. > > > 2016-09-13 > Best, > Regards, > Zhang Ji > Beijing Computational Science Research Center > E-mail: gotofd at gmail.com > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Tue Sep 13 06:48:14 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 13 Sep 2016 13:48:14 +0200 Subject: [petsc-users] Diagnosing a difference between "unpreconditioned" and "true" residual norms In-Reply-To: References: Message-ID: Hi Barry - You were correct - the problem was indeed a nonlinear (in double precision) operator. My operator in this case wrapped an existing residual evaluation routine (by someone else). I had lazily been computing Ax = (Ax-b)+b; that is, compute the residual and then add the rhs back. The norm of the b was large enough to wipe out a few digits of accuracy, thus introducing floating point noise which rendered the operator nonlinear enough in double precision to induce the observed deviation in the norms. Thanks! On Fri, Sep 9, 2016 at 9:44 PM, Barry Smith wrote: > > Patrick, > > I have only seen this when the "linear" operator turned out to not actually be linear or at least not linear in double precision. > Are you using differencing or anything in your MatShell that might make it not be a linear operator in full precision? > > Since your problem is so small you can compute the Jacobian explicitly via finite differencing and then use that matrix plus your shell preconditioner. I beat if you do this you will see the true and non-true residual norms remain the same, this would likely mean something is wonky with your shell matrix. > > Barry > >> On Sep 9, 2016, at 9:32 AM, Patrick Sanan wrote: >> >> I am debugging a linear solver which uses a custom operator and >> preconditioner, via MATSHELL and PCSHELL. Convergence seems to be >> fine, except that I unexpectedly see a difference between the >> "unpreconditioned" and "true" residual norms when I use >> -ksp_monitor_true_residual with a right-preconditioned Krylov method >> (FGMRES or right-preconditioned GMRES). >> >> 0 KSP unpreconditioned resid norm 9.266794204683e+08 true resid norm >> 9.266794204683e+08 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 2.317801431974e+04 true resid norm >> 2.317826550333e+04 ||r(i)||/||b|| 2.501217248530e-05 >> 2 KSP unpreconditioned resid norm 4.453270507534e+00 true resid norm >> 2.699824780158e+01 ||r(i)||/||b|| 2.913439880638e-08 >> 3 KSP unpreconditioned resid norm 1.015490793887e-03 true resid norm >> 2.658635801018e+01 ||r(i)||/||b|| 2.868991953738e-08 >> 4 KSP unpreconditioned resid norm 4.710220776105e-07 true resid norm >> 2.658631616810e+01 ||r(i)||/||b|| 2.868987438467e-08 >> KSP Object:(mgk_) 1 MPI processes >> type: fgmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-13, absolute=1e-50, divergence=10000. >> right preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object:(mgk_) 1 MPI processes >> type: shell >> Shell: Custom PC >> linear system matrix = precond matrix: >> Mat Object: Custom Operator 1 MPI processes >> type: shell >> rows=256, cols=256 >> has attached null space >> >> I have dumped the explicit operator and preconditioned operator, and I >> can see that the operator and the preconditioned operator each have a >> 1-dimensional nullspace (a constant-pressure nullspace) which I have >> accounted for by constructing a normalized, constant-pressure vector >> and supplying it to the operator via a MatNullSpace. >> >> If I disregard the (numerically) zero singular value, the operator has >> a condition number of 1.5669e+05 and the preconditioned operator has a >> condition number of 1.01 (strong preconditioner). >> >> Has anyone seen this sort of behavior before and if so, is there a >> common culprit that I am overlooking? Any ideas of what to test next >> to try to isolate the issue? >> >> As I understand it, the unpreconditioned and true residual norms >> should be identical in exact arithmetic, so I would suspect that >> somehow I've ended up with a "bad Hessenberg matrix" in some way as I >> perform this solve (or maybe I have a more subtle bug). > From jason.hou at ncsu.edu Tue Sep 13 11:25:34 2016 From: jason.hou at ncsu.edu (Jason Hou) Date: Tue, 13 Sep 2016 12:25:34 -0400 Subject: [petsc-users] Errors in installing PETSc Message-ID: Hi there, I was trying to install MOOSE, which uses PETSc as the solver; however, I was stuck during the installation of PETSc. [jhou8 at rdfmg petsc-3.6.4]$ pwd /tmp/cluster_temp.FFxzAF/petsc-3.6.4 [jhou8 at rdfmg petsc-3.6.4]$ ./configure --prefix=$PETSC_DIR --download-hypre=1 --with-ssl=0 --with-debugging=1 --with-pic=1 --with-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack=1 --download-metis=1 --download-parmetis=1 --download-superlu_dist=1 --download-scalapack=1 --download-mumps=1 CC=mpicc CXX=mpicxx FC=mpif90 F77=mpif77 F90=mpif90 CFLAGS='-fPIC -fopenmp' CXXFLAGS='-fPIC -fopenmp' FFLAGS='-fPIC -fopenmp' FCFLAGS='-fPIC -fopenmp' F90FLAGS='-fPIC -fopenmp' F77FLAGS='-fPIC -fopenmp' PETSC_DIR=`pwd` (sections removed) =============================================================================== Compiling and installing Scalapack; this may take several minutes ============================== ================================================= TESTING: check from config.libraries(config/BuildSystem/config/libraries.py:146) ************************************************************ ******************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------ ------------------- Downloaded scalapack could not be used. Please check install in /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt ************************************************************ ******************* Also attached is the configure.log file and I hope you could help me with that. Thank you in advance for your effort. Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2902994 bytes Desc: not available URL: From doss0032 at umn.edu Tue Sep 13 11:40:40 2016 From: doss0032 at umn.edu (Scott Dossa) Date: Tue, 13 Sep 2016 11:40:40 -0500 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: Hi Jason, I have two tips for configuring PETSc. While I can't speak for all Cluster Computing, I know that many of the libraries you are attempting to link might require loading when logging into a cluster. For example, here at MSI, one needs to call > module load mkl > to load the intel math kernel library (which includes blas-lapack). Since you are downloading blas-lapack, that example would not be necessary, but it may be for other libraries you are using. Also I found I had to search manually for where the shared-libraries resided and explicitly tell PETSc where they live. For example, when I configure PETSc, I had to include options like > ./configure (. . .) > --with-mpi-dir=/panfs/roc/itascasoft/openmpi/el6/1.7.2/intel-2013-update5 > to tell PETSc where openmpi lives. I had to do this for nearly all libraries I wished to link to PETSc. Hope this helps! -Scott Dossa On Tue, Sep 13, 2016 at 11:25 AM, Jason Hou wrote: > Hi there, > > I was trying to install MOOSE, which uses PETSc as the solver; however, I > was stuck during the installation of PETSc. > > [jhou8 at rdfmg petsc-3.6.4]$ pwd > /tmp/cluster_temp.FFxzAF/petsc-3.6.4 > [jhou8 at rdfmg petsc-3.6.4]$ ./configure --prefix=$PETSC_DIR > --download-hypre=1 --with-ssl=0 --with-debugging=1 --with-pic=1 > --with-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx > --with-fc=mpif90 --download-fblaslapack=1 --download-metis=1 > --download-parmetis=1 --download-superlu_dist=1 --download-scalapack=1 > --download-mumps=1 CC=mpicc CXX=mpicxx FC=mpif90 F77=mpif77 F90=mpif90 > CFLAGS='-fPIC -fopenmp' CXXFLAGS='-fPIC -fopenmp' FFLAGS='-fPIC -fopenmp' > FCFLAGS='-fPIC -fopenmp' F90FLAGS='-fPIC -fopenmp' F77FLAGS='-fPIC > -fopenmp' PETSC_DIR=`pwd` > > (sections removed) > > =============================================================================== > > Compiling and installing Scalapack; this may take several > minutes > ============================== > ================================================= > TESTING: > check from config.libraries(config/BuildSystem/config/libraries.py:146) > > ************************************************************ > ******************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > ------------------------------------------------------------ > ------------------- > Downloaded scalapack could not be used. Please check install in > /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt > ************************************************************ > ******************* > > > Also attached is the configure.log file and I hope you could help me with > that. Thank you in advance for your effort. > > Jason > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Tue Sep 13 12:16:31 2016 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Tue, 13 Sep 2016 19:16:31 +0200 Subject: [petsc-users] FLAGS in makefile Message-ID: Hi all, I can't seem to figure out how to specify my compilation options with PETSc. For my makefiles, I've always been using Petsc examples inspired makefiles, just tuning them to my needs, and I have never played with compilation options so far. Now, I am trying to add some compilation options, but they are not taken into account by the compiler. My makefile looks like this all: energy include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules #FLAGS = -g -O0 -fbounds-check MYFLAGS = -mcmodel=medium -shared-intel OBJS = main.o \ modules.o \ diags.o \ functions.o \ conservation.o \ EXEC = energy main.o: modules.o \ functions.o \ conservation.o \ diags.o \ energy: $(OBJS) chkopts -$(FLINKER) -o $(EXEC) $(MYFLAGS) $(FLAGS) $(OBJS) $(PETSC_SNES_LIB) clean_all: $(RM) $(OBJS) $(EXEC) The compiler then executes things like /opt/mpi/bullxmpi/1.2.8.4/bin/mpif90 -c -fPIC -g -O3 -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/petsc-3.7.3/include -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/petsc-3.7.3/arch-linux2-c-debug/include -I/opt/mpi/bullxmpi/1.2.8.4/include -o modules.o modules.F90 without taking my variable MYFLAGS into account. What may be the reason? Also, what does "chkopts" mean ? Best Timothee -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 13 12:16:51 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Sep 2016 12:16:51 -0500 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: On Tue, Sep 13, 2016 at 11:25 AM, Jason Hou wrote: > Hi there, > > I was trying to install MOOSE, which uses PETSc as the solver; however, I > was stuck during the installation of PETSc. > I believe your problem is that this is old PETSc. In the latest release, BLACS is part of SCALAPACK. Thanks, Matt > [jhou8 at rdfmg petsc-3.6.4]$ pwd > /tmp/cluster_temp.FFxzAF/petsc-3.6.4 > [jhou8 at rdfmg petsc-3.6.4]$ ./configure --prefix=$PETSC_DIR > --download-hypre=1 --with-ssl=0 --with-debugging=1 --with-pic=1 > --with-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx > --with-fc=mpif90 --download-fblaslapack=1 --download-metis=1 > --download-parmetis=1 --download-superlu_dist=1 --download-scalapack=1 > --download-mumps=1 CC=mpicc CXX=mpicxx FC=mpif90 F77=mpif77 F90=mpif90 > CFLAGS='-fPIC -fopenmp' CXXFLAGS='-fPIC -fopenmp' FFLAGS='-fPIC -fopenmp' > FCFLAGS='-fPIC -fopenmp' F90FLAGS='-fPIC -fopenmp' F77FLAGS='-fPIC > -fopenmp' PETSC_DIR=`pwd` > > (sections removed) > > =============================================================================== > > Compiling and installing Scalapack; this may take several > minutes > ============================== > ================================================= > TESTING: > check from config.libraries(config/BuildSystem/config/libraries.py:146) > > ************************************************************ > ******************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > ------------------------------------------------------------ > ------------------- > Downloaded scalapack could not be used. Please check install in > /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt > ************************************************************ > ******************* > > > Also attached is the configure.log file and I hope you could help me with > that. Thank you in advance for your effort. > > Jason > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 13 12:21:57 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Sep 2016 12:21:57 -0500 Subject: [petsc-users] FLAGS in makefile In-Reply-To: References: Message-ID: On Tue, Sep 13, 2016 at 12:16 PM, Timoth?e Nicolas < timothee.nicolas at gmail.com> wrote: > Hi all, > > I can't seem to figure out how to specify my compilation options with > PETSc. For my makefiles, I've always been using Petsc examples inspired > makefiles, just tuning them to my needs, and I have never played with > compilation options so far. Now, I am trying to add some compilation > options, but they are not taken into account by the compiler. My makefile > looks like this > > all: energy > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > #FLAGS = -g -O0 -fbounds-check > > > > > MYFLAGS = -mcmodel=medium -shared-intel > > > OBJS = main.o \ > > modules.o \ > > diags.o \ > > functions.o \ > > conservation.o \ > > > EXEC = energy > > > main.o: modules.o \ > > functions.o \ > > conservation.o \ > > diags.o \ > > > energy: $(OBJS) chkopts > > -$(FLINKER) -o $(EXEC) $(MYFLAGS) $(FLAGS) $(OBJS) $( > PETSC_SNES_LIB) > > > clean_all: > > $(RM) $(OBJS) $(EXEC) > > > The compiler then executes things like > > /opt/mpi/bullxmpi/1.2.8.4/bin/mpif90 -c -fPIC -g -O3 > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/petsc-3.7.3/include > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/ > petsc-3.7.3/arch-linux2-c-debug/include -I/opt/mpi/bullxmpi/1.2.8.4/ > include -o modules.o modules.F90 > > > without taking my variable MYFLAGS into account. What may be the reason? > Also, what does "chkopts" mean ? > 1) You want to change CFLAGS or FFLAGS 2) 'chkopts' is an internal check for PETSc 3) You realize that it is very dangerous to compile with options not configure with. Thanks, Matt > Best > > > Timothee > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Sep 13 12:27:23 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Sep 2016 12:27:23 -0500 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: On Tue, 13 Sep 2016, Scott Dossa wrote: > Also I found I had to search manually for where the shared-libraries > resided and explicitly tell PETSc where they live. For example, when I > configure PETSc, I had to include options like > > > ./configure (. . .) > > --with-mpi-dir=/panfs/roc/itascasoft/openmpi/el6/1.7.2/intel-2013-update5 > > > to tell PETSc where openmpi lives. I had to do this for nearly all > libraries I wished to link to PETSc. To clarify - wrt MPI - when one specifies --with-mpi-dir - then configure looks for mpicc,mpicxx,mpif90 from that location and uses it. Alternative is to specify --with-cc=/panfs/roc/itascasoft/openmpi/el6/1.7.2/intel-2013-update5/bin/mpicc etc.. [or have mpicc etc in your PATH] All the above 3 modes are equivalent. IF one mixes up these option - there is ambiguity - and configure stumbles - for eg: "--with-cc=gcc --with-mpi-dir=/location/of/mpich-install" Satish From balay at mcs.anl.gov Tue Sep 13 12:33:45 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Sep 2016 12:33:45 -0500 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: On Tue, 13 Sep 2016, Matthew Knepley wrote: > I believe your problem is that this is old PETSc. In the latest release, > BLACS is part of SCALAPACK. BLACS had been a part of scalapack for a few releases - so thats not the issue. >>>>>>>> stderr: /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt/lib/libscalapack.a(pssytrd.o): In function `pssytrd': /tmp/cluster_temp.FFxzAF/petsc-3.6.4/arch-linux2-c-debug/externalpackages/scalapack-2.0.2/SRC/pssytrd.f:259: undefined reference to `blacs_gridinfo__' /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt/lib/libscalapack.a(chk1mat.o): In function `chk1mat': <<<<<< Double underscore? >>> mpicc -c -Df77IsF2C -fPIC -fopenmp -fPIC -g -I/cm/shared/apps/openmpi/open64/64/1.10.1/include pzrot.c <<< scalapack is getting compiled with this flag '-Df77IsF2C'. This mode was primarily supported by 'g77' previously - which we hardly ever use anymore - so this mode is not really tested? >>>>> Executing: mpif90 -show stdout: openf90 -I/cm/shared/apps/openmpi/open64/64/1.10.1/include -pthread -I/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 -L/usr/lib64/ -Wl,-rpath -Wl,/usr/lib64/ -Wl,-rpath -Wl,/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 -Wl,--enable-new-dtags -L/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi compilers: Fortran appends an extra underscore to names containing underscores Defined "HAVE_FORTRAN_UNDERSCORE_UNDERSCORE" to "1" <<<< What do you have for: cd /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt/lib/ nm -Ao libscalapack.a |grep -i blacs_gridinfo However - as Matt refered to - its best to use latest petsc-3.7 release. Does MOOSE require 3.6? Satish From balay at mcs.anl.gov Tue Sep 13 12:56:11 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Sep 2016 12:56:11 -0500 Subject: [petsc-users] FLAGS in makefile In-Reply-To: References: Message-ID: On Tue, 13 Sep 2016, Matthew Knepley wrote: > On Tue, Sep 13, 2016 at 12:16 PM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > > > Hi all, > > > > I can't seem to figure out how to specify my compilation options with > > PETSc. For my makefiles, I've always been using Petsc examples inspired > > makefiles, just tuning them to my needs, and I have never played with > > compilation options so far. Now, I am trying to add some compilation > > options, but they are not taken into account by the compiler. My makefile > > looks like this > > > > all: energy > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > > > #FLAGS = -g -O0 -fbounds-check > > > > > > > > > > MYFLAGS = -mcmodel=medium -shared-intel > > > > > > OBJS = main.o \ > > > > modules.o \ > > > > diags.o \ > > > > functions.o \ > > > > conservation.o \ > > > > > > EXEC = energy > > > > > > main.o: modules.o \ > > > > functions.o \ > > > > conservation.o \ > > > > diags.o \ > > > > > > energy: $(OBJS) chkopts > > > > -$(FLINKER) -o $(EXEC) $(MYFLAGS) $(FLAGS) $(OBJS) $( > > PETSC_SNES_LIB) > > > > > > clean_all: > > > > $(RM) $(OBJS) $(EXEC) > > > > > > The compiler then executes things like > > > > /opt/mpi/bullxmpi/1.2.8.4/bin/mpif90 -c -fPIC -g -O3 > > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/petsc-3.7.3/include > > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/ > > petsc-3.7.3/arch-linux2-c-debug/include -I/opt/mpi/bullxmpi/1.2.8.4/ > > include -o modules.o modules.F90 > > > > > > without taking my variable MYFLAGS into account. What may be the reason? > > Also, what does "chkopts" mean ? > > > > 1) You want to change CFLAGS or FFLAGS > > 2) 'chkopts' is an internal check for PETSc > > 3) You realize that it is very dangerous to compile with options not > configure with. And generally - you have to declare FFLAGS - before the 'include' statement You can set: FPPFLAGS - for preprocessing [or compile only flags] FFLAGS - for compile & link flags For link only flags - you can add them to the link command - as you've done with MYFLAGS.. Satish From fande.kong at inl.gov Tue Sep 13 12:56:52 2016 From: fande.kong at inl.gov (Kong (Non-US), Fande) Date: Tue, 13 Sep 2016 11:56:52 -0600 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: On Tue, Sep 13, 2016 at 11:33 AM, Satish Balay wrote: > On Tue, 13 Sep 2016, Matthew Knepley wrote: > > > I believe your problem is that this is old PETSc. In the latest release, > > BLACS is part of SCALAPACK. > > BLACS had been a part of scalapack for a few releases - so thats not the > issue. > > >>>>>>>> > stderr: > /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/ > gcc-opt/lib/libscalapack.a(pssytrd.o): In function `pssytrd': > /tmp/cluster_temp.FFxzAF/petsc-3.6.4/arch-linux2-c-debug/externalpackages/ > scalapack-2.0.2/SRC/pssytrd.f:259: undefined reference to > `blacs_gridinfo__' > /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/ > gcc-opt/lib/libscalapack.a(chk1mat.o): In function `chk1mat': > <<<<<< > > Double underscore? > > >>> > mpicc -c -Df77IsF2C -fPIC -fopenmp -fPIC -g -I/cm/shared/apps/openmpi/open64/64/1.10.1/include > pzrot.c > <<< > > scalapack is getting compiled with this flag '-Df77IsF2C'. This mode > was primarily supported by 'g77' previously - which we hardly ever use > anymore - so this mode is not really tested? > > >>>>> > Executing: mpif90 -show > stdout: openf90 -I/cm/shared/apps/openmpi/open64/64/1.10.1/include > -pthread -I/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 -L/usr/lib64/ > -Wl,-rpath -Wl,/usr/lib64/ -Wl,-rpath -Wl,/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 > -Wl,--enable-new-dtags -L/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 > -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > > compilers: Fortran appends an extra underscore to names > containing underscores > Defined "HAVE_FORTRAN_UNDERSCORE_UNDERSCORE" to "1" > > <<<< > > What do you have for: > > cd /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt/lib/ > nm -Ao libscalapack.a |grep -i blacs_gridinfo > > However - as Matt refered to - its best to use latest petsc-3.7 > release. Does MOOSE require 3.6? > > I think MOOSE works fine with petsc-3.7 as long as you do not use superlu_dist. The superlu_dist has bugs in the latest petsc-3.7. Many applications just run fine with the old superlu_dist in the petsc-3.5.x. For the later version, most of the applications which use superlu_dist fails. Fande, > Satish > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hsahasra at purdue.edu Tue Sep 13 13:01:40 2016 From: hsahasra at purdue.edu (Harshad Sahasrabudhe) Date: Tue, 13 Sep 2016 14:01:40 -0400 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> Message-ID: Hi Barry, I compiled with mpich configured using --enable-g=meminit to get rid of MPI errors in Valgrind. Doing this reduced the number of errors to 2. I have attached the Valgrind output. I'm using GAMG+GMRES for in each linear iteration of SNES. The linear solver converges with CONVERGED_RTOL for the first 6 iterations and with CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is happening. Any thoughts/ideas? Thanks, Harshad On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith wrote: > > Install your MPI with --download-mpich as a PETSc ./configure option, > this will eliminate all the MPICH valgrind errors. Then send as an > attachment the resulting valgrind file. > > I do not 100 % trust any code that produces such valgrind errors. > > Barry > > > > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe > wrote: > > > > Hi Barry, > > > > Thanks for the reply. My code is in C. I ran with Valgrind and found > many "Conditional jump or move depends on uninitialized value(s)", "Invalid > read" and "Use of uninitialized value" errors. I think all of them are from > the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure > what I'm looking for in the Valgrind output. At the end of the file, I get: > > > > ==40223== More than 10000000 total errors detected. I'm not reporting > any more. > > ==40223== Final error counts will be inaccurate. Go fix your program! > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note > > ==40223== that errors may occur in your program without prior warning > from > > ==40223== Valgrind, because errors are no longer being displayed. > > > > Can you give some suggestions on how I should proceed? > > > > Thanks, > > Harshad > > > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith wrote: > > > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very specialized > and should never occur with GMRES. > > > > Can you run with valgrind to make sure there is no memory corruption? > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > Is your code fortran or C? > > > > Barry > > > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe > wrote: > > > > > > Hi, > > > > > > I'm using GAMG + GMRES for my Poisson problem. The solver converges > with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which > is much higher than what I need (I need a tolerance of at least 1E-8). I am > not able to figure out which tolerance I need to set to avoid convergence > due to CONVERGED_STEP_LENGTH. > > > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > > > > > 0 KSP Residual norm 3.121347818142e+00 > > > 1 KSP Residual norm 9.773346857844e-02 > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > > > KSP Object: 1 MPI processes > > > type: gmres > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > > GMRES: happy breakdown tolerance 1e-30 > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using PRECONDITIONED norm type for convergence test > > > PC Object: 1 MPI processes > > > type: gamg > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v > > > Cycles per PCApply=1 > > > Using Galerkin computed coarse grid matrices > > > Coarse grid solver -- level ------------------------------- > > > KSP Object: (mg_coarse_) 1 MPI processes > > > type: preonly > > > maximum iterations=1, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (mg_coarse_) 1 MPI processes > > > type: bjacobi > > > block Jacobi: number of blocks = 1 > > > Local solve is same for all blocks, in the following KSP and > PC objects: > > > KSP Object: (mg_coarse_sub_) 1 MPI processes > > > type: preonly > > > maximum iterations=1, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (mg_coarse_sub_) 1 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > > > matrix ordering: nd > > > factor fill ratio given 5, needed 1.91048 > > > Factored matrix follows: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=284, cols=284 > > > package used to perform factorization: petsc > > > total: nonzeros=7726, allocated nonzeros=7726 > > > total number of mallocs used during MatSetValues > calls =0 > > > using I-node routines: found 133 nodes, limit used > is 5 > > > linear system matrix = precond matrix: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=284, cols=284 > > > total: nonzeros=4044, allocated nonzeros=4044 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node routines > > > linear system matrix = precond matrix: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=284, cols=284 > > > total: nonzeros=4044, allocated nonzeros=4044 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node routines > > > Down solver (pre-smoother) on level 1 ------------------------------ > - > > > KSP Object: (mg_levels_1_) 1 MPI processes > > > type: chebyshev > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = 4.10212 > > > maximum iterations=2 > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using nonzero initial guess > > > using NONE norm type for convergence test > > > PC Object: (mg_levels_1_) 1 MPI processes > > > type: sor > > > SOR: type = local_symmetric, iterations = 1, local iterations > = 1, omega = 1 > > > linear system matrix = precond matrix: > > > Mat Object: () 1 MPI processes > > > type: seqaij > > > rows=9036, cols=9036 > > > total: nonzeros=192256, allocated nonzeros=192256 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node routines > > > Up solver (post-smoother) same as down solver (pre-smoother) > > > linear system matrix = precond matrix: > > > Mat Object: () 1 MPI processes > > > type: seqaij > > > rows=9036, cols=9036 > > > total: nonzeros=192256, allocated nonzeros=192256 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node routines > > > > > > Thanks, > > > Harshad > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: valgrind.log.33199 Type: application/octet-stream Size: 2922 bytes Desc: not available URL: From balay at mcs.anl.gov Tue Sep 13 13:05:23 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Sep 2016 13:05:23 -0500 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: On Tue, 13 Sep 2016, Kong (Non-US), Fande wrote: > > However - as Matt refered to - its best to use latest petsc-3.7 > > release. Does MOOSE require 3.6? > > > > > I think MOOSE works fine with petsc-3.7 as long as you do not use > superlu_dist. The superlu_dist has bugs in the latest petsc-3.7. Hm - I don't see any on reports on the superlu_dist bug tracker [or perhaps I missed some of the e-mail tarffic on petsc lists on this issue] https://github.com/xiaoyeli/superlu_dist/issues Satish From fande.kong at inl.gov Tue Sep 13 13:13:02 2016 From: fande.kong at inl.gov (Kong (Non-US), Fande) Date: Tue, 13 Sep 2016 12:13:02 -0600 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: On Tue, Sep 13, 2016 at 12:05 PM, Satish Balay wrote: > On Tue, 13 Sep 2016, Kong (Non-US), Fande wrote: > > > > However - as Matt refered to - its best to use latest petsc-3.7 > > > release. Does MOOSE require 3.6? > > > > > > > > I think MOOSE works fine with petsc-3.7 as long as you do not use > > superlu_dist. The superlu_dist has bugs in the latest petsc-3.7. > > Thanks, Satish, > Hm - I don't see any on reports on the superlu_dist bug tracker [or > perhaps I missed some of the e-mail tarffic on petsc lists on this > issue] > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github. > com_xiaoyeli_superlu-5Fdist_issues&d=CwIBAg&c= > 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=DUUt3SRGI0_ > JgtNaS3udV68GRkgV4ts7XKfj2opmiCY&m=sbGb6NjNAkTFB_rboPxuOUZE_ > e0JXSaKebcodwE2J_s&s=p-u5ET_flkwmIQlbzL2ahWQkYGmzrxNtQ3MfbaI63rg&e= > > The bug is related with the algorithm in superlu_dist. It is a runtime bug not an compiling issue. I am thinking how to reproduce this issue on the petsc side. Fande, > Satish > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Sep 13 13:23:50 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Sep 2016 13:23:50 -0500 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: On Tue, 13 Sep 2016, Kong (Non-US), Fande wrote: > On Tue, Sep 13, 2016 at 12:05 PM, Satish Balay wrote: > > > On Tue, 13 Sep 2016, Kong (Non-US), Fande wrote: > > > > > > However - as Matt refered to - its best to use latest petsc-3.7 > > > > release. Does MOOSE require 3.6? > > > > > > > > > > > I think MOOSE works fine with petsc-3.7 as long as you do not use > > > superlu_dist. The superlu_dist has bugs in the latest petsc-3.7. > > > > > Thanks, Satish, > > > > Hm - I don't see any on reports on the superlu_dist bug tracker [or > > perhaps I missed some of the e-mail tarffic on petsc lists on this > > issue] > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github. > > com_xiaoyeli_superlu-5Fdist_issues&d=CwIBAg&c= > > 54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=DUUt3SRGI0_ > > JgtNaS3udV68GRkgV4ts7XKfj2opmiCY&m=sbGb6NjNAkTFB_rboPxuOUZE_ > > e0JXSaKebcodwE2J_s&s=p-u5ET_flkwmIQlbzL2ahWQkYGmzrxNtQ3MfbaI63rg&e= > > > > > The bug is related with the algorithm in superlu_dist. It is a runtime bug > not an compiling issue. I am thinking how to reproduce this issue on the > petsc side. superlu_dist has git history from apporximately version 4.1 [the version thats used in petsc-3.6] - so you you can get a working and buggy build with the oldest and latest git snapshots - then you might be able to do 'git bisect' to narrow down to the change thats causing this issue. Satish From bsmith at mcs.anl.gov Tue Sep 13 13:47:46 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 Sep 2016 13:47:46 -0500 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> Message-ID: <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe wrote: > > Hi Barry, > > I compiled with mpich configured using --enable-g=meminit to get rid of MPI errors in Valgrind. Doing this reduced the number of errors to 2. I have attached the Valgrind output. This isn't helpful but it seems not to be a memory corruption issue :-( > > I'm using GAMG+GMRES for in each linear iteration of SNES. The linear solver converges with CONVERGED_RTOL for the first 6 iterations and with CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is happening. Any thoughts/ideas? Does this happen on one process? If so I would run in the debugger and track the variable to see everyplace the variable is changed, this would point to exactly what piece of code is changing the variable to this unexpected value. For example with lldb one can use watch http://lldb.llvm.org/tutorial.html to see each time a variable gets changed. Similar thing with gdb. The variable to watch is ksp->reason Once you get the hang of this it can take just a few minutes to track down the code that is making this unexpected value, though I understand if you haven't done it before it can be intimidating. Barry You can do the same thing in parallel (like on two processes) if you need to but it is more cumbersome since you need run multiple debuggers. You can have PETSc start up multiple debuggers with mpiexec -n 2 ./ex -start_in_debugger > > Thanks, > Harshad > > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith wrote: > > Install your MPI with --download-mpich as a PETSc ./configure option, this will eliminate all the MPICH valgrind errors. Then send as an attachment the resulting valgrind file. > > I do not 100 % trust any code that produces such valgrind errors. > > Barry > > > > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe wrote: > > > > Hi Barry, > > > > Thanks for the reply. My code is in C. I ran with Valgrind and found many "Conditional jump or move depends on uninitialized value(s)", "Invalid read" and "Use of uninitialized value" errors. I think all of them are from the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure what I'm looking for in the Valgrind output. At the end of the file, I get: > > > > ==40223== More than 10000000 total errors detected. I'm not reporting any more. > > ==40223== Final error counts will be inaccurate. Go fix your program! > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note > > ==40223== that errors may occur in your program without prior warning from > > ==40223== Valgrind, because errors are no longer being displayed. > > > > Can you give some suggestions on how I should proceed? > > > > Thanks, > > Harshad > > > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith wrote: > > > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very specialized and should never occur with GMRES. > > > > Can you run with valgrind to make sure there is no memory corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > Is your code fortran or C? > > > > Barry > > > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe wrote: > > > > > > Hi, > > > > > > I'm using GAMG + GMRES for my Poisson problem. The solver converges with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which is much higher than what I need (I need a tolerance of at least 1E-8). I am not able to figure out which tolerance I need to set to avoid convergence due to CONVERGED_STEP_LENGTH. > > > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > > > > > 0 KSP Residual norm 3.121347818142e+00 > > > 1 KSP Residual norm 9.773346857844e-02 > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > > > KSP Object: 1 MPI processes > > > type: gmres > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > > GMRES: happy breakdown tolerance 1e-30 > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using PRECONDITIONED norm type for convergence test > > > PC Object: 1 MPI processes > > > type: gamg > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v > > > Cycles per PCApply=1 > > > Using Galerkin computed coarse grid matrices > > > Coarse grid solver -- level ------------------------------- > > > KSP Object: (mg_coarse_) 1 MPI processes > > > type: preonly > > > maximum iterations=1, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (mg_coarse_) 1 MPI processes > > > type: bjacobi > > > block Jacobi: number of blocks = 1 > > > Local solve is same for all blocks, in the following KSP and PC objects: > > > KSP Object: (mg_coarse_sub_) 1 MPI processes > > > type: preonly > > > maximum iterations=1, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (mg_coarse_sub_) 1 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > > > matrix ordering: nd > > > factor fill ratio given 5, needed 1.91048 > > > Factored matrix follows: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=284, cols=284 > > > package used to perform factorization: petsc > > > total: nonzeros=7726, allocated nonzeros=7726 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node routines: found 133 nodes, limit used is 5 > > > linear system matrix = precond matrix: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=284, cols=284 > > > total: nonzeros=4044, allocated nonzeros=4044 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node routines > > > linear system matrix = precond matrix: > > > Mat Object: 1 MPI processes > > > type: seqaij > > > rows=284, cols=284 > > > total: nonzeros=4044, allocated nonzeros=4044 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node routines > > > Down solver (pre-smoother) on level 1 ------------------------------- > > > KSP Object: (mg_levels_1_) 1 MPI processes > > > type: chebyshev > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = 4.10212 > > > maximum iterations=2 > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using nonzero initial guess > > > using NONE norm type for convergence test > > > PC Object: (mg_levels_1_) 1 MPI processes > > > type: sor > > > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1 > > > linear system matrix = precond matrix: > > > Mat Object: () 1 MPI processes > > > type: seqaij > > > rows=9036, cols=9036 > > > total: nonzeros=192256, allocated nonzeros=192256 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node routines > > > Up solver (post-smoother) same as down solver (pre-smoother) > > > linear system matrix = precond matrix: > > > Mat Object: () 1 MPI processes > > > type: seqaij > > > rows=9036, cols=9036 > > > total: nonzeros=192256, allocated nonzeros=192256 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node routines > > > > > > Thanks, > > > Harshad > > > > > > > From ferreroandrea13 at yahoo.it Tue Sep 13 14:31:01 2016 From: ferreroandrea13 at yahoo.it (Andrea Ferrero) Date: Tue, 13 Sep 2016 21:31:01 +0200 Subject: [petsc-users] Periodic BCs in DMPlex Message-ID: <57D853F5.1090103@yahoo.it> Hello everyone, I have a question about the implementation of periodic boundary conditions with the class DMPlex. I used this class to implement a discontinuous Galerkin discretization: I read and distribute the mesh starting from a GMSH file by means of the DMPlexCreateGmshFromFile function. How do you suggest to manage the periodic boundary condition in DMPlex? Periodic BCs can be trivially activated in DMDA but it seems to me that there isn't the possibility to immediately activate them in DMPlex. Should I read the mesh with DMPlexCreateGmshFromFile and then manually change the Hasse diagram in order to keep into account the periodicity? Many thanks Andrea From balay at mcs.anl.gov Tue Sep 13 14:51:20 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Sep 2016 14:51:20 -0500 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: On Tue, 13 Sep 2016, Satish Balay wrote: > scalapack is getting compiled with this flag '-Df77IsF2C'. This mode > was primarily supported by 'g77' previously - which we hardly ever use > anymore - so this mode is not really tested? Looks like no one ever tested scalapack with open64 compiler. [as its internally inconsistant with the double underscore mode: uses -Df77IsF2C for some part of the code - and -DFortranIsF2C for the other part] The following patch works for me. I've added it to petsc-3.7. You can try patching 3.6 with it and see if it works. https://bitbucket.org/petsc/petsc/commits/3087a8ae6474620b012259e2918d8cbd6a1fd369 Satish From jason.hou at ncsu.edu Tue Sep 13 16:30:15 2016 From: jason.hou at ncsu.edu (Jason Hou) Date: Tue, 13 Sep 2016 17:30:15 -0400 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: Hi Satish, The output is the following: [jhou8 at rdfmg lib]$ nm -Ao libscalapack.a |grep -i blacs_gridinfo libscalapack.a:blacs_abort_.o: U Cblacs_gridinfo libscalapack.a:blacs_info_.o:0000000000000000 T blacs_gridinfo_ libscalapack.a:blacs_abort_.oo: U Cblacs_gridinfo libscalapack.a:blacs_info_.oo:0000000000000000 T Cblacs_gridinfo libscalapack.a:chk1mat.o: U blacs_gridinfo__ libscalapack.a:pchkxmat.o: U blacs_gridinfo__ libscalapack.a:desc_convert.o: U blacs_gridinfo__ libscalapack.a:descinit.o: U blacs_gridinfo__ libscalapack.a:reshape.o: U Cblacs_gridinfo libscalapack.a:SL_gridreshape.o: U Cblacs_gridinfo libscalapack.a:picol2row.o: U blacs_gridinfo__ libscalapack.a:pirow2col.o: U blacs_gridinfo__ libscalapack.a:pilaprnt.o: U blacs_gridinfo__ libscalapack.a:pitreecomb.o: U blacs_gridinfo__ libscalapack.a:pichekpad.o: U blacs_gridinfo__ libscalapack.a:pielset.o: U blacs_gridinfo__ libscalapack.a:pielset2.o: U blacs_gridinfo__ libscalapack.a:pielget.o: U blacs_gridinfo__ libscalapack.a:psmatadd.o: U blacs_gridinfo__ libscalapack.a:pscol2row.o: U blacs_gridinfo__ libscalapack.a:psrow2col.o: U blacs_gridinfo__ libscalapack.a:pslaprnt.o: U blacs_gridinfo__ libscalapack.a:pstreecomb.o: U blacs_gridinfo__ libscalapack.a:pschekpad.o: U blacs_gridinfo__ libscalapack.a:pselset.o: U blacs_gridinfo__ libscalapack.a:pselset2.o: U blacs_gridinfo__ libscalapack.a:pselget.o: U blacs_gridinfo__ libscalapack.a:pslaread.o: U blacs_gridinfo__ libscalapack.a:pslawrite.o: U blacs_gridinfo__ libscalapack.a:pdmatadd.o: U blacs_gridinfo__ libscalapack.a:pdcol2row.o: U blacs_gridinfo__ libscalapack.a:pdrow2col.o: U blacs_gridinfo__ libscalapack.a:pdlaprnt.o: U blacs_gridinfo__ libscalapack.a:pdtreecomb.o: U blacs_gridinfo__ libscalapack.a:pdchekpad.o: U blacs_gridinfo__ libscalapack.a:pdelset.o: U blacs_gridinfo__ libscalapack.a:pdelset2.o: U blacs_gridinfo__ libscalapack.a:pdelget.o: U blacs_gridinfo__ libscalapack.a:pdlaread.o: U blacs_gridinfo__ libscalapack.a:pdlawrite.o: U blacs_gridinfo__ libscalapack.a:pcmatadd.o: U blacs_gridinfo__ libscalapack.a:pccol2row.o: U blacs_gridinfo__ libscalapack.a:pcrow2col.o: U blacs_gridinfo__ libscalapack.a:pclaprnt.o: U blacs_gridinfo__ libscalapack.a:pctreecomb.o: U blacs_gridinfo__ libscalapack.a:pcchekpad.o: U blacs_gridinfo__ libscalapack.a:pcelset.o: U blacs_gridinfo__ libscalapack.a:pcelset2.o: U blacs_gridinfo__ libscalapack.a:pcelget.o: U blacs_gridinfo__ libscalapack.a:pclaread.o: U blacs_gridinfo__ libscalapack.a:pclawrite.o: U blacs_gridinfo__ libscalapack.a:pzmatadd.o: U blacs_gridinfo__ libscalapack.a:pzcol2row.o: U blacs_gridinfo__ libscalapack.a:pzrow2col.o: U blacs_gridinfo__ libscalapack.a:pzlaprnt.o: U blacs_gridinfo__ libscalapack.a:pztreecomb.o: U blacs_gridinfo__ libscalapack.a:pzchekpad.o: U blacs_gridinfo__ libscalapack.a:pzelset.o: U blacs_gridinfo__ libscalapack.a:pzelset2.o: U blacs_gridinfo__ libscalapack.a:pzelget.o: U blacs_gridinfo__ libscalapack.a:pzlaread.o: U blacs_gridinfo__ libscalapack.a:pzlawrite.o: U blacs_gridinfo__ libscalapack.a:picopy_.o: U Cblacs_gridinfo libscalapack.a:pbstran.o: U blacs_gridinfo__ libscalapack.a:pbstrnv.o: U blacs_gridinfo__ libscalapack.a:pxerbla.o: U blacs_gridinfo__ libscalapack.a:PB_CGatherV.o: U Cblacs_gridinfo libscalapack.a:PB_CInV.o: U Cblacs_gridinfo libscalapack.a:PB_CInV2.o: U Cblacs_gridinfo libscalapack.a:PB_CInOutV.o: U Cblacs_gridinfo libscalapack.a:PB_CInOutV2.o: U Cblacs_gridinfo libscalapack.a:PB_COutV.o: U Cblacs_gridinfo libscalapack.a:PB_CScatterV.o: U Cblacs_gridinfo libscalapack.a:PB_Cabort.o: U Cblacs_gridinfo libscalapack.a:PB_Cchkmat.o: U Cblacs_gridinfo libscalapack.a:PB_Cchkvec.o: U Cblacs_gridinfo libscalapack.a:PB_CpswapNN.o: U Cblacs_gridinfo libscalapack.a:PB_CpswapND.o: U Cblacs_gridinfo libscalapack.a:PB_Cpdot11.o: U Cblacs_gridinfo libscalapack.a:PB_CpdotNN.o: U Cblacs_gridinfo libscalapack.a:PB_CpdotND.o: U Cblacs_gridinfo libscalapack.a:PB_CpaxpbyNN.o: U Cblacs_gridinfo libscalapack.a:PB_CpaxpbyND.o: U Cblacs_gridinfo libscalapack.a:PB_CpaxpbyDN.o: U Cblacs_gridinfo libscalapack.a:PB_Cpaxpby.o: U Cblacs_gridinfo libscalapack.a:PB_CpgemmBC.o: U Cblacs_gridinfo libscalapack.a:PB_CpgemmAC.o: U Cblacs_gridinfo libscalapack.a:PB_CpgemmAB.o: U Cblacs_gridinfo libscalapack.a:PB_Cplaprnt.o: U Cblacs_gridinfo libscalapack.a:PB_Cplapad.o: U Cblacs_gridinfo libscalapack.a:PB_Cplapd2.o: U Cblacs_gridinfo libscalapack.a:PB_Cplascal.o: U Cblacs_gridinfo libscalapack.a:PB_Cplasca2.o: U Cblacs_gridinfo libscalapack.a:PB_Cplacnjg.o: U Cblacs_gridinfo libscalapack.a:PB_Cpsym.o: U Cblacs_gridinfo libscalapack.a:PB_CpsymmAB.o: U Cblacs_gridinfo libscalapack.a:PB_CpsymmBC.o: U Cblacs_gridinfo libscalapack.a:PB_Cpsyr.o: U Cblacs_gridinfo libscalapack.a:PB_CpsyrkA.o: U Cblacs_gridinfo libscalapack.a:PB_CpsyrkAC.o: U Cblacs_gridinfo libscalapack.a:PB_Cpsyr2.o: U Cblacs_gridinfo libscalapack.a:PB_Cpsyr2kA.o: U Cblacs_gridinfo libscalapack.a:PB_Cpsyr2kAC.o: U Cblacs_gridinfo libscalapack.a:PB_Cptrm.o: U Cblacs_gridinfo libscalapack.a:PB_Cpgeadd.o: U Cblacs_gridinfo libscalapack.a:PB_Cptran.o: U Cblacs_gridinfo libscalapack.a:PB_CptrmmAB.o: U Cblacs_gridinfo libscalapack.a:PB_CptrmmB.o: U Cblacs_gridinfo libscalapack.a:PB_Cptrsm.o: U Cblacs_gridinfo libscalapack.a:PB_CptrsmAB.o: U Cblacs_gridinfo libscalapack.a:PB_CptrsmAB0.o: U Cblacs_gridinfo libscalapack.a:PB_CptrsmAB1.o: U Cblacs_gridinfo libscalapack.a:PB_CptrsmB.o: U Cblacs_gridinfo libscalapack.a:PB_Cptrsv.o: U Cblacs_gridinfo libscalapack.a:PB_Cwarn.o: U Cblacs_gridinfo libscalapack.a:psswap_.o: U Cblacs_gridinfo libscalapack.a:psscal_.o: U Cblacs_gridinfo libscalapack.a:pscopy_.o: U Cblacs_gridinfo libscalapack.a:psaxpy_.o: U Cblacs_gridinfo libscalapack.a:psdot_.o: U Cblacs_gridinfo libscalapack.a:psnrm2_.o: U Cblacs_gridinfo libscalapack.a:psasum_.o: U Cblacs_gridinfo libscalapack.a:psamax_.o: U Cblacs_gridinfo libscalapack.a:psgemv_.o: U Cblacs_gridinfo libscalapack.a:psger_.o: U Cblacs_gridinfo libscalapack.a:pssymv_.o: U Cblacs_gridinfo libscalapack.a:pssyr_.o: U Cblacs_gridinfo libscalapack.a:pssyr2_.o: U Cblacs_gridinfo libscalapack.a:pstrmv_.o: U Cblacs_gridinfo libscalapack.a:pstrsv_.o: U Cblacs_gridinfo libscalapack.a:psagemv_.o: U Cblacs_gridinfo libscalapack.a:psasymv_.o: U Cblacs_gridinfo libscalapack.a:psatrmv_.o: U Cblacs_gridinfo libscalapack.a:psgeadd_.o: U Cblacs_gridinfo libscalapack.a:psgemm_.o: U Cblacs_gridinfo libscalapack.a:pssymm_.o: U Cblacs_gridinfo libscalapack.a:pssyr2k_.o: U Cblacs_gridinfo libscalapack.a:pssyrk_.o: U Cblacs_gridinfo libscalapack.a:pstradd_.o: U Cblacs_gridinfo libscalapack.a:pstran_.o: U Cblacs_gridinfo libscalapack.a:pstrmm_.o: U Cblacs_gridinfo libscalapack.a:pstrsm_.o: U Cblacs_gridinfo libscalapack.a:pbdtran.o: U blacs_gridinfo__ libscalapack.a:pbdtrnv.o: U blacs_gridinfo__ libscalapack.a:pdswap_.o: U Cblacs_gridinfo libscalapack.a:pdscal_.o: U Cblacs_gridinfo libscalapack.a:pdcopy_.o: U Cblacs_gridinfo libscalapack.a:pdaxpy_.o: U Cblacs_gridinfo libscalapack.a:pddot_.o: U Cblacs_gridinfo libscalapack.a:pdnrm2_.o: U Cblacs_gridinfo libscalapack.a:pdasum_.o: U Cblacs_gridinfo libscalapack.a:pdamax_.o: U Cblacs_gridinfo libscalapack.a:pdgemv_.o: U Cblacs_gridinfo libscalapack.a:pdger_.o: U Cblacs_gridinfo libscalapack.a:pdsymv_.o: U Cblacs_gridinfo libscalapack.a:pdsyr_.o: U Cblacs_gridinfo libscalapack.a:pdsyr2_.o: U Cblacs_gridinfo libscalapack.a:pdtrmv_.o: U Cblacs_gridinfo libscalapack.a:pdtrsv_.o: U Cblacs_gridinfo libscalapack.a:pdagemv_.o: U Cblacs_gridinfo libscalapack.a:pdasymv_.o: U Cblacs_gridinfo libscalapack.a:pdatrmv_.o: U Cblacs_gridinfo libscalapack.a:pdgeadd_.o: U Cblacs_gridinfo libscalapack.a:pdgemm_.o: U Cblacs_gridinfo libscalapack.a:pdsymm_.o: U Cblacs_gridinfo libscalapack.a:pdsyr2k_.o: U Cblacs_gridinfo libscalapack.a:pdsyrk_.o: U Cblacs_gridinfo libscalapack.a:pdtradd_.o: U Cblacs_gridinfo libscalapack.a:pdtran_.o: U Cblacs_gridinfo libscalapack.a:pdtrmm_.o: U Cblacs_gridinfo libscalapack.a:pdtrsm_.o: U Cblacs_gridinfo libscalapack.a:pbctran.o: U blacs_gridinfo__ libscalapack.a:pbctrnv.o: U blacs_gridinfo__ libscalapack.a:pcswap_.o: U Cblacs_gridinfo libscalapack.a:pcscal_.o: U Cblacs_gridinfo libscalapack.a:pcsscal_.o: U Cblacs_gridinfo libscalapack.a:pccopy_.o: U Cblacs_gridinfo libscalapack.a:pcaxpy_.o: U Cblacs_gridinfo libscalapack.a:pcdotu_.o: U Cblacs_gridinfo libscalapack.a:pcdotc_.o: U Cblacs_gridinfo libscalapack.a:pscnrm2_.o: U Cblacs_gridinfo libscalapack.a:pscasum_.o: U Cblacs_gridinfo libscalapack.a:pcamax_.o: U Cblacs_gridinfo libscalapack.a:pcgemv_.o: U Cblacs_gridinfo libscalapack.a:pcgerc_.o: U Cblacs_gridinfo libscalapack.a:pcgeru_.o: U Cblacs_gridinfo libscalapack.a:pchemv_.o: U Cblacs_gridinfo libscalapack.a:pcher_.o: U Cblacs_gridinfo libscalapack.a:pcher2_.o: U Cblacs_gridinfo libscalapack.a:pctrmv_.o: U Cblacs_gridinfo libscalapack.a:pctrsv_.o: U Cblacs_gridinfo libscalapack.a:pcagemv_.o: U Cblacs_gridinfo libscalapack.a:pcahemv_.o: U Cblacs_gridinfo libscalapack.a:pcatrmv_.o: U Cblacs_gridinfo libscalapack.a:pcgeadd_.o: U Cblacs_gridinfo libscalapack.a:pcgemm_.o: U Cblacs_gridinfo libscalapack.a:pchemm_.o: U Cblacs_gridinfo libscalapack.a:pcher2k_.o: U Cblacs_gridinfo libscalapack.a:pcherk_.o: U Cblacs_gridinfo libscalapack.a:pcsymm_.o: U Cblacs_gridinfo libscalapack.a:pcsyr2k_.o: U Cblacs_gridinfo libscalapack.a:pcsyrk_.o: U Cblacs_gridinfo libscalapack.a:pctradd_.o: U Cblacs_gridinfo libscalapack.a:pctranc_.o: U Cblacs_gridinfo libscalapack.a:pctranu_.o: U Cblacs_gridinfo libscalapack.a:pctrmm_.o: U Cblacs_gridinfo libscalapack.a:pctrsm_.o: U Cblacs_gridinfo libscalapack.a:pbztran.o: U blacs_gridinfo__ libscalapack.a:pbztrnv.o: U blacs_gridinfo__ libscalapack.a:pzswap_.o: U Cblacs_gridinfo libscalapack.a:pzscal_.o: U Cblacs_gridinfo libscalapack.a:pzdscal_.o: U Cblacs_gridinfo libscalapack.a:pzcopy_.o: U Cblacs_gridinfo libscalapack.a:pzaxpy_.o: U Cblacs_gridinfo libscalapack.a:pzdotu_.o: U Cblacs_gridinfo libscalapack.a:pzdotc_.o: U Cblacs_gridinfo libscalapack.a:pdznrm2_.o: U Cblacs_gridinfo libscalapack.a:pdzasum_.o: U Cblacs_gridinfo libscalapack.a:pzamax_.o: U Cblacs_gridinfo libscalapack.a:pzgemv_.o: U Cblacs_gridinfo libscalapack.a:pzgerc_.o: U Cblacs_gridinfo libscalapack.a:pzgeru_.o: U Cblacs_gridinfo libscalapack.a:pzhemv_.o: U Cblacs_gridinfo libscalapack.a:pzher_.o: U Cblacs_gridinfo libscalapack.a:pzher2_.o: U Cblacs_gridinfo libscalapack.a:pztrmv_.o: U Cblacs_gridinfo libscalapack.a:pztrsv_.o: U Cblacs_gridinfo libscalapack.a:pzagemv_.o: U Cblacs_gridinfo libscalapack.a:pzahemv_.o: U Cblacs_gridinfo libscalapack.a:pzatrmv_.o: U Cblacs_gridinfo libscalapack.a:pzgeadd_.o: U Cblacs_gridinfo libscalapack.a:pzgemm_.o: U Cblacs_gridinfo libscalapack.a:pzhemm_.o: U Cblacs_gridinfo libscalapack.a:pzher2k_.o: U Cblacs_gridinfo libscalapack.a:pzherk_.o: U Cblacs_gridinfo libscalapack.a:pzsymm_.o: U Cblacs_gridinfo libscalapack.a:pzsyr2k_.o: U Cblacs_gridinfo libscalapack.a:pzsyrk_.o: U Cblacs_gridinfo libscalapack.a:pztradd_.o: U Cblacs_gridinfo libscalapack.a:pztranc_.o: U Cblacs_gridinfo libscalapack.a:pztranu_.o: U Cblacs_gridinfo libscalapack.a:pztrmm_.o: U Cblacs_gridinfo libscalapack.a:pztrsm_.o: U Cblacs_gridinfo libscalapack.a:pigemr.o: U Cblacs_gridinfo libscalapack.a:pitrmr.o: U Cblacs_gridinfo libscalapack.a:pgemraux.o: U Cblacs_gridinfo libscalapack.a:psgemr.o: U Cblacs_gridinfo libscalapack.a:pstrmr.o: U Cblacs_gridinfo libscalapack.a:pdgemr.o: U Cblacs_gridinfo libscalapack.a:pdtrmr.o: U Cblacs_gridinfo libscalapack.a:pcgemr.o: U Cblacs_gridinfo libscalapack.a:pctrmr.o: U Cblacs_gridinfo libscalapack.a:pzgemr.o: U Cblacs_gridinfo libscalapack.a:pztrmr.o: U Cblacs_gridinfo libscalapack.a:psdbsv.o: U blacs_gridinfo__ libscalapack.a:psdbtrf.o: U blacs_gridinfo__ libscalapack.a:psdbtrs.o: U blacs_gridinfo__ libscalapack.a:psdbtrsv.o: U blacs_gridinfo__ libscalapack.a:psdtsv.o: U blacs_gridinfo__ libscalapack.a:psdttrf.o: U blacs_gridinfo__ libscalapack.a:psdttrs.o: U blacs_gridinfo__ libscalapack.a:psdttrsv.o: U blacs_gridinfo__ libscalapack.a:psgbsv.o: U blacs_gridinfo__ libscalapack.a:psgbtrf.o: U blacs_gridinfo__ libscalapack.a:psgbtrs.o: U blacs_gridinfo__ libscalapack.a:psgebd2.o: U blacs_gridinfo__ libscalapack.a:psgebrd.o: U blacs_gridinfo__ libscalapack.a:psgecon.o: U blacs_gridinfo__ libscalapack.a:psgeequ.o: U blacs_gridinfo__ libscalapack.a:psgehd2.o: U blacs_gridinfo__ libscalapack.a:psgehrd.o: U blacs_gridinfo__ libscalapack.a:psgelq2.o: U blacs_gridinfo__ libscalapack.a:psgelqf.o: U blacs_gridinfo__ libscalapack.a:psgels.o: U blacs_gridinfo__ libscalapack.a:psgeql2.o: U blacs_gridinfo__ libscalapack.a:psgeqlf.o: U blacs_gridinfo__ libscalapack.a:psgeqpf.o: U blacs_gridinfo__ libscalapack.a:psgeqr2.o: U blacs_gridinfo__ libscalapack.a:psgeqrf.o: U blacs_gridinfo__ libscalapack.a:psgerfs.o: U blacs_gridinfo__ libscalapack.a:psgerq2.o: U blacs_gridinfo__ libscalapack.a:psgerqf.o: U blacs_gridinfo__ libscalapack.a:psgesv.o: U blacs_gridinfo__ libscalapack.a:psgesvd.o: U blacs_gridinfo__ libscalapack.a:psgesvx.o: U blacs_gridinfo__ libscalapack.a:psgetf2.o: U blacs_gridinfo__ libscalapack.a:psgetrf.o: U blacs_gridinfo__ libscalapack.a:psgetri.o: U blacs_gridinfo__ libscalapack.a:psgetrs.o: U blacs_gridinfo__ libscalapack.a:psggqrf.o: U blacs_gridinfo__ libscalapack.a:psggrqf.o: U blacs_gridinfo__ libscalapack.a:pslabrd.o: U blacs_gridinfo__ libscalapack.a:pslacon.o: U blacs_gridinfo__ libscalapack.a:pslacp2.o: U blacs_gridinfo__ libscalapack.a:pslahrd.o: U blacs_gridinfo__ libscalapack.a:pslange.o: U blacs_gridinfo__ libscalapack.a:pslanhs.o: U blacs_gridinfo__ libscalapack.a:pslansy.o: U blacs_gridinfo__ libscalapack.a:pslantr.o: U blacs_gridinfo__ libscalapack.a:pslapiv.o: U blacs_gridinfo__ libscalapack.a:pslapv2.o: U blacs_gridinfo__ libscalapack.a:pslaqge.o: U blacs_gridinfo__ libscalapack.a:pslaqsy.o: U blacs_gridinfo__ libscalapack.a:pslarf.o: U blacs_gridinfo__ libscalapack.a:pslarfb.o: U blacs_gridinfo__ libscalapack.a:pslarfg.o: U blacs_gridinfo__ libscalapack.a:pslarft.o: U blacs_gridinfo__ libscalapack.a:pslase2.o: U blacs_gridinfo__ libscalapack.a:pslascl.o: U blacs_gridinfo__ libscalapack.a:pslassq.o: U blacs_gridinfo__ libscalapack.a:pslaswp.o: U blacs_gridinfo__ libscalapack.a:pslatra.o: U blacs_gridinfo__ libscalapack.a:pslatrd.o: U blacs_gridinfo__ libscalapack.a:pslatrs.o: U blacs_gridinfo__ libscalapack.a:pslauu2.o: U blacs_gridinfo__ libscalapack.a:psorg2l.o: U blacs_gridinfo__ libscalapack.a:psorg2r.o: U blacs_gridinfo__ libscalapack.a:psorgl2.o: U blacs_gridinfo__ libscalapack.a:psorglq.o: U blacs_gridinfo__ libscalapack.a:psorgql.o: U blacs_gridinfo__ libscalapack.a:psorgqr.o: U blacs_gridinfo__ libscalapack.a:psorgr2.o: U blacs_gridinfo__ libscalapack.a:psorgrq.o: U blacs_gridinfo__ libscalapack.a:psorm2l.o: U blacs_gridinfo__ libscalapack.a:psorm2r.o: U blacs_gridinfo__ libscalapack.a:psormbr.o: U blacs_gridinfo__ libscalapack.a:psormhr.o: U blacs_gridinfo__ libscalapack.a:psorml2.o: U blacs_gridinfo__ libscalapack.a:psormlq.o: U blacs_gridinfo__ libscalapack.a:psormql.o: U blacs_gridinfo__ libscalapack.a:psormqr.o: U blacs_gridinfo__ libscalapack.a:psormr2.o: U blacs_gridinfo__ libscalapack.a:psormrq.o: U blacs_gridinfo__ libscalapack.a:psormtr.o: U blacs_gridinfo__ libscalapack.a:pspocon.o: U blacs_gridinfo__ libscalapack.a:pspbsv.o: U blacs_gridinfo__ libscalapack.a:pspbtrf.o: U blacs_gridinfo__ libscalapack.a:pspbtrs.o: U blacs_gridinfo__ libscalapack.a:pspbtrsv.o: U blacs_gridinfo__ libscalapack.a:psptsv.o: U blacs_gridinfo__ libscalapack.a:pspttrf.o: U blacs_gridinfo__ libscalapack.a:pspttrs.o: U blacs_gridinfo__ libscalapack.a:pspttrsv.o: U blacs_gridinfo__ libscalapack.a:pspoequ.o: U blacs_gridinfo__ libscalapack.a:psporfs.o: U blacs_gridinfo__ libscalapack.a:psposv.o: U blacs_gridinfo__ libscalapack.a:psposvx.o: U blacs_gridinfo__ libscalapack.a:pspotf2.o: U blacs_gridinfo__ libscalapack.a:pspotrf.o: U blacs_gridinfo__ libscalapack.a:pspotri.o: U blacs_gridinfo__ libscalapack.a:pspotrs.o: U blacs_gridinfo__ libscalapack.a:psrscl.o: U blacs_gridinfo__ libscalapack.a:psstein.o: U blacs_gridinfo__ libscalapack.a:pssyev.o: U blacs_gridinfo__ libscalapack.a:pssyevd.o: U blacs_gridinfo__ libscalapack.a:pssyevx.o: U blacs_gridinfo__ libscalapack.a:pssygs2.o: U blacs_gridinfo__ libscalapack.a:pssygst.o: U blacs_gridinfo__ libscalapack.a:pssygvx.o: U blacs_gridinfo__ libscalapack.a:pssyngst.o: U blacs_gridinfo__ libscalapack.a:pssyntrd.o: U blacs_gridinfo__ libscalapack.a:pssyttrd.o: U blacs_gridinfo__ libscalapack.a:pssytd2.o: U blacs_gridinfo__ libscalapack.a:pssytrd.o: U blacs_gridinfo__ libscalapack.a:pstrti2.o: U blacs_gridinfo__ libscalapack.a:pstrtri.o: U blacs_gridinfo__ libscalapack.a:pstrtrs.o: U blacs_gridinfo__ libscalapack.a:pslaevswp.o: U blacs_gridinfo__ libscalapack.a:pslarzb.o: U blacs_gridinfo__ libscalapack.a:pslarzt.o: U blacs_gridinfo__ libscalapack.a:pslarz.o: U blacs_gridinfo__ libscalapack.a:pslatrz.o: U blacs_gridinfo__ libscalapack.a:pstzrzf.o: U blacs_gridinfo__ libscalapack.a:psormr3.o: U blacs_gridinfo__ libscalapack.a:psormrz.o: U blacs_gridinfo__ libscalapack.a:pslahqr.o: U blacs_gridinfo__ libscalapack.a:pslaconsb.o: U blacs_gridinfo__ libscalapack.a:pslacp3.o: U blacs_gridinfo__ libscalapack.a:pslawil.o: U blacs_gridinfo__ libscalapack.a:pslasmsub.o: U blacs_gridinfo__ libscalapack.a:pslared2d.o: U blacs_gridinfo__ libscalapack.a:pslamr1d.o: U blacs_gridinfo__ libscalapack.a:pssyevr.o: U blacs_gridinfo__ libscalapack.a:pstrord.o: U blacs_gridinfo__ libscalapack.a:pstrsen.o: U blacs_gridinfo__ libscalapack.a:psgebal.o: U blacs_gridinfo__ libscalapack.a:pshseqr.o: U blacs_gridinfo__ libscalapack.a:pslamve.o: U blacs_gridinfo__ libscalapack.a:pslaqr0.o: U blacs_gridinfo__ libscalapack.a:pslaqr1.o: U blacs_gridinfo__ libscalapack.a:pslaqr2.o: U blacs_gridinfo__ libscalapack.a:pslaqr3.o: U blacs_gridinfo__ libscalapack.a:pslaqr4.o: U blacs_gridinfo__ libscalapack.a:pslaqr5.o: U blacs_gridinfo__ libscalapack.a:psrot.o: U blacs_gridinfo__ libscalapack.a:pslaed0.o: U blacs_gridinfo__ libscalapack.a:pslaed1.o: U blacs_gridinfo__ libscalapack.a:pslaed2.o: U blacs_gridinfo__ libscalapack.a:pslaed3.o: U blacs_gridinfo__ libscalapack.a:pslaedz.o: U blacs_gridinfo__ libscalapack.a:pslared1d.o: U blacs_gridinfo__ libscalapack.a:pslasrt.o: U blacs_gridinfo__ libscalapack.a:psstebz.o: U blacs_gridinfo__ libscalapack.a:psstedc.o: U blacs_gridinfo__ libscalapack.a:pilaenvx.o: U blacs_gridinfo__ libscalapack.a:piparmq.o: U blacs_gridinfo__ libscalapack.a:pddbsv.o: U blacs_gridinfo__ libscalapack.a:pddbtrf.o: U blacs_gridinfo__ libscalapack.a:pddbtrs.o: U blacs_gridinfo__ libscalapack.a:pddbtrsv.o: U blacs_gridinfo__ libscalapack.a:pddtsv.o: U blacs_gridinfo__ libscalapack.a:pddttrf.o: U blacs_gridinfo__ libscalapack.a:pddttrs.o: U blacs_gridinfo__ libscalapack.a:pddttrsv.o: U blacs_gridinfo__ libscalapack.a:pdgbsv.o: U blacs_gridinfo__ libscalapack.a:pdgbtrf.o: U blacs_gridinfo__ libscalapack.a:pdgbtrs.o: U blacs_gridinfo__ libscalapack.a:pdgebd2.o: U blacs_gridinfo__ libscalapack.a:pdgebrd.o: U blacs_gridinfo__ libscalapack.a:pdgecon.o: U blacs_gridinfo__ libscalapack.a:pdgeequ.o: U blacs_gridinfo__ libscalapack.a:pdgehd2.o: U blacs_gridinfo__ libscalapack.a:pdgehrd.o: U blacs_gridinfo__ libscalapack.a:pdgelq2.o: U blacs_gridinfo__ libscalapack.a:pdgelqf.o: U blacs_gridinfo__ libscalapack.a:pdgels.o: U blacs_gridinfo__ libscalapack.a:pdgeql2.o: U blacs_gridinfo__ libscalapack.a:pdgeqlf.o: U blacs_gridinfo__ libscalapack.a:pdgeqpf.o: U blacs_gridinfo__ libscalapack.a:pdgeqr2.o: U blacs_gridinfo__ libscalapack.a:pdgeqrf.o: U blacs_gridinfo__ libscalapack.a:pdgerfs.o: U blacs_gridinfo__ libscalapack.a:pdgerq2.o: U blacs_gridinfo__ libscalapack.a:pdgerqf.o: U blacs_gridinfo__ libscalapack.a:pdgesv.o: U blacs_gridinfo__ libscalapack.a:pdgesvd.o: U blacs_gridinfo__ libscalapack.a:pdgesvx.o: U blacs_gridinfo__ libscalapack.a:pdgetf2.o: U blacs_gridinfo__ libscalapack.a:pdgetrf.o: U blacs_gridinfo__ libscalapack.a:pdgetri.o: U blacs_gridinfo__ libscalapack.a:pdgetrs.o: U blacs_gridinfo__ libscalapack.a:pdggqrf.o: U blacs_gridinfo__ libscalapack.a:pdggrqf.o: U blacs_gridinfo__ libscalapack.a:pdlabrd.o: U blacs_gridinfo__ libscalapack.a:pdlacon.o: U blacs_gridinfo__ libscalapack.a:pdlacp2.o: U blacs_gridinfo__ libscalapack.a:pdlahrd.o: U blacs_gridinfo__ libscalapack.a:pdlange.o: U blacs_gridinfo__ libscalapack.a:pdlanhs.o: U blacs_gridinfo__ libscalapack.a:pdlansy.o: U blacs_gridinfo__ libscalapack.a:pdlantr.o: U blacs_gridinfo__ libscalapack.a:pdlapiv.o: U blacs_gridinfo__ libscalapack.a:pdlapv2.o: U blacs_gridinfo__ libscalapack.a:pdlaqge.o: U blacs_gridinfo__ libscalapack.a:pdlaqsy.o: U blacs_gridinfo__ libscalapack.a:pdlarf.o: U blacs_gridinfo__ libscalapack.a:pdlarfb.o: U blacs_gridinfo__ libscalapack.a:pdlarfg.o: U blacs_gridinfo__ libscalapack.a:pdlarft.o: U blacs_gridinfo__ libscalapack.a:pdlase2.o: U blacs_gridinfo__ libscalapack.a:pdlascl.o: U blacs_gridinfo__ libscalapack.a:pdlassq.o: U blacs_gridinfo__ libscalapack.a:pdlaswp.o: U blacs_gridinfo__ libscalapack.a:pdlatra.o: U blacs_gridinfo__ libscalapack.a:pdlatrd.o: U blacs_gridinfo__ libscalapack.a:pdlatrs.o: U blacs_gridinfo__ libscalapack.a:pdlauu2.o: U blacs_gridinfo__ libscalapack.a:pdorg2l.o: U blacs_gridinfo__ libscalapack.a:pdorg2r.o: U blacs_gridinfo__ libscalapack.a:pdorgl2.o: U blacs_gridinfo__ libscalapack.a:pdorglq.o: U blacs_gridinfo__ libscalapack.a:pdorgql.o: U blacs_gridinfo__ libscalapack.a:pdorgqr.o: U blacs_gridinfo__ libscalapack.a:pdorgr2.o: U blacs_gridinfo__ libscalapack.a:pdorgrq.o: U blacs_gridinfo__ libscalapack.a:pdorm2l.o: U blacs_gridinfo__ libscalapack.a:pdorm2r.o: U blacs_gridinfo__ libscalapack.a:pdormbr.o: U blacs_gridinfo__ libscalapack.a:pdormhr.o: U blacs_gridinfo__ libscalapack.a:pdorml2.o: U blacs_gridinfo__ libscalapack.a:pdormlq.o: U blacs_gridinfo__ libscalapack.a:pdormql.o: U blacs_gridinfo__ libscalapack.a:pdormqr.o: U blacs_gridinfo__ libscalapack.a:pdormr2.o: U blacs_gridinfo__ libscalapack.a:pdormrq.o: U blacs_gridinfo__ libscalapack.a:pdormtr.o: U blacs_gridinfo__ libscalapack.a:pdpocon.o: U blacs_gridinfo__ libscalapack.a:pdpbsv.o: U blacs_gridinfo__ libscalapack.a:pdpbtrf.o: U blacs_gridinfo__ libscalapack.a:pdpbtrs.o: U blacs_gridinfo__ libscalapack.a:pdpbtrsv.o: U blacs_gridinfo__ libscalapack.a:pdptsv.o: U blacs_gridinfo__ libscalapack.a:pdpttrf.o: U blacs_gridinfo__ libscalapack.a:pdpttrs.o: U blacs_gridinfo__ libscalapack.a:pdpttrsv.o: U blacs_gridinfo__ libscalapack.a:pdpoequ.o: U blacs_gridinfo__ libscalapack.a:pdporfs.o: U blacs_gridinfo__ libscalapack.a:pdposv.o: U blacs_gridinfo__ libscalapack.a:pdposvx.o: U blacs_gridinfo__ libscalapack.a:pdpotf2.o: U blacs_gridinfo__ libscalapack.a:pdpotrf.o: U blacs_gridinfo__ libscalapack.a:pdpotri.o: U blacs_gridinfo__ libscalapack.a:pdpotrs.o: U blacs_gridinfo__ libscalapack.a:pdrscl.o: U blacs_gridinfo__ libscalapack.a:pdstein.o: U blacs_gridinfo__ libscalapack.a:pdsyev.o: U blacs_gridinfo__ libscalapack.a:pdsyevd.o: U blacs_gridinfo__ libscalapack.a:pdsyevx.o: U blacs_gridinfo__ libscalapack.a:pdsygs2.o: U blacs_gridinfo__ libscalapack.a:pdsygst.o: U blacs_gridinfo__ libscalapack.a:pdsygvx.o: U blacs_gridinfo__ libscalapack.a:pdsyngst.o: U blacs_gridinfo__ libscalapack.a:pdsyntrd.o: U blacs_gridinfo__ libscalapack.a:pdsyttrd.o: U blacs_gridinfo__ libscalapack.a:pdsytd2.o: U blacs_gridinfo__ libscalapack.a:pdsytrd.o: U blacs_gridinfo__ libscalapack.a:pdtrti2.o: U blacs_gridinfo__ libscalapack.a:pdtrtri.o: U blacs_gridinfo__ libscalapack.a:pdtrtrs.o: U blacs_gridinfo__ libscalapack.a:pdlaevswp.o: U blacs_gridinfo__ libscalapack.a:pdlarzb.o: U blacs_gridinfo__ libscalapack.a:pdlarzt.o: U blacs_gridinfo__ libscalapack.a:pdlarz.o: U blacs_gridinfo__ libscalapack.a:pdlatrz.o: U blacs_gridinfo__ libscalapack.a:pdtzrzf.o: U blacs_gridinfo__ libscalapack.a:pdormr3.o: U blacs_gridinfo__ libscalapack.a:pdormrz.o: U blacs_gridinfo__ libscalapack.a:pdlahqr.o: U blacs_gridinfo__ libscalapack.a:pdlaconsb.o: U blacs_gridinfo__ libscalapack.a:pdlacp3.o: U blacs_gridinfo__ libscalapack.a:pdlawil.o: U blacs_gridinfo__ libscalapack.a:pdlasmsub.o: U blacs_gridinfo__ libscalapack.a:pdlared2d.o: U blacs_gridinfo__ libscalapack.a:pdlamr1d.o: U blacs_gridinfo__ libscalapack.a:pdsyevr.o: U blacs_gridinfo__ libscalapack.a:pdtrord.o: U blacs_gridinfo__ libscalapack.a:pdtrsen.o: U blacs_gridinfo__ libscalapack.a:pdgebal.o: U blacs_gridinfo__ libscalapack.a:pdhseqr.o: U blacs_gridinfo__ libscalapack.a:pdlamve.o: U blacs_gridinfo__ libscalapack.a:pdlaqr0.o: U blacs_gridinfo__ libscalapack.a:pdlaqr1.o: U blacs_gridinfo__ libscalapack.a:pdlaqr2.o: U blacs_gridinfo__ libscalapack.a:pdlaqr3.o: U blacs_gridinfo__ libscalapack.a:pdlaqr4.o: U blacs_gridinfo__ libscalapack.a:pdlaqr5.o: U blacs_gridinfo__ libscalapack.a:pdrot.o: U blacs_gridinfo__ libscalapack.a:pdlaed0.o: U blacs_gridinfo__ libscalapack.a:pdlaed1.o: U blacs_gridinfo__ libscalapack.a:pdlaed2.o: U blacs_gridinfo__ libscalapack.a:pdlaed3.o: U blacs_gridinfo__ libscalapack.a:pdlaedz.o: U blacs_gridinfo__ libscalapack.a:pdlared1d.o: U blacs_gridinfo__ libscalapack.a:pdlasrt.o: U blacs_gridinfo__ libscalapack.a:pdstebz.o: U blacs_gridinfo__ libscalapack.a:pdstedc.o: U blacs_gridinfo__ libscalapack.a:pcdbsv.o: U blacs_gridinfo__ libscalapack.a:pcdbtrf.o: U blacs_gridinfo__ libscalapack.a:pcdbtrs.o: U blacs_gridinfo__ libscalapack.a:pcdbtrsv.o: U blacs_gridinfo__ libscalapack.a:pcdtsv.o: U blacs_gridinfo__ libscalapack.a:pcdttrf.o: U blacs_gridinfo__ libscalapack.a:pcdttrs.o: U blacs_gridinfo__ libscalapack.a:pcdttrsv.o: U blacs_gridinfo__ libscalapack.a:pcgbsv.o: U blacs_gridinfo__ libscalapack.a:pcgbtrf.o: U blacs_gridinfo__ libscalapack.a:pcgbtrs.o: U blacs_gridinfo__ libscalapack.a:pcgebd2.o: U blacs_gridinfo__ libscalapack.a:pcgebrd.o: U blacs_gridinfo__ libscalapack.a:pcgecon.o: U blacs_gridinfo__ libscalapack.a:pcgeequ.o: U blacs_gridinfo__ libscalapack.a:pcgehd2.o: U blacs_gridinfo__ libscalapack.a:pcgehrd.o: U blacs_gridinfo__ libscalapack.a:pcgelq2.o: U blacs_gridinfo__ libscalapack.a:pcgelqf.o: U blacs_gridinfo__ libscalapack.a:pcgels.o: U blacs_gridinfo__ libscalapack.a:pcgeql2.o: U blacs_gridinfo__ libscalapack.a:pcgeqlf.o: U blacs_gridinfo__ libscalapack.a:pcgeqpf.o: U blacs_gridinfo__ libscalapack.a:pcgeqr2.o: U blacs_gridinfo__ libscalapack.a:pcgeqrf.o: U blacs_gridinfo__ libscalapack.a:pcgerfs.o: U blacs_gridinfo__ libscalapack.a:pcgerq2.o: U blacs_gridinfo__ libscalapack.a:pcgerqf.o: U blacs_gridinfo__ libscalapack.a:pcgesv.o: U blacs_gridinfo__ libscalapack.a:pcgesvd.o: U blacs_gridinfo__ libscalapack.a:pcgesvx.o: U blacs_gridinfo__ libscalapack.a:pcgetf2.o: U blacs_gridinfo__ libscalapack.a:pcgetrf.o: U blacs_gridinfo__ libscalapack.a:pcgetri.o: U blacs_gridinfo__ libscalapack.a:pcgetrs.o: U blacs_gridinfo__ libscalapack.a:pcggqrf.o: U blacs_gridinfo__ libscalapack.a:pcggrqf.o: U blacs_gridinfo__ libscalapack.a:pcheev.o: U blacs_gridinfo__ libscalapack.a:pcheevd.o: U blacs_gridinfo__ libscalapack.a:pcheevx.o: U blacs_gridinfo__ libscalapack.a:pchegs2.o: U blacs_gridinfo__ libscalapack.a:pchegst.o: U blacs_gridinfo__ libscalapack.a:pchegvx.o: U blacs_gridinfo__ libscalapack.a:pchengst.o: U blacs_gridinfo__ libscalapack.a:pchentrd.o: U blacs_gridinfo__ libscalapack.a:pchettrd.o: U blacs_gridinfo__ libscalapack.a:pchetd2.o: U blacs_gridinfo__ libscalapack.a:pchetrd.o: U blacs_gridinfo__ libscalapack.a:pclabrd.o: U blacs_gridinfo__ libscalapack.a:pclacon.o: U blacs_gridinfo__ libscalapack.a:pclacgv.o: U blacs_gridinfo__ libscalapack.a:pclacp2.o: U blacs_gridinfo__ libscalapack.a:pclahrd.o: U blacs_gridinfo__ libscalapack.a:pclahqr.o: U blacs_gridinfo__ libscalapack.a:pclaconsb.o: U blacs_gridinfo__ libscalapack.a:pclasmsub.o: U blacs_gridinfo__ libscalapack.a:pclacp3.o: U blacs_gridinfo__ libscalapack.a:pclawil.o: U blacs_gridinfo__ libscalapack.a:pcrot.o: U blacs_gridinfo__ libscalapack.a:pclange.o: U blacs_gridinfo__ libscalapack.a:pclanhe.o: U blacs_gridinfo__ libscalapack.a:pclanhs.o: U blacs_gridinfo__ libscalapack.a:pclansy.o: U blacs_gridinfo__ libscalapack.a:pclantr.o: U blacs_gridinfo__ libscalapack.a:pclapiv.o: U blacs_gridinfo__ libscalapack.a:pclapv2.o: U blacs_gridinfo__ libscalapack.a:pclaqge.o: U blacs_gridinfo__ libscalapack.a:pclaqsy.o: U blacs_gridinfo__ libscalapack.a:pclarf.o: U blacs_gridinfo__ libscalapack.a:pclarfb.o: U blacs_gridinfo__ libscalapack.a:pclarfc.o: U blacs_gridinfo__ libscalapack.a:pclarfg.o: U blacs_gridinfo__ libscalapack.a:pclarft.o: U blacs_gridinfo__ libscalapack.a:pclascl.o: U blacs_gridinfo__ libscalapack.a:pclase2.o: U blacs_gridinfo__ libscalapack.a:pclassq.o: U blacs_gridinfo__ libscalapack.a:pclaswp.o: U blacs_gridinfo__ libscalapack.a:pclatra.o: U blacs_gridinfo__ libscalapack.a:pclatrd.o: U blacs_gridinfo__ libscalapack.a:pclatrs.o: U blacs_gridinfo__ libscalapack.a:pclauu2.o: U blacs_gridinfo__ libscalapack.a:pcpocon.o: U blacs_gridinfo__ libscalapack.a:pcpoequ.o: U blacs_gridinfo__ libscalapack.a:pcporfs.o: U blacs_gridinfo__ libscalapack.a:pcposv.o: U blacs_gridinfo__ libscalapack.a:pcpbsv.o: U blacs_gridinfo__ libscalapack.a:pcpbtrf.o: U blacs_gridinfo__ libscalapack.a:pcpbtrs.o: U blacs_gridinfo__ libscalapack.a:pcpbtrsv.o: U blacs_gridinfo__ libscalapack.a:pcptsv.o: U blacs_gridinfo__ libscalapack.a:pcpttrf.o: U blacs_gridinfo__ libscalapack.a:pcpttrs.o: U blacs_gridinfo__ libscalapack.a:pcpttrsv.o: U blacs_gridinfo__ libscalapack.a:pcposvx.o: U blacs_gridinfo__ libscalapack.a:pcpotf2.o: U blacs_gridinfo__ libscalapack.a:pcpotrf.o: U blacs_gridinfo__ libscalapack.a:pcpotri.o: U blacs_gridinfo__ libscalapack.a:pcpotrs.o: U blacs_gridinfo__ libscalapack.a:pcsrscl.o: U blacs_gridinfo__ libscalapack.a:pcstein.o: U blacs_gridinfo__ libscalapack.a:pctrevc.o: U blacs_gridinfo__ libscalapack.a:pctrti2.o: U blacs_gridinfo__ libscalapack.a:pctrtri.o: U blacs_gridinfo__ libscalapack.a:pctrtrs.o: U blacs_gridinfo__ libscalapack.a:pcung2l.o: U blacs_gridinfo__ libscalapack.a:pcung2r.o: U blacs_gridinfo__ libscalapack.a:pcungl2.o: U blacs_gridinfo__ libscalapack.a:pcunglq.o: U blacs_gridinfo__ libscalapack.a:pcungql.o: U blacs_gridinfo__ libscalapack.a:pcungqr.o: U blacs_gridinfo__ libscalapack.a:pcungr2.o: U blacs_gridinfo__ libscalapack.a:pcungrq.o: U blacs_gridinfo__ libscalapack.a:pcunm2l.o: U blacs_gridinfo__ libscalapack.a:pcunm2r.o: U blacs_gridinfo__ libscalapack.a:pcunmbr.o: U blacs_gridinfo__ libscalapack.a:pcunmhr.o: U blacs_gridinfo__ libscalapack.a:pcunml2.o: U blacs_gridinfo__ libscalapack.a:pcunmlq.o: U blacs_gridinfo__ libscalapack.a:pcunmql.o: U blacs_gridinfo__ libscalapack.a:pcunmqr.o: U blacs_gridinfo__ libscalapack.a:pcunmr2.o: U blacs_gridinfo__ libscalapack.a:pcunmrq.o: U blacs_gridinfo__ libscalapack.a:pcunmtr.o: U blacs_gridinfo__ libscalapack.a:pclaevswp.o: U blacs_gridinfo__ libscalapack.a:pclarzb.o: U blacs_gridinfo__ libscalapack.a:pclarzt.o: U blacs_gridinfo__ libscalapack.a:pclarz.o: U blacs_gridinfo__ libscalapack.a:pclarzc.o: U blacs_gridinfo__ libscalapack.a:pclatrz.o: U blacs_gridinfo__ libscalapack.a:pctzrzf.o: U blacs_gridinfo__ libscalapack.a:pclattrs.o: U blacs_gridinfo__ libscalapack.a:pcunmr3.o: U blacs_gridinfo__ libscalapack.a:pcunmrz.o: U blacs_gridinfo__ libscalapack.a:pcmax1.o: U blacs_gridinfo__ libscalapack.a:pscsum1.o: U blacs_gridinfo__ libscalapack.a:pclamr1d.o: U blacs_gridinfo__ libscalapack.a:pcheevr.o: U blacs_gridinfo__ libscalapack.a:pzdbsv.o: U blacs_gridinfo__ libscalapack.a:pzdbtrf.o: U blacs_gridinfo__ libscalapack.a:pzdbtrs.o: U blacs_gridinfo__ libscalapack.a:pzdbtrsv.o: U blacs_gridinfo__ libscalapack.a:pzdtsv.o: U blacs_gridinfo__ libscalapack.a:pzdttrf.o: U blacs_gridinfo__ libscalapack.a:pzdttrs.o: U blacs_gridinfo__ libscalapack.a:pzdttrsv.o: U blacs_gridinfo__ libscalapack.a:pzgbsv.o: U blacs_gridinfo__ libscalapack.a:pzgbtrf.o: U blacs_gridinfo__ libscalapack.a:pzgbtrs.o: U blacs_gridinfo__ libscalapack.a:pzgebd2.o: U blacs_gridinfo__ libscalapack.a:pzgebrd.o: U blacs_gridinfo__ libscalapack.a:pzgecon.o: U blacs_gridinfo__ libscalapack.a:pzgeequ.o: U blacs_gridinfo__ libscalapack.a:pzgehd2.o: U blacs_gridinfo__ libscalapack.a:pzgehrd.o: U blacs_gridinfo__ libscalapack.a:pzgelq2.o: U blacs_gridinfo__ libscalapack.a:pzgelqf.o: U blacs_gridinfo__ libscalapack.a:pzgels.o: U blacs_gridinfo__ libscalapack.a:pzgeql2.o: U blacs_gridinfo__ libscalapack.a:pzgeqlf.o: U blacs_gridinfo__ libscalapack.a:pzgeqpf.o: U blacs_gridinfo__ libscalapack.a:pzgeqr2.o: U blacs_gridinfo__ libscalapack.a:pzgeqrf.o: U blacs_gridinfo__ libscalapack.a:pzgerfs.o: U blacs_gridinfo__ libscalapack.a:pzgerq2.o: U blacs_gridinfo__ libscalapack.a:pzgerqf.o: U blacs_gridinfo__ libscalapack.a:pzgesv.o: U blacs_gridinfo__ libscalapack.a:pzgesvd.o: U blacs_gridinfo__ libscalapack.a:pzgesvx.o: U blacs_gridinfo__ libscalapack.a:pzgetf2.o: U blacs_gridinfo__ libscalapack.a:pzgetrf.o: U blacs_gridinfo__ libscalapack.a:pzgetri.o: U blacs_gridinfo__ libscalapack.a:pzgetrs.o: U blacs_gridinfo__ libscalapack.a:pzggqrf.o: U blacs_gridinfo__ libscalapack.a:pzggrqf.o: U blacs_gridinfo__ libscalapack.a:pzheev.o: U blacs_gridinfo__ libscalapack.a:pzheevd.o: U blacs_gridinfo__ libscalapack.a:pzheevx.o: U blacs_gridinfo__ libscalapack.a:pzhegs2.o: U blacs_gridinfo__ libscalapack.a:pzhegst.o: U blacs_gridinfo__ libscalapack.a:pzhegvx.o: U blacs_gridinfo__ libscalapack.a:pzhengst.o: U blacs_gridinfo__ libscalapack.a:pzhentrd.o: U blacs_gridinfo__ libscalapack.a:pzhettrd.o: U blacs_gridinfo__ libscalapack.a:pzhetd2.o: U blacs_gridinfo__ libscalapack.a:pzhetrd.o: U blacs_gridinfo__ libscalapack.a:pzlabrd.o: U blacs_gridinfo__ libscalapack.a:pzlacon.o: U blacs_gridinfo__ libscalapack.a:pzlacgv.o: U blacs_gridinfo__ libscalapack.a:pzlacp2.o: U blacs_gridinfo__ libscalapack.a:pzlahrd.o: U blacs_gridinfo__ libscalapack.a:pzlahqr.o: U blacs_gridinfo__ libscalapack.a:pzlaconsb.o: U blacs_gridinfo__ libscalapack.a:pzlasmsub.o: U blacs_gridinfo__ libscalapack.a:pzlacp3.o: U blacs_gridinfo__ libscalapack.a:pzlawil.o: U blacs_gridinfo__ libscalapack.a:pzrot.o: U blacs_gridinfo__ libscalapack.a:pzlange.o: U blacs_gridinfo__ libscalapack.a:pzlanhe.o: U blacs_gridinfo__ libscalapack.a:pzlanhs.o: U blacs_gridinfo__ libscalapack.a:pzlansy.o: U blacs_gridinfo__ libscalapack.a:pzlantr.o: U blacs_gridinfo__ libscalapack.a:pzlapiv.o: U blacs_gridinfo__ libscalapack.a:pzlapv2.o: U blacs_gridinfo__ libscalapack.a:pzlaqge.o: U blacs_gridinfo__ libscalapack.a:pzlaqsy.o: U blacs_gridinfo__ libscalapack.a:pzlarf.o: U blacs_gridinfo__ libscalapack.a:pzlarfb.o: U blacs_gridinfo__ libscalapack.a:pzlarfc.o: U blacs_gridinfo__ libscalapack.a:pzlarfg.o: U blacs_gridinfo__ libscalapack.a:pzlarft.o: U blacs_gridinfo__ libscalapack.a:pzlascl.o: U blacs_gridinfo__ libscalapack.a:pzlase2.o: U blacs_gridinfo__ libscalapack.a:pzlassq.o: U blacs_gridinfo__ libscalapack.a:pzlaswp.o: U blacs_gridinfo__ libscalapack.a:pzlatra.o: U blacs_gridinfo__ libscalapack.a:pzlatrd.o: U blacs_gridinfo__ libscalapack.a:pzlattrs.o: U blacs_gridinfo__ libscalapack.a:pzlatrs.o: U blacs_gridinfo__ libscalapack.a:pzlauu2.o: U blacs_gridinfo__ libscalapack.a:pzpocon.o: U blacs_gridinfo__ libscalapack.a:pzpoequ.o: U blacs_gridinfo__ libscalapack.a:pzporfs.o: U blacs_gridinfo__ libscalapack.a:pzposv.o: U blacs_gridinfo__ libscalapack.a:pzpbsv.o: U blacs_gridinfo__ libscalapack.a:pzpbtrf.o: U blacs_gridinfo__ libscalapack.a:pzpbtrs.o: U blacs_gridinfo__ libscalapack.a:pzpbtrsv.o: U blacs_gridinfo__ libscalapack.a:pzptsv.o: U blacs_gridinfo__ libscalapack.a:pzpttrf.o: U blacs_gridinfo__ libscalapack.a:pzpttrs.o: U blacs_gridinfo__ libscalapack.a:pzpttrsv.o: U blacs_gridinfo__ libscalapack.a:pzposvx.o: U blacs_gridinfo__ libscalapack.a:pzpotf2.o: U blacs_gridinfo__ libscalapack.a:pzpotrf.o: U blacs_gridinfo__ libscalapack.a:pzpotri.o: U blacs_gridinfo__ libscalapack.a:pzpotrs.o: U blacs_gridinfo__ libscalapack.a:pzdrscl.o: U blacs_gridinfo__ libscalapack.a:pzstein.o: U blacs_gridinfo__ libscalapack.a:pztrevc.o: U blacs_gridinfo__ libscalapack.a:pztrti2.o: U blacs_gridinfo__ libscalapack.a:pztrtri.o: U blacs_gridinfo__ libscalapack.a:pztrtrs.o: U blacs_gridinfo__ libscalapack.a:pzung2l.o: U blacs_gridinfo__ libscalapack.a:pzung2r.o: U blacs_gridinfo__ libscalapack.a:pzungl2.o: U blacs_gridinfo__ libscalapack.a:pzunglq.o: U blacs_gridinfo__ libscalapack.a:pzungql.o: U blacs_gridinfo__ libscalapack.a:pzungqr.o: U blacs_gridinfo__ libscalapack.a:pzungr2.o: U blacs_gridinfo__ libscalapack.a:pzungrq.o: U blacs_gridinfo__ libscalapack.a:pzunm2l.o: U blacs_gridinfo__ libscalapack.a:pzunm2r.o: U blacs_gridinfo__ libscalapack.a:pzunmbr.o: U blacs_gridinfo__ libscalapack.a:pzunmhr.o: U blacs_gridinfo__ libscalapack.a:pzunml2.o: U blacs_gridinfo__ libscalapack.a:pzunmlq.o: U blacs_gridinfo__ libscalapack.a:pzunmql.o: U blacs_gridinfo__ libscalapack.a:pzunmqr.o: U blacs_gridinfo__ libscalapack.a:pzunmr2.o: U blacs_gridinfo__ libscalapack.a:pzunmrq.o: U blacs_gridinfo__ libscalapack.a:pzunmtr.o: U blacs_gridinfo__ libscalapack.a:pzlaevswp.o: U blacs_gridinfo__ libscalapack.a:pzlarzb.o: U blacs_gridinfo__ libscalapack.a:pzlarzt.o: U blacs_gridinfo__ libscalapack.a:pzlarz.o: U blacs_gridinfo__ libscalapack.a:pzlarzc.o: U blacs_gridinfo__ libscalapack.a:pzlatrz.o: U blacs_gridinfo__ libscalapack.a:pztzrzf.o: U blacs_gridinfo__ libscalapack.a:pzunmr3.o: U blacs_gridinfo__ libscalapack.a:pzunmrz.o: U blacs_gridinfo__ libscalapack.a:pzmax1.o: U blacs_gridinfo__ libscalapack.a:pdzsum1.o: U blacs_gridinfo__ libscalapack.a:pzlamr1d.o: U blacs_gridinfo__ libscalapack.a:pzheevr.o: U blacs_gridinfo__ On Tue, Sep 13, 2016 at 1:33 PM, Satish Balay wrote: > On Tue, 13 Sep 2016, Matthew Knepley wrote: > > > I believe your problem is that this is old PETSc. In the latest release, > > BLACS is part of SCALAPACK. > > BLACS had been a part of scalapack for a few releases - so thats not the > issue. > > >>>>>>>> > stderr: > /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/ > gcc-opt/lib/libscalapack.a(pssytrd.o): In function `pssytrd': > /tmp/cluster_temp.FFxzAF/petsc-3.6.4/arch-linux2-c-debug/externalpackages/ > scalapack-2.0.2/SRC/pssytrd.f:259: undefined reference to > `blacs_gridinfo__' > /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/ > gcc-opt/lib/libscalapack.a(chk1mat.o): In function `chk1mat': > <<<<<< > > Double underscore? > > >>> > mpicc -c -Df77IsF2C -fPIC -fopenmp -fPIC -g -I/cm/shared/apps/openmpi/open64/64/1.10.1/include > pzrot.c > <<< > > scalapack is getting compiled with this flag '-Df77IsF2C'. This mode > was primarily supported by 'g77' previously - which we hardly ever use > anymore - so this mode is not really tested? > > >>>>> > Executing: mpif90 -show > stdout: openf90 -I/cm/shared/apps/openmpi/open64/64/1.10.1/include > -pthread -I/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 -L/usr/lib64/ > -Wl,-rpath -Wl,/usr/lib64/ -Wl,-rpath -Wl,/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 > -Wl,--enable-new-dtags -L/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 > -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > > compilers: Fortran appends an extra underscore to names > containing underscores > Defined "HAVE_FORTRAN_UNDERSCORE_UNDERSCORE" to "1" > > <<<< > > What do you have for: > > cd /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt/lib/ > nm -Ao libscalapack.a |grep -i blacs_gridinfo > > However - as Matt refered to - its best to use latest petsc-3.7 > release. Does MOOSE require 3.6? > > Satish > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.hou at ncsu.edu Tue Sep 13 16:36:28 2016 From: jason.hou at ncsu.edu (Jason Hou) Date: Tue, 13 Sep 2016 17:36:28 -0400 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: Hi everyone, Thanks for the help. I figured the problem was because the mpicc wasn't properly defined; however, the blas-lapack libraries still have issues: ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- --download-fblaslapack libraries cannot be used ******************************************************************************* As pointed by Scott, it should have nothing to do with loaded modules since I would download the libraries anyway. I've attached the log file and any suggestion is welcome. Thanks, Jason On Tue, Sep 13, 2016 at 5:30 PM, Jason Hou wrote: > Hi Satish, > > The output is the following: > > [jhou8 at rdfmg lib]$ nm -Ao libscalapack.a |grep -i blacs_gridinfo > libscalapack.a:blacs_abort_.o: U Cblacs_gridinfo > libscalapack.a:blacs_info_.o:0000000000000000 T blacs_gridinfo_ > libscalapack.a:blacs_abort_.oo: U Cblacs_gridinfo > libscalapack.a:blacs_info_.oo:0000000000000000 T Cblacs_gridinfo > libscalapack.a:chk1mat.o: U blacs_gridinfo__ > libscalapack.a:pchkxmat.o: U blacs_gridinfo__ > libscalapack.a:desc_convert.o: U blacs_gridinfo__ > libscalapack.a:descinit.o: U blacs_gridinfo__ > libscalapack.a:reshape.o: U Cblacs_gridinfo > libscalapack.a:SL_gridreshape.o: U Cblacs_gridinfo > libscalapack.a:picol2row.o: U blacs_gridinfo__ > libscalapack.a:pirow2col.o: U blacs_gridinfo__ > libscalapack.a:pilaprnt.o: U blacs_gridinfo__ > libscalapack.a:pitreecomb.o: U blacs_gridinfo__ > libscalapack.a:pichekpad.o: U blacs_gridinfo__ > libscalapack.a:pielset.o: U blacs_gridinfo__ > libscalapack.a:pielset2.o: U blacs_gridinfo__ > libscalapack.a:pielget.o: U blacs_gridinfo__ > libscalapack.a:psmatadd.o: U blacs_gridinfo__ > libscalapack.a:pscol2row.o: U blacs_gridinfo__ > libscalapack.a:psrow2col.o: U blacs_gridinfo__ > libscalapack.a:pslaprnt.o: U blacs_gridinfo__ > libscalapack.a:pstreecomb.o: U blacs_gridinfo__ > libscalapack.a:pschekpad.o: U blacs_gridinfo__ > libscalapack.a:pselset.o: U blacs_gridinfo__ > libscalapack.a:pselset2.o: U blacs_gridinfo__ > libscalapack.a:pselget.o: U blacs_gridinfo__ > libscalapack.a:pslaread.o: U blacs_gridinfo__ > libscalapack.a:pslawrite.o: U blacs_gridinfo__ > libscalapack.a:pdmatadd.o: U blacs_gridinfo__ > libscalapack.a:pdcol2row.o: U blacs_gridinfo__ > libscalapack.a:pdrow2col.o: U blacs_gridinfo__ > libscalapack.a:pdlaprnt.o: U blacs_gridinfo__ > libscalapack.a:pdtreecomb.o: U blacs_gridinfo__ > libscalapack.a:pdchekpad.o: U blacs_gridinfo__ > libscalapack.a:pdelset.o: U blacs_gridinfo__ > libscalapack.a:pdelset2.o: U blacs_gridinfo__ > libscalapack.a:pdelget.o: U blacs_gridinfo__ > libscalapack.a:pdlaread.o: U blacs_gridinfo__ > libscalapack.a:pdlawrite.o: U blacs_gridinfo__ > libscalapack.a:pcmatadd.o: U blacs_gridinfo__ > libscalapack.a:pccol2row.o: U blacs_gridinfo__ > libscalapack.a:pcrow2col.o: U blacs_gridinfo__ > libscalapack.a:pclaprnt.o: U blacs_gridinfo__ > libscalapack.a:pctreecomb.o: U blacs_gridinfo__ > libscalapack.a:pcchekpad.o: U blacs_gridinfo__ > libscalapack.a:pcelset.o: U blacs_gridinfo__ > libscalapack.a:pcelset2.o: U blacs_gridinfo__ > libscalapack.a:pcelget.o: U blacs_gridinfo__ > libscalapack.a:pclaread.o: U blacs_gridinfo__ > libscalapack.a:pclawrite.o: U blacs_gridinfo__ > libscalapack.a:pzmatadd.o: U blacs_gridinfo__ > libscalapack.a:pzcol2row.o: U blacs_gridinfo__ > libscalapack.a:pzrow2col.o: U blacs_gridinfo__ > libscalapack.a:pzlaprnt.o: U blacs_gridinfo__ > libscalapack.a:pztreecomb.o: U blacs_gridinfo__ > libscalapack.a:pzchekpad.o: U blacs_gridinfo__ > libscalapack.a:pzelset.o: U blacs_gridinfo__ > libscalapack.a:pzelset2.o: U blacs_gridinfo__ > libscalapack.a:pzelget.o: U blacs_gridinfo__ > libscalapack.a:pzlaread.o: U blacs_gridinfo__ > libscalapack.a:pzlawrite.o: U blacs_gridinfo__ > libscalapack.a:picopy_.o: U Cblacs_gridinfo > libscalapack.a:pbstran.o: U blacs_gridinfo__ > libscalapack.a:pbstrnv.o: U blacs_gridinfo__ > libscalapack.a:pxerbla.o: U blacs_gridinfo__ > libscalapack.a:PB_CGatherV.o: U Cblacs_gridinfo > libscalapack.a:PB_CInV.o: U Cblacs_gridinfo > libscalapack.a:PB_CInV2.o: U Cblacs_gridinfo > libscalapack.a:PB_CInOutV.o: U Cblacs_gridinfo > libscalapack.a:PB_CInOutV2.o: U Cblacs_gridinfo > libscalapack.a:PB_COutV.o: U Cblacs_gridinfo > libscalapack.a:PB_CScatterV.o: U Cblacs_gridinfo > libscalapack.a:PB_Cabort.o: U Cblacs_gridinfo > libscalapack.a:PB_Cchkmat.o: U Cblacs_gridinfo > libscalapack.a:PB_Cchkvec.o: U Cblacs_gridinfo > libscalapack.a:PB_CpswapNN.o: U Cblacs_gridinfo > libscalapack.a:PB_CpswapND.o: U Cblacs_gridinfo > libscalapack.a:PB_Cpdot11.o: U Cblacs_gridinfo > libscalapack.a:PB_CpdotNN.o: U Cblacs_gridinfo > libscalapack.a:PB_CpdotND.o: U Cblacs_gridinfo > libscalapack.a:PB_CpaxpbyNN.o: U Cblacs_gridinfo > libscalapack.a:PB_CpaxpbyND.o: U Cblacs_gridinfo > libscalapack.a:PB_CpaxpbyDN.o: U Cblacs_gridinfo > libscalapack.a:PB_Cpaxpby.o: U Cblacs_gridinfo > libscalapack.a:PB_CpgemmBC.o: U Cblacs_gridinfo > libscalapack.a:PB_CpgemmAC.o: U Cblacs_gridinfo > libscalapack.a:PB_CpgemmAB.o: U Cblacs_gridinfo > libscalapack.a:PB_Cplaprnt.o: U Cblacs_gridinfo > libscalapack.a:PB_Cplapad.o: U Cblacs_gridinfo > libscalapack.a:PB_Cplapd2.o: U Cblacs_gridinfo > libscalapack.a:PB_Cplascal.o: U Cblacs_gridinfo > libscalapack.a:PB_Cplasca2.o: U Cblacs_gridinfo > libscalapack.a:PB_Cplacnjg.o: U Cblacs_gridinfo > libscalapack.a:PB_Cpsym.o: U Cblacs_gridinfo > libscalapack.a:PB_CpsymmAB.o: U Cblacs_gridinfo > libscalapack.a:PB_CpsymmBC.o: U Cblacs_gridinfo > libscalapack.a:PB_Cpsyr.o: U Cblacs_gridinfo > libscalapack.a:PB_CpsyrkA.o: U Cblacs_gridinfo > libscalapack.a:PB_CpsyrkAC.o: U Cblacs_gridinfo > libscalapack.a:PB_Cpsyr2.o: U Cblacs_gridinfo > libscalapack.a:PB_Cpsyr2kA.o: U Cblacs_gridinfo > libscalapack.a:PB_Cpsyr2kAC.o: U Cblacs_gridinfo > libscalapack.a:PB_Cptrm.o: U Cblacs_gridinfo > libscalapack.a:PB_Cpgeadd.o: U Cblacs_gridinfo > libscalapack.a:PB_Cptran.o: U Cblacs_gridinfo > libscalapack.a:PB_CptrmmAB.o: U Cblacs_gridinfo > libscalapack.a:PB_CptrmmB.o: U Cblacs_gridinfo > libscalapack.a:PB_Cptrsm.o: U Cblacs_gridinfo > libscalapack.a:PB_CptrsmAB.o: U Cblacs_gridinfo > libscalapack.a:PB_CptrsmAB0.o: U Cblacs_gridinfo > libscalapack.a:PB_CptrsmAB1.o: U Cblacs_gridinfo > libscalapack.a:PB_CptrsmB.o: U Cblacs_gridinfo > libscalapack.a:PB_Cptrsv.o: U Cblacs_gridinfo > libscalapack.a:PB_Cwarn.o: U Cblacs_gridinfo > libscalapack.a:psswap_.o: U Cblacs_gridinfo > libscalapack.a:psscal_.o: U Cblacs_gridinfo > libscalapack.a:pscopy_.o: U Cblacs_gridinfo > libscalapack.a:psaxpy_.o: U Cblacs_gridinfo > libscalapack.a:psdot_.o: U Cblacs_gridinfo > libscalapack.a:psnrm2_.o: U Cblacs_gridinfo > libscalapack.a:psasum_.o: U Cblacs_gridinfo > libscalapack.a:psamax_.o: U Cblacs_gridinfo > libscalapack.a:psgemv_.o: U Cblacs_gridinfo > libscalapack.a:psger_.o: U Cblacs_gridinfo > libscalapack.a:pssymv_.o: U Cblacs_gridinfo > libscalapack.a:pssyr_.o: U Cblacs_gridinfo > libscalapack.a:pssyr2_.o: U Cblacs_gridinfo > libscalapack.a:pstrmv_.o: U Cblacs_gridinfo > libscalapack.a:pstrsv_.o: U Cblacs_gridinfo > libscalapack.a:psagemv_.o: U Cblacs_gridinfo > libscalapack.a:psasymv_.o: U Cblacs_gridinfo > libscalapack.a:psatrmv_.o: U Cblacs_gridinfo > libscalapack.a:psgeadd_.o: U Cblacs_gridinfo > libscalapack.a:psgemm_.o: U Cblacs_gridinfo > libscalapack.a:pssymm_.o: U Cblacs_gridinfo > libscalapack.a:pssyr2k_.o: U Cblacs_gridinfo > libscalapack.a:pssyrk_.o: U Cblacs_gridinfo > libscalapack.a:pstradd_.o: U Cblacs_gridinfo > libscalapack.a:pstran_.o: U Cblacs_gridinfo > libscalapack.a:pstrmm_.o: U Cblacs_gridinfo > libscalapack.a:pstrsm_.o: U Cblacs_gridinfo > libscalapack.a:pbdtran.o: U blacs_gridinfo__ > libscalapack.a:pbdtrnv.o: U blacs_gridinfo__ > libscalapack.a:pdswap_.o: U Cblacs_gridinfo > libscalapack.a:pdscal_.o: U Cblacs_gridinfo > libscalapack.a:pdcopy_.o: U Cblacs_gridinfo > libscalapack.a:pdaxpy_.o: U Cblacs_gridinfo > libscalapack.a:pddot_.o: U Cblacs_gridinfo > libscalapack.a:pdnrm2_.o: U Cblacs_gridinfo > libscalapack.a:pdasum_.o: U Cblacs_gridinfo > libscalapack.a:pdamax_.o: U Cblacs_gridinfo > libscalapack.a:pdgemv_.o: U Cblacs_gridinfo > libscalapack.a:pdger_.o: U Cblacs_gridinfo > libscalapack.a:pdsymv_.o: U Cblacs_gridinfo > libscalapack.a:pdsyr_.o: U Cblacs_gridinfo > libscalapack.a:pdsyr2_.o: U Cblacs_gridinfo > libscalapack.a:pdtrmv_.o: U Cblacs_gridinfo > libscalapack.a:pdtrsv_.o: U Cblacs_gridinfo > libscalapack.a:pdagemv_.o: U Cblacs_gridinfo > libscalapack.a:pdasymv_.o: U Cblacs_gridinfo > libscalapack.a:pdatrmv_.o: U Cblacs_gridinfo > libscalapack.a:pdgeadd_.o: U Cblacs_gridinfo > libscalapack.a:pdgemm_.o: U Cblacs_gridinfo > libscalapack.a:pdsymm_.o: U Cblacs_gridinfo > libscalapack.a:pdsyr2k_.o: U Cblacs_gridinfo > libscalapack.a:pdsyrk_.o: U Cblacs_gridinfo > libscalapack.a:pdtradd_.o: U Cblacs_gridinfo > libscalapack.a:pdtran_.o: U Cblacs_gridinfo > libscalapack.a:pdtrmm_.o: U Cblacs_gridinfo > libscalapack.a:pdtrsm_.o: U Cblacs_gridinfo > libscalapack.a:pbctran.o: U blacs_gridinfo__ > libscalapack.a:pbctrnv.o: U blacs_gridinfo__ > libscalapack.a:pcswap_.o: U Cblacs_gridinfo > libscalapack.a:pcscal_.o: U Cblacs_gridinfo > libscalapack.a:pcsscal_.o: U Cblacs_gridinfo > libscalapack.a:pccopy_.o: U Cblacs_gridinfo > libscalapack.a:pcaxpy_.o: U Cblacs_gridinfo > libscalapack.a:pcdotu_.o: U Cblacs_gridinfo > libscalapack.a:pcdotc_.o: U Cblacs_gridinfo > libscalapack.a:pscnrm2_.o: U Cblacs_gridinfo > libscalapack.a:pscasum_.o: U Cblacs_gridinfo > libscalapack.a:pcamax_.o: U Cblacs_gridinfo > libscalapack.a:pcgemv_.o: U Cblacs_gridinfo > libscalapack.a:pcgerc_.o: U Cblacs_gridinfo > libscalapack.a:pcgeru_.o: U Cblacs_gridinfo > libscalapack.a:pchemv_.o: U Cblacs_gridinfo > libscalapack.a:pcher_.o: U Cblacs_gridinfo > libscalapack.a:pcher2_.o: U Cblacs_gridinfo > libscalapack.a:pctrmv_.o: U Cblacs_gridinfo > libscalapack.a:pctrsv_.o: U Cblacs_gridinfo > libscalapack.a:pcagemv_.o: U Cblacs_gridinfo > libscalapack.a:pcahemv_.o: U Cblacs_gridinfo > libscalapack.a:pcatrmv_.o: U Cblacs_gridinfo > libscalapack.a:pcgeadd_.o: U Cblacs_gridinfo > libscalapack.a:pcgemm_.o: U Cblacs_gridinfo > libscalapack.a:pchemm_.o: U Cblacs_gridinfo > libscalapack.a:pcher2k_.o: U Cblacs_gridinfo > libscalapack.a:pcherk_.o: U Cblacs_gridinfo > libscalapack.a:pcsymm_.o: U Cblacs_gridinfo > libscalapack.a:pcsyr2k_.o: U Cblacs_gridinfo > libscalapack.a:pcsyrk_.o: U Cblacs_gridinfo > libscalapack.a:pctradd_.o: U Cblacs_gridinfo > libscalapack.a:pctranc_.o: U Cblacs_gridinfo > libscalapack.a:pctranu_.o: U Cblacs_gridinfo > libscalapack.a:pctrmm_.o: U Cblacs_gridinfo > libscalapack.a:pctrsm_.o: U Cblacs_gridinfo > libscalapack.a:pbztran.o: U blacs_gridinfo__ > libscalapack.a:pbztrnv.o: U blacs_gridinfo__ > libscalapack.a:pzswap_.o: U Cblacs_gridinfo > libscalapack.a:pzscal_.o: U Cblacs_gridinfo > libscalapack.a:pzdscal_.o: U Cblacs_gridinfo > libscalapack.a:pzcopy_.o: U Cblacs_gridinfo > libscalapack.a:pzaxpy_.o: U Cblacs_gridinfo > libscalapack.a:pzdotu_.o: U Cblacs_gridinfo > libscalapack.a:pzdotc_.o: U Cblacs_gridinfo > libscalapack.a:pdznrm2_.o: U Cblacs_gridinfo > libscalapack.a:pdzasum_.o: U Cblacs_gridinfo > libscalapack.a:pzamax_.o: U Cblacs_gridinfo > libscalapack.a:pzgemv_.o: U Cblacs_gridinfo > libscalapack.a:pzgerc_.o: U Cblacs_gridinfo > libscalapack.a:pzgeru_.o: U Cblacs_gridinfo > libscalapack.a:pzhemv_.o: U Cblacs_gridinfo > libscalapack.a:pzher_.o: U Cblacs_gridinfo > libscalapack.a:pzher2_.o: U Cblacs_gridinfo > libscalapack.a:pztrmv_.o: U Cblacs_gridinfo > libscalapack.a:pztrsv_.o: U Cblacs_gridinfo > libscalapack.a:pzagemv_.o: U Cblacs_gridinfo > libscalapack.a:pzahemv_.o: U Cblacs_gridinfo > libscalapack.a:pzatrmv_.o: U Cblacs_gridinfo > libscalapack.a:pzgeadd_.o: U Cblacs_gridinfo > libscalapack.a:pzgemm_.o: U Cblacs_gridinfo > libscalapack.a:pzhemm_.o: U Cblacs_gridinfo > libscalapack.a:pzher2k_.o: U Cblacs_gridinfo > libscalapack.a:pzherk_.o: U Cblacs_gridinfo > libscalapack.a:pzsymm_.o: U Cblacs_gridinfo > libscalapack.a:pzsyr2k_.o: U Cblacs_gridinfo > libscalapack.a:pzsyrk_.o: U Cblacs_gridinfo > libscalapack.a:pztradd_.o: U Cblacs_gridinfo > libscalapack.a:pztranc_.o: U Cblacs_gridinfo > libscalapack.a:pztranu_.o: U Cblacs_gridinfo > libscalapack.a:pztrmm_.o: U Cblacs_gridinfo > libscalapack.a:pztrsm_.o: U Cblacs_gridinfo > libscalapack.a:pigemr.o: U Cblacs_gridinfo > libscalapack.a:pitrmr.o: U Cblacs_gridinfo > libscalapack.a:pgemraux.o: U Cblacs_gridinfo > libscalapack.a:psgemr.o: U Cblacs_gridinfo > libscalapack.a:pstrmr.o: U Cblacs_gridinfo > libscalapack.a:pdgemr.o: U Cblacs_gridinfo > libscalapack.a:pdtrmr.o: U Cblacs_gridinfo > libscalapack.a:pcgemr.o: U Cblacs_gridinfo > libscalapack.a:pctrmr.o: U Cblacs_gridinfo > libscalapack.a:pzgemr.o: U Cblacs_gridinfo > libscalapack.a:pztrmr.o: U Cblacs_gridinfo > libscalapack.a:psdbsv.o: U blacs_gridinfo__ > libscalapack.a:psdbtrf.o: U blacs_gridinfo__ > libscalapack.a:psdbtrs.o: U blacs_gridinfo__ > libscalapack.a:psdbtrsv.o: U blacs_gridinfo__ > libscalapack.a:psdtsv.o: U blacs_gridinfo__ > libscalapack.a:psdttrf.o: U blacs_gridinfo__ > libscalapack.a:psdttrs.o: U blacs_gridinfo__ > libscalapack.a:psdttrsv.o: U blacs_gridinfo__ > libscalapack.a:psgbsv.o: U blacs_gridinfo__ > libscalapack.a:psgbtrf.o: U blacs_gridinfo__ > libscalapack.a:psgbtrs.o: U blacs_gridinfo__ > libscalapack.a:psgebd2.o: U blacs_gridinfo__ > libscalapack.a:psgebrd.o: U blacs_gridinfo__ > libscalapack.a:psgecon.o: U blacs_gridinfo__ > libscalapack.a:psgeequ.o: U blacs_gridinfo__ > libscalapack.a:psgehd2.o: U blacs_gridinfo__ > libscalapack.a:psgehrd.o: U blacs_gridinfo__ > libscalapack.a:psgelq2.o: U blacs_gridinfo__ > libscalapack.a:psgelqf.o: U blacs_gridinfo__ > libscalapack.a:psgels.o: U blacs_gridinfo__ > libscalapack.a:psgeql2.o: U blacs_gridinfo__ > libscalapack.a:psgeqlf.o: U blacs_gridinfo__ > libscalapack.a:psgeqpf.o: U blacs_gridinfo__ > libscalapack.a:psgeqr2.o: U blacs_gridinfo__ > libscalapack.a:psgeqrf.o: U blacs_gridinfo__ > libscalapack.a:psgerfs.o: U blacs_gridinfo__ > libscalapack.a:psgerq2.o: U blacs_gridinfo__ > libscalapack.a:psgerqf.o: U blacs_gridinfo__ > libscalapack.a:psgesv.o: U blacs_gridinfo__ > libscalapack.a:psgesvd.o: U blacs_gridinfo__ > libscalapack.a:psgesvx.o: U blacs_gridinfo__ > libscalapack.a:psgetf2.o: U blacs_gridinfo__ > libscalapack.a:psgetrf.o: U blacs_gridinfo__ > libscalapack.a:psgetri.o: U blacs_gridinfo__ > libscalapack.a:psgetrs.o: U blacs_gridinfo__ > libscalapack.a:psggqrf.o: U blacs_gridinfo__ > libscalapack.a:psggrqf.o: U blacs_gridinfo__ > libscalapack.a:pslabrd.o: U blacs_gridinfo__ > libscalapack.a:pslacon.o: U blacs_gridinfo__ > libscalapack.a:pslacp2.o: U blacs_gridinfo__ > libscalapack.a:pslahrd.o: U blacs_gridinfo__ > libscalapack.a:pslange.o: U blacs_gridinfo__ > libscalapack.a:pslanhs.o: U blacs_gridinfo__ > libscalapack.a:pslansy.o: U blacs_gridinfo__ > libscalapack.a:pslantr.o: U blacs_gridinfo__ > libscalapack.a:pslapiv.o: U blacs_gridinfo__ > libscalapack.a:pslapv2.o: U blacs_gridinfo__ > libscalapack.a:pslaqge.o: U blacs_gridinfo__ > libscalapack.a:pslaqsy.o: U blacs_gridinfo__ > libscalapack.a:pslarf.o: U blacs_gridinfo__ > libscalapack.a:pslarfb.o: U blacs_gridinfo__ > libscalapack.a:pslarfg.o: U blacs_gridinfo__ > libscalapack.a:pslarft.o: U blacs_gridinfo__ > libscalapack.a:pslase2.o: U blacs_gridinfo__ > libscalapack.a:pslascl.o: U blacs_gridinfo__ > libscalapack.a:pslassq.o: U blacs_gridinfo__ > libscalapack.a:pslaswp.o: U blacs_gridinfo__ > libscalapack.a:pslatra.o: U blacs_gridinfo__ > libscalapack.a:pslatrd.o: U blacs_gridinfo__ > libscalapack.a:pslatrs.o: U blacs_gridinfo__ > libscalapack.a:pslauu2.o: U blacs_gridinfo__ > libscalapack.a:psorg2l.o: U blacs_gridinfo__ > libscalapack.a:psorg2r.o: U blacs_gridinfo__ > libscalapack.a:psorgl2.o: U blacs_gridinfo__ > libscalapack.a:psorglq.o: U blacs_gridinfo__ > libscalapack.a:psorgql.o: U blacs_gridinfo__ > libscalapack.a:psorgqr.o: U blacs_gridinfo__ > libscalapack.a:psorgr2.o: U blacs_gridinfo__ > libscalapack.a:psorgrq.o: U blacs_gridinfo__ > libscalapack.a:psorm2l.o: U blacs_gridinfo__ > libscalapack.a:psorm2r.o: U blacs_gridinfo__ > libscalapack.a:psormbr.o: U blacs_gridinfo__ > libscalapack.a:psormhr.o: U blacs_gridinfo__ > libscalapack.a:psorml2.o: U blacs_gridinfo__ > libscalapack.a:psormlq.o: U blacs_gridinfo__ > libscalapack.a:psormql.o: U blacs_gridinfo__ > libscalapack.a:psormqr.o: U blacs_gridinfo__ > libscalapack.a:psormr2.o: U blacs_gridinfo__ > libscalapack.a:psormrq.o: U blacs_gridinfo__ > libscalapack.a:psormtr.o: U blacs_gridinfo__ > libscalapack.a:pspocon.o: U blacs_gridinfo__ > libscalapack.a:pspbsv.o: U blacs_gridinfo__ > libscalapack.a:pspbtrf.o: U blacs_gridinfo__ > libscalapack.a:pspbtrs.o: U blacs_gridinfo__ > libscalapack.a:pspbtrsv.o: U blacs_gridinfo__ > libscalapack.a:psptsv.o: U blacs_gridinfo__ > libscalapack.a:pspttrf.o: U blacs_gridinfo__ > libscalapack.a:pspttrs.o: U blacs_gridinfo__ > libscalapack.a:pspttrsv.o: U blacs_gridinfo__ > libscalapack.a:pspoequ.o: U blacs_gridinfo__ > libscalapack.a:psporfs.o: U blacs_gridinfo__ > libscalapack.a:psposv.o: U blacs_gridinfo__ > libscalapack.a:psposvx.o: U blacs_gridinfo__ > libscalapack.a:pspotf2.o: U blacs_gridinfo__ > libscalapack.a:pspotrf.o: U blacs_gridinfo__ > libscalapack.a:pspotri.o: U blacs_gridinfo__ > libscalapack.a:pspotrs.o: U blacs_gridinfo__ > libscalapack.a:psrscl.o: U blacs_gridinfo__ > libscalapack.a:psstein.o: U blacs_gridinfo__ > libscalapack.a:pssyev.o: U blacs_gridinfo__ > libscalapack.a:pssyevd.o: U blacs_gridinfo__ > libscalapack.a:pssyevx.o: U blacs_gridinfo__ > libscalapack.a:pssygs2.o: U blacs_gridinfo__ > libscalapack.a:pssygst.o: U blacs_gridinfo__ > libscalapack.a:pssygvx.o: U blacs_gridinfo__ > libscalapack.a:pssyngst.o: U blacs_gridinfo__ > libscalapack.a:pssyntrd.o: U blacs_gridinfo__ > libscalapack.a:pssyttrd.o: U blacs_gridinfo__ > libscalapack.a:pssytd2.o: U blacs_gridinfo__ > libscalapack.a:pssytrd.o: U blacs_gridinfo__ > libscalapack.a:pstrti2.o: U blacs_gridinfo__ > libscalapack.a:pstrtri.o: U blacs_gridinfo__ > libscalapack.a:pstrtrs.o: U blacs_gridinfo__ > libscalapack.a:pslaevswp.o: U blacs_gridinfo__ > libscalapack.a:pslarzb.o: U blacs_gridinfo__ > libscalapack.a:pslarzt.o: U blacs_gridinfo__ > libscalapack.a:pslarz.o: U blacs_gridinfo__ > libscalapack.a:pslatrz.o: U blacs_gridinfo__ > libscalapack.a:pstzrzf.o: U blacs_gridinfo__ > libscalapack.a:psormr3.o: U blacs_gridinfo__ > libscalapack.a:psormrz.o: U blacs_gridinfo__ > libscalapack.a:pslahqr.o: U blacs_gridinfo__ > libscalapack.a:pslaconsb.o: U blacs_gridinfo__ > libscalapack.a:pslacp3.o: U blacs_gridinfo__ > libscalapack.a:pslawil.o: U blacs_gridinfo__ > libscalapack.a:pslasmsub.o: U blacs_gridinfo__ > libscalapack.a:pslared2d.o: U blacs_gridinfo__ > libscalapack.a:pslamr1d.o: U blacs_gridinfo__ > libscalapack.a:pssyevr.o: U blacs_gridinfo__ > libscalapack.a:pstrord.o: U blacs_gridinfo__ > libscalapack.a:pstrsen.o: U blacs_gridinfo__ > libscalapack.a:psgebal.o: U blacs_gridinfo__ > libscalapack.a:pshseqr.o: U blacs_gridinfo__ > libscalapack.a:pslamve.o: U blacs_gridinfo__ > libscalapack.a:pslaqr0.o: U blacs_gridinfo__ > libscalapack.a:pslaqr1.o: U blacs_gridinfo__ > libscalapack.a:pslaqr2.o: U blacs_gridinfo__ > libscalapack.a:pslaqr3.o: U blacs_gridinfo__ > libscalapack.a:pslaqr4.o: U blacs_gridinfo__ > libscalapack.a:pslaqr5.o: U blacs_gridinfo__ > libscalapack.a:psrot.o: U blacs_gridinfo__ > libscalapack.a:pslaed0.o: U blacs_gridinfo__ > libscalapack.a:pslaed1.o: U blacs_gridinfo__ > libscalapack.a:pslaed2.o: U blacs_gridinfo__ > libscalapack.a:pslaed3.o: U blacs_gridinfo__ > libscalapack.a:pslaedz.o: U blacs_gridinfo__ > libscalapack.a:pslared1d.o: U blacs_gridinfo__ > libscalapack.a:pslasrt.o: U blacs_gridinfo__ > libscalapack.a:psstebz.o: U blacs_gridinfo__ > libscalapack.a:psstedc.o: U blacs_gridinfo__ > libscalapack.a:pilaenvx.o: U blacs_gridinfo__ > libscalapack.a:piparmq.o: U blacs_gridinfo__ > libscalapack.a:pddbsv.o: U blacs_gridinfo__ > libscalapack.a:pddbtrf.o: U blacs_gridinfo__ > libscalapack.a:pddbtrs.o: U blacs_gridinfo__ > libscalapack.a:pddbtrsv.o: U blacs_gridinfo__ > libscalapack.a:pddtsv.o: U blacs_gridinfo__ > libscalapack.a:pddttrf.o: U blacs_gridinfo__ > libscalapack.a:pddttrs.o: U blacs_gridinfo__ > libscalapack.a:pddttrsv.o: U blacs_gridinfo__ > libscalapack.a:pdgbsv.o: U blacs_gridinfo__ > libscalapack.a:pdgbtrf.o: U blacs_gridinfo__ > libscalapack.a:pdgbtrs.o: U blacs_gridinfo__ > libscalapack.a:pdgebd2.o: U blacs_gridinfo__ > libscalapack.a:pdgebrd.o: U blacs_gridinfo__ > libscalapack.a:pdgecon.o: U blacs_gridinfo__ > libscalapack.a:pdgeequ.o: U blacs_gridinfo__ > libscalapack.a:pdgehd2.o: U blacs_gridinfo__ > libscalapack.a:pdgehrd.o: U blacs_gridinfo__ > libscalapack.a:pdgelq2.o: U blacs_gridinfo__ > libscalapack.a:pdgelqf.o: U blacs_gridinfo__ > libscalapack.a:pdgels.o: U blacs_gridinfo__ > libscalapack.a:pdgeql2.o: U blacs_gridinfo__ > libscalapack.a:pdgeqlf.o: U blacs_gridinfo__ > libscalapack.a:pdgeqpf.o: U blacs_gridinfo__ > libscalapack.a:pdgeqr2.o: U blacs_gridinfo__ > libscalapack.a:pdgeqrf.o: U blacs_gridinfo__ > libscalapack.a:pdgerfs.o: U blacs_gridinfo__ > libscalapack.a:pdgerq2.o: U blacs_gridinfo__ > libscalapack.a:pdgerqf.o: U blacs_gridinfo__ > libscalapack.a:pdgesv.o: U blacs_gridinfo__ > libscalapack.a:pdgesvd.o: U blacs_gridinfo__ > libscalapack.a:pdgesvx.o: U blacs_gridinfo__ > libscalapack.a:pdgetf2.o: U blacs_gridinfo__ > libscalapack.a:pdgetrf.o: U blacs_gridinfo__ > libscalapack.a:pdgetri.o: U blacs_gridinfo__ > libscalapack.a:pdgetrs.o: U blacs_gridinfo__ > libscalapack.a:pdggqrf.o: U blacs_gridinfo__ > libscalapack.a:pdggrqf.o: U blacs_gridinfo__ > libscalapack.a:pdlabrd.o: U blacs_gridinfo__ > libscalapack.a:pdlacon.o: U blacs_gridinfo__ > libscalapack.a:pdlacp2.o: U blacs_gridinfo__ > libscalapack.a:pdlahrd.o: U blacs_gridinfo__ > libscalapack.a:pdlange.o: U blacs_gridinfo__ > libscalapack.a:pdlanhs.o: U blacs_gridinfo__ > libscalapack.a:pdlansy.o: U blacs_gridinfo__ > libscalapack.a:pdlantr.o: U blacs_gridinfo__ > libscalapack.a:pdlapiv.o: U blacs_gridinfo__ > libscalapack.a:pdlapv2.o: U blacs_gridinfo__ > libscalapack.a:pdlaqge.o: U blacs_gridinfo__ > libscalapack.a:pdlaqsy.o: U blacs_gridinfo__ > libscalapack.a:pdlarf.o: U blacs_gridinfo__ > libscalapack.a:pdlarfb.o: U blacs_gridinfo__ > libscalapack.a:pdlarfg.o: U blacs_gridinfo__ > libscalapack.a:pdlarft.o: U blacs_gridinfo__ > libscalapack.a:pdlase2.o: U blacs_gridinfo__ > libscalapack.a:pdlascl.o: U blacs_gridinfo__ > libscalapack.a:pdlassq.o: U blacs_gridinfo__ > libscalapack.a:pdlaswp.o: U blacs_gridinfo__ > libscalapack.a:pdlatra.o: U blacs_gridinfo__ > libscalapack.a:pdlatrd.o: U blacs_gridinfo__ > libscalapack.a:pdlatrs.o: U blacs_gridinfo__ > libscalapack.a:pdlauu2.o: U blacs_gridinfo__ > libscalapack.a:pdorg2l.o: U blacs_gridinfo__ > libscalapack.a:pdorg2r.o: U blacs_gridinfo__ > libscalapack.a:pdorgl2.o: U blacs_gridinfo__ > libscalapack.a:pdorglq.o: U blacs_gridinfo__ > libscalapack.a:pdorgql.o: U blacs_gridinfo__ > libscalapack.a:pdorgqr.o: U blacs_gridinfo__ > libscalapack.a:pdorgr2.o: U blacs_gridinfo__ > libscalapack.a:pdorgrq.o: U blacs_gridinfo__ > libscalapack.a:pdorm2l.o: U blacs_gridinfo__ > libscalapack.a:pdorm2r.o: U blacs_gridinfo__ > libscalapack.a:pdormbr.o: U blacs_gridinfo__ > libscalapack.a:pdormhr.o: U blacs_gridinfo__ > libscalapack.a:pdorml2.o: U blacs_gridinfo__ > libscalapack.a:pdormlq.o: U blacs_gridinfo__ > libscalapack.a:pdormql.o: U blacs_gridinfo__ > libscalapack.a:pdormqr.o: U blacs_gridinfo__ > libscalapack.a:pdormr2.o: U blacs_gridinfo__ > libscalapack.a:pdormrq.o: U blacs_gridinfo__ > libscalapack.a:pdormtr.o: U blacs_gridinfo__ > libscalapack.a:pdpocon.o: U blacs_gridinfo__ > libscalapack.a:pdpbsv.o: U blacs_gridinfo__ > libscalapack.a:pdpbtrf.o: U blacs_gridinfo__ > libscalapack.a:pdpbtrs.o: U blacs_gridinfo__ > libscalapack.a:pdpbtrsv.o: U blacs_gridinfo__ > libscalapack.a:pdptsv.o: U blacs_gridinfo__ > libscalapack.a:pdpttrf.o: U blacs_gridinfo__ > libscalapack.a:pdpttrs.o: U blacs_gridinfo__ > libscalapack.a:pdpttrsv.o: U blacs_gridinfo__ > libscalapack.a:pdpoequ.o: U blacs_gridinfo__ > libscalapack.a:pdporfs.o: U blacs_gridinfo__ > libscalapack.a:pdposv.o: U blacs_gridinfo__ > libscalapack.a:pdposvx.o: U blacs_gridinfo__ > libscalapack.a:pdpotf2.o: U blacs_gridinfo__ > libscalapack.a:pdpotrf.o: U blacs_gridinfo__ > libscalapack.a:pdpotri.o: U blacs_gridinfo__ > libscalapack.a:pdpotrs.o: U blacs_gridinfo__ > libscalapack.a:pdrscl.o: U blacs_gridinfo__ > libscalapack.a:pdstein.o: U blacs_gridinfo__ > libscalapack.a:pdsyev.o: U blacs_gridinfo__ > libscalapack.a:pdsyevd.o: U blacs_gridinfo__ > libscalapack.a:pdsyevx.o: U blacs_gridinfo__ > libscalapack.a:pdsygs2.o: U blacs_gridinfo__ > libscalapack.a:pdsygst.o: U blacs_gridinfo__ > libscalapack.a:pdsygvx.o: U blacs_gridinfo__ > libscalapack.a:pdsyngst.o: U blacs_gridinfo__ > libscalapack.a:pdsyntrd.o: U blacs_gridinfo__ > libscalapack.a:pdsyttrd.o: U blacs_gridinfo__ > libscalapack.a:pdsytd2.o: U blacs_gridinfo__ > libscalapack.a:pdsytrd.o: U blacs_gridinfo__ > libscalapack.a:pdtrti2.o: U blacs_gridinfo__ > libscalapack.a:pdtrtri.o: U blacs_gridinfo__ > libscalapack.a:pdtrtrs.o: U blacs_gridinfo__ > libscalapack.a:pdlaevswp.o: U blacs_gridinfo__ > libscalapack.a:pdlarzb.o: U blacs_gridinfo__ > libscalapack.a:pdlarzt.o: U blacs_gridinfo__ > libscalapack.a:pdlarz.o: U blacs_gridinfo__ > libscalapack.a:pdlatrz.o: U blacs_gridinfo__ > libscalapack.a:pdtzrzf.o: U blacs_gridinfo__ > libscalapack.a:pdormr3.o: U blacs_gridinfo__ > libscalapack.a:pdormrz.o: U blacs_gridinfo__ > libscalapack.a:pdlahqr.o: U blacs_gridinfo__ > libscalapack.a:pdlaconsb.o: U blacs_gridinfo__ > libscalapack.a:pdlacp3.o: U blacs_gridinfo__ > libscalapack.a:pdlawil.o: U blacs_gridinfo__ > libscalapack.a:pdlasmsub.o: U blacs_gridinfo__ > libscalapack.a:pdlared2d.o: U blacs_gridinfo__ > libscalapack.a:pdlamr1d.o: U blacs_gridinfo__ > libscalapack.a:pdsyevr.o: U blacs_gridinfo__ > libscalapack.a:pdtrord.o: U blacs_gridinfo__ > libscalapack.a:pdtrsen.o: U blacs_gridinfo__ > libscalapack.a:pdgebal.o: U blacs_gridinfo__ > libscalapack.a:pdhseqr.o: U blacs_gridinfo__ > libscalapack.a:pdlamve.o: U blacs_gridinfo__ > libscalapack.a:pdlaqr0.o: U blacs_gridinfo__ > libscalapack.a:pdlaqr1.o: U blacs_gridinfo__ > libscalapack.a:pdlaqr2.o: U blacs_gridinfo__ > libscalapack.a:pdlaqr3.o: U blacs_gridinfo__ > libscalapack.a:pdlaqr4.o: U blacs_gridinfo__ > libscalapack.a:pdlaqr5.o: U blacs_gridinfo__ > libscalapack.a:pdrot.o: U blacs_gridinfo__ > libscalapack.a:pdlaed0.o: U blacs_gridinfo__ > libscalapack.a:pdlaed1.o: U blacs_gridinfo__ > libscalapack.a:pdlaed2.o: U blacs_gridinfo__ > libscalapack.a:pdlaed3.o: U blacs_gridinfo__ > libscalapack.a:pdlaedz.o: U blacs_gridinfo__ > libscalapack.a:pdlared1d.o: U blacs_gridinfo__ > libscalapack.a:pdlasrt.o: U blacs_gridinfo__ > libscalapack.a:pdstebz.o: U blacs_gridinfo__ > libscalapack.a:pdstedc.o: U blacs_gridinfo__ > libscalapack.a:pcdbsv.o: U blacs_gridinfo__ > libscalapack.a:pcdbtrf.o: U blacs_gridinfo__ > libscalapack.a:pcdbtrs.o: U blacs_gridinfo__ > libscalapack.a:pcdbtrsv.o: U blacs_gridinfo__ > libscalapack.a:pcdtsv.o: U blacs_gridinfo__ > libscalapack.a:pcdttrf.o: U blacs_gridinfo__ > libscalapack.a:pcdttrs.o: U blacs_gridinfo__ > libscalapack.a:pcdttrsv.o: U blacs_gridinfo__ > libscalapack.a:pcgbsv.o: U blacs_gridinfo__ > libscalapack.a:pcgbtrf.o: U blacs_gridinfo__ > libscalapack.a:pcgbtrs.o: U blacs_gridinfo__ > libscalapack.a:pcgebd2.o: U blacs_gridinfo__ > libscalapack.a:pcgebrd.o: U blacs_gridinfo__ > libscalapack.a:pcgecon.o: U blacs_gridinfo__ > libscalapack.a:pcgeequ.o: U blacs_gridinfo__ > libscalapack.a:pcgehd2.o: U blacs_gridinfo__ > libscalapack.a:pcgehrd.o: U blacs_gridinfo__ > libscalapack.a:pcgelq2.o: U blacs_gridinfo__ > libscalapack.a:pcgelqf.o: U blacs_gridinfo__ > libscalapack.a:pcgels.o: U blacs_gridinfo__ > libscalapack.a:pcgeql2.o: U blacs_gridinfo__ > libscalapack.a:pcgeqlf.o: U blacs_gridinfo__ > libscalapack.a:pcgeqpf.o: U blacs_gridinfo__ > libscalapack.a:pcgeqr2.o: U blacs_gridinfo__ > libscalapack.a:pcgeqrf.o: U blacs_gridinfo__ > libscalapack.a:pcgerfs.o: U blacs_gridinfo__ > libscalapack.a:pcgerq2.o: U blacs_gridinfo__ > libscalapack.a:pcgerqf.o: U blacs_gridinfo__ > libscalapack.a:pcgesv.o: U blacs_gridinfo__ > libscalapack.a:pcgesvd.o: U blacs_gridinfo__ > libscalapack.a:pcgesvx.o: U blacs_gridinfo__ > libscalapack.a:pcgetf2.o: U blacs_gridinfo__ > libscalapack.a:pcgetrf.o: U blacs_gridinfo__ > libscalapack.a:pcgetri.o: U blacs_gridinfo__ > libscalapack.a:pcgetrs.o: U blacs_gridinfo__ > libscalapack.a:pcggqrf.o: U blacs_gridinfo__ > libscalapack.a:pcggrqf.o: U blacs_gridinfo__ > libscalapack.a:pcheev.o: U blacs_gridinfo__ > libscalapack.a:pcheevd.o: U blacs_gridinfo__ > libscalapack.a:pcheevx.o: U blacs_gridinfo__ > libscalapack.a:pchegs2.o: U blacs_gridinfo__ > libscalapack.a:pchegst.o: U blacs_gridinfo__ > libscalapack.a:pchegvx.o: U blacs_gridinfo__ > libscalapack.a:pchengst.o: U blacs_gridinfo__ > libscalapack.a:pchentrd.o: U blacs_gridinfo__ > libscalapack.a:pchettrd.o: U blacs_gridinfo__ > libscalapack.a:pchetd2.o: U blacs_gridinfo__ > libscalapack.a:pchetrd.o: U blacs_gridinfo__ > libscalapack.a:pclabrd.o: U blacs_gridinfo__ > libscalapack.a:pclacon.o: U blacs_gridinfo__ > libscalapack.a:pclacgv.o: U blacs_gridinfo__ > libscalapack.a:pclacp2.o: U blacs_gridinfo__ > libscalapack.a:pclahrd.o: U blacs_gridinfo__ > libscalapack.a:pclahqr.o: U blacs_gridinfo__ > libscalapack.a:pclaconsb.o: U blacs_gridinfo__ > libscalapack.a:pclasmsub.o: U blacs_gridinfo__ > libscalapack.a:pclacp3.o: U blacs_gridinfo__ > libscalapack.a:pclawil.o: U blacs_gridinfo__ > libscalapack.a:pcrot.o: U blacs_gridinfo__ > libscalapack.a:pclange.o: U blacs_gridinfo__ > libscalapack.a:pclanhe.o: U blacs_gridinfo__ > libscalapack.a:pclanhs.o: U blacs_gridinfo__ > libscalapack.a:pclansy.o: U blacs_gridinfo__ > libscalapack.a:pclantr.o: U blacs_gridinfo__ > libscalapack.a:pclapiv.o: U blacs_gridinfo__ > libscalapack.a:pclapv2.o: U blacs_gridinfo__ > libscalapack.a:pclaqge.o: U blacs_gridinfo__ > libscalapack.a:pclaqsy.o: U blacs_gridinfo__ > libscalapack.a:pclarf.o: U blacs_gridinfo__ > libscalapack.a:pclarfb.o: U blacs_gridinfo__ > libscalapack.a:pclarfc.o: U blacs_gridinfo__ > libscalapack.a:pclarfg.o: U blacs_gridinfo__ > libscalapack.a:pclarft.o: U blacs_gridinfo__ > libscalapack.a:pclascl.o: U blacs_gridinfo__ > libscalapack.a:pclase2.o: U blacs_gridinfo__ > libscalapack.a:pclassq.o: U blacs_gridinfo__ > libscalapack.a:pclaswp.o: U blacs_gridinfo__ > libscalapack.a:pclatra.o: U blacs_gridinfo__ > libscalapack.a:pclatrd.o: U blacs_gridinfo__ > libscalapack.a:pclatrs.o: U blacs_gridinfo__ > libscalapack.a:pclauu2.o: U blacs_gridinfo__ > libscalapack.a:pcpocon.o: U blacs_gridinfo__ > libscalapack.a:pcpoequ.o: U blacs_gridinfo__ > libscalapack.a:pcporfs.o: U blacs_gridinfo__ > libscalapack.a:pcposv.o: U blacs_gridinfo__ > libscalapack.a:pcpbsv.o: U blacs_gridinfo__ > libscalapack.a:pcpbtrf.o: U blacs_gridinfo__ > libscalapack.a:pcpbtrs.o: U blacs_gridinfo__ > libscalapack.a:pcpbtrsv.o: U blacs_gridinfo__ > libscalapack.a:pcptsv.o: U blacs_gridinfo__ > libscalapack.a:pcpttrf.o: U blacs_gridinfo__ > libscalapack.a:pcpttrs.o: U blacs_gridinfo__ > libscalapack.a:pcpttrsv.o: U blacs_gridinfo__ > libscalapack.a:pcposvx.o: U blacs_gridinfo__ > libscalapack.a:pcpotf2.o: U blacs_gridinfo__ > libscalapack.a:pcpotrf.o: U blacs_gridinfo__ > libscalapack.a:pcpotri.o: U blacs_gridinfo__ > libscalapack.a:pcpotrs.o: U blacs_gridinfo__ > libscalapack.a:pcsrscl.o: U blacs_gridinfo__ > libscalapack.a:pcstein.o: U blacs_gridinfo__ > libscalapack.a:pctrevc.o: U blacs_gridinfo__ > libscalapack.a:pctrti2.o: U blacs_gridinfo__ > libscalapack.a:pctrtri.o: U blacs_gridinfo__ > libscalapack.a:pctrtrs.o: U blacs_gridinfo__ > libscalapack.a:pcung2l.o: U blacs_gridinfo__ > libscalapack.a:pcung2r.o: U blacs_gridinfo__ > libscalapack.a:pcungl2.o: U blacs_gridinfo__ > libscalapack.a:pcunglq.o: U blacs_gridinfo__ > libscalapack.a:pcungql.o: U blacs_gridinfo__ > libscalapack.a:pcungqr.o: U blacs_gridinfo__ > libscalapack.a:pcungr2.o: U blacs_gridinfo__ > libscalapack.a:pcungrq.o: U blacs_gridinfo__ > libscalapack.a:pcunm2l.o: U blacs_gridinfo__ > libscalapack.a:pcunm2r.o: U blacs_gridinfo__ > libscalapack.a:pcunmbr.o: U blacs_gridinfo__ > libscalapack.a:pcunmhr.o: U blacs_gridinfo__ > libscalapack.a:pcunml2.o: U blacs_gridinfo__ > libscalapack.a:pcunmlq.o: U blacs_gridinfo__ > libscalapack.a:pcunmql.o: U blacs_gridinfo__ > libscalapack.a:pcunmqr.o: U blacs_gridinfo__ > libscalapack.a:pcunmr2.o: U blacs_gridinfo__ > libscalapack.a:pcunmrq.o: U blacs_gridinfo__ > libscalapack.a:pcunmtr.o: U blacs_gridinfo__ > libscalapack.a:pclaevswp.o: U blacs_gridinfo__ > libscalapack.a:pclarzb.o: U blacs_gridinfo__ > libscalapack.a:pclarzt.o: U blacs_gridinfo__ > libscalapack.a:pclarz.o: U blacs_gridinfo__ > libscalapack.a:pclarzc.o: U blacs_gridinfo__ > libscalapack.a:pclatrz.o: U blacs_gridinfo__ > libscalapack.a:pctzrzf.o: U blacs_gridinfo__ > libscalapack.a:pclattrs.o: U blacs_gridinfo__ > libscalapack.a:pcunmr3.o: U blacs_gridinfo__ > libscalapack.a:pcunmrz.o: U blacs_gridinfo__ > libscalapack.a:pcmax1.o: U blacs_gridinfo__ > libscalapack.a:pscsum1.o: U blacs_gridinfo__ > libscalapack.a:pclamr1d.o: U blacs_gridinfo__ > libscalapack.a:pcheevr.o: U blacs_gridinfo__ > libscalapack.a:pzdbsv.o: U blacs_gridinfo__ > libscalapack.a:pzdbtrf.o: U blacs_gridinfo__ > libscalapack.a:pzdbtrs.o: U blacs_gridinfo__ > libscalapack.a:pzdbtrsv.o: U blacs_gridinfo__ > libscalapack.a:pzdtsv.o: U blacs_gridinfo__ > libscalapack.a:pzdttrf.o: U blacs_gridinfo__ > libscalapack.a:pzdttrs.o: U blacs_gridinfo__ > libscalapack.a:pzdttrsv.o: U blacs_gridinfo__ > libscalapack.a:pzgbsv.o: U blacs_gridinfo__ > libscalapack.a:pzgbtrf.o: U blacs_gridinfo__ > libscalapack.a:pzgbtrs.o: U blacs_gridinfo__ > libscalapack.a:pzgebd2.o: U blacs_gridinfo__ > libscalapack.a:pzgebrd.o: U blacs_gridinfo__ > libscalapack.a:pzgecon.o: U blacs_gridinfo__ > libscalapack.a:pzgeequ.o: U blacs_gridinfo__ > libscalapack.a:pzgehd2.o: U blacs_gridinfo__ > libscalapack.a:pzgehrd.o: U blacs_gridinfo__ > libscalapack.a:pzgelq2.o: U blacs_gridinfo__ > libscalapack.a:pzgelqf.o: U blacs_gridinfo__ > libscalapack.a:pzgels.o: U blacs_gridinfo__ > libscalapack.a:pzgeql2.o: U blacs_gridinfo__ > libscalapack.a:pzgeqlf.o: U blacs_gridinfo__ > libscalapack.a:pzgeqpf.o: U blacs_gridinfo__ > libscalapack.a:pzgeqr2.o: U blacs_gridinfo__ > libscalapack.a:pzgeqrf.o: U blacs_gridinfo__ > libscalapack.a:pzgerfs.o: U blacs_gridinfo__ > libscalapack.a:pzgerq2.o: U blacs_gridinfo__ > libscalapack.a:pzgerqf.o: U blacs_gridinfo__ > libscalapack.a:pzgesv.o: U blacs_gridinfo__ > libscalapack.a:pzgesvd.o: U blacs_gridinfo__ > libscalapack.a:pzgesvx.o: U blacs_gridinfo__ > libscalapack.a:pzgetf2.o: U blacs_gridinfo__ > libscalapack.a:pzgetrf.o: U blacs_gridinfo__ > libscalapack.a:pzgetri.o: U blacs_gridinfo__ > libscalapack.a:pzgetrs.o: U blacs_gridinfo__ > libscalapack.a:pzggqrf.o: U blacs_gridinfo__ > libscalapack.a:pzggrqf.o: U blacs_gridinfo__ > libscalapack.a:pzheev.o: U blacs_gridinfo__ > libscalapack.a:pzheevd.o: U blacs_gridinfo__ > libscalapack.a:pzheevx.o: U blacs_gridinfo__ > libscalapack.a:pzhegs2.o: U blacs_gridinfo__ > libscalapack.a:pzhegst.o: U blacs_gridinfo__ > libscalapack.a:pzhegvx.o: U blacs_gridinfo__ > libscalapack.a:pzhengst.o: U blacs_gridinfo__ > libscalapack.a:pzhentrd.o: U blacs_gridinfo__ > libscalapack.a:pzhettrd.o: U blacs_gridinfo__ > libscalapack.a:pzhetd2.o: U blacs_gridinfo__ > libscalapack.a:pzhetrd.o: U blacs_gridinfo__ > libscalapack.a:pzlabrd.o: U blacs_gridinfo__ > libscalapack.a:pzlacon.o: U blacs_gridinfo__ > libscalapack.a:pzlacgv.o: U blacs_gridinfo__ > libscalapack.a:pzlacp2.o: U blacs_gridinfo__ > libscalapack.a:pzlahrd.o: U blacs_gridinfo__ > libscalapack.a:pzlahqr.o: U blacs_gridinfo__ > libscalapack.a:pzlaconsb.o: U blacs_gridinfo__ > libscalapack.a:pzlasmsub.o: U blacs_gridinfo__ > libscalapack.a:pzlacp3.o: U blacs_gridinfo__ > libscalapack.a:pzlawil.o: U blacs_gridinfo__ > libscalapack.a:pzrot.o: U blacs_gridinfo__ > libscalapack.a:pzlange.o: U blacs_gridinfo__ > libscalapack.a:pzlanhe.o: U blacs_gridinfo__ > libscalapack.a:pzlanhs.o: U blacs_gridinfo__ > libscalapack.a:pzlansy.o: U blacs_gridinfo__ > libscalapack.a:pzlantr.o: U blacs_gridinfo__ > libscalapack.a:pzlapiv.o: U blacs_gridinfo__ > libscalapack.a:pzlapv2.o: U blacs_gridinfo__ > libscalapack.a:pzlaqge.o: U blacs_gridinfo__ > libscalapack.a:pzlaqsy.o: U blacs_gridinfo__ > libscalapack.a:pzlarf.o: U blacs_gridinfo__ > libscalapack.a:pzlarfb.o: U blacs_gridinfo__ > libscalapack.a:pzlarfc.o: U blacs_gridinfo__ > libscalapack.a:pzlarfg.o: U blacs_gridinfo__ > libscalapack.a:pzlarft.o: U blacs_gridinfo__ > libscalapack.a:pzlascl.o: U blacs_gridinfo__ > libscalapack.a:pzlase2.o: U blacs_gridinfo__ > libscalapack.a:pzlassq.o: U blacs_gridinfo__ > libscalapack.a:pzlaswp.o: U blacs_gridinfo__ > libscalapack.a:pzlatra.o: U blacs_gridinfo__ > libscalapack.a:pzlatrd.o: U blacs_gridinfo__ > libscalapack.a:pzlattrs.o: U blacs_gridinfo__ > libscalapack.a:pzlatrs.o: U blacs_gridinfo__ > libscalapack.a:pzlauu2.o: U blacs_gridinfo__ > libscalapack.a:pzpocon.o: U blacs_gridinfo__ > libscalapack.a:pzpoequ.o: U blacs_gridinfo__ > libscalapack.a:pzporfs.o: U blacs_gridinfo__ > libscalapack.a:pzposv.o: U blacs_gridinfo__ > libscalapack.a:pzpbsv.o: U blacs_gridinfo__ > libscalapack.a:pzpbtrf.o: U blacs_gridinfo__ > libscalapack.a:pzpbtrs.o: U blacs_gridinfo__ > libscalapack.a:pzpbtrsv.o: U blacs_gridinfo__ > libscalapack.a:pzptsv.o: U blacs_gridinfo__ > libscalapack.a:pzpttrf.o: U blacs_gridinfo__ > libscalapack.a:pzpttrs.o: U blacs_gridinfo__ > libscalapack.a:pzpttrsv.o: U blacs_gridinfo__ > libscalapack.a:pzposvx.o: U blacs_gridinfo__ > libscalapack.a:pzpotf2.o: U blacs_gridinfo__ > libscalapack.a:pzpotrf.o: U blacs_gridinfo__ > libscalapack.a:pzpotri.o: U blacs_gridinfo__ > libscalapack.a:pzpotrs.o: U blacs_gridinfo__ > libscalapack.a:pzdrscl.o: U blacs_gridinfo__ > libscalapack.a:pzstein.o: U blacs_gridinfo__ > libscalapack.a:pztrevc.o: U blacs_gridinfo__ > libscalapack.a:pztrti2.o: U blacs_gridinfo__ > libscalapack.a:pztrtri.o: U blacs_gridinfo__ > libscalapack.a:pztrtrs.o: U blacs_gridinfo__ > libscalapack.a:pzung2l.o: U blacs_gridinfo__ > libscalapack.a:pzung2r.o: U blacs_gridinfo__ > libscalapack.a:pzungl2.o: U blacs_gridinfo__ > libscalapack.a:pzunglq.o: U blacs_gridinfo__ > libscalapack.a:pzungql.o: U blacs_gridinfo__ > libscalapack.a:pzungqr.o: U blacs_gridinfo__ > libscalapack.a:pzungr2.o: U blacs_gridinfo__ > libscalapack.a:pzungrq.o: U blacs_gridinfo__ > libscalapack.a:pzunm2l.o: U blacs_gridinfo__ > libscalapack.a:pzunm2r.o: U blacs_gridinfo__ > libscalapack.a:pzunmbr.o: U blacs_gridinfo__ > libscalapack.a:pzunmhr.o: U blacs_gridinfo__ > libscalapack.a:pzunml2.o: U blacs_gridinfo__ > libscalapack.a:pzunmlq.o: U blacs_gridinfo__ > libscalapack.a:pzunmql.o: U blacs_gridinfo__ > libscalapack.a:pzunmqr.o: U blacs_gridinfo__ > libscalapack.a:pzunmr2.o: U blacs_gridinfo__ > libscalapack.a:pzunmrq.o: U blacs_gridinfo__ > libscalapack.a:pzunmtr.o: U blacs_gridinfo__ > libscalapack.a:pzlaevswp.o: U blacs_gridinfo__ > libscalapack.a:pzlarzb.o: U blacs_gridinfo__ > libscalapack.a:pzlarzt.o: U blacs_gridinfo__ > libscalapack.a:pzlarz.o: U blacs_gridinfo__ > libscalapack.a:pzlarzc.o: U blacs_gridinfo__ > libscalapack.a:pzlatrz.o: U blacs_gridinfo__ > libscalapack.a:pztzrzf.o: U blacs_gridinfo__ > libscalapack.a:pzunmr3.o: U blacs_gridinfo__ > libscalapack.a:pzunmrz.o: U blacs_gridinfo__ > libscalapack.a:pzmax1.o: U blacs_gridinfo__ > libscalapack.a:pdzsum1.o: U blacs_gridinfo__ > libscalapack.a:pzlamr1d.o: U blacs_gridinfo__ > libscalapack.a:pzheevr.o: U blacs_gridinfo__ > > > On Tue, Sep 13, 2016 at 1:33 PM, Satish Balay wrote: > >> On Tue, 13 Sep 2016, Matthew Knepley wrote: >> >> > I believe your problem is that this is old PETSc. In the latest release, >> > BLACS is part of SCALAPACK. >> >> BLACS had been a part of scalapack for a few releases - so thats not the >> issue. >> >> >>>>>>>> >> stderr: >> /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc >> -opt/lib/libscalapack.a(pssytrd.o): In function `pssytrd': >> /tmp/cluster_temp.FFxzAF/petsc-3.6.4/arch-linux2-c-debug/ >> externalpackages/scalapack-2.0.2/SRC/pssytrd.f:259: undefined reference >> to `blacs_gridinfo__' >> /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc >> -opt/lib/libscalapack.a(chk1mat.o): In function `chk1mat': >> <<<<<< >> >> Double underscore? >> >> >>> >> mpicc -c -Df77IsF2C -fPIC -fopenmp -fPIC -g >> -I/cm/shared/apps/openmpi/open64/64/1.10.1/include pzrot.c >> <<< >> >> scalapack is getting compiled with this flag '-Df77IsF2C'. This mode >> was primarily supported by 'g77' previously - which we hardly ever use >> anymore - so this mode is not really tested? >> >> >>>>> >> Executing: mpif90 -show >> stdout: openf90 -I/cm/shared/apps/openmpi/open64/64/1.10.1/include >> -pthread -I/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 -L/usr/lib64/ >> -Wl,-rpath -Wl,/usr/lib64/ -Wl,-rpath -Wl,/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 >> -Wl,--enable-new-dtags -L/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 >> -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi >> >> compilers: Fortran appends an extra underscore to names >> containing underscores >> Defined "HAVE_FORTRAN_UNDERSCORE_UNDERSCORE" to "1" >> >> <<<< >> >> What do you have for: >> >> cd /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt/lib/ >> nm -Ao libscalapack.a |grep -i blacs_gridinfo >> >> However - as Matt refered to - its best to use latest petsc-3.7 >> release. Does MOOSE require 3.6? >> >> Satish >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1946821 bytes Desc: not available URL: From balay at mcs.anl.gov Tue Sep 13 16:43:21 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Sep 2016 16:43:21 -0500 Subject: [petsc-users] Errors in installing PETSc In-Reply-To: References: Message-ID: >>>> Possible ERROR while running linker: exit code 256 stderr: /usr/bin/ld: cannot find -lfblas collect2: error: ld returned 1 exit status <<<< Do not use '--prefix=' option. Don't know what it does with an empty value. Either do an 'inplace' build [defult - when you do not specify a prefix] - or a proper prefix build - with a valid path. And do am 'rm -rf /tmp/cluster_temp.FFxzAF/petsc-3.6.4/arch-linux2-c-opt' - before you restart. And make sure you apply the scalapack patch I refered to in an earlier e-mail https://bitbucket.org/petsc/petsc/commits/3087a8ae6474620b012259e2918d8cbd6a1fd369 Satish On Tue, 13 Sep 2016, Jason Hou wrote: > Hi everyone, > > Thanks for the help. I figured the problem was because the mpicc wasn't > properly defined; however, the blas-lapack libraries still have issues: > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > ------------------------------------------------------------------------------- > --download-fblaslapack libraries cannot be used > ******************************************************************************* > > As pointed by Scott, it should have nothing to do with loaded modules since > I would download the libraries anyway. I've attached the log file and any > suggestion is welcome. > > Thanks, > > Jason > > On Tue, Sep 13, 2016 at 5:30 PM, Jason Hou wrote: > > > Hi Satish, > > > > The output is the following: > > > > [jhou8 at rdfmg lib]$ nm -Ao libscalapack.a |grep -i blacs_gridinfo > > libscalapack.a:blacs_abort_.o: U Cblacs_gridinfo > > libscalapack.a:blacs_info_.o:0000000000000000 T blacs_gridinfo_ > > libscalapack.a:blacs_abort_.oo: U Cblacs_gridinfo > > libscalapack.a:blacs_info_.oo:0000000000000000 T Cblacs_gridinfo > > libscalapack.a:chk1mat.o: U blacs_gridinfo__ > > libscalapack.a:pchkxmat.o: U blacs_gridinfo__ > > libscalapack.a:desc_convert.o: U blacs_gridinfo__ > > libscalapack.a:descinit.o: U blacs_gridinfo__ > > libscalapack.a:reshape.o: U Cblacs_gridinfo > > libscalapack.a:SL_gridreshape.o: U Cblacs_gridinfo > > libscalapack.a:picol2row.o: U blacs_gridinfo__ > > libscalapack.a:pirow2col.o: U blacs_gridinfo__ > > libscalapack.a:pilaprnt.o: U blacs_gridinfo__ > > libscalapack.a:pitreecomb.o: U blacs_gridinfo__ > > libscalapack.a:pichekpad.o: U blacs_gridinfo__ > > libscalapack.a:pielset.o: U blacs_gridinfo__ > > libscalapack.a:pielset2.o: U blacs_gridinfo__ > > libscalapack.a:pielget.o: U blacs_gridinfo__ > > libscalapack.a:psmatadd.o: U blacs_gridinfo__ > > libscalapack.a:pscol2row.o: U blacs_gridinfo__ > > libscalapack.a:psrow2col.o: U blacs_gridinfo__ > > libscalapack.a:pslaprnt.o: U blacs_gridinfo__ > > libscalapack.a:pstreecomb.o: U blacs_gridinfo__ > > libscalapack.a:pschekpad.o: U blacs_gridinfo__ > > libscalapack.a:pselset.o: U blacs_gridinfo__ > > libscalapack.a:pselset2.o: U blacs_gridinfo__ > > libscalapack.a:pselget.o: U blacs_gridinfo__ > > libscalapack.a:pslaread.o: U blacs_gridinfo__ > > libscalapack.a:pslawrite.o: U blacs_gridinfo__ > > libscalapack.a:pdmatadd.o: U blacs_gridinfo__ > > libscalapack.a:pdcol2row.o: U blacs_gridinfo__ > > libscalapack.a:pdrow2col.o: U blacs_gridinfo__ > > libscalapack.a:pdlaprnt.o: U blacs_gridinfo__ > > libscalapack.a:pdtreecomb.o: U blacs_gridinfo__ > > libscalapack.a:pdchekpad.o: U blacs_gridinfo__ > > libscalapack.a:pdelset.o: U blacs_gridinfo__ > > libscalapack.a:pdelset2.o: U blacs_gridinfo__ > > libscalapack.a:pdelget.o: U blacs_gridinfo__ > > libscalapack.a:pdlaread.o: U blacs_gridinfo__ > > libscalapack.a:pdlawrite.o: U blacs_gridinfo__ > > libscalapack.a:pcmatadd.o: U blacs_gridinfo__ > > libscalapack.a:pccol2row.o: U blacs_gridinfo__ > > libscalapack.a:pcrow2col.o: U blacs_gridinfo__ > > libscalapack.a:pclaprnt.o: U blacs_gridinfo__ > > libscalapack.a:pctreecomb.o: U blacs_gridinfo__ > > libscalapack.a:pcchekpad.o: U blacs_gridinfo__ > > libscalapack.a:pcelset.o: U blacs_gridinfo__ > > libscalapack.a:pcelset2.o: U blacs_gridinfo__ > > libscalapack.a:pcelget.o: U blacs_gridinfo__ > > libscalapack.a:pclaread.o: U blacs_gridinfo__ > > libscalapack.a:pclawrite.o: U blacs_gridinfo__ > > libscalapack.a:pzmatadd.o: U blacs_gridinfo__ > > libscalapack.a:pzcol2row.o: U blacs_gridinfo__ > > libscalapack.a:pzrow2col.o: U blacs_gridinfo__ > > libscalapack.a:pzlaprnt.o: U blacs_gridinfo__ > > libscalapack.a:pztreecomb.o: U blacs_gridinfo__ > > libscalapack.a:pzchekpad.o: U blacs_gridinfo__ > > libscalapack.a:pzelset.o: U blacs_gridinfo__ > > libscalapack.a:pzelset2.o: U blacs_gridinfo__ > > libscalapack.a:pzelget.o: U blacs_gridinfo__ > > libscalapack.a:pzlaread.o: U blacs_gridinfo__ > > libscalapack.a:pzlawrite.o: U blacs_gridinfo__ > > libscalapack.a:picopy_.o: U Cblacs_gridinfo > > libscalapack.a:pbstran.o: U blacs_gridinfo__ > > libscalapack.a:pbstrnv.o: U blacs_gridinfo__ > > libscalapack.a:pxerbla.o: U blacs_gridinfo__ > > libscalapack.a:PB_CGatherV.o: U Cblacs_gridinfo > > libscalapack.a:PB_CInV.o: U Cblacs_gridinfo > > libscalapack.a:PB_CInV2.o: U Cblacs_gridinfo > > libscalapack.a:PB_CInOutV.o: U Cblacs_gridinfo > > libscalapack.a:PB_CInOutV2.o: U Cblacs_gridinfo > > libscalapack.a:PB_COutV.o: U Cblacs_gridinfo > > libscalapack.a:PB_CScatterV.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cabort.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cchkmat.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cchkvec.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpswapNN.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpswapND.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cpdot11.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpdotNN.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpdotND.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpaxpbyNN.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpaxpbyND.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpaxpbyDN.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cpaxpby.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpgemmBC.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpgemmAC.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpgemmAB.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cplaprnt.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cplapad.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cplapd2.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cplascal.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cplasca2.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cplacnjg.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cpsym.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpsymmAB.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpsymmBC.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cpsyr.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpsyrkA.o: U Cblacs_gridinfo > > libscalapack.a:PB_CpsyrkAC.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cpsyr2.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cpsyr2kA.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cpsyr2kAC.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cptrm.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cpgeadd.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cptran.o: U Cblacs_gridinfo > > libscalapack.a:PB_CptrmmAB.o: U Cblacs_gridinfo > > libscalapack.a:PB_CptrmmB.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cptrsm.o: U Cblacs_gridinfo > > libscalapack.a:PB_CptrsmAB.o: U Cblacs_gridinfo > > libscalapack.a:PB_CptrsmAB0.o: U Cblacs_gridinfo > > libscalapack.a:PB_CptrsmAB1.o: U Cblacs_gridinfo > > libscalapack.a:PB_CptrsmB.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cptrsv.o: U Cblacs_gridinfo > > libscalapack.a:PB_Cwarn.o: U Cblacs_gridinfo > > libscalapack.a:psswap_.o: U Cblacs_gridinfo > > libscalapack.a:psscal_.o: U Cblacs_gridinfo > > libscalapack.a:pscopy_.o: U Cblacs_gridinfo > > libscalapack.a:psaxpy_.o: U Cblacs_gridinfo > > libscalapack.a:psdot_.o: U Cblacs_gridinfo > > libscalapack.a:psnrm2_.o: U Cblacs_gridinfo > > libscalapack.a:psasum_.o: U Cblacs_gridinfo > > libscalapack.a:psamax_.o: U Cblacs_gridinfo > > libscalapack.a:psgemv_.o: U Cblacs_gridinfo > > libscalapack.a:psger_.o: U Cblacs_gridinfo > > libscalapack.a:pssymv_.o: U Cblacs_gridinfo > > libscalapack.a:pssyr_.o: U Cblacs_gridinfo > > libscalapack.a:pssyr2_.o: U Cblacs_gridinfo > > libscalapack.a:pstrmv_.o: U Cblacs_gridinfo > > libscalapack.a:pstrsv_.o: U Cblacs_gridinfo > > libscalapack.a:psagemv_.o: U Cblacs_gridinfo > > libscalapack.a:psasymv_.o: U Cblacs_gridinfo > > libscalapack.a:psatrmv_.o: U Cblacs_gridinfo > > libscalapack.a:psgeadd_.o: U Cblacs_gridinfo > > libscalapack.a:psgemm_.o: U Cblacs_gridinfo > > libscalapack.a:pssymm_.o: U Cblacs_gridinfo > > libscalapack.a:pssyr2k_.o: U Cblacs_gridinfo > > libscalapack.a:pssyrk_.o: U Cblacs_gridinfo > > libscalapack.a:pstradd_.o: U Cblacs_gridinfo > > libscalapack.a:pstran_.o: U Cblacs_gridinfo > > libscalapack.a:pstrmm_.o: U Cblacs_gridinfo > > libscalapack.a:pstrsm_.o: U Cblacs_gridinfo > > libscalapack.a:pbdtran.o: U blacs_gridinfo__ > > libscalapack.a:pbdtrnv.o: U blacs_gridinfo__ > > libscalapack.a:pdswap_.o: U Cblacs_gridinfo > > libscalapack.a:pdscal_.o: U Cblacs_gridinfo > > libscalapack.a:pdcopy_.o: U Cblacs_gridinfo > > libscalapack.a:pdaxpy_.o: U Cblacs_gridinfo > > libscalapack.a:pddot_.o: U Cblacs_gridinfo > > libscalapack.a:pdnrm2_.o: U Cblacs_gridinfo > > libscalapack.a:pdasum_.o: U Cblacs_gridinfo > > libscalapack.a:pdamax_.o: U Cblacs_gridinfo > > libscalapack.a:pdgemv_.o: U Cblacs_gridinfo > > libscalapack.a:pdger_.o: U Cblacs_gridinfo > > libscalapack.a:pdsymv_.o: U Cblacs_gridinfo > > libscalapack.a:pdsyr_.o: U Cblacs_gridinfo > > libscalapack.a:pdsyr2_.o: U Cblacs_gridinfo > > libscalapack.a:pdtrmv_.o: U Cblacs_gridinfo > > libscalapack.a:pdtrsv_.o: U Cblacs_gridinfo > > libscalapack.a:pdagemv_.o: U Cblacs_gridinfo > > libscalapack.a:pdasymv_.o: U Cblacs_gridinfo > > libscalapack.a:pdatrmv_.o: U Cblacs_gridinfo > > libscalapack.a:pdgeadd_.o: U Cblacs_gridinfo > > libscalapack.a:pdgemm_.o: U Cblacs_gridinfo > > libscalapack.a:pdsymm_.o: U Cblacs_gridinfo > > libscalapack.a:pdsyr2k_.o: U Cblacs_gridinfo > > libscalapack.a:pdsyrk_.o: U Cblacs_gridinfo > > libscalapack.a:pdtradd_.o: U Cblacs_gridinfo > > libscalapack.a:pdtran_.o: U Cblacs_gridinfo > > libscalapack.a:pdtrmm_.o: U Cblacs_gridinfo > > libscalapack.a:pdtrsm_.o: U Cblacs_gridinfo > > libscalapack.a:pbctran.o: U blacs_gridinfo__ > > libscalapack.a:pbctrnv.o: U blacs_gridinfo__ > > libscalapack.a:pcswap_.o: U Cblacs_gridinfo > > libscalapack.a:pcscal_.o: U Cblacs_gridinfo > > libscalapack.a:pcsscal_.o: U Cblacs_gridinfo > > libscalapack.a:pccopy_.o: U Cblacs_gridinfo > > libscalapack.a:pcaxpy_.o: U Cblacs_gridinfo > > libscalapack.a:pcdotu_.o: U Cblacs_gridinfo > > libscalapack.a:pcdotc_.o: U Cblacs_gridinfo > > libscalapack.a:pscnrm2_.o: U Cblacs_gridinfo > > libscalapack.a:pscasum_.o: U Cblacs_gridinfo > > libscalapack.a:pcamax_.o: U Cblacs_gridinfo > > libscalapack.a:pcgemv_.o: U Cblacs_gridinfo > > libscalapack.a:pcgerc_.o: U Cblacs_gridinfo > > libscalapack.a:pcgeru_.o: U Cblacs_gridinfo > > libscalapack.a:pchemv_.o: U Cblacs_gridinfo > > libscalapack.a:pcher_.o: U Cblacs_gridinfo > > libscalapack.a:pcher2_.o: U Cblacs_gridinfo > > libscalapack.a:pctrmv_.o: U Cblacs_gridinfo > > libscalapack.a:pctrsv_.o: U Cblacs_gridinfo > > libscalapack.a:pcagemv_.o: U Cblacs_gridinfo > > libscalapack.a:pcahemv_.o: U Cblacs_gridinfo > > libscalapack.a:pcatrmv_.o: U Cblacs_gridinfo > > libscalapack.a:pcgeadd_.o: U Cblacs_gridinfo > > libscalapack.a:pcgemm_.o: U Cblacs_gridinfo > > libscalapack.a:pchemm_.o: U Cblacs_gridinfo > > libscalapack.a:pcher2k_.o: U Cblacs_gridinfo > > libscalapack.a:pcherk_.o: U Cblacs_gridinfo > > libscalapack.a:pcsymm_.o: U Cblacs_gridinfo > > libscalapack.a:pcsyr2k_.o: U Cblacs_gridinfo > > libscalapack.a:pcsyrk_.o: U Cblacs_gridinfo > > libscalapack.a:pctradd_.o: U Cblacs_gridinfo > > libscalapack.a:pctranc_.o: U Cblacs_gridinfo > > libscalapack.a:pctranu_.o: U Cblacs_gridinfo > > libscalapack.a:pctrmm_.o: U Cblacs_gridinfo > > libscalapack.a:pctrsm_.o: U Cblacs_gridinfo > > libscalapack.a:pbztran.o: U blacs_gridinfo__ > > libscalapack.a:pbztrnv.o: U blacs_gridinfo__ > > libscalapack.a:pzswap_.o: U Cblacs_gridinfo > > libscalapack.a:pzscal_.o: U Cblacs_gridinfo > > libscalapack.a:pzdscal_.o: U Cblacs_gridinfo > > libscalapack.a:pzcopy_.o: U Cblacs_gridinfo > > libscalapack.a:pzaxpy_.o: U Cblacs_gridinfo > > libscalapack.a:pzdotu_.o: U Cblacs_gridinfo > > libscalapack.a:pzdotc_.o: U Cblacs_gridinfo > > libscalapack.a:pdznrm2_.o: U Cblacs_gridinfo > > libscalapack.a:pdzasum_.o: U Cblacs_gridinfo > > libscalapack.a:pzamax_.o: U Cblacs_gridinfo > > libscalapack.a:pzgemv_.o: U Cblacs_gridinfo > > libscalapack.a:pzgerc_.o: U Cblacs_gridinfo > > libscalapack.a:pzgeru_.o: U Cblacs_gridinfo > > libscalapack.a:pzhemv_.o: U Cblacs_gridinfo > > libscalapack.a:pzher_.o: U Cblacs_gridinfo > > libscalapack.a:pzher2_.o: U Cblacs_gridinfo > > libscalapack.a:pztrmv_.o: U Cblacs_gridinfo > > libscalapack.a:pztrsv_.o: U Cblacs_gridinfo > > libscalapack.a:pzagemv_.o: U Cblacs_gridinfo > > libscalapack.a:pzahemv_.o: U Cblacs_gridinfo > > libscalapack.a:pzatrmv_.o: U Cblacs_gridinfo > > libscalapack.a:pzgeadd_.o: U Cblacs_gridinfo > > libscalapack.a:pzgemm_.o: U Cblacs_gridinfo > > libscalapack.a:pzhemm_.o: U Cblacs_gridinfo > > libscalapack.a:pzher2k_.o: U Cblacs_gridinfo > > libscalapack.a:pzherk_.o: U Cblacs_gridinfo > > libscalapack.a:pzsymm_.o: U Cblacs_gridinfo > > libscalapack.a:pzsyr2k_.o: U Cblacs_gridinfo > > libscalapack.a:pzsyrk_.o: U Cblacs_gridinfo > > libscalapack.a:pztradd_.o: U Cblacs_gridinfo > > libscalapack.a:pztranc_.o: U Cblacs_gridinfo > > libscalapack.a:pztranu_.o: U Cblacs_gridinfo > > libscalapack.a:pztrmm_.o: U Cblacs_gridinfo > > libscalapack.a:pztrsm_.o: U Cblacs_gridinfo > > libscalapack.a:pigemr.o: U Cblacs_gridinfo > > libscalapack.a:pitrmr.o: U Cblacs_gridinfo > > libscalapack.a:pgemraux.o: U Cblacs_gridinfo > > libscalapack.a:psgemr.o: U Cblacs_gridinfo > > libscalapack.a:pstrmr.o: U Cblacs_gridinfo > > libscalapack.a:pdgemr.o: U Cblacs_gridinfo > > libscalapack.a:pdtrmr.o: U Cblacs_gridinfo > > libscalapack.a:pcgemr.o: U Cblacs_gridinfo > > libscalapack.a:pctrmr.o: U Cblacs_gridinfo > > libscalapack.a:pzgemr.o: U Cblacs_gridinfo > > libscalapack.a:pztrmr.o: U Cblacs_gridinfo > > libscalapack.a:psdbsv.o: U blacs_gridinfo__ > > libscalapack.a:psdbtrf.o: U blacs_gridinfo__ > > libscalapack.a:psdbtrs.o: U blacs_gridinfo__ > > libscalapack.a:psdbtrsv.o: U blacs_gridinfo__ > > libscalapack.a:psdtsv.o: U blacs_gridinfo__ > > libscalapack.a:psdttrf.o: U blacs_gridinfo__ > > libscalapack.a:psdttrs.o: U blacs_gridinfo__ > > libscalapack.a:psdttrsv.o: U blacs_gridinfo__ > > libscalapack.a:psgbsv.o: U blacs_gridinfo__ > > libscalapack.a:psgbtrf.o: U blacs_gridinfo__ > > libscalapack.a:psgbtrs.o: U blacs_gridinfo__ > > libscalapack.a:psgebd2.o: U blacs_gridinfo__ > > libscalapack.a:psgebrd.o: U blacs_gridinfo__ > > libscalapack.a:psgecon.o: U blacs_gridinfo__ > > libscalapack.a:psgeequ.o: U blacs_gridinfo__ > > libscalapack.a:psgehd2.o: U blacs_gridinfo__ > > libscalapack.a:psgehrd.o: U blacs_gridinfo__ > > libscalapack.a:psgelq2.o: U blacs_gridinfo__ > > libscalapack.a:psgelqf.o: U blacs_gridinfo__ > > libscalapack.a:psgels.o: U blacs_gridinfo__ > > libscalapack.a:psgeql2.o: U blacs_gridinfo__ > > libscalapack.a:psgeqlf.o: U blacs_gridinfo__ > > libscalapack.a:psgeqpf.o: U blacs_gridinfo__ > > libscalapack.a:psgeqr2.o: U blacs_gridinfo__ > > libscalapack.a:psgeqrf.o: U blacs_gridinfo__ > > libscalapack.a:psgerfs.o: U blacs_gridinfo__ > > libscalapack.a:psgerq2.o: U blacs_gridinfo__ > > libscalapack.a:psgerqf.o: U blacs_gridinfo__ > > libscalapack.a:psgesv.o: U blacs_gridinfo__ > > libscalapack.a:psgesvd.o: U blacs_gridinfo__ > > libscalapack.a:psgesvx.o: U blacs_gridinfo__ > > libscalapack.a:psgetf2.o: U blacs_gridinfo__ > > libscalapack.a:psgetrf.o: U blacs_gridinfo__ > > libscalapack.a:psgetri.o: U blacs_gridinfo__ > > libscalapack.a:psgetrs.o: U blacs_gridinfo__ > > libscalapack.a:psggqrf.o: U blacs_gridinfo__ > > libscalapack.a:psggrqf.o: U blacs_gridinfo__ > > libscalapack.a:pslabrd.o: U blacs_gridinfo__ > > libscalapack.a:pslacon.o: U blacs_gridinfo__ > > libscalapack.a:pslacp2.o: U blacs_gridinfo__ > > libscalapack.a:pslahrd.o: U blacs_gridinfo__ > > libscalapack.a:pslange.o: U blacs_gridinfo__ > > libscalapack.a:pslanhs.o: U blacs_gridinfo__ > > libscalapack.a:pslansy.o: U blacs_gridinfo__ > > libscalapack.a:pslantr.o: U blacs_gridinfo__ > > libscalapack.a:pslapiv.o: U blacs_gridinfo__ > > libscalapack.a:pslapv2.o: U blacs_gridinfo__ > > libscalapack.a:pslaqge.o: U blacs_gridinfo__ > > libscalapack.a:pslaqsy.o: U blacs_gridinfo__ > > libscalapack.a:pslarf.o: U blacs_gridinfo__ > > libscalapack.a:pslarfb.o: U blacs_gridinfo__ > > libscalapack.a:pslarfg.o: U blacs_gridinfo__ > > libscalapack.a:pslarft.o: U blacs_gridinfo__ > > libscalapack.a:pslase2.o: U blacs_gridinfo__ > > libscalapack.a:pslascl.o: U blacs_gridinfo__ > > libscalapack.a:pslassq.o: U blacs_gridinfo__ > > libscalapack.a:pslaswp.o: U blacs_gridinfo__ > > libscalapack.a:pslatra.o: U blacs_gridinfo__ > > libscalapack.a:pslatrd.o: U blacs_gridinfo__ > > libscalapack.a:pslatrs.o: U blacs_gridinfo__ > > libscalapack.a:pslauu2.o: U blacs_gridinfo__ > > libscalapack.a:psorg2l.o: U blacs_gridinfo__ > > libscalapack.a:psorg2r.o: U blacs_gridinfo__ > > libscalapack.a:psorgl2.o: U blacs_gridinfo__ > > libscalapack.a:psorglq.o: U blacs_gridinfo__ > > libscalapack.a:psorgql.o: U blacs_gridinfo__ > > libscalapack.a:psorgqr.o: U blacs_gridinfo__ > > libscalapack.a:psorgr2.o: U blacs_gridinfo__ > > libscalapack.a:psorgrq.o: U blacs_gridinfo__ > > libscalapack.a:psorm2l.o: U blacs_gridinfo__ > > libscalapack.a:psorm2r.o: U blacs_gridinfo__ > > libscalapack.a:psormbr.o: U blacs_gridinfo__ > > libscalapack.a:psormhr.o: U blacs_gridinfo__ > > libscalapack.a:psorml2.o: U blacs_gridinfo__ > > libscalapack.a:psormlq.o: U blacs_gridinfo__ > > libscalapack.a:psormql.o: U blacs_gridinfo__ > > libscalapack.a:psormqr.o: U blacs_gridinfo__ > > libscalapack.a:psormr2.o: U blacs_gridinfo__ > > libscalapack.a:psormrq.o: U blacs_gridinfo__ > > libscalapack.a:psormtr.o: U blacs_gridinfo__ > > libscalapack.a:pspocon.o: U blacs_gridinfo__ > > libscalapack.a:pspbsv.o: U blacs_gridinfo__ > > libscalapack.a:pspbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pspbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pspbtrsv.o: U blacs_gridinfo__ > > libscalapack.a:psptsv.o: U blacs_gridinfo__ > > libscalapack.a:pspttrf.o: U blacs_gridinfo__ > > libscalapack.a:pspttrs.o: U blacs_gridinfo__ > > libscalapack.a:pspttrsv.o: U blacs_gridinfo__ > > libscalapack.a:pspoequ.o: U blacs_gridinfo__ > > libscalapack.a:psporfs.o: U blacs_gridinfo__ > > libscalapack.a:psposv.o: U blacs_gridinfo__ > > libscalapack.a:psposvx.o: U blacs_gridinfo__ > > libscalapack.a:pspotf2.o: U blacs_gridinfo__ > > libscalapack.a:pspotrf.o: U blacs_gridinfo__ > > libscalapack.a:pspotri.o: U blacs_gridinfo__ > > libscalapack.a:pspotrs.o: U blacs_gridinfo__ > > libscalapack.a:psrscl.o: U blacs_gridinfo__ > > libscalapack.a:psstein.o: U blacs_gridinfo__ > > libscalapack.a:pssyev.o: U blacs_gridinfo__ > > libscalapack.a:pssyevd.o: U blacs_gridinfo__ > > libscalapack.a:pssyevx.o: U blacs_gridinfo__ > > libscalapack.a:pssygs2.o: U blacs_gridinfo__ > > libscalapack.a:pssygst.o: U blacs_gridinfo__ > > libscalapack.a:pssygvx.o: U blacs_gridinfo__ > > libscalapack.a:pssyngst.o: U blacs_gridinfo__ > > libscalapack.a:pssyntrd.o: U blacs_gridinfo__ > > libscalapack.a:pssyttrd.o: U blacs_gridinfo__ > > libscalapack.a:pssytd2.o: U blacs_gridinfo__ > > libscalapack.a:pssytrd.o: U blacs_gridinfo__ > > libscalapack.a:pstrti2.o: U blacs_gridinfo__ > > libscalapack.a:pstrtri.o: U blacs_gridinfo__ > > libscalapack.a:pstrtrs.o: U blacs_gridinfo__ > > libscalapack.a:pslaevswp.o: U blacs_gridinfo__ > > libscalapack.a:pslarzb.o: U blacs_gridinfo__ > > libscalapack.a:pslarzt.o: U blacs_gridinfo__ > > libscalapack.a:pslarz.o: U blacs_gridinfo__ > > libscalapack.a:pslatrz.o: U blacs_gridinfo__ > > libscalapack.a:pstzrzf.o: U blacs_gridinfo__ > > libscalapack.a:psormr3.o: U blacs_gridinfo__ > > libscalapack.a:psormrz.o: U blacs_gridinfo__ > > libscalapack.a:pslahqr.o: U blacs_gridinfo__ > > libscalapack.a:pslaconsb.o: U blacs_gridinfo__ > > libscalapack.a:pslacp3.o: U blacs_gridinfo__ > > libscalapack.a:pslawil.o: U blacs_gridinfo__ > > libscalapack.a:pslasmsub.o: U blacs_gridinfo__ > > libscalapack.a:pslared2d.o: U blacs_gridinfo__ > > libscalapack.a:pslamr1d.o: U blacs_gridinfo__ > > libscalapack.a:pssyevr.o: U blacs_gridinfo__ > > libscalapack.a:pstrord.o: U blacs_gridinfo__ > > libscalapack.a:pstrsen.o: U blacs_gridinfo__ > > libscalapack.a:psgebal.o: U blacs_gridinfo__ > > libscalapack.a:pshseqr.o: U blacs_gridinfo__ > > libscalapack.a:pslamve.o: U blacs_gridinfo__ > > libscalapack.a:pslaqr0.o: U blacs_gridinfo__ > > libscalapack.a:pslaqr1.o: U blacs_gridinfo__ > > libscalapack.a:pslaqr2.o: U blacs_gridinfo__ > > libscalapack.a:pslaqr3.o: U blacs_gridinfo__ > > libscalapack.a:pslaqr4.o: U blacs_gridinfo__ > > libscalapack.a:pslaqr5.o: U blacs_gridinfo__ > > libscalapack.a:psrot.o: U blacs_gridinfo__ > > libscalapack.a:pslaed0.o: U blacs_gridinfo__ > > libscalapack.a:pslaed1.o: U blacs_gridinfo__ > > libscalapack.a:pslaed2.o: U blacs_gridinfo__ > > libscalapack.a:pslaed3.o: U blacs_gridinfo__ > > libscalapack.a:pslaedz.o: U blacs_gridinfo__ > > libscalapack.a:pslared1d.o: U blacs_gridinfo__ > > libscalapack.a:pslasrt.o: U blacs_gridinfo__ > > libscalapack.a:psstebz.o: U blacs_gridinfo__ > > libscalapack.a:psstedc.o: U blacs_gridinfo__ > > libscalapack.a:pilaenvx.o: U blacs_gridinfo__ > > libscalapack.a:piparmq.o: U blacs_gridinfo__ > > libscalapack.a:pddbsv.o: U blacs_gridinfo__ > > libscalapack.a:pddbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pddbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pddbtrsv.o: U blacs_gridinfo__ > > libscalapack.a:pddtsv.o: U blacs_gridinfo__ > > libscalapack.a:pddttrf.o: U blacs_gridinfo__ > > libscalapack.a:pddttrs.o: U blacs_gridinfo__ > > libscalapack.a:pddttrsv.o: U blacs_gridinfo__ > > libscalapack.a:pdgbsv.o: U blacs_gridinfo__ > > libscalapack.a:pdgbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pdgbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pdgebd2.o: U blacs_gridinfo__ > > libscalapack.a:pdgebrd.o: U blacs_gridinfo__ > > libscalapack.a:pdgecon.o: U blacs_gridinfo__ > > libscalapack.a:pdgeequ.o: U blacs_gridinfo__ > > libscalapack.a:pdgehd2.o: U blacs_gridinfo__ > > libscalapack.a:pdgehrd.o: U blacs_gridinfo__ > > libscalapack.a:pdgelq2.o: U blacs_gridinfo__ > > libscalapack.a:pdgelqf.o: U blacs_gridinfo__ > > libscalapack.a:pdgels.o: U blacs_gridinfo__ > > libscalapack.a:pdgeql2.o: U blacs_gridinfo__ > > libscalapack.a:pdgeqlf.o: U blacs_gridinfo__ > > libscalapack.a:pdgeqpf.o: U blacs_gridinfo__ > > libscalapack.a:pdgeqr2.o: U blacs_gridinfo__ > > libscalapack.a:pdgeqrf.o: U blacs_gridinfo__ > > libscalapack.a:pdgerfs.o: U blacs_gridinfo__ > > libscalapack.a:pdgerq2.o: U blacs_gridinfo__ > > libscalapack.a:pdgerqf.o: U blacs_gridinfo__ > > libscalapack.a:pdgesv.o: U blacs_gridinfo__ > > libscalapack.a:pdgesvd.o: U blacs_gridinfo__ > > libscalapack.a:pdgesvx.o: U blacs_gridinfo__ > > libscalapack.a:pdgetf2.o: U blacs_gridinfo__ > > libscalapack.a:pdgetrf.o: U blacs_gridinfo__ > > libscalapack.a:pdgetri.o: U blacs_gridinfo__ > > libscalapack.a:pdgetrs.o: U blacs_gridinfo__ > > libscalapack.a:pdggqrf.o: U blacs_gridinfo__ > > libscalapack.a:pdggrqf.o: U blacs_gridinfo__ > > libscalapack.a:pdlabrd.o: U blacs_gridinfo__ > > libscalapack.a:pdlacon.o: U blacs_gridinfo__ > > libscalapack.a:pdlacp2.o: U blacs_gridinfo__ > > libscalapack.a:pdlahrd.o: U blacs_gridinfo__ > > libscalapack.a:pdlange.o: U blacs_gridinfo__ > > libscalapack.a:pdlanhs.o: U blacs_gridinfo__ > > libscalapack.a:pdlansy.o: U blacs_gridinfo__ > > libscalapack.a:pdlantr.o: U blacs_gridinfo__ > > libscalapack.a:pdlapiv.o: U blacs_gridinfo__ > > libscalapack.a:pdlapv2.o: U blacs_gridinfo__ > > libscalapack.a:pdlaqge.o: U blacs_gridinfo__ > > libscalapack.a:pdlaqsy.o: U blacs_gridinfo__ > > libscalapack.a:pdlarf.o: U blacs_gridinfo__ > > libscalapack.a:pdlarfb.o: U blacs_gridinfo__ > > libscalapack.a:pdlarfg.o: U blacs_gridinfo__ > > libscalapack.a:pdlarft.o: U blacs_gridinfo__ > > libscalapack.a:pdlase2.o: U blacs_gridinfo__ > > libscalapack.a:pdlascl.o: U blacs_gridinfo__ > > libscalapack.a:pdlassq.o: U blacs_gridinfo__ > > libscalapack.a:pdlaswp.o: U blacs_gridinfo__ > > libscalapack.a:pdlatra.o: U blacs_gridinfo__ > > libscalapack.a:pdlatrd.o: U blacs_gridinfo__ > > libscalapack.a:pdlatrs.o: U blacs_gridinfo__ > > libscalapack.a:pdlauu2.o: U blacs_gridinfo__ > > libscalapack.a:pdorg2l.o: U blacs_gridinfo__ > > libscalapack.a:pdorg2r.o: U blacs_gridinfo__ > > libscalapack.a:pdorgl2.o: U blacs_gridinfo__ > > libscalapack.a:pdorglq.o: U blacs_gridinfo__ > > libscalapack.a:pdorgql.o: U blacs_gridinfo__ > > libscalapack.a:pdorgqr.o: U blacs_gridinfo__ > > libscalapack.a:pdorgr2.o: U blacs_gridinfo__ > > libscalapack.a:pdorgrq.o: U blacs_gridinfo__ > > libscalapack.a:pdorm2l.o: U blacs_gridinfo__ > > libscalapack.a:pdorm2r.o: U blacs_gridinfo__ > > libscalapack.a:pdormbr.o: U blacs_gridinfo__ > > libscalapack.a:pdormhr.o: U blacs_gridinfo__ > > libscalapack.a:pdorml2.o: U blacs_gridinfo__ > > libscalapack.a:pdormlq.o: U blacs_gridinfo__ > > libscalapack.a:pdormql.o: U blacs_gridinfo__ > > libscalapack.a:pdormqr.o: U blacs_gridinfo__ > > libscalapack.a:pdormr2.o: U blacs_gridinfo__ > > libscalapack.a:pdormrq.o: U blacs_gridinfo__ > > libscalapack.a:pdormtr.o: U blacs_gridinfo__ > > libscalapack.a:pdpocon.o: U blacs_gridinfo__ > > libscalapack.a:pdpbsv.o: U blacs_gridinfo__ > > libscalapack.a:pdpbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pdpbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pdpbtrsv.o: U blacs_gridinfo__ > > libscalapack.a:pdptsv.o: U blacs_gridinfo__ > > libscalapack.a:pdpttrf.o: U blacs_gridinfo__ > > libscalapack.a:pdpttrs.o: U blacs_gridinfo__ > > libscalapack.a:pdpttrsv.o: U blacs_gridinfo__ > > libscalapack.a:pdpoequ.o: U blacs_gridinfo__ > > libscalapack.a:pdporfs.o: U blacs_gridinfo__ > > libscalapack.a:pdposv.o: U blacs_gridinfo__ > > libscalapack.a:pdposvx.o: U blacs_gridinfo__ > > libscalapack.a:pdpotf2.o: U blacs_gridinfo__ > > libscalapack.a:pdpotrf.o: U blacs_gridinfo__ > > libscalapack.a:pdpotri.o: U blacs_gridinfo__ > > libscalapack.a:pdpotrs.o: U blacs_gridinfo__ > > libscalapack.a:pdrscl.o: U blacs_gridinfo__ > > libscalapack.a:pdstein.o: U blacs_gridinfo__ > > libscalapack.a:pdsyev.o: U blacs_gridinfo__ > > libscalapack.a:pdsyevd.o: U blacs_gridinfo__ > > libscalapack.a:pdsyevx.o: U blacs_gridinfo__ > > libscalapack.a:pdsygs2.o: U blacs_gridinfo__ > > libscalapack.a:pdsygst.o: U blacs_gridinfo__ > > libscalapack.a:pdsygvx.o: U blacs_gridinfo__ > > libscalapack.a:pdsyngst.o: U blacs_gridinfo__ > > libscalapack.a:pdsyntrd.o: U blacs_gridinfo__ > > libscalapack.a:pdsyttrd.o: U blacs_gridinfo__ > > libscalapack.a:pdsytd2.o: U blacs_gridinfo__ > > libscalapack.a:pdsytrd.o: U blacs_gridinfo__ > > libscalapack.a:pdtrti2.o: U blacs_gridinfo__ > > libscalapack.a:pdtrtri.o: U blacs_gridinfo__ > > libscalapack.a:pdtrtrs.o: U blacs_gridinfo__ > > libscalapack.a:pdlaevswp.o: U blacs_gridinfo__ > > libscalapack.a:pdlarzb.o: U blacs_gridinfo__ > > libscalapack.a:pdlarzt.o: U blacs_gridinfo__ > > libscalapack.a:pdlarz.o: U blacs_gridinfo__ > > libscalapack.a:pdlatrz.o: U blacs_gridinfo__ > > libscalapack.a:pdtzrzf.o: U blacs_gridinfo__ > > libscalapack.a:pdormr3.o: U blacs_gridinfo__ > > libscalapack.a:pdormrz.o: U blacs_gridinfo__ > > libscalapack.a:pdlahqr.o: U blacs_gridinfo__ > > libscalapack.a:pdlaconsb.o: U blacs_gridinfo__ > > libscalapack.a:pdlacp3.o: U blacs_gridinfo__ > > libscalapack.a:pdlawil.o: U blacs_gridinfo__ > > libscalapack.a:pdlasmsub.o: U blacs_gridinfo__ > > libscalapack.a:pdlared2d.o: U blacs_gridinfo__ > > libscalapack.a:pdlamr1d.o: U blacs_gridinfo__ > > libscalapack.a:pdsyevr.o: U blacs_gridinfo__ > > libscalapack.a:pdtrord.o: U blacs_gridinfo__ > > libscalapack.a:pdtrsen.o: U blacs_gridinfo__ > > libscalapack.a:pdgebal.o: U blacs_gridinfo__ > > libscalapack.a:pdhseqr.o: U blacs_gridinfo__ > > libscalapack.a:pdlamve.o: U blacs_gridinfo__ > > libscalapack.a:pdlaqr0.o: U blacs_gridinfo__ > > libscalapack.a:pdlaqr1.o: U blacs_gridinfo__ > > libscalapack.a:pdlaqr2.o: U blacs_gridinfo__ > > libscalapack.a:pdlaqr3.o: U blacs_gridinfo__ > > libscalapack.a:pdlaqr4.o: U blacs_gridinfo__ > > libscalapack.a:pdlaqr5.o: U blacs_gridinfo__ > > libscalapack.a:pdrot.o: U blacs_gridinfo__ > > libscalapack.a:pdlaed0.o: U blacs_gridinfo__ > > libscalapack.a:pdlaed1.o: U blacs_gridinfo__ > > libscalapack.a:pdlaed2.o: U blacs_gridinfo__ > > libscalapack.a:pdlaed3.o: U blacs_gridinfo__ > > libscalapack.a:pdlaedz.o: U blacs_gridinfo__ > > libscalapack.a:pdlared1d.o: U blacs_gridinfo__ > > libscalapack.a:pdlasrt.o: U blacs_gridinfo__ > > libscalapack.a:pdstebz.o: U blacs_gridinfo__ > > libscalapack.a:pdstedc.o: U blacs_gridinfo__ > > libscalapack.a:pcdbsv.o: U blacs_gridinfo__ > > libscalapack.a:pcdbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pcdbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pcdbtrsv.o: U blacs_gridinfo__ > > libscalapack.a:pcdtsv.o: U blacs_gridinfo__ > > libscalapack.a:pcdttrf.o: U blacs_gridinfo__ > > libscalapack.a:pcdttrs.o: U blacs_gridinfo__ > > libscalapack.a:pcdttrsv.o: U blacs_gridinfo__ > > libscalapack.a:pcgbsv.o: U blacs_gridinfo__ > > libscalapack.a:pcgbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pcgbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pcgebd2.o: U blacs_gridinfo__ > > libscalapack.a:pcgebrd.o: U blacs_gridinfo__ > > libscalapack.a:pcgecon.o: U blacs_gridinfo__ > > libscalapack.a:pcgeequ.o: U blacs_gridinfo__ > > libscalapack.a:pcgehd2.o: U blacs_gridinfo__ > > libscalapack.a:pcgehrd.o: U blacs_gridinfo__ > > libscalapack.a:pcgelq2.o: U blacs_gridinfo__ > > libscalapack.a:pcgelqf.o: U blacs_gridinfo__ > > libscalapack.a:pcgels.o: U blacs_gridinfo__ > > libscalapack.a:pcgeql2.o: U blacs_gridinfo__ > > libscalapack.a:pcgeqlf.o: U blacs_gridinfo__ > > libscalapack.a:pcgeqpf.o: U blacs_gridinfo__ > > libscalapack.a:pcgeqr2.o: U blacs_gridinfo__ > > libscalapack.a:pcgeqrf.o: U blacs_gridinfo__ > > libscalapack.a:pcgerfs.o: U blacs_gridinfo__ > > libscalapack.a:pcgerq2.o: U blacs_gridinfo__ > > libscalapack.a:pcgerqf.o: U blacs_gridinfo__ > > libscalapack.a:pcgesv.o: U blacs_gridinfo__ > > libscalapack.a:pcgesvd.o: U blacs_gridinfo__ > > libscalapack.a:pcgesvx.o: U blacs_gridinfo__ > > libscalapack.a:pcgetf2.o: U blacs_gridinfo__ > > libscalapack.a:pcgetrf.o: U blacs_gridinfo__ > > libscalapack.a:pcgetri.o: U blacs_gridinfo__ > > libscalapack.a:pcgetrs.o: U blacs_gridinfo__ > > libscalapack.a:pcggqrf.o: U blacs_gridinfo__ > > libscalapack.a:pcggrqf.o: U blacs_gridinfo__ > > libscalapack.a:pcheev.o: U blacs_gridinfo__ > > libscalapack.a:pcheevd.o: U blacs_gridinfo__ > > libscalapack.a:pcheevx.o: U blacs_gridinfo__ > > libscalapack.a:pchegs2.o: U blacs_gridinfo__ > > libscalapack.a:pchegst.o: U blacs_gridinfo__ > > libscalapack.a:pchegvx.o: U blacs_gridinfo__ > > libscalapack.a:pchengst.o: U blacs_gridinfo__ > > libscalapack.a:pchentrd.o: U blacs_gridinfo__ > > libscalapack.a:pchettrd.o: U blacs_gridinfo__ > > libscalapack.a:pchetd2.o: U blacs_gridinfo__ > > libscalapack.a:pchetrd.o: U blacs_gridinfo__ > > libscalapack.a:pclabrd.o: U blacs_gridinfo__ > > libscalapack.a:pclacon.o: U blacs_gridinfo__ > > libscalapack.a:pclacgv.o: U blacs_gridinfo__ > > libscalapack.a:pclacp2.o: U blacs_gridinfo__ > > libscalapack.a:pclahrd.o: U blacs_gridinfo__ > > libscalapack.a:pclahqr.o: U blacs_gridinfo__ > > libscalapack.a:pclaconsb.o: U blacs_gridinfo__ > > libscalapack.a:pclasmsub.o: U blacs_gridinfo__ > > libscalapack.a:pclacp3.o: U blacs_gridinfo__ > > libscalapack.a:pclawil.o: U blacs_gridinfo__ > > libscalapack.a:pcrot.o: U blacs_gridinfo__ > > libscalapack.a:pclange.o: U blacs_gridinfo__ > > libscalapack.a:pclanhe.o: U blacs_gridinfo__ > > libscalapack.a:pclanhs.o: U blacs_gridinfo__ > > libscalapack.a:pclansy.o: U blacs_gridinfo__ > > libscalapack.a:pclantr.o: U blacs_gridinfo__ > > libscalapack.a:pclapiv.o: U blacs_gridinfo__ > > libscalapack.a:pclapv2.o: U blacs_gridinfo__ > > libscalapack.a:pclaqge.o: U blacs_gridinfo__ > > libscalapack.a:pclaqsy.o: U blacs_gridinfo__ > > libscalapack.a:pclarf.o: U blacs_gridinfo__ > > libscalapack.a:pclarfb.o: U blacs_gridinfo__ > > libscalapack.a:pclarfc.o: U blacs_gridinfo__ > > libscalapack.a:pclarfg.o: U blacs_gridinfo__ > > libscalapack.a:pclarft.o: U blacs_gridinfo__ > > libscalapack.a:pclascl.o: U blacs_gridinfo__ > > libscalapack.a:pclase2.o: U blacs_gridinfo__ > > libscalapack.a:pclassq.o: U blacs_gridinfo__ > > libscalapack.a:pclaswp.o: U blacs_gridinfo__ > > libscalapack.a:pclatra.o: U blacs_gridinfo__ > > libscalapack.a:pclatrd.o: U blacs_gridinfo__ > > libscalapack.a:pclatrs.o: U blacs_gridinfo__ > > libscalapack.a:pclauu2.o: U blacs_gridinfo__ > > libscalapack.a:pcpocon.o: U blacs_gridinfo__ > > libscalapack.a:pcpoequ.o: U blacs_gridinfo__ > > libscalapack.a:pcporfs.o: U blacs_gridinfo__ > > libscalapack.a:pcposv.o: U blacs_gridinfo__ > > libscalapack.a:pcpbsv.o: U blacs_gridinfo__ > > libscalapack.a:pcpbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pcpbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pcpbtrsv.o: U blacs_gridinfo__ > > libscalapack.a:pcptsv.o: U blacs_gridinfo__ > > libscalapack.a:pcpttrf.o: U blacs_gridinfo__ > > libscalapack.a:pcpttrs.o: U blacs_gridinfo__ > > libscalapack.a:pcpttrsv.o: U blacs_gridinfo__ > > libscalapack.a:pcposvx.o: U blacs_gridinfo__ > > libscalapack.a:pcpotf2.o: U blacs_gridinfo__ > > libscalapack.a:pcpotrf.o: U blacs_gridinfo__ > > libscalapack.a:pcpotri.o: U blacs_gridinfo__ > > libscalapack.a:pcpotrs.o: U blacs_gridinfo__ > > libscalapack.a:pcsrscl.o: U blacs_gridinfo__ > > libscalapack.a:pcstein.o: U blacs_gridinfo__ > > libscalapack.a:pctrevc.o: U blacs_gridinfo__ > > libscalapack.a:pctrti2.o: U blacs_gridinfo__ > > libscalapack.a:pctrtri.o: U blacs_gridinfo__ > > libscalapack.a:pctrtrs.o: U blacs_gridinfo__ > > libscalapack.a:pcung2l.o: U blacs_gridinfo__ > > libscalapack.a:pcung2r.o: U blacs_gridinfo__ > > libscalapack.a:pcungl2.o: U blacs_gridinfo__ > > libscalapack.a:pcunglq.o: U blacs_gridinfo__ > > libscalapack.a:pcungql.o: U blacs_gridinfo__ > > libscalapack.a:pcungqr.o: U blacs_gridinfo__ > > libscalapack.a:pcungr2.o: U blacs_gridinfo__ > > libscalapack.a:pcungrq.o: U blacs_gridinfo__ > > libscalapack.a:pcunm2l.o: U blacs_gridinfo__ > > libscalapack.a:pcunm2r.o: U blacs_gridinfo__ > > libscalapack.a:pcunmbr.o: U blacs_gridinfo__ > > libscalapack.a:pcunmhr.o: U blacs_gridinfo__ > > libscalapack.a:pcunml2.o: U blacs_gridinfo__ > > libscalapack.a:pcunmlq.o: U blacs_gridinfo__ > > libscalapack.a:pcunmql.o: U blacs_gridinfo__ > > libscalapack.a:pcunmqr.o: U blacs_gridinfo__ > > libscalapack.a:pcunmr2.o: U blacs_gridinfo__ > > libscalapack.a:pcunmrq.o: U blacs_gridinfo__ > > libscalapack.a:pcunmtr.o: U blacs_gridinfo__ > > libscalapack.a:pclaevswp.o: U blacs_gridinfo__ > > libscalapack.a:pclarzb.o: U blacs_gridinfo__ > > libscalapack.a:pclarzt.o: U blacs_gridinfo__ > > libscalapack.a:pclarz.o: U blacs_gridinfo__ > > libscalapack.a:pclarzc.o: U blacs_gridinfo__ > > libscalapack.a:pclatrz.o: U blacs_gridinfo__ > > libscalapack.a:pctzrzf.o: U blacs_gridinfo__ > > libscalapack.a:pclattrs.o: U blacs_gridinfo__ > > libscalapack.a:pcunmr3.o: U blacs_gridinfo__ > > libscalapack.a:pcunmrz.o: U blacs_gridinfo__ > > libscalapack.a:pcmax1.o: U blacs_gridinfo__ > > libscalapack.a:pscsum1.o: U blacs_gridinfo__ > > libscalapack.a:pclamr1d.o: U blacs_gridinfo__ > > libscalapack.a:pcheevr.o: U blacs_gridinfo__ > > libscalapack.a:pzdbsv.o: U blacs_gridinfo__ > > libscalapack.a:pzdbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pzdbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pzdbtrsv.o: U blacs_gridinfo__ > > libscalapack.a:pzdtsv.o: U blacs_gridinfo__ > > libscalapack.a:pzdttrf.o: U blacs_gridinfo__ > > libscalapack.a:pzdttrs.o: U blacs_gridinfo__ > > libscalapack.a:pzdttrsv.o: U blacs_gridinfo__ > > libscalapack.a:pzgbsv.o: U blacs_gridinfo__ > > libscalapack.a:pzgbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pzgbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pzgebd2.o: U blacs_gridinfo__ > > libscalapack.a:pzgebrd.o: U blacs_gridinfo__ > > libscalapack.a:pzgecon.o: U blacs_gridinfo__ > > libscalapack.a:pzgeequ.o: U blacs_gridinfo__ > > libscalapack.a:pzgehd2.o: U blacs_gridinfo__ > > libscalapack.a:pzgehrd.o: U blacs_gridinfo__ > > libscalapack.a:pzgelq2.o: U blacs_gridinfo__ > > libscalapack.a:pzgelqf.o: U blacs_gridinfo__ > > libscalapack.a:pzgels.o: U blacs_gridinfo__ > > libscalapack.a:pzgeql2.o: U blacs_gridinfo__ > > libscalapack.a:pzgeqlf.o: U blacs_gridinfo__ > > libscalapack.a:pzgeqpf.o: U blacs_gridinfo__ > > libscalapack.a:pzgeqr2.o: U blacs_gridinfo__ > > libscalapack.a:pzgeqrf.o: U blacs_gridinfo__ > > libscalapack.a:pzgerfs.o: U blacs_gridinfo__ > > libscalapack.a:pzgerq2.o: U blacs_gridinfo__ > > libscalapack.a:pzgerqf.o: U blacs_gridinfo__ > > libscalapack.a:pzgesv.o: U blacs_gridinfo__ > > libscalapack.a:pzgesvd.o: U blacs_gridinfo__ > > libscalapack.a:pzgesvx.o: U blacs_gridinfo__ > > libscalapack.a:pzgetf2.o: U blacs_gridinfo__ > > libscalapack.a:pzgetrf.o: U blacs_gridinfo__ > > libscalapack.a:pzgetri.o: U blacs_gridinfo__ > > libscalapack.a:pzgetrs.o: U blacs_gridinfo__ > > libscalapack.a:pzggqrf.o: U blacs_gridinfo__ > > libscalapack.a:pzggrqf.o: U blacs_gridinfo__ > > libscalapack.a:pzheev.o: U blacs_gridinfo__ > > libscalapack.a:pzheevd.o: U blacs_gridinfo__ > > libscalapack.a:pzheevx.o: U blacs_gridinfo__ > > libscalapack.a:pzhegs2.o: U blacs_gridinfo__ > > libscalapack.a:pzhegst.o: U blacs_gridinfo__ > > libscalapack.a:pzhegvx.o: U blacs_gridinfo__ > > libscalapack.a:pzhengst.o: U blacs_gridinfo__ > > libscalapack.a:pzhentrd.o: U blacs_gridinfo__ > > libscalapack.a:pzhettrd.o: U blacs_gridinfo__ > > libscalapack.a:pzhetd2.o: U blacs_gridinfo__ > > libscalapack.a:pzhetrd.o: U blacs_gridinfo__ > > libscalapack.a:pzlabrd.o: U blacs_gridinfo__ > > libscalapack.a:pzlacon.o: U blacs_gridinfo__ > > libscalapack.a:pzlacgv.o: U blacs_gridinfo__ > > libscalapack.a:pzlacp2.o: U blacs_gridinfo__ > > libscalapack.a:pzlahrd.o: U blacs_gridinfo__ > > libscalapack.a:pzlahqr.o: U blacs_gridinfo__ > > libscalapack.a:pzlaconsb.o: U blacs_gridinfo__ > > libscalapack.a:pzlasmsub.o: U blacs_gridinfo__ > > libscalapack.a:pzlacp3.o: U blacs_gridinfo__ > > libscalapack.a:pzlawil.o: U blacs_gridinfo__ > > libscalapack.a:pzrot.o: U blacs_gridinfo__ > > libscalapack.a:pzlange.o: U blacs_gridinfo__ > > libscalapack.a:pzlanhe.o: U blacs_gridinfo__ > > libscalapack.a:pzlanhs.o: U blacs_gridinfo__ > > libscalapack.a:pzlansy.o: U blacs_gridinfo__ > > libscalapack.a:pzlantr.o: U blacs_gridinfo__ > > libscalapack.a:pzlapiv.o: U blacs_gridinfo__ > > libscalapack.a:pzlapv2.o: U blacs_gridinfo__ > > libscalapack.a:pzlaqge.o: U blacs_gridinfo__ > > libscalapack.a:pzlaqsy.o: U blacs_gridinfo__ > > libscalapack.a:pzlarf.o: U blacs_gridinfo__ > > libscalapack.a:pzlarfb.o: U blacs_gridinfo__ > > libscalapack.a:pzlarfc.o: U blacs_gridinfo__ > > libscalapack.a:pzlarfg.o: U blacs_gridinfo__ > > libscalapack.a:pzlarft.o: U blacs_gridinfo__ > > libscalapack.a:pzlascl.o: U blacs_gridinfo__ > > libscalapack.a:pzlase2.o: U blacs_gridinfo__ > > libscalapack.a:pzlassq.o: U blacs_gridinfo__ > > libscalapack.a:pzlaswp.o: U blacs_gridinfo__ > > libscalapack.a:pzlatra.o: U blacs_gridinfo__ > > libscalapack.a:pzlatrd.o: U blacs_gridinfo__ > > libscalapack.a:pzlattrs.o: U blacs_gridinfo__ > > libscalapack.a:pzlatrs.o: U blacs_gridinfo__ > > libscalapack.a:pzlauu2.o: U blacs_gridinfo__ > > libscalapack.a:pzpocon.o: U blacs_gridinfo__ > > libscalapack.a:pzpoequ.o: U blacs_gridinfo__ > > libscalapack.a:pzporfs.o: U blacs_gridinfo__ > > libscalapack.a:pzposv.o: U blacs_gridinfo__ > > libscalapack.a:pzpbsv.o: U blacs_gridinfo__ > > libscalapack.a:pzpbtrf.o: U blacs_gridinfo__ > > libscalapack.a:pzpbtrs.o: U blacs_gridinfo__ > > libscalapack.a:pzpbtrsv.o: U blacs_gridinfo__ > > libscalapack.a:pzptsv.o: U blacs_gridinfo__ > > libscalapack.a:pzpttrf.o: U blacs_gridinfo__ > > libscalapack.a:pzpttrs.o: U blacs_gridinfo__ > > libscalapack.a:pzpttrsv.o: U blacs_gridinfo__ > > libscalapack.a:pzposvx.o: U blacs_gridinfo__ > > libscalapack.a:pzpotf2.o: U blacs_gridinfo__ > > libscalapack.a:pzpotrf.o: U blacs_gridinfo__ > > libscalapack.a:pzpotri.o: U blacs_gridinfo__ > > libscalapack.a:pzpotrs.o: U blacs_gridinfo__ > > libscalapack.a:pzdrscl.o: U blacs_gridinfo__ > > libscalapack.a:pzstein.o: U blacs_gridinfo__ > > libscalapack.a:pztrevc.o: U blacs_gridinfo__ > > libscalapack.a:pztrti2.o: U blacs_gridinfo__ > > libscalapack.a:pztrtri.o: U blacs_gridinfo__ > > libscalapack.a:pztrtrs.o: U blacs_gridinfo__ > > libscalapack.a:pzung2l.o: U blacs_gridinfo__ > > libscalapack.a:pzung2r.o: U blacs_gridinfo__ > > libscalapack.a:pzungl2.o: U blacs_gridinfo__ > > libscalapack.a:pzunglq.o: U blacs_gridinfo__ > > libscalapack.a:pzungql.o: U blacs_gridinfo__ > > libscalapack.a:pzungqr.o: U blacs_gridinfo__ > > libscalapack.a:pzungr2.o: U blacs_gridinfo__ > > libscalapack.a:pzungrq.o: U blacs_gridinfo__ > > libscalapack.a:pzunm2l.o: U blacs_gridinfo__ > > libscalapack.a:pzunm2r.o: U blacs_gridinfo__ > > libscalapack.a:pzunmbr.o: U blacs_gridinfo__ > > libscalapack.a:pzunmhr.o: U blacs_gridinfo__ > > libscalapack.a:pzunml2.o: U blacs_gridinfo__ > > libscalapack.a:pzunmlq.o: U blacs_gridinfo__ > > libscalapack.a:pzunmql.o: U blacs_gridinfo__ > > libscalapack.a:pzunmqr.o: U blacs_gridinfo__ > > libscalapack.a:pzunmr2.o: U blacs_gridinfo__ > > libscalapack.a:pzunmrq.o: U blacs_gridinfo__ > > libscalapack.a:pzunmtr.o: U blacs_gridinfo__ > > libscalapack.a:pzlaevswp.o: U blacs_gridinfo__ > > libscalapack.a:pzlarzb.o: U blacs_gridinfo__ > > libscalapack.a:pzlarzt.o: U blacs_gridinfo__ > > libscalapack.a:pzlarz.o: U blacs_gridinfo__ > > libscalapack.a:pzlarzc.o: U blacs_gridinfo__ > > libscalapack.a:pzlatrz.o: U blacs_gridinfo__ > > libscalapack.a:pztzrzf.o: U blacs_gridinfo__ > > libscalapack.a:pzunmr3.o: U blacs_gridinfo__ > > libscalapack.a:pzunmrz.o: U blacs_gridinfo__ > > libscalapack.a:pzmax1.o: U blacs_gridinfo__ > > libscalapack.a:pdzsum1.o: U blacs_gridinfo__ > > libscalapack.a:pzlamr1d.o: U blacs_gridinfo__ > > libscalapack.a:pzheevr.o: U blacs_gridinfo__ > > > > > > On Tue, Sep 13, 2016 at 1:33 PM, Satish Balay wrote: > > > >> On Tue, 13 Sep 2016, Matthew Knepley wrote: > >> > >> > I believe your problem is that this is old PETSc. In the latest release, > >> > BLACS is part of SCALAPACK. > >> > >> BLACS had been a part of scalapack for a few releases - so thats not the > >> issue. > >> > >> >>>>>>>> > >> stderr: > >> /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc > >> -opt/lib/libscalapack.a(pssytrd.o): In function `pssytrd': > >> /tmp/cluster_temp.FFxzAF/petsc-3.6.4/arch-linux2-c-debug/ > >> externalpackages/scalapack-2.0.2/SRC/pssytrd.f:259: undefined reference > >> to `blacs_gridinfo__' > >> /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc > >> -opt/lib/libscalapack.a(chk1mat.o): In function `chk1mat': > >> <<<<<< > >> > >> Double underscore? > >> > >> >>> > >> mpicc -c -Df77IsF2C -fPIC -fopenmp -fPIC -g > >> -I/cm/shared/apps/openmpi/open64/64/1.10.1/include pzrot.c > >> <<< > >> > >> scalapack is getting compiled with this flag '-Df77IsF2C'. This mode > >> was primarily supported by 'g77' previously - which we hardly ever use > >> anymore - so this mode is not really tested? > >> > >> >>>>> > >> Executing: mpif90 -show > >> stdout: openf90 -I/cm/shared/apps/openmpi/open64/64/1.10.1/include > >> -pthread -I/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 -L/usr/lib64/ > >> -Wl,-rpath -Wl,/usr/lib64/ -Wl,-rpath -Wl,/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 > >> -Wl,--enable-new-dtags -L/cm/shared/apps/openmpi/open64/64/1.10.1/lib64 > >> -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > >> > >> compilers: Fortran appends an extra underscore to names > >> containing underscores > >> Defined "HAVE_FORTRAN_UNDERSCORE_UNDERSCORE" to "1" > >> > >> <<<< > >> > >> What do you have for: > >> > >> cd /cm/shared/modulefiles/moose-compilers/petsc/petsc-3.6.4/gcc-opt/lib/ > >> nm -Ao libscalapack.a |grep -i blacs_gridinfo > >> > >> However - as Matt refered to - its best to use latest petsc-3.7 > >> release. Does MOOSE require 3.6? > >> > >> Satish > >> > >> > > > From timothee.nicolas at gmail.com Tue Sep 13 17:28:33 2016 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Wed, 14 Sep 2016 00:28:33 +0200 Subject: [petsc-users] FLAGS in makefile In-Reply-To: References: Message-ID: OK, thank you. Regarding point 3, no I didn't realize this was dangerous. But reconfiguring petsc everytime is quite lengthy. How to do if you just want to test a few compilation options, before knowing what you need? Timoth?e 2016-09-13 19:56 GMT+02:00 Satish Balay : > On Tue, 13 Sep 2016, Matthew Knepley wrote: > > > On Tue, Sep 13, 2016 at 12:16 PM, Timoth?e Nicolas < > > timothee.nicolas at gmail.com> wrote: > > > > > Hi all, > > > > > > I can't seem to figure out how to specify my compilation options with > > > PETSc. For my makefiles, I've always been using Petsc examples inspired > > > makefiles, just tuning them to my needs, and I have never played with > > > compilation options so far. Now, I am trying to add some compilation > > > options, but they are not taken into account by the compiler. My > makefile > > > looks like this > > > > > > all: energy > > > > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > > > > > > #FLAGS = -g -O0 -fbounds-check > > > > > > > > > > > > > > > MYFLAGS = -mcmodel=medium -shared-intel > > > > > > > > > OBJS = main.o \ > > > > > > modules.o \ > > > > > > diags.o \ > > > > > > functions.o \ > > > > > > conservation.o \ > > > > > > > > > EXEC = energy > > > > > > > > > main.o: modules.o \ > > > > > > functions.o \ > > > > > > conservation.o \ > > > > > > diags.o \ > > > > > > > > > energy: $(OBJS) chkopts > > > > > > -$(FLINKER) -o $(EXEC) $(MYFLAGS) $(FLAGS) $(OBJS) $( > > > PETSC_SNES_LIB) > > > > > > > > > clean_all: > > > > > > $(RM) $(OBJS) $(EXEC) > > > > > > > > > The compiler then executes things like > > > > > > /opt/mpi/bullxmpi/1.2.8.4/bin/mpif90 -c -fPIC -g -O3 > > > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/petsc-3.7.3/include > > > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/ > > > petsc-3.7.3/arch-linux2-c-debug/include -I/opt/mpi/bullxmpi/1.2.8.4/ > > > include -o modules.o modules.F90 > > > > > > > > > without taking my variable MYFLAGS into account. What may be the > reason? > > > Also, what does "chkopts" mean ? > > > > > > > 1) You want to change CFLAGS or FFLAGS > > > > 2) 'chkopts' is an internal check for PETSc > > > > 3) You realize that it is very dangerous to compile with options not > > configure with. > > And generally - you have to declare FFLAGS - before the 'include' > statement You can set: > > FPPFLAGS - for preprocessing [or compile only flags] > FFLAGS - for compile & link flags > > For link only flags - you can add them to the link command - as you've > done with MYFLAGS.. > > Satish -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 13 17:32:57 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Sep 2016 17:32:57 -0500 Subject: [petsc-users] FLAGS in makefile In-Reply-To: References: Message-ID: On Tue, Sep 13, 2016 at 5:28 PM, Timoth?e Nicolas < timothee.nicolas at gmail.com> wrote: > OK, thank you. > > Regarding point 3, no I didn't realize this was dangerous. But > reconfiguring petsc everytime is quite lengthy. How to do if you just want > to test a few compilation options, before knowing what you need? > It dangerous because you can make source incompatible with the previously compiled libraries. You can insert other headers in front of those used by the libraries, change type size, name mangling, etc. Configuring should not take that long. A complete configure and build takes about 8min at the Bitbucket regression tests. Are you running on a slow file system? Thanks, Matt > Timoth?e > > 2016-09-13 19:56 GMT+02:00 Satish Balay : > >> On Tue, 13 Sep 2016, Matthew Knepley wrote: >> >> > On Tue, Sep 13, 2016 at 12:16 PM, Timoth?e Nicolas < >> > timothee.nicolas at gmail.com> wrote: >> > >> > > Hi all, >> > > >> > > I can't seem to figure out how to specify my compilation options with >> > > PETSc. For my makefiles, I've always been using Petsc examples >> inspired >> > > makefiles, just tuning them to my needs, and I have never played with >> > > compilation options so far. Now, I am trying to add some compilation >> > > options, but they are not taken into account by the compiler. My >> makefile >> > > looks like this >> > > >> > > all: energy >> > > >> > > >> > > include ${PETSC_DIR}/lib/petsc/conf/variables >> > > >> > > include ${PETSC_DIR}/lib/petsc/conf/rules >> > > >> > > >> > > #FLAGS = -g -O0 -fbounds-check >> > > >> > > >> > > >> > > >> > > MYFLAGS = -mcmodel=medium -shared-intel >> > > >> > > >> > > OBJS = main.o \ >> > > >> > > modules.o \ >> > > >> > > diags.o \ >> > > >> > > functions.o \ >> > > >> > > conservation.o \ >> > > >> > > >> > > EXEC = energy >> > > >> > > >> > > main.o: modules.o \ >> > > >> > > functions.o \ >> > > >> > > conservation.o \ >> > > >> > > diags.o \ >> > > >> > > >> > > energy: $(OBJS) chkopts >> > > >> > > -$(FLINKER) -o $(EXEC) $(MYFLAGS) $(FLAGS) $(OBJS) $( >> > > PETSC_SNES_LIB) >> > > >> > > >> > > clean_all: >> > > >> > > $(RM) $(OBJS) $(EXEC) >> > > >> > > >> > > The compiler then executes things like >> > > >> > > /opt/mpi/bullxmpi/1.2.8.4/bin/mpif90 -c -fPIC -g -O3 >> > > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/petsc-3.7.3/include >> > > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/ >> > > petsc-3.7.3/arch-linux2-c-debug/include -I/opt/mpi/bullxmpi/1.2.8.4/ >> > > include -o modules.o modules.F90 >> > > >> > > >> > > without taking my variable MYFLAGS into account. What may be the >> reason? >> > > Also, what does "chkopts" mean ? >> > > >> > >> > 1) You want to change CFLAGS or FFLAGS >> > >> > 2) 'chkopts' is an internal check for PETSc >> > >> > 3) You realize that it is very dangerous to compile with options not >> > configure with. >> >> And generally - you have to declare FFLAGS - before the 'include' >> statement You can set: >> >> FPPFLAGS - for preprocessing [or compile only flags] >> FFLAGS - for compile & link flags >> >> For link only flags - you can add them to the link command - as you've >> done with MYFLAGS.. >> >> Satish > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Tue Sep 13 17:48:45 2016 From: timothee.nicolas at gmail.com (timothee.nicolas) Date: Wed, 14 Sep 2016 00:48:45 +0200 Subject: [petsc-users] =?iso-8859-1?q?R=E9p=3A_Re=3A__FLAGS_in_makefile?= Message-ID: 8 min should reasonable. I was considering this lengthy actually but OK I got the point. Thank you Timoth?e Matthew Knepley a ?crit?: On Tue, Sep 13, 2016 at 5:28 PM, Timoth?e Nicolas wrote: OK, thank you. Regarding point 3, no I didn't realize this was dangerous. But reconfiguring petsc everytime is quite lengthy. How to do if you just want to test a few compilation options, before knowing what you need? It dangerous because you can make source incompatible with the previously compiled libraries. You can insert other headers in front of those used by the libraries, change type size, name mangling, etc. Configuring should not take that long. A complete configure and build takes about 8min at the Bitbucket regression tests. Are you running on a slow file system? Thanks, Matt Timoth?e 2016-09-13 19:56 GMT+02:00 Satish Balay : On Tue, 13 Sep 2016, Matthew Knepley wrote: > On Tue, Sep 13, 2016 at 12:16 PM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > > > Hi all, > > > > I can't seem to figure out how to specify my compilation options with > > PETSc. For my makefiles, I've always been using Petsc examples inspired > > makefiles, just tuning them to my needs, and I have never played with > > compilation options so far. Now, I am trying to add some compilation > > options, but they are not taken into account by the compiler. My makefile > > looks like this > > > > all: energy > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > > > #FLAGS = -g -O0 -fbounds-check > > > > > > > > > > MYFLAGS = -mcmodel=medium -shared-intel > > > > > > OBJS = main.o \ > > > > modules.o \ > > > > diags.o \ > > > > functions.o \ > > > > conservation.o \ > > > > > > EXEC = energy > > > > > > main.o: modules.o \ > > > > functions.o \ > > > > conservation.o \ > > > > diags.o \ > > > > > > energy: $(OBJS) chkopts > > > > -$(FLINKER) -o $(EXEC) $(MYFLAGS) $(FLAGS) $(OBJS) $( > > PETSC_SNES_LIB) > > > > > > clean_all: > > > > $(RM) $(OBJS) $(EXEC) > > > > > > The compiler then executes things like > > > > /opt/mpi/bullxmpi/1.2.8.4/bin/mpif90 -c -fPIC -g -O3 > > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/petsc-3.7.3/include > > -I/ccc/scratch/cont003/gen0198/lutjensh/Timothee/ > > petsc-3.7.3/arch-linux2-c-debug/include -I/opt/mpi/bullxmpi/1.2.8.4/ > > include -o modules.o modules.F90 > > > > > > without taking my variable MYFLAGS into account. What may be the reason? > > Also, what does "chkopts" mean ? > > > > 1) You want to change CFLAGS or FFLAGS > > 2) 'chkopts' is an internal check for PETSc > > 3) You realize that it is very dangerous to compile with options not From gideon.simpson at gmail.com Tue Sep 13 21:47:56 2016 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 13 Sep 2016 22:47:56 -0400 Subject: [petsc-users] Time Stepping Terminology Message-ID: <3F1C1787-7A22-4E97-8D0A-234DCC2C5668@gmail.com> I was looking around to see if there was any built in routine for implicit midpoint time stepping in petsc. By this I mean equation (1i) of https://en.wikipedia.org/wiki/Midpoint_method. I noticed the TSTHETA solver, and I see that the documentation says: -ts_type theta -ts_theta_theta 0.5 corresponds to the implicit midpoint rule But I wanted to verify that it agrees with what I interpret to be implicit midpoint. Also, is the distinction with the ts_theta_endpoint flag as to whether it is the weighting of the argument to the f(t,y) function or the weighting of the f(t,y) evaluated at the different points? -gideon From C.Klaij at marin.nl Wed Sep 14 04:25:16 2016 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 14 Sep 2016 09:25:16 +0000 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: References: Message-ID: <1473845116361.56153@marin.nl> Jed, Just a reminder, you haven't responded to this thread, notably Matt's question below whether you can fix the L2G mapping. Chris > > From: Matthew Knepley > > Sent: Tuesday, August 02, 2016 12:28 AM > > To: Klaij, Christiaan > > Cc: petsc-users at mcs.anl.gov; Jed Brown > > Subject: Re: [petsc-users] block matrix without MatCreateNest > > > > On Mon, Aug 1, 2016 at 9:36 AM, Klaij, Christiaan > wrote: > > > > Matt, > > > > > > 1) great! > > > > > > 2) ??? that's precisely why I paste the output of "cat mattry.F90" > in the emails, so you have a small example that produces the errors I > mention. Now I'm also attaching it to this email. > > > > Okay, I have gone through it. You are correct that it is completely > broken. > > > > The way that MatNest currently works is that it trys to use L2G mappings > from individual blocks > > and then builds a composite L2G map for the whole matrix. This is > obviously incompatible with > > the primary use case, and should be changed to break up the full L2G > into one for each block. > > > > Jed, can you fix this? I am not sure I know enough about how Nest works. > > > > Matt > > > > Thanks, > > > > Chris dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl MARIN news: http://www.marin.nl/web/News/News-items/MARIN-at-Monaco-Yacht-Show-September-28October-1.htm From hsahasra at purdue.edu Wed Sep 14 09:06:25 2016 From: hsahasra at purdue.edu (Harshad Sahasrabudhe) Date: Wed, 14 Sep 2016 10:06:25 -0400 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> Message-ID: Hi Barry, Thanks for your inputs. I tried to set a watchpoint on ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. Basically, GDB isn't able to find the PETSc source code. I built PETSc with --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get doesn't contain debugging symbols (checked using objdump -g). How do I get PETSc library to have debugging info? Thanks, Harshad On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith wrote: > > > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe > wrote: > > > > Hi Barry, > > > > I compiled with mpich configured using --enable-g=meminit to get rid of > MPI errors in Valgrind. Doing this reduced the number of errors to 2. I > have attached the Valgrind output. > > This isn't helpful but it seems not to be a memory corruption issue :-( > > > > I'm using GAMG+GMRES for in each linear iteration of SNES. The linear > solver converges with CONVERGED_RTOL for the first 6 iterations and with > CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is > happening. Any thoughts/ideas? > > Does this happen on one process? If so I would run in the debugger and > track the variable to see everyplace the variable is changed, this would > point to exactly what piece of code is changing the variable to this > unexpected value. > > For example with lldb one can use watch http://lldb.llvm.org/tutorial. > html to see each time a variable gets changed. Similar thing with gdb. > > The variable to watch is ksp->reason Once you get the hang of this it > can take just a few minutes to track down the code that is making this > unexpected value, though I understand if you haven't done it before it can > be intimidating. > > Barry > > You can do the same thing in parallel (like on two processes) if you need > to but it is more cumbersome since you need run multiple debuggers. You can > have PETSc start up multiple debuggers with mpiexec -n 2 ./ex > -start_in_debugger > > > > > > > > Thanks, > > Harshad > > > > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith wrote: > > > > Install your MPI with --download-mpich as a PETSc ./configure option, > this will eliminate all the MPICH valgrind errors. Then send as an > attachment the resulting valgrind file. > > > > I do not 100 % trust any code that produces such valgrind errors. > > > > Barry > > > > > > > > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe > wrote: > > > > > > Hi Barry, > > > > > > Thanks for the reply. My code is in C. I ran with Valgrind and found > many "Conditional jump or move depends on uninitialized value(s)", "Invalid > read" and "Use of uninitialized value" errors. I think all of them are from > the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure > what I'm looking for in the Valgrind output. At the end of the file, I get: > > > > > > ==40223== More than 10000000 total errors detected. I'm not reporting > any more. > > > ==40223== Final error counts will be inaccurate. Go fix your program! > > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note > > > ==40223== that errors may occur in your program without prior warning > from > > > ==40223== Valgrind, because errors are no longer being displayed. > > > > > > Can you give some suggestions on how I should proceed? > > > > > > Thanks, > > > Harshad > > > > > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith > wrote: > > > > > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very specialized > and should never occur with GMRES. > > > > > > Can you run with valgrind to make sure there is no memory > corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > > Is your code fortran or C? > > > > > > Barry > > > > > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > > > > > > Hi, > > > > > > > > I'm using GAMG + GMRES for my Poisson problem. The solver converges > with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which > is much higher than what I need (I need a tolerance of at least 1E-8). I am > not able to figure out which tolerance I need to set to avoid convergence > due to CONVERGED_STEP_LENGTH. > > > > > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > > > > > > > 0 KSP Residual norm 3.121347818142e+00 > > > > 1 KSP Residual norm 9.773346857844e-02 > > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > > > > KSP Object: 1 MPI processes > > > > type: gmres > > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > > > GMRES: happy breakdown tolerance 1e-30 > > > > maximum iterations=10000, initial guess is zero > > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using PRECONDITIONED norm type for convergence test > > > > PC Object: 1 MPI processes > > > > type: gamg > > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v > > > > Cycles per PCApply=1 > > > > Using Galerkin computed coarse grid matrices > > > > Coarse grid solver -- level ------------------------------- > > > > KSP Object: (mg_coarse_) 1 MPI processes > > > > type: preonly > > > > maximum iterations=1, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (mg_coarse_) 1 MPI processes > > > > type: bjacobi > > > > block Jacobi: number of blocks = 1 > > > > Local solve is same for all blocks, in the following KSP and > PC objects: > > > > KSP Object: (mg_coarse_sub_) 1 MPI processes > > > > type: preonly > > > > maximum iterations=1, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (mg_coarse_sub_) 1 MPI processes > > > > type: lu > > > > LU: out-of-place factorization > > > > tolerance for zero pivot 2.22045e-14 > > > > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > > > > matrix ordering: nd > > > > factor fill ratio given 5, needed 1.91048 > > > > Factored matrix follows: > > > > Mat Object: 1 MPI processes > > > > type: seqaij > > > > rows=284, cols=284 > > > > package used to perform factorization: petsc > > > > total: nonzeros=7726, allocated nonzeros=7726 > > > > total number of mallocs used during MatSetValues > calls =0 > > > > using I-node routines: found 133 nodes, limit > used is 5 > > > > linear system matrix = precond matrix: > > > > Mat Object: 1 MPI processes > > > > type: seqaij > > > > rows=284, cols=284 > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node routines > > > > linear system matrix = precond matrix: > > > > Mat Object: 1 MPI processes > > > > type: seqaij > > > > rows=284, cols=284 > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node routines > > > > Down solver (pre-smoother) on level 1 > ------------------------------- > > > > KSP Object: (mg_levels_1_) 1 MPI processes > > > > type: chebyshev > > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = > 4.10212 > > > > maximum iterations=2 > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using nonzero initial guess > > > > using NONE norm type for convergence test > > > > PC Object: (mg_levels_1_) 1 MPI processes > > > > type: sor > > > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1 > > > > linear system matrix = precond matrix: > > > > Mat Object: () 1 MPI processes > > > > type: seqaij > > > > rows=9036, cols=9036 > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node routines > > > > Up solver (post-smoother) same as down solver (pre-smoother) > > > > linear system matrix = precond matrix: > > > > Mat Object: () 1 MPI processes > > > > type: seqaij > > > > rows=9036, cols=9036 > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node routines > > > > > > > > Thanks, > > > > Harshad > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hsahasra at purdue.edu Wed Sep 14 09:10:36 2016 From: hsahasra at purdue.edu (Harshad Sahasrabudhe) Date: Wed, 14 Sep 2016 10:10:36 -0400 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> Message-ID: I think I found the problem. I configured PETSc with COPTFLAGS=-O3. I'll remove that option and try again. Thanks! Harshad On Wed, Sep 14, 2016 at 10:06 AM, Harshad Sahasrabudhe wrote: > Hi Barry, > > Thanks for your inputs. I tried to set a watchpoint on > ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. > Basically, GDB isn't able to find the PETSc source code. I built PETSc with > --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get > doesn't contain debugging symbols (checked using objdump -g). How do I get > PETSc library to have debugging info? > > Thanks, > Harshad > > On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith wrote: > >> >> > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe >> wrote: >> > >> > Hi Barry, >> > >> > I compiled with mpich configured using --enable-g=meminit to get rid of >> MPI errors in Valgrind. Doing this reduced the number of errors to 2. I >> have attached the Valgrind output. >> >> This isn't helpful but it seems not to be a memory corruption issue :-( >> > >> > I'm using GAMG+GMRES for in each linear iteration of SNES. The linear >> solver converges with CONVERGED_RTOL for the first 6 iterations and with >> CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is >> happening. Any thoughts/ideas? >> >> Does this happen on one process? If so I would run in the debugger and >> track the variable to see everyplace the variable is changed, this would >> point to exactly what piece of code is changing the variable to this >> unexpected value. >> >> For example with lldb one can use watch http://lldb.llvm.org/tutorial. >> html to see each time a variable gets changed. Similar thing with gdb. >> >> The variable to watch is ksp->reason Once you get the hang of this it >> can take just a few minutes to track down the code that is making this >> unexpected value, though I understand if you haven't done it before it can >> be intimidating. >> >> Barry >> >> You can do the same thing in parallel (like on two processes) if you need >> to but it is more cumbersome since you need run multiple debuggers. You can >> have PETSc start up multiple debuggers with mpiexec -n 2 ./ex >> -start_in_debugger >> >> >> >> >> > >> > Thanks, >> > Harshad >> > >> > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith >> wrote: >> > >> > Install your MPI with --download-mpich as a PETSc ./configure option, >> this will eliminate all the MPICH valgrind errors. Then send as an >> attachment the resulting valgrind file. >> > >> > I do not 100 % trust any code that produces such valgrind errors. >> > >> > Barry >> > >> > >> > >> > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> > > >> > > Hi Barry, >> > > >> > > Thanks for the reply. My code is in C. I ran with Valgrind and found >> many "Conditional jump or move depends on uninitialized value(s)", "Invalid >> read" and "Use of uninitialized value" errors. I think all of them are from >> the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure >> what I'm looking for in the Valgrind output. At the end of the file, I get: >> > > >> > > ==40223== More than 10000000 total errors detected. I'm not >> reporting any more. >> > > ==40223== Final error counts will be inaccurate. Go fix your program! >> > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note >> > > ==40223== that errors may occur in your program without prior warning >> from >> > > ==40223== Valgrind, because errors are no longer being displayed. >> > > >> > > Can you give some suggestions on how I should proceed? >> > > >> > > Thanks, >> > > Harshad >> > > >> > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith >> wrote: >> > > >> > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very >> specialized and should never occur with GMRES. >> > > >> > > Can you run with valgrind to make sure there is no memory >> corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > >> > > Is your code fortran or C? >> > > >> > > Barry >> > > >> > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> > > > >> > > > Hi, >> > > > >> > > > I'm using GAMG + GMRES for my Poisson problem. The solver converges >> with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which >> is much higher than what I need (I need a tolerance of at least 1E-8). I am >> not able to figure out which tolerance I need to set to avoid convergence >> due to CONVERGED_STEP_LENGTH. >> > > > >> > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: >> > > > >> > > > 0 KSP Residual norm 3.121347818142e+00 >> > > > 1 KSP Residual norm 9.773346857844e-02 >> > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 >> > > > KSP Object: 1 MPI processes >> > > > type: gmres >> > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> > > > GMRES: happy breakdown tolerance 1e-30 >> > > > maximum iterations=10000, initial guess is zero >> > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 >> > > > left preconditioning >> > > > using PRECONDITIONED norm type for convergence test >> > > > PC Object: 1 MPI processes >> > > > type: gamg >> > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v >> > > > Cycles per PCApply=1 >> > > > Using Galerkin computed coarse grid matrices >> > > > Coarse grid solver -- level ------------------------------- >> > > > KSP Object: (mg_coarse_) 1 MPI processes >> > > > type: preonly >> > > > maximum iterations=1, initial guess is zero >> > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> > > > left preconditioning >> > > > using NONE norm type for convergence test >> > > > PC Object: (mg_coarse_) 1 MPI processes >> > > > type: bjacobi >> > > > block Jacobi: number of blocks = 1 >> > > > Local solve is same for all blocks, in the following KSP >> and PC objects: >> > > > KSP Object: (mg_coarse_sub_) 1 MPI processes >> > > > type: preonly >> > > > maximum iterations=1, initial guess is zero >> > > > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000 >> > > > left preconditioning >> > > > using NONE norm type for convergence test >> > > > PC Object: (mg_coarse_sub_) 1 MPI processes >> > > > type: lu >> > > > LU: out-of-place factorization >> > > > tolerance for zero pivot 2.22045e-14 >> > > > using diagonal shift on blocks to prevent zero pivot >> [INBLOCKS] >> > > > matrix ordering: nd >> > > > factor fill ratio given 5, needed 1.91048 >> > > > Factored matrix follows: >> > > > Mat Object: 1 MPI processes >> > > > type: seqaij >> > > > rows=284, cols=284 >> > > > package used to perform factorization: petsc >> > > > total: nonzeros=7726, allocated nonzeros=7726 >> > > > total number of mallocs used during MatSetValues >> calls =0 >> > > > using I-node routines: found 133 nodes, limit >> used is 5 >> > > > linear system matrix = precond matrix: >> > > > Mat Object: 1 MPI processes >> > > > type: seqaij >> > > > rows=284, cols=284 >> > > > total: nonzeros=4044, allocated nonzeros=4044 >> > > > total number of mallocs used during MatSetValues calls >> =0 >> > > > not using I-node routines >> > > > linear system matrix = precond matrix: >> > > > Mat Object: 1 MPI processes >> > > > type: seqaij >> > > > rows=284, cols=284 >> > > > total: nonzeros=4044, allocated nonzeros=4044 >> > > > total number of mallocs used during MatSetValues calls =0 >> > > > not using I-node routines >> > > > Down solver (pre-smoother) on level 1 >> ------------------------------- >> > > > KSP Object: (mg_levels_1_) 1 MPI processes >> > > > type: chebyshev >> > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = >> 4.10212 >> > > > maximum iterations=2 >> > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> > > > left preconditioning >> > > > using nonzero initial guess >> > > > using NONE norm type for convergence test >> > > > PC Object: (mg_levels_1_) 1 MPI processes >> > > > type: sor >> > > > SOR: type = local_symmetric, iterations = 1, local >> iterations = 1, omega = 1 >> > > > linear system matrix = precond matrix: >> > > > Mat Object: () 1 MPI processes >> > > > type: seqaij >> > > > rows=9036, cols=9036 >> > > > total: nonzeros=192256, allocated nonzeros=192256 >> > > > total number of mallocs used during MatSetValues calls =0 >> > > > not using I-node routines >> > > > Up solver (post-smoother) same as down solver (pre-smoother) >> > > > linear system matrix = precond matrix: >> > > > Mat Object: () 1 MPI processes >> > > > type: seqaij >> > > > rows=9036, cols=9036 >> > > > total: nonzeros=192256, allocated nonzeros=192256 >> > > > total number of mallocs used during MatSetValues calls =0 >> > > > not using I-node routines >> > > > >> > > > Thanks, >> > > > Harshad >> > > >> > > >> > >> > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Wed Sep 14 09:19:23 2016 From: hongzhang at anl.gov (Hong Zhang) Date: Wed, 14 Sep 2016 09:19:23 -0500 Subject: [petsc-users] Time Stepping Terminology In-Reply-To: <3F1C1787-7A22-4E97-8D0A-234DCC2C5668@gmail.com> References: <3F1C1787-7A22-4E97-8D0A-234DCC2C5668@gmail.com> Message-ID: > On Sep 13, 2016, at 9:47 PM, Gideon Simpson wrote: > > I was looking around to see if there was any built in routine for implicit midpoint time stepping in petsc. By this I mean equation (1i) of https://en.wikipedia.org/wiki/Midpoint_method. > > I noticed the TSTHETA solver, and I see that the documentation says: > > -ts_type theta -ts_theta_theta 0.5 corresponds to the implicit midpoint rule > > But I wanted to verify that it agrees with what I interpret to be implicit midpoint. > Yes, it does. > Also, is the distinction with the ts_theta_endpoint flag as to whether it is the weighting of the argument to the f(t,y) function or the weighting of the f(t,y) evaluated at the different points? > The ts_theta_endpoint flag allows one to switch between two different types of theta methods; for ODEs they can be represented in the general forms y_{i+1} = y_i + h ( \theta f(t_i,y_i) + (1-\theta) f(t_{i+1},y_{i+1}) ) (w the endpoint flag) and y_s = y_i + h \theta f(t_i+\theta h, y_s) y_{i+1} = y_i + (y_s-y_i)/\theta (w/o the endpoint flag) Hong j+1=yj+h[?f(tj,yj)+(1??)f(tj+1,yj+1) > -gideon > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hsahasra at purdue.edu Wed Sep 14 12:39:53 2016 From: hsahasra at purdue.edu (Harshad Sahasrabudhe) Date: Wed, 14 Sep 2016 13:39:53 -0400 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> Message-ID: Hi Barry, I put a watchpoint on *((KSP_CONVERGED_REASON*) &( ((_p_KSP*)ksp)->reason )) in gdb. The ksp->reason switched between: Old value = KSP_CONVERGED_ITERATING New value = KSP_CONVERGED_RTOL 0x00002b143054bef2 in KSPConvergedDefault (ksp=0x23c3090, n=12, rnorm=5.3617149831259514e-08, reason=0x23c3310, ctx=0x2446210) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/iterativ.c:764 764 *reason = KSP_CONVERGED_RTOL; *and* Old value = KSP_CONVERGED_RTOL New value = KSP_CONVERGED_ITERATING KSPSetUp (ksp=0x23c3090) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:226 226 if (!((PetscObject)ksp)->type_name) { However, after iteration 6, it changed to KSP_CONVERGED_STEP_LENGTH Old value = KSP_CONVERGED_ITERATING New value = KSP_CONVERGED_STEP_LENGTH SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:36 36 PetscFunctionReturn(0); Any ideas why that function was executed? Backtrace when the program stopped here: #0 SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:36 #1 0x00002b14305d3fda in KSPGMRESCycle (itcount=0x7ffdcf2d4ffc, ksp=0x23c3090) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/impls/gmres/gmres.c:182 #2 0x00002b14305d4711 in KSPSolve_GMRES (ksp=0x23c3090) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/impls/gmres/gmres.c:235 #3 0x00002b1430526a8a in KSPSolve (ksp=0x23c3090, b=0x1a916c0, x=0x1d89dc0) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:460 #4 0x00002b1430bb3905 in SNESSolve_NEWTONTR (snes=0x1ea2490) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:160 #5 0x00002b1430655b57 in SNESSolve (snes=0x1ea2490, b=0x0, x=0x1a27420) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 #6 0x00002b142f606780 in libMesh::PetscNonlinearSolver::solve (this=0x1a1d8c0, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 #7 0x00002b142f67561d in libMesh::NonlinearImplicitSystem::solve (this=0x1a14fe0) at src/systems/nonlinear_implicit_system.C:183 #8 0x00002b1429548ceb in NonlinearPoisson::execute_solver (this=0x1110500) at NonlinearPoisson.cpp:1191 #9 0x00002b142952233c in NonlinearPoisson::do_solve (this=0x1110500) at NonlinearPoisson.cpp:948 #10 0x00002b1429b9e785 in Simulation::solve (this=0x1110500) at Simulation.cpp:781 #11 0x00002b1429ac326e in Nemo::run_simulations (this=0x63b020) at Nemo.cpp:1313 #12 0x0000000000426d0d in main (argc=6, argv=0x7ffdcf2d7908) at main.cpp:447 Thanks! Harshad On Wed, Sep 14, 2016 at 10:10 AM, Harshad Sahasrabudhe wrote: > I think I found the problem. I configured PETSc with COPTFLAGS=-O3. I'll > remove that option and try again. > > Thanks! > Harshad > > On Wed, Sep 14, 2016 at 10:06 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > >> Hi Barry, >> >> Thanks for your inputs. I tried to set a watchpoint on >> ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. >> Basically, GDB isn't able to find the PETSc source code. I built PETSc with >> --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get >> doesn't contain debugging symbols (checked using objdump -g). How do I get >> PETSc library to have debugging info? >> >> Thanks, >> Harshad >> >> On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith wrote: >> >>> >>> > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe >>> wrote: >>> > >>> > Hi Barry, >>> > >>> > I compiled with mpich configured using --enable-g=meminit to get rid >>> of MPI errors in Valgrind. Doing this reduced the number of errors to 2. I >>> have attached the Valgrind output. >>> >>> This isn't helpful but it seems not to be a memory corruption issue >>> :-( >>> > >>> > I'm using GAMG+GMRES for in each linear iteration of SNES. The linear >>> solver converges with CONVERGED_RTOL for the first 6 iterations and with >>> CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is >>> happening. Any thoughts/ideas? >>> >>> Does this happen on one process? If so I would run in the debugger >>> and track the variable to see everyplace the variable is changed, this >>> would point to exactly what piece of code is changing the variable to this >>> unexpected value. >>> >>> For example with lldb one can use watch >>> http://lldb.llvm.org/tutorial.html to see each time a variable gets >>> changed. Similar thing with gdb. >>> >>> The variable to watch is ksp->reason Once you get the hang of this >>> it can take just a few minutes to track down the code that is making this >>> unexpected value, though I understand if you haven't done it before it can >>> be intimidating. >>> >>> Barry >>> >>> You can do the same thing in parallel (like on two processes) if you >>> need to but it is more cumbersome since you need run multiple debuggers. >>> You can have PETSc start up multiple debuggers with mpiexec -n 2 ./ex >>> -start_in_debugger >>> >>> >>> >>> >>> > >>> > Thanks, >>> > Harshad >>> > >>> > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith >>> wrote: >>> > >>> > Install your MPI with --download-mpich as a PETSc ./configure >>> option, this will eliminate all the MPICH valgrind errors. Then send as an >>> attachment the resulting valgrind file. >>> > >>> > I do not 100 % trust any code that produces such valgrind errors. >>> > >>> > Barry >>> > >>> > >>> > >>> > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe < >>> hsahasra at purdue.edu> wrote: >>> > > >>> > > Hi Barry, >>> > > >>> > > Thanks for the reply. My code is in C. I ran with Valgrind and found >>> many "Conditional jump or move depends on uninitialized value(s)", "Invalid >>> read" and "Use of uninitialized value" errors. I think all of them are from >>> the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure >>> what I'm looking for in the Valgrind output. At the end of the file, I get: >>> > > >>> > > ==40223== More than 10000000 total errors detected. I'm not >>> reporting any more. >>> > > ==40223== Final error counts will be inaccurate. Go fix your >>> program! >>> > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note >>> > > ==40223== that errors may occur in your program without prior >>> warning from >>> > > ==40223== Valgrind, because errors are no longer being displayed. >>> > > >>> > > Can you give some suggestions on how I should proceed? >>> > > >>> > > Thanks, >>> > > Harshad >>> > > >>> > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith >>> wrote: >>> > > >>> > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very >>> specialized and should never occur with GMRES. >>> > > >>> > > Can you run with valgrind to make sure there is no memory >>> corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> > > >>> > > Is your code fortran or C? >>> > > >>> > > Barry >>> > > >>> > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe < >>> hsahasra at purdue.edu> wrote: >>> > > > >>> > > > Hi, >>> > > > >>> > > > I'm using GAMG + GMRES for my Poisson problem. The solver >>> converges with KSP_CONVERGED_STEP_LENGTH at a residual of >>> 9.773346857844e-02, which is much higher than what I need (I need a >>> tolerance of at least 1E-8). I am not able to figure out which tolerance I >>> need to set to avoid convergence due to CONVERGED_STEP_LENGTH. >>> > > > >>> > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: >>> > > > >>> > > > 0 KSP Residual norm 3.121347818142e+00 >>> > > > 1 KSP Residual norm 9.773346857844e-02 >>> > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 >>> > > > KSP Object: 1 MPI processes >>> > > > type: gmres >>> > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> > > > GMRES: happy breakdown tolerance 1e-30 >>> > > > maximum iterations=10000, initial guess is zero >>> > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 >>> > > > left preconditioning >>> > > > using PRECONDITIONED norm type for convergence test >>> > > > PC Object: 1 MPI processes >>> > > > type: gamg >>> > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v >>> > > > Cycles per PCApply=1 >>> > > > Using Galerkin computed coarse grid matrices >>> > > > Coarse grid solver -- level ------------------------------- >>> > > > KSP Object: (mg_coarse_) 1 MPI processes >>> > > > type: preonly >>> > > > maximum iterations=1, initial guess is zero >>> > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> > > > left preconditioning >>> > > > using NONE norm type for convergence test >>> > > > PC Object: (mg_coarse_) 1 MPI processes >>> > > > type: bjacobi >>> > > > block Jacobi: number of blocks = 1 >>> > > > Local solve is same for all blocks, in the following KSP >>> and PC objects: >>> > > > KSP Object: (mg_coarse_sub_) 1 MPI processes >>> > > > type: preonly >>> > > > maximum iterations=1, initial guess is zero >>> > > > tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000 >>> > > > left preconditioning >>> > > > using NONE norm type for convergence test >>> > > > PC Object: (mg_coarse_sub_) 1 MPI processes >>> > > > type: lu >>> > > > LU: out-of-place factorization >>> > > > tolerance for zero pivot 2.22045e-14 >>> > > > using diagonal shift on blocks to prevent zero pivot >>> [INBLOCKS] >>> > > > matrix ordering: nd >>> > > > factor fill ratio given 5, needed 1.91048 >>> > > > Factored matrix follows: >>> > > > Mat Object: 1 MPI processes >>> > > > type: seqaij >>> > > > rows=284, cols=284 >>> > > > package used to perform factorization: petsc >>> > > > total: nonzeros=7726, allocated nonzeros=7726 >>> > > > total number of mallocs used during MatSetValues >>> calls =0 >>> > > > using I-node routines: found 133 nodes, limit >>> used is 5 >>> > > > linear system matrix = precond matrix: >>> > > > Mat Object: 1 MPI processes >>> > > > type: seqaij >>> > > > rows=284, cols=284 >>> > > > total: nonzeros=4044, allocated nonzeros=4044 >>> > > > total number of mallocs used during MatSetValues calls >>> =0 >>> > > > not using I-node routines >>> > > > linear system matrix = precond matrix: >>> > > > Mat Object: 1 MPI processes >>> > > > type: seqaij >>> > > > rows=284, cols=284 >>> > > > total: nonzeros=4044, allocated nonzeros=4044 >>> > > > total number of mallocs used during MatSetValues calls =0 >>> > > > not using I-node routines >>> > > > Down solver (pre-smoother) on level 1 >>> ------------------------------- >>> > > > KSP Object: (mg_levels_1_) 1 MPI processes >>> > > > type: chebyshev >>> > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = >>> 4.10212 >>> > > > maximum iterations=2 >>> > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> > > > left preconditioning >>> > > > using nonzero initial guess >>> > > > using NONE norm type for convergence test >>> > > > PC Object: (mg_levels_1_) 1 MPI processes >>> > > > type: sor >>> > > > SOR: type = local_symmetric, iterations = 1, local >>> iterations = 1, omega = 1 >>> > > > linear system matrix = precond matrix: >>> > > > Mat Object: () 1 MPI processes >>> > > > type: seqaij >>> > > > rows=9036, cols=9036 >>> > > > total: nonzeros=192256, allocated nonzeros=192256 >>> > > > total number of mallocs used during MatSetValues calls =0 >>> > > > not using I-node routines >>> > > > Up solver (post-smoother) same as down solver (pre-smoother) >>> > > > linear system matrix = precond matrix: >>> > > > Mat Object: () 1 MPI processes >>> > > > type: seqaij >>> > > > rows=9036, cols=9036 >>> > > > total: nonzeros=192256, allocated nonzeros=192256 >>> > > > total number of mallocs used during MatSetValues calls =0 >>> > > > not using I-node routines >>> > > > >>> > > > Thanks, >>> > > > Harshad >>> > > >>> > > >>> > >>> > >>> > >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Wed Sep 14 12:52:38 2016 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Wed, 14 Sep 2016 13:52:38 -0400 Subject: [petsc-users] link error and symbol error Message-ID: <5A5CAEE1-3B29-4D84-8B56-4BC10C2C6B57@gmail.com> I was trying to compile a new version of 3.7.3, and I noticed the following. First, when building the library, I see, at the end: CLINKER /opt/petsc-3.7.3/arch-darwin-c-debug/lib/libpetsc.3.7.3.dylib ld: warning: could not create compact unwind for _dhseqr_: stack subq instruction is too different from dwarf stack size DSYMUTIL /opt/petsc-3.7.3/arch-darwin-c-debug/lib/libpetsc.3.7.3.dylib Next, when I try to run the tests, I see: Running test examples to verify correct installation Using PETSC_DIR=/opt/petsc-3.7.3 and PETSC_ARCH=arch-darwin-c-debug *******************Error detected during compile or link!******************* See http://www.mcs.anl.gov/petsc/documentation/faq.html /opt/petsc-3.7.3/src/snes/examples/tutorials ex19 ********************************************************************************* /opt/petsc-3.7.3/arch-darwin-c-debug/bin/mpicc -o ex19.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 -I/opt/petsc-3.7.3/include -I/opt/petsc-3.7.3/arch-darwin-c-debug/include -I/opt/local/include -I/opt/X11/include `pwd`/ex19.c /opt/petsc-3.7.3/arch-darwin-c-debug/bin/mpicc -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 -o ex19 ex19.o -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -L/opt/petsc-3.7.3/arch-darwin-c-debug/lib -lpetsc -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lsuperlu -lscalapack -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -llapack -lblas -Wl,-rpath,/opt/local/lib -L/opt/local/lib -lhwloc -lptesmumps -lptscotch -lptscotcherr -lscotch -lscotcherr -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/8.0.0/lib/darwin -lmpifort -lgfortran -Wl,-rpath,/opt/local/lib/gcc5/gcc/x86_64-apple-darwin15/5.4.0 -L/opt/local/lib/gcc5/gcc/x86_64-apple-darwin15/5.4.0 -Wl,-rpath,/opt/local/lib/gcc5 -L/opt/local/lib/gcc5 -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpicxx -lc++ -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -lclang_rt.osx -lm -lz -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -L/opt/petsc-3.7.3/arch-darwin-c-debug/lib -ldl -lmpi -lpmpi -lSystem -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -lclang_rt.osx -ldl Undefined symbols for architecture x86_64: "_ATL_cGetNB", referenced from: import-atom in libpetsc.dylib "_ATL_dGetNB", referenced from: import-atom in libpetsc.dylib "_ATL_dgemoveT", referenced from: import-atom in libpetsc.dylib "_ATL_dger", referenced from: import-atom in libpetsc.dylib "_ATL_dsqtrans", referenced from: import-atom in libpetsc.dylib "_ATL_sGetNB", referenced from: import-atom in libpetsc.dylib "_ATL_xerbla", referenced from: import-atom in libpetsc.dylib "_ATL_zGetNB", referenced from: import-atom in libpetsc.dylib ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) makefile:108: recipe for target 'ex19' failed gmake[3]: [ex19] Error 1 (ignored) /bin/rm -f ex19.o -gideon From bsmith at mcs.anl.gov Wed Sep 14 13:00:58 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 Sep 2016 13:00:58 -0500 Subject: [petsc-users] link error and symbol error In-Reply-To: <5A5CAEE1-3B29-4D84-8B56-4BC10C2C6B57@gmail.com> References: <5A5CAEE1-3B29-4D84-8B56-4BC10C2C6B57@gmail.com> Message-ID: <8774FC75-0161-454A-823D-23836FE61C07@mcs.anl.gov> Send configure.log to petsc-maint at mcs.anl.gov Looks like a troubled BLAS/LAPACK install. Barry > On Sep 14, 2016, at 12:52 PM, Gideon Simpson wrote: > > I was trying to compile a new version of 3.7.3, and I noticed the following. First, when building the library, I see, at the end: > > CLINKER /opt/petsc-3.7.3/arch-darwin-c-debug/lib/libpetsc.3.7.3.dylib > ld: warning: could not create compact unwind for _dhseqr_: stack subq instruction is too different from dwarf stack size > DSYMUTIL /opt/petsc-3.7.3/arch-darwin-c-debug/lib/libpetsc.3.7.3.dylib > > Next, when I try to run the tests, I see: > > Running test examples to verify correct installation > Using PETSC_DIR=/opt/petsc-3.7.3 and PETSC_ARCH=arch-darwin-c-debug > *******************Error detected during compile or link!******************* > See http://www.mcs.anl.gov/petsc/documentation/faq.html > /opt/petsc-3.7.3/src/snes/examples/tutorials ex19 > ********************************************************************************* > /opt/petsc-3.7.3/arch-darwin-c-debug/bin/mpicc -o ex19.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 -I/opt/petsc-3.7.3/include -I/opt/petsc-3.7.3/arch-darwin-c-debug/include -I/opt/local/include -I/opt/X11/include `pwd`/ex19.c > /opt/petsc-3.7.3/arch-darwin-c-debug/bin/mpicc -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 -o ex19 ex19.o -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -L/opt/petsc-3.7.3/arch-darwin-c-debug/lib -lpetsc -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lsuperlu -lscalapack -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -llapack -lblas -Wl,-rpath,/opt/local/lib -L/opt/local/lib -lhwloc -lptesmumps -lptscotch -lptscotcherr -lscotch -lscotcherr -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/8.0.0/lib/darwin -lmpifort -lgfortran -Wl,-rpath,/opt/local/lib/gcc5/gcc/x86_64-apple-darwin15/5.4.0 -L/opt/local/lib/gcc5/gcc/x86_64-apple-darwin15/5.4.0 -Wl,-rpath,/opt/local/lib/gcc5 -L/opt/local/lib/gcc5 -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpicxx -lc++ -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -lclang_rt.osx -lm -lz -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -L/opt/petsc-3.7.3/arch-darwin-c-debug/lib -ldl -lmpi -lpmpi -lSystem -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -lclang_rt.osx -ldl > Undefined symbols for architecture x86_64: > "_ATL_cGetNB", referenced from: > import-atom in libpetsc.dylib > "_ATL_dGetNB", referenced from: > import-atom in libpetsc.dylib > "_ATL_dgemoveT", referenced from: > import-atom in libpetsc.dylib > "_ATL_dger", referenced from: > import-atom in libpetsc.dylib > "_ATL_dsqtrans", referenced from: > import-atom in libpetsc.dylib > "_ATL_sGetNB", referenced from: > import-atom in libpetsc.dylib > "_ATL_xerbla", referenced from: > import-atom in libpetsc.dylib > "_ATL_zGetNB", referenced from: > import-atom in libpetsc.dylib > ld: symbol(s) not found for architecture x86_64 > clang: error: linker command failed with exit code 1 (use -v to see invocation) > makefile:108: recipe for target 'ex19' failed > gmake[3]: [ex19] Error 1 (ignored) > /bin/rm -f ex19.o > > > -gideon > From balay at mcs.anl.gov Wed Sep 14 13:03:24 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 14 Sep 2016 13:03:24 -0500 Subject: [petsc-users] link error and symbol error In-Reply-To: <8774FC75-0161-454A-823D-23836FE61C07@mcs.anl.gov> References: <5A5CAEE1-3B29-4D84-8B56-4BC10C2C6B57@gmail.com> <8774FC75-0161-454A-823D-23836FE61C07@mcs.anl.gov> Message-ID: Do you have MacPorts installed? With Blas? Perhaps thats the source of conflict.. Satish On Wed, 14 Sep 2016, Barry Smith wrote: > > Send configure.log to petsc-maint at mcs.anl.gov Looks like a troubled BLAS/LAPACK install. > > Barry > > > On Sep 14, 2016, at 12:52 PM, Gideon Simpson wrote: > > > > I was trying to compile a new version of 3.7.3, and I noticed the following. First, when building the library, I see, at the end: > > > > CLINKER /opt/petsc-3.7.3/arch-darwin-c-debug/lib/libpetsc.3.7.3.dylib > > ld: warning: could not create compact unwind for _dhseqr_: stack subq instruction is too different from dwarf stack size > > DSYMUTIL /opt/petsc-3.7.3/arch-darwin-c-debug/lib/libpetsc.3.7.3.dylib > > > > Next, when I try to run the tests, I see: > > > > Running test examples to verify correct installation > > Using PETSC_DIR=/opt/petsc-3.7.3 and PETSC_ARCH=arch-darwin-c-debug > > *******************Error detected during compile or link!******************* > > See http://www.mcs.anl.gov/petsc/documentation/faq.html > > /opt/petsc-3.7.3/src/snes/examples/tutorials ex19 > > ********************************************************************************* > > /opt/petsc-3.7.3/arch-darwin-c-debug/bin/mpicc -o ex19.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 -I/opt/petsc-3.7.3/include -I/opt/petsc-3.7.3/arch-darwin-c-debug/include -I/opt/local/include -I/opt/X11/include `pwd`/ex19.c > > /opt/petsc-3.7.3/arch-darwin-c-debug/bin/mpicc -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 -o ex19 ex19.o -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -L/opt/petsc-3.7.3/arch-darwin-c-debug/lib -lpetsc -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lsuperlu -lscalapack -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -llapack -lblas -Wl,-rpath,/opt/local/lib -L/opt/local/lib -lhwloc -lptesmumps -lptscotch -lptscotcherr -lscotch -lscotcherr -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/8.0.0/lib/darwin -lmpifort -lgfortran -Wl,-rpath,/opt/local/lib/gcc5/gcc/x86_64-apple-darwin15/5.4.0 -L/opt/local/lib/gcc5/gcc/x86_64-apple-darwin15/5.4.0 -Wl,-rpath,/opt/local/lib/gcc5 -L/opt/local/lib/gcc5 -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpicxx -lc++ -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -lclang_rt.osx -lm -lz -Wl,-rpath,/opt/petsc-3.7.3/arch-darwin-c-debug/lib -L/opt/petsc-3.7.3/arch-darwin-c-debug/lib -ldl -lmpi -lpmpi -lSystem -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8..0.0/lib/darwin -lclang_rt.osx -ldl > > Undefined symbols for architecture x86_64: > > "_ATL_cGetNB", referenced from: > > import-atom in libpetsc.dylib > > "_ATL_dGetNB", referenced from: > > import-atom in libpetsc.dylib > > "_ATL_dgemoveT", referenced from: > > import-atom in libpetsc.dylib > > "_ATL_dger", referenced from: > > import-atom in libpetsc.dylib > > "_ATL_dsqtrans", referenced from: > > import-atom in libpetsc.dylib > > "_ATL_sGetNB", referenced from: > > import-atom in libpetsc.dylib > > "_ATL_xerbla", referenced from: > > import-atom in libpetsc.dylib > > "_ATL_zGetNB", referenced from: > > import-atom in libpetsc.dylib > > ld: symbol(s) not found for architecture x86_64 > > clang: error: linker command failed with exit code 1 (use -v to see invocation) > > makefile:108: recipe for target 'ex19' failed > > gmake[3]: [ex19] Error 1 (ignored) > > /bin/rm -f ex19.o > > > > > > -gideon > > > > From bsmith at mcs.anl.gov Wed Sep 14 13:07:59 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 Sep 2016 13:07:59 -0500 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> Message-ID: Super strange, it should never have switched to the function SNES_TR_KSPConverged_Private Fortunately you can use the same technique to track down where the function pointer changes. Just watch ksp->converged to see when the function pointer gets changed and send back the new stack trace. Barry > On Sep 14, 2016, at 12:39 PM, Harshad Sahasrabudhe wrote: > > Hi Barry, > > I put a watchpoint on *((KSP_CONVERGED_REASON*) &( ((_p_KSP*)ksp)->reason )) in gdb. The ksp->reason switched between: > > Old value = KSP_CONVERGED_ITERATING > New value = KSP_CONVERGED_RTOL > 0x00002b143054bef2 in KSPConvergedDefault (ksp=0x23c3090, n=12, rnorm=5.3617149831259514e-08, reason=0x23c3310, ctx=0x2446210) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/iterativ.c:764 > 764 *reason = KSP_CONVERGED_RTOL; > > and > > Old value = KSP_CONVERGED_RTOL > New value = KSP_CONVERGED_ITERATING > KSPSetUp (ksp=0x23c3090) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:226 > 226 if (!((PetscObject)ksp)->type_name) { > > However, after iteration 6, it changed to KSP_CONVERGED_STEP_LENGTH > > Old value = KSP_CONVERGED_ITERATING > New value = KSP_CONVERGED_STEP_LENGTH > SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:36 > 36 PetscFunctionReturn(0); > > Any ideas why that function was executed? Backtrace when the program stopped here: > > #0 SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:36 > #1 0x00002b14305d3fda in KSPGMRESCycle (itcount=0x7ffdcf2d4ffc, ksp=0x23c3090) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/impls/gmres/gmres.c:182 > #2 0x00002b14305d4711 in KSPSolve_GMRES (ksp=0x23c3090) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/impls/gmres/gmres.c:235 > #3 0x00002b1430526a8a in KSPSolve (ksp=0x23c3090, b=0x1a916c0, x=0x1d89dc0) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:460 > #4 0x00002b1430bb3905 in SNESSolve_NEWTONTR (snes=0x1ea2490) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:160 > #5 0x00002b1430655b57 in SNESSolve (snes=0x1ea2490, b=0x0, x=0x1a27420) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 > #6 0x00002b142f606780 in libMesh::PetscNonlinearSolver::solve (this=0x1a1d8c0, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 > #7 0x00002b142f67561d in libMesh::NonlinearImplicitSystem::solve (this=0x1a14fe0) at src/systems/nonlinear_implicit_system.C:183 > #8 0x00002b1429548ceb in NonlinearPoisson::execute_solver (this=0x1110500) at NonlinearPoisson.cpp:1191 > #9 0x00002b142952233c in NonlinearPoisson::do_solve (this=0x1110500) at NonlinearPoisson.cpp:948 > #10 0x00002b1429b9e785 in Simulation::solve (this=0x1110500) at Simulation.cpp:781 > #11 0x00002b1429ac326e in Nemo::run_simulations (this=0x63b020) at Nemo.cpp:1313 > #12 0x0000000000426d0d in main (argc=6, argv=0x7ffdcf2d7908) at main.cpp:447 > > > Thanks! > Harshad > > On Wed, Sep 14, 2016 at 10:10 AM, Harshad Sahasrabudhe wrote: > I think I found the problem. I configured PETSc with COPTFLAGS=-O3. I'll remove that option and try again. > > Thanks! > Harshad > > On Wed, Sep 14, 2016 at 10:06 AM, Harshad Sahasrabudhe wrote: > Hi Barry, > > Thanks for your inputs. I tried to set a watchpoint on ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. Basically, GDB isn't able to find the PETSc source code. I built PETSc with --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get doesn't contain debugging symbols (checked using objdump -g). How do I get PETSc library to have debugging info? > > Thanks, > Harshad > > On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith wrote: > > > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe wrote: > > > > Hi Barry, > > > > I compiled with mpich configured using --enable-g=meminit to get rid of MPI errors in Valgrind. Doing this reduced the number of errors to 2. I have attached the Valgrind output. > > This isn't helpful but it seems not to be a memory corruption issue :-( > > > > I'm using GAMG+GMRES for in each linear iteration of SNES. The linear solver converges with CONVERGED_RTOL for the first 6 iterations and with CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is happening. Any thoughts/ideas? > > Does this happen on one process? If so I would run in the debugger and track the variable to see everyplace the variable is changed, this would point to exactly what piece of code is changing the variable to this unexpected value. > > For example with lldb one can use watch http://lldb.llvm.org/tutorial.html to see each time a variable gets changed. Similar thing with gdb. > > The variable to watch is ksp->reason Once you get the hang of this it can take just a few minutes to track down the code that is making this unexpected value, though I understand if you haven't done it before it can be intimidating. > > Barry > > You can do the same thing in parallel (like on two processes) if you need to but it is more cumbersome since you need run multiple debuggers. You can have PETSc start up multiple debuggers with mpiexec -n 2 ./ex -start_in_debugger > > > > > > > > Thanks, > > Harshad > > > > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith wrote: > > > > Install your MPI with --download-mpich as a PETSc ./configure option, this will eliminate all the MPICH valgrind errors. Then send as an attachment the resulting valgrind file. > > > > I do not 100 % trust any code that produces such valgrind errors. > > > > Barry > > > > > > > > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe wrote: > > > > > > Hi Barry, > > > > > > Thanks for the reply. My code is in C. I ran with Valgrind and found many "Conditional jump or move depends on uninitialized value(s)", "Invalid read" and "Use of uninitialized value" errors. I think all of them are from the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure what I'm looking for in the Valgrind output. At the end of the file, I get: > > > > > > ==40223== More than 10000000 total errors detected. I'm not reporting any more. > > > ==40223== Final error counts will be inaccurate. Go fix your program! > > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note > > > ==40223== that errors may occur in your program without prior warning from > > > ==40223== Valgrind, because errors are no longer being displayed. > > > > > > Can you give some suggestions on how I should proceed? > > > > > > Thanks, > > > Harshad > > > > > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith wrote: > > > > > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very specialized and should never occur with GMRES. > > > > > > Can you run with valgrind to make sure there is no memory corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > > Is your code fortran or C? > > > > > > Barry > > > > > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe wrote: > > > > > > > > Hi, > > > > > > > > I'm using GAMG + GMRES for my Poisson problem. The solver converges with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which is much higher than what I need (I need a tolerance of at least 1E-8). I am not able to figure out which tolerance I need to set to avoid convergence due to CONVERGED_STEP_LENGTH. > > > > > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > > > > > > > 0 KSP Residual norm 3.121347818142e+00 > > > > 1 KSP Residual norm 9.773346857844e-02 > > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > > > > KSP Object: 1 MPI processes > > > > type: gmres > > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > > > GMRES: happy breakdown tolerance 1e-30 > > > > maximum iterations=10000, initial guess is zero > > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using PRECONDITIONED norm type for convergence test > > > > PC Object: 1 MPI processes > > > > type: gamg > > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v > > > > Cycles per PCApply=1 > > > > Using Galerkin computed coarse grid matrices > > > > Coarse grid solver -- level ------------------------------- > > > > KSP Object: (mg_coarse_) 1 MPI processes > > > > type: preonly > > > > maximum iterations=1, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (mg_coarse_) 1 MPI processes > > > > type: bjacobi > > > > block Jacobi: number of blocks = 1 > > > > Local solve is same for all blocks, in the following KSP and PC objects: > > > > KSP Object: (mg_coarse_sub_) 1 MPI processes > > > > type: preonly > > > > maximum iterations=1, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (mg_coarse_sub_) 1 MPI processes > > > > type: lu > > > > LU: out-of-place factorization > > > > tolerance for zero pivot 2.22045e-14 > > > > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > > > > matrix ordering: nd > > > > factor fill ratio given 5, needed 1.91048 > > > > Factored matrix follows: > > > > Mat Object: 1 MPI processes > > > > type: seqaij > > > > rows=284, cols=284 > > > > package used to perform factorization: petsc > > > > total: nonzeros=7726, allocated nonzeros=7726 > > > > total number of mallocs used during MatSetValues calls =0 > > > > using I-node routines: found 133 nodes, limit used is 5 > > > > linear system matrix = precond matrix: > > > > Mat Object: 1 MPI processes > > > > type: seqaij > > > > rows=284, cols=284 > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node routines > > > > linear system matrix = precond matrix: > > > > Mat Object: 1 MPI processes > > > > type: seqaij > > > > rows=284, cols=284 > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node routines > > > > Down solver (pre-smoother) on level 1 ------------------------------- > > > > KSP Object: (mg_levels_1_) 1 MPI processes > > > > type: chebyshev > > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = 4.10212 > > > > maximum iterations=2 > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using nonzero initial guess > > > > using NONE norm type for convergence test > > > > PC Object: (mg_levels_1_) 1 MPI processes > > > > type: sor > > > > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1 > > > > linear system matrix = precond matrix: > > > > Mat Object: () 1 MPI processes > > > > type: seqaij > > > > rows=9036, cols=9036 > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node routines > > > > Up solver (post-smoother) same as down solver (pre-smoother) > > > > linear system matrix = precond matrix: > > > > Mat Object: () 1 MPI processes > > > > type: seqaij > > > > rows=9036, cols=9036 > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node routines > > > > > > > > Thanks, > > > > Harshad > > > > > > > > > > > > > > > > From hsahasra at purdue.edu Wed Sep 14 13:31:13 2016 From: hsahasra at purdue.edu (Harshad Sahasrabudhe) Date: Wed, 14 Sep 2016 14:31:13 -0400 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> Message-ID: Thanks. I now put a watchpoint on watch *( (PetscErrorCode (**)(KSP, PetscInt, PetscReal, KSPConvergedReason *, void *)) &(ksp->converged) ) The function pointer changes in the first iteration of SNES. It changed at the following place: Old value = (PetscErrorCode (*)(KSP, PetscInt, PetscReal, KSPConvergedReason *, void *)) 0x2b54acdd00aa New value = (PetscErrorCode (*)(KSP, PetscInt, PetscReal, KSPConvergedReason *, void *)) 0x2b54ad436ce8 KSPSetConvergenceTest (ksp=0x22bf090, converge=0x2b54ad436ce8 , cctx=0x1c8b3e0, destroy=0x2b54ad437210 ) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:1768 1767 ksp->converged = converge; Here's the backtrace: #0 KSPSetConvergenceTest (ksp=0x22bf090, converge=0x2b54ad436ce8 , cctx=0x1c8b3e0, destroy=0x2b54ad437210 ) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:1768 #1 0x00002b54ad43865a in SNESSolve_NEWTONTR (snes=0x1d9e490) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:146 #2 0x00002b54acedab57 in SNESSolve (snes=0x1d9e490, b=0x0, x=0x1923420) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 #3 0x00002b54abe8b780 in libMesh::PetscNonlinearSolver::solve (this=0x19198c0, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 #4 0x00002b54abefa61d in libMesh::NonlinearImplicitSystem::solve (this=0x1910fe0) at src/systems/nonlinear_implicit_system.C:183 #5 0x00002b54a5dcdceb in NonlinearPoisson::execute_solver (this=0x100c500) at NonlinearPoisson.cpp:1191 #6 0x00002b54a5da733c in NonlinearPoisson::do_solve (this=0x100c500) at NonlinearPoisson.cpp:948 #7 0x00002b54a6423785 in Simulation::solve (this=0x100c500) at Simulation.cpp:781 #8 0x00002b54a634826e in Nemo::run_simulations (this=0x63b020) at Nemo.cpp:1313 #9 0x0000000000426d0d in main (argc=6, argv=0x7ffcdb910768) at main.cpp:447 Thanks, Harshad On Wed, Sep 14, 2016 at 2:07 PM, Barry Smith wrote: > > Super strange, it should never have switched to the function > SNES_TR_KSPConverged_Private > > Fortunately you can use the same technique to track down where the > function pointer changes. Just watch ksp->converged to see when the > function pointer gets changed and send back the new stack trace. > > Barry > > > > > > On Sep 14, 2016, at 12:39 PM, Harshad Sahasrabudhe > wrote: > > > > Hi Barry, > > > > I put a watchpoint on *((KSP_CONVERGED_REASON*) &( > ((_p_KSP*)ksp)->reason )) in gdb. The ksp->reason switched between: > > > > Old value = KSP_CONVERGED_ITERATING > > New value = KSP_CONVERGED_RTOL > > 0x00002b143054bef2 in KSPConvergedDefault (ksp=0x23c3090, n=12, > rnorm=5.3617149831259514e-08, reason=0x23c3310, ctx=0x2446210) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/interface/iterativ.c:764 > > 764 *reason = KSP_CONVERGED_RTOL; > > > > and > > > > Old value = KSP_CONVERGED_RTOL > > New value = KSP_CONVERGED_ITERATING > > KSPSetUp (ksp=0x23c3090) at /depot/ncn/apps/conte/conte- > gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:226 > > 226 if (!((PetscObject)ksp)->type_name) { > > > > However, after iteration 6, it changed to KSP_CONVERGED_STEP_LENGTH > > > > Old value = KSP_CONVERGED_ITERATING > > New value = KSP_CONVERGED_STEP_LENGTH > > SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, > rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:36 > > 36 PetscFunctionReturn(0); > > > > Any ideas why that function was executed? Backtrace when the program > stopped here: > > > > #0 SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, > rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:36 > > #1 0x00002b14305d3fda in KSPGMRESCycle (itcount=0x7ffdcf2d4ffc, > ksp=0x23c3090) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/impls/gmres/gmres.c:182 > > #2 0x00002b14305d4711 in KSPSolve_GMRES (ksp=0x23c3090) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/impls/gmres/gmres.c:235 > > #3 0x00002b1430526a8a in KSPSolve (ksp=0x23c3090, b=0x1a916c0, > x=0x1d89dc0) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/interface/itfunc.c:460 > > #4 0x00002b1430bb3905 in SNESSolve_NEWTONTR (snes=0x1ea2490) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:160 > > #5 0x00002b1430655b57 in SNESSolve (snes=0x1ea2490, b=0x0, x=0x1a27420) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/interface/snes.c:3743 > > #6 0x00002b142f606780 in libMesh::PetscNonlinearSolver::solve > (this=0x1a1d8c0, jac_in=..., x_in=..., r_in=...) at > src/solvers/petsc_nonlinear_solver.C:714 > > #7 0x00002b142f67561d in libMesh::NonlinearImplicitSystem::solve > (this=0x1a14fe0) at src/systems/nonlinear_implicit_system.C:183 > > #8 0x00002b1429548ceb in NonlinearPoisson::execute_solver > (this=0x1110500) at NonlinearPoisson.cpp:1191 > > #9 0x00002b142952233c in NonlinearPoisson::do_solve (this=0x1110500) at > NonlinearPoisson.cpp:948 > > #10 0x00002b1429b9e785 in Simulation::solve (this=0x1110500) at > Simulation.cpp:781 > > #11 0x00002b1429ac326e in Nemo::run_simulations (this=0x63b020) at > Nemo.cpp:1313 > > #12 0x0000000000426d0d in main (argc=6, argv=0x7ffdcf2d7908) at > main.cpp:447 > > > > > > Thanks! > > Harshad > > > > On Wed, Sep 14, 2016 at 10:10 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > I think I found the problem. I configured PETSc with COPTFLAGS=-O3. I'll > remove that option and try again. > > > > Thanks! > > Harshad > > > > On Wed, Sep 14, 2016 at 10:06 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > Hi Barry, > > > > Thanks for your inputs. I tried to set a watchpoint on > ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. > Basically, GDB isn't able to find the PETSc source code. I built PETSc with > --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get > doesn't contain debugging symbols (checked using objdump -g). How do I get > PETSc library to have debugging info? > > > > Thanks, > > Harshad > > > > On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith wrote: > > > > > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe > wrote: > > > > > > Hi Barry, > > > > > > I compiled with mpich configured using --enable-g=meminit to get rid > of MPI errors in Valgrind. Doing this reduced the number of errors to 2. I > have attached the Valgrind output. > > > > This isn't helpful but it seems not to be a memory corruption issue > :-( > > > > > > I'm using GAMG+GMRES for in each linear iteration of SNES. The linear > solver converges with CONVERGED_RTOL for the first 6 iterations and with > CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is > happening. Any thoughts/ideas? > > > > Does this happen on one process? If so I would run in the debugger > and track the variable to see everyplace the variable is changed, this > would point to exactly what piece of code is changing the variable to this > unexpected value. > > > > For example with lldb one can use watch > http://lldb.llvm.org/tutorial.html to see each time a variable gets > changed. Similar thing with gdb. > > > > The variable to watch is ksp->reason Once you get the hang of this > it can take just a few minutes to track down the code that is making this > unexpected value, though I understand if you haven't done it before it can > be intimidating. > > > > Barry > > > > You can do the same thing in parallel (like on two processes) if you > need to but it is more cumbersome since you need run multiple debuggers. > You can have PETSc start up multiple debuggers with mpiexec -n 2 ./ex > -start_in_debugger > > > > > > > > > > > > > > Thanks, > > > Harshad > > > > > > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith > wrote: > > > > > > Install your MPI with --download-mpich as a PETSc ./configure > option, this will eliminate all the MPICH valgrind errors. Then send as an > attachment the resulting valgrind file. > > > > > > I do not 100 % trust any code that produces such valgrind errors. > > > > > > Barry > > > > > > > > > > > > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > > > > > > Hi Barry, > > > > > > > > Thanks for the reply. My code is in C. I ran with Valgrind and found > many "Conditional jump or move depends on uninitialized value(s)", "Invalid > read" and "Use of uninitialized value" errors. I think all of them are from > the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure > what I'm looking for in the Valgrind output. At the end of the file, I get: > > > > > > > > ==40223== More than 10000000 total errors detected. I'm not > reporting any more. > > > > ==40223== Final error counts will be inaccurate. Go fix your > program! > > > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note > > > > ==40223== that errors may occur in your program without prior > warning from > > > > ==40223== Valgrind, because errors are no longer being displayed. > > > > > > > > Can you give some suggestions on how I should proceed? > > > > > > > > Thanks, > > > > Harshad > > > > > > > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith > wrote: > > > > > > > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very > specialized and should never occur with GMRES. > > > > > > > > Can you run with valgrind to make sure there is no memory > corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > > > > Is your code fortran or C? > > > > > > > > Barry > > > > > > > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > > > > > > > > Hi, > > > > > > > > > > I'm using GAMG + GMRES for my Poisson problem. The solver > converges with KSP_CONVERGED_STEP_LENGTH at a residual of > 9.773346857844e-02, which is much higher than what I need (I need a > tolerance of at least 1E-8). I am not able to figure out which tolerance I > need to set to avoid convergence due to CONVERGED_STEP_LENGTH. > > > > > > > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > > > > > > > > > 0 KSP Residual norm 3.121347818142e+00 > > > > > 1 KSP Residual norm 9.773346857844e-02 > > > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > > > > > KSP Object: 1 MPI processes > > > > > type: gmres > > > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > > > > GMRES: happy breakdown tolerance 1e-30 > > > > > maximum iterations=10000, initial guess is zero > > > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > > > > > left preconditioning > > > > > using PRECONDITIONED norm type for convergence test > > > > > PC Object: 1 MPI processes > > > > > type: gamg > > > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v > > > > > Cycles per PCApply=1 > > > > > Using Galerkin computed coarse grid matrices > > > > > Coarse grid solver -- level ------------------------------- > > > > > KSP Object: (mg_coarse_) 1 MPI processes > > > > > type: preonly > > > > > maximum iterations=1, initial guess is zero > > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > > left preconditioning > > > > > using NONE norm type for convergence test > > > > > PC Object: (mg_coarse_) 1 MPI processes > > > > > type: bjacobi > > > > > block Jacobi: number of blocks = 1 > > > > > Local solve is same for all blocks, in the following KSP > and PC objects: > > > > > KSP Object: (mg_coarse_sub_) 1 MPI processes > > > > > type: preonly > > > > > maximum iterations=1, initial guess is zero > > > > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > > > > > left preconditioning > > > > > using NONE norm type for convergence test > > > > > PC Object: (mg_coarse_sub_) 1 MPI processes > > > > > type: lu > > > > > LU: out-of-place factorization > > > > > tolerance for zero pivot 2.22045e-14 > > > > > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > > > > > matrix ordering: nd > > > > > factor fill ratio given 5, needed 1.91048 > > > > > Factored matrix follows: > > > > > Mat Object: 1 MPI processes > > > > > type: seqaij > > > > > rows=284, cols=284 > > > > > package used to perform factorization: petsc > > > > > total: nonzeros=7726, allocated nonzeros=7726 > > > > > total number of mallocs used during MatSetValues > calls =0 > > > > > using I-node routines: found 133 nodes, limit > used is 5 > > > > > linear system matrix = precond matrix: > > > > > Mat Object: 1 MPI processes > > > > > type: seqaij > > > > > rows=284, cols=284 > > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > > total number of mallocs used during MatSetValues calls > =0 > > > > > not using I-node routines > > > > > linear system matrix = precond matrix: > > > > > Mat Object: 1 MPI processes > > > > > type: seqaij > > > > > rows=284, cols=284 > > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > not using I-node routines > > > > > Down solver (pre-smoother) on level 1 > ------------------------------- > > > > > KSP Object: (mg_levels_1_) 1 MPI processes > > > > > type: chebyshev > > > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = > 4.10212 > > > > > maximum iterations=2 > > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > > left preconditioning > > > > > using nonzero initial guess > > > > > using NONE norm type for convergence test > > > > > PC Object: (mg_levels_1_) 1 MPI processes > > > > > type: sor > > > > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1 > > > > > linear system matrix = precond matrix: > > > > > Mat Object: () 1 MPI processes > > > > > type: seqaij > > > > > rows=9036, cols=9036 > > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > not using I-node routines > > > > > Up solver (post-smoother) same as down solver (pre-smoother) > > > > > linear system matrix = precond matrix: > > > > > Mat Object: () 1 MPI processes > > > > > type: seqaij > > > > > rows=9036, cols=9036 > > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > not using I-node routines > > > > > > > > > > Thanks, > > > > > Harshad > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fande.kong at inl.gov Wed Sep 14 13:35:58 2016 From: fande.kong at inl.gov (Kong (Non-US), Fande) Date: Wed, 14 Sep 2016 12:35:58 -0600 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> Message-ID: Hi Hsahara, I am not sure whether or not the dbg will also trace your libmesh code. We have a similar issue in the MOOSE with PETSc-3.7.x. No issues with the old PETSc. We finally get the problem fixed. In KSP, we could plugin any user converged test function, and there is a plugin-in function in MOOSE. PETSc-3.7.x. calls the PETSc default converged test first, and if the algorithm converges or diverges, PETSc just stop solving the linear system and NOT going to call the user converged test any more. A variable stores the converged reason in MOOSE. Try to solve another updated linear system using the same KSP, and KSP just stops solving because the old converged reason is reused again. I think the old version of PETSc always call the user converged test function first. This is possibly not related to your issue, but just give you one more thought. Fande Kong On Wed, Sep 14, 2016 at 11:39 AM, Harshad Sahasrabudhe wrote: > Hi Barry, > > I put a watchpoint on *((KSP_CONVERGED_REASON*) &( ((_p_KSP*)ksp)->reason )) > in gdb. The ksp->reason switched between: > > Old value = KSP_CONVERGED_ITERATING > New value = KSP_CONVERGED_RTOL > 0x00002b143054bef2 in KSPConvergedDefault (ksp=0x23c3090, n=12, > rnorm=5.3617149831259514e-08, reason=0x23c3310, ctx=0x2446210) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/interface/iterativ.c:764 > 764 *reason = KSP_CONVERGED_RTOL; > > *and* > > Old value = KSP_CONVERGED_RTOL > New value = KSP_CONVERGED_ITERATING > KSPSetUp (ksp=0x23c3090) at /depot/ncn/apps/conte/conte- > gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:226 > 226 if (!((PetscObject)ksp)->type_name) { > > However, after iteration 6, it changed to KSP_CONVERGED_STEP_LENGTH > > Old value = KSP_CONVERGED_ITERATING > New value = KSP_CONVERGED_STEP_LENGTH > SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, > rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:36 > 36 PetscFunctionReturn(0); > > Any ideas why that function was executed? Backtrace when the program > stopped here: > > #0 SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, > rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:36 > #1 0x00002b14305d3fda in KSPGMRESCycle (itcount=0x7ffdcf2d4ffc, > ksp=0x23c3090) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/impls/gmres/gmres.c:182 > #2 0x00002b14305d4711 in KSPSolve_GMRES (ksp=0x23c3090) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/impls/gmres/gmres.c:235 > #3 0x00002b1430526a8a in KSPSolve (ksp=0x23c3090, b=0x1a916c0, > x=0x1d89dc0) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/interface/itfunc.c:460 > #4 0x00002b1430bb3905 in SNESSolve_NEWTONTR (snes=0x1ea2490) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:160 > #5 0x00002b1430655b57 in SNESSolve (snes=0x1ea2490, b=0x0, x=0x1a27420) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/interface/snes.c:3743 > #6 0x00002b142f606780 in libMesh::PetscNonlinearSolver::solve > (this=0x1a1d8c0, jac_in=..., x_in=..., r_in=...) at > src/solvers/petsc_nonlinear_solver.C:714 > #7 0x00002b142f67561d in libMesh::NonlinearImplicitSystem::solve > (this=0x1a14fe0) at src/systems/nonlinear_implicit_system.C:183 > #8 0x00002b1429548ceb in NonlinearPoisson::execute_solver > (this=0x1110500) at NonlinearPoisson.cpp:1191 > #9 0x00002b142952233c in NonlinearPoisson::do_solve (this=0x1110500) at > NonlinearPoisson.cpp:948 > #10 0x00002b1429b9e785 in Simulation::solve (this=0x1110500) at > Simulation.cpp:781 > #11 0x00002b1429ac326e in Nemo::run_simulations (this=0x63b020) at > Nemo.cpp:1313 > #12 0x0000000000426d0d in main (argc=6, argv=0x7ffdcf2d7908) at > main.cpp:447 > > > Thanks! > Harshad > > On Wed, Sep 14, 2016 at 10:10 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > >> I think I found the problem. I configured PETSc with COPTFLAGS=-O3. I'll >> remove that option and try again. >> >> Thanks! >> Harshad >> >> On Wed, Sep 14, 2016 at 10:06 AM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> >>> Hi Barry, >>> >>> Thanks for your inputs. I tried to set a watchpoint on >>> ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. >>> Basically, GDB isn't able to find the PETSc source code. I built PETSc with >>> --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get >>> doesn't contain debugging symbols (checked using objdump -g). How do I get >>> PETSc library to have debugging info? >>> >>> Thanks, >>> Harshad >>> >>> On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith wrote: >>> >>>> >>>> > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe < >>>> hsahasra at purdue.edu> wrote: >>>> > >>>> > Hi Barry, >>>> > >>>> > I compiled with mpich configured using --enable-g=meminit to get rid >>>> of MPI errors in Valgrind. Doing this reduced the number of errors to 2. I >>>> have attached the Valgrind output. >>>> >>>> This isn't helpful but it seems not to be a memory corruption issue >>>> :-( >>>> > >>>> > I'm using GAMG+GMRES for in each linear iteration of SNES. The linear >>>> solver converges with CONVERGED_RTOL for the first 6 iterations and with >>>> CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is >>>> happening. Any thoughts/ideas? >>>> >>>> Does this happen on one process? If so I would run in the debugger >>>> and track the variable to see everyplace the variable is changed, this >>>> would point to exactly what piece of code is changing the variable to this >>>> unexpected value. >>>> >>>> For example with lldb one can use watch >>>> http://lldb.llvm.org/tutorial.html >>>> >>>> to see each time a variable gets changed. Similar thing with gdb. >>>> >>>> The variable to watch is ksp->reason Once you get the hang of this >>>> it can take just a few minutes to track down the code that is making this >>>> unexpected value, though I understand if you haven't done it before it can >>>> be intimidating. >>>> >>>> Barry >>>> >>>> You can do the same thing in parallel (like on two processes) if you >>>> need to but it is more cumbersome since you need run multiple debuggers. >>>> You can have PETSc start up multiple debuggers with mpiexec -n 2 ./ex >>>> -start_in_debugger >>>> >>>> >>>> >>>> >>>> > >>>> > Thanks, >>>> > Harshad >>>> > >>>> > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith >>>> wrote: >>>> > >>>> > Install your MPI with --download-mpich as a PETSc ./configure >>>> option, this will eliminate all the MPICH valgrind errors. Then send as an >>>> attachment the resulting valgrind file. >>>> > >>>> > I do not 100 % trust any code that produces such valgrind errors. >>>> > >>>> > Barry >>>> > >>>> > >>>> > >>>> > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe < >>>> hsahasra at purdue.edu> wrote: >>>> > > >>>> > > Hi Barry, >>>> > > >>>> > > Thanks for the reply. My code is in C. I ran with Valgrind and >>>> found many "Conditional jump or move depends on uninitialized value(s)", >>>> "Invalid read" and "Use of uninitialized value" errors. I think all of them >>>> are from the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not >>>> really sure what I'm looking for in the Valgrind output. At the end of the >>>> file, I get: >>>> > > >>>> > > ==40223== More than 10000000 total errors detected. I'm not >>>> reporting any more. >>>> > > ==40223== Final error counts will be inaccurate. Go fix your >>>> program! >>>> > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note >>>> > > ==40223== that errors may occur in your program without prior >>>> warning from >>>> > > ==40223== Valgrind, because errors are no longer being displayed. >>>> > > >>>> > > Can you give some suggestions on how I should proceed? >>>> > > >>>> > > Thanks, >>>> > > Harshad >>>> > > >>>> > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith >>>> wrote: >>>> > > >>>> > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very >>>> specialized and should never occur with GMRES. >>>> > > >>>> > > Can you run with valgrind to make sure there is no memory >>>> corruption? http://www.mcs.anl.gov/petsc/d >>>> ocumentation/faq.html#valgrind >>>> >>>> > > >>>> > > Is your code fortran or C? >>>> > > >>>> > > Barry >>>> > > >>>> > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe < >>>> hsahasra at purdue.edu> wrote: >>>> > > > >>>> > > > Hi, >>>> > > > >>>> > > > I'm using GAMG + GMRES for my Poisson problem. The solver >>>> converges with KSP_CONVERGED_STEP_LENGTH at a residual of >>>> 9.773346857844e-02, which is much higher than what I need (I need a >>>> tolerance of at least 1E-8). I am not able to figure out which tolerance I >>>> need to set to avoid convergence due to CONVERGED_STEP_LENGTH. >>>> > > > >>>> > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: >>>> > > > >>>> > > > 0 KSP Residual norm 3.121347818142e+00 >>>> > > > 1 KSP Residual norm 9.773346857844e-02 >>>> > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 >>>> > > > KSP Object: 1 MPI processes >>>> > > > type: gmres >>>> > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >>>> Orthogonalization with no iterative refinement >>>> > > > GMRES: happy breakdown tolerance 1e-30 >>>> > > > maximum iterations=10000, initial guess is zero >>>> > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 >>>> > > > left preconditioning >>>> > > > using PRECONDITIONED norm type for convergence test >>>> > > > PC Object: 1 MPI processes >>>> > > > type: gamg >>>> > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v >>>> > > > Cycles per PCApply=1 >>>> > > > Using Galerkin computed coarse grid matrices >>>> > > > Coarse grid solver -- level ------------------------------- >>>> > > > KSP Object: (mg_coarse_) 1 MPI processes >>>> > > > type: preonly >>>> > > > maximum iterations=1, initial guess is zero >>>> > > > tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000 >>>> > > > left preconditioning >>>> > > > using NONE norm type for convergence test >>>> > > > PC Object: (mg_coarse_) 1 MPI processes >>>> > > > type: bjacobi >>>> > > > block Jacobi: number of blocks = 1 >>>> > > > Local solve is same for all blocks, in the following KSP >>>> and PC objects: >>>> > > > KSP Object: (mg_coarse_sub_) 1 MPI >>>> processes >>>> > > > type: preonly >>>> > > > maximum iterations=1, initial guess is zero >>>> > > > tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000 >>>> > > > left preconditioning >>>> > > > using NONE norm type for convergence test >>>> > > > PC Object: (mg_coarse_sub_) 1 MPI processes >>>> > > > type: lu >>>> > > > LU: out-of-place factorization >>>> > > > tolerance for zero pivot 2.22045e-14 >>>> > > > using diagonal shift on blocks to prevent zero pivot >>>> [INBLOCKS] >>>> > > > matrix ordering: nd >>>> > > > factor fill ratio given 5, needed 1.91048 >>>> > > > Factored matrix follows: >>>> > > > Mat Object: 1 MPI processes >>>> > > > type: seqaij >>>> > > > rows=284, cols=284 >>>> > > > package used to perform factorization: petsc >>>> > > > total: nonzeros=7726, allocated nonzeros=7726 >>>> > > > total number of mallocs used during >>>> MatSetValues calls =0 >>>> > > > using I-node routines: found 133 nodes, limit >>>> used is 5 >>>> > > > linear system matrix = precond matrix: >>>> > > > Mat Object: 1 MPI processes >>>> > > > type: seqaij >>>> > > > rows=284, cols=284 >>>> > > > total: nonzeros=4044, allocated nonzeros=4044 >>>> > > > total number of mallocs used during MatSetValues >>>> calls =0 >>>> > > > not using I-node routines >>>> > > > linear system matrix = precond matrix: >>>> > > > Mat Object: 1 MPI processes >>>> > > > type: seqaij >>>> > > > rows=284, cols=284 >>>> > > > total: nonzeros=4044, allocated nonzeros=4044 >>>> > > > total number of mallocs used during MatSetValues calls =0 >>>> > > > not using I-node routines >>>> > > > Down solver (pre-smoother) on level 1 >>>> ------------------------------- >>>> > > > KSP Object: (mg_levels_1_) 1 MPI processes >>>> > > > type: chebyshev >>>> > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = >>>> 4.10212 >>>> > > > maximum iterations=2 >>>> > > > tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000 >>>> > > > left preconditioning >>>> > > > using nonzero initial guess >>>> > > > using NONE norm type for convergence test >>>> > > > PC Object: (mg_levels_1_) 1 MPI processes >>>> > > > type: sor >>>> > > > SOR: type = local_symmetric, iterations = 1, local >>>> iterations = 1, omega = 1 >>>> > > > linear system matrix = precond matrix: >>>> > > > Mat Object: () 1 MPI processes >>>> > > > type: seqaij >>>> > > > rows=9036, cols=9036 >>>> > > > total: nonzeros=192256, allocated nonzeros=192256 >>>> > > > total number of mallocs used during MatSetValues calls =0 >>>> > > > not using I-node routines >>>> > > > Up solver (post-smoother) same as down solver (pre-smoother) >>>> > > > linear system matrix = precond matrix: >>>> > > > Mat Object: () 1 MPI processes >>>> > > > type: seqaij >>>> > > > rows=9036, cols=9036 >>>> > > > total: nonzeros=192256, allocated nonzeros=192256 >>>> > > > total number of mallocs used during MatSetValues calls =0 >>>> > > > not using I-node routines >>>> > > > >>>> > > > Thanks, >>>> > > > Harshad >>>> > > >>>> > > >>>> > >>>> > >>>> > >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 14 13:51:43 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 Sep 2016 13:51:43 -0500 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> Message-ID: <94E92F38-0D18-4400-8BB4-3F42FF051C72@mcs.anl.gov> Oh, you are using SNESSolve_NEWTONTR ! Now it all makes sense! The trust region methods impose other conditions on the linear solution so it needs to have their own special convergence test. In particular it requires that the solution from the linear solve be inside the trust region (i.e. not too big) So, any particular reason you use SNESSolve_NEWTONTR instead of the default line search NEWTONLS? In general unless you have a good reason I recommend just using the NEWTONLS. If you really want to use the TR then the "early" return from the linear solve is expected, it is controlled by the trust region size and is not under user control. I'm sorry I was so brain dead and did not realize you had be using TR or we could have resolved this much sooner. Barry > On Sep 14, 2016, at 1:31 PM, Harshad Sahasrabudhe wrote: > > Thanks. I now put a watchpoint on > > watch *( (PetscErrorCode (**)(KSP, PetscInt, PetscReal, KSPConvergedReason *, void *)) &(ksp->converged) ) > > The function pointer changes in the first iteration of SNES. It changed at the following place: > > Old value = > (PetscErrorCode (*)(KSP, PetscInt, PetscReal, KSPConvergedReason *, void *)) 0x2b54acdd00aa > New value = > (PetscErrorCode (*)(KSP, PetscInt, PetscReal, KSPConvergedReason *, void *)) 0x2b54ad436ce8 <.SNES_TR_KSPConverged_Private.SNES_TR_KSPConverged_Private.SNES_TR_KSPConverged_Private(_p_KSP*, int, double, KSPConvergedReason*, void*)> > KSPSetConvergenceTest (ksp=0x22bf090, converge=0x2b54ad436ce8 , cctx=0x1c8b3e0, > destroy=0x2b54ad437210 ) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:1768 > 1767 ksp->converged = converge; > > Here's the backtrace: > > #0 KSPSetConvergenceTest (ksp=0x22bf090, converge=0x2b54ad436ce8 , cctx=0x1c8b3e0, > destroy=0x2b54ad437210 ) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:1768 > #1 0x00002b54ad43865a in SNESSolve_NEWTONTR (snes=0x1d9e490) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:146 > #2 0x00002b54acedab57 in SNESSolve (snes=0x1d9e490, b=0x0, x=0x1923420) > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 > #3 0x00002b54abe8b780 in libMesh::PetscNonlinearSolver::solve (this=0x19198c0, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 > #4 0x00002b54abefa61d in libMesh::NonlinearImplicitSystem::solve (this=0x1910fe0) at src/systems/nonlinear_implicit_system.C:183 > #5 0x00002b54a5dcdceb in NonlinearPoisson::execute_solver (this=0x100c500) at NonlinearPoisson.cpp:1191 > #6 0x00002b54a5da733c in NonlinearPoisson::do_solve (this=0x100c500) at NonlinearPoisson.cpp:948 > #7 0x00002b54a6423785 in Simulation::solve (this=0x100c500) at Simulation.cpp:781 > #8 0x00002b54a634826e in Nemo::run_simulations (this=0x63b020) at Nemo.cpp:1313 > #9 0x0000000000426d0d in main (argc=6, argv=0x7ffcdb910768) at main.cpp:447 > > Thanks, > Harshad > > > On Wed, Sep 14, 2016 at 2:07 PM, Barry Smith wrote: > > Super strange, it should never have switched to the function SNES_TR_KSPConverged_Private > > Fortunately you can use the same technique to track down where the function pointer changes. Just watch ksp->converged to see when the function pointer gets changed and send back the new stack trace. > > Barry > > > > > > On Sep 14, 2016, at 12:39 PM, Harshad Sahasrabudhe wrote: > > > > Hi Barry, > > > > I put a watchpoint on *((KSP_CONVERGED_REASON*) &( ((_p_KSP*)ksp)->reason )) in gdb. The ksp->reason switched between: > > > > Old value = KSP_CONVERGED_ITERATING > > New value = KSP_CONVERGED_RTOL > > 0x00002b143054bef2 in KSPConvergedDefault (ksp=0x23c3090, n=12, rnorm=5.3617149831259514e-08, reason=0x23c3310, ctx=0x2446210) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/iterativ.c:764 > > 764 *reason = KSP_CONVERGED_RTOL; > > > > and > > > > Old value = KSP_CONVERGED_RTOL > > New value = KSP_CONVERGED_ITERATING > > KSPSetUp (ksp=0x23c3090) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:226 > > 226 if (!((PetscObject)ksp)->type_name) { > > > > However, after iteration 6, it changed to KSP_CONVERGED_STEP_LENGTH > > > > Old value = KSP_CONVERGED_ITERATING > > New value = KSP_CONVERGED_STEP_LENGTH > > SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:36 > > 36 PetscFunctionReturn(0); > > > > Any ideas why that function was executed? Backtrace when the program stopped here: > > > > #0 SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:36 > > #1 0x00002b14305d3fda in KSPGMRESCycle (itcount=0x7ffdcf2d4ffc, ksp=0x23c3090) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/impls/gmres/gmres.c:182 > > #2 0x00002b14305d4711 in KSPSolve_GMRES (ksp=0x23c3090) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/impls/gmres/gmres.c:235 > > #3 0x00002b1430526a8a in KSPSolve (ksp=0x23c3090, b=0x1a916c0, x=0x1d89dc0) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:460 > > #4 0x00002b1430bb3905 in SNESSolve_NEWTONTR (snes=0x1ea2490) at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/impls/tr/tr.c:160 > > #5 0x00002b1430655b57 in SNESSolve (snes=0x1ea2490, b=0x0, x=0x1a27420) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build-real/src/snes/interface/snes.c:3743 > > #6 0x00002b142f606780 in libMesh::PetscNonlinearSolver::solve (this=0x1a1d8c0, jac_in=..., x_in=..., r_in=...) at src/solvers/petsc_nonlinear_solver.C:714 > > #7 0x00002b142f67561d in libMesh::NonlinearImplicitSystem::solve (this=0x1a14fe0) at src/systems/nonlinear_implicit_system.C:183 > > #8 0x00002b1429548ceb in NonlinearPoisson::execute_solver (this=0x1110500) at NonlinearPoisson.cpp:1191 > > #9 0x00002b142952233c in NonlinearPoisson::do_solve (this=0x1110500) at NonlinearPoisson.cpp:948 > > #10 0x00002b1429b9e785 in Simulation::solve (this=0x1110500) at Simulation.cpp:781 > > #11 0x00002b1429ac326e in Nemo::run_simulations (this=0x63b020) at Nemo.cpp:1313 > > #12 0x0000000000426d0d in main (argc=6, argv=0x7ffdcf2d7908) at main.cpp:447 > > > > > > Thanks! > > Harshad > > > > On Wed, Sep 14, 2016 at 10:10 AM, Harshad Sahasrabudhe wrote: > > I think I found the problem. I configured PETSc with COPTFLAGS=-O3. I'll remove that option and try again. > > > > Thanks! > > Harshad > > > > On Wed, Sep 14, 2016 at 10:06 AM, Harshad Sahasrabudhe wrote: > > Hi Barry, > > > > Thanks for your inputs. I tried to set a watchpoint on ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. Basically, GDB isn't able to find the PETSc source code. I built PETSc with --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get doesn't contain debugging symbols (checked using objdump -g). How do I get PETSc library to have debugging info? > > > > Thanks, > > Harshad > > > > On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith wrote: > > > > > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe wrote: > > > > > > Hi Barry, > > > > > > I compiled with mpich configured using --enable-g=meminit to get rid of MPI errors in Valgrind. Doing this reduced the number of errors to 2. I have attached the Valgrind output. > > > > This isn't helpful but it seems not to be a memory corruption issue :-( > > > > > > I'm using GAMG+GMRES for in each linear iteration of SNES. The linear solver converges with CONVERGED_RTOL for the first 6 iterations and with CONVERGED_STEP_LENGTH after that. I'm still very confused about why this is happening. Any thoughts/ideas? > > > > Does this happen on one process? If so I would run in the debugger and track the variable to see everyplace the variable is changed, this would point to exactly what piece of code is changing the variable to this unexpected value. > > > > For example with lldb one can use watch http://lldb.llvm.org/tutorial.html to see each time a variable gets changed. Similar thing with gdb. > > > > The variable to watch is ksp->reason Once you get the hang of this it can take just a few minutes to track down the code that is making this unexpected value, though I understand if you haven't done it before it can be intimidating. > > > > Barry > > > > You can do the same thing in parallel (like on two processes) if you need to but it is more cumbersome since you need run multiple debuggers. You can have PETSc start up multiple debuggers with mpiexec -n 2 ./ex -start_in_debugger > > > > > > > > > > > > > > Thanks, > > > Harshad > > > > > > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith wrote: > > > > > > Install your MPI with --download-mpich as a PETSc ./configure option, this will eliminate all the MPICH valgrind errors. Then send as an attachment the resulting valgrind file. > > > > > > I do not 100 % trust any code that produces such valgrind errors. > > > > > > Barry > > > > > > > > > > > > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe wrote: > > > > > > > > Hi Barry, > > > > > > > > Thanks for the reply. My code is in C. I ran with Valgrind and found many "Conditional jump or move depends on uninitialized value(s)", "Invalid read" and "Use of uninitialized value" errors. I think all of them are from the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not really sure what I'm looking for in the Valgrind output. At the end of the file, I get: > > > > > > > > ==40223== More than 10000000 total errors detected. I'm not reporting any more. > > > > ==40223== Final error counts will be inaccurate. Go fix your program! > > > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note > > > > ==40223== that errors may occur in your program without prior warning from > > > > ==40223== Valgrind, because errors are no longer being displayed. > > > > > > > > Can you give some suggestions on how I should proceed? > > > > > > > > Thanks, > > > > Harshad > > > > > > > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith wrote: > > > > > > > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very specialized and should never occur with GMRES. > > > > > > > > Can you run with valgrind to make sure there is no memory corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > > > > Is your code fortran or C? > > > > > > > > Barry > > > > > > > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe wrote: > > > > > > > > > > Hi, > > > > > > > > > > I'm using GAMG + GMRES for my Poisson problem. The solver converges with KSP_CONVERGED_STEP_LENGTH at a residual of 9.773346857844e-02, which is much higher than what I need (I need a tolerance of at least 1E-8). I am not able to figure out which tolerance I need to set to avoid convergence due to CONVERGED_STEP_LENGTH. > > > > > > > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > > > > > > > > > 0 KSP Residual norm 3.121347818142e+00 > > > > > 1 KSP Residual norm 9.773346857844e-02 > > > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > > > > > KSP Object: 1 MPI processes > > > > > type: gmres > > > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > > > > GMRES: happy breakdown tolerance 1e-30 > > > > > maximum iterations=10000, initial guess is zero > > > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > > > > > left preconditioning > > > > > using PRECONDITIONED norm type for convergence test > > > > > PC Object: 1 MPI processes > > > > > type: gamg > > > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v > > > > > Cycles per PCApply=1 > > > > > Using Galerkin computed coarse grid matrices > > > > > Coarse grid solver -- level ------------------------------- > > > > > KSP Object: (mg_coarse_) 1 MPI processes > > > > > type: preonly > > > > > maximum iterations=1, initial guess is zero > > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > > left preconditioning > > > > > using NONE norm type for convergence test > > > > > PC Object: (mg_coarse_) 1 MPI processes > > > > > type: bjacobi > > > > > block Jacobi: number of blocks = 1 > > > > > Local solve is same for all blocks, in the following KSP and PC objects: > > > > > KSP Object: (mg_coarse_sub_) 1 MPI processes > > > > > type: preonly > > > > > maximum iterations=1, initial guess is zero > > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > > left preconditioning > > > > > using NONE norm type for convergence test > > > > > PC Object: (mg_coarse_sub_) 1 MPI processes > > > > > type: lu > > > > > LU: out-of-place factorization > > > > > tolerance for zero pivot 2.22045e-14 > > > > > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > > > > > matrix ordering: nd > > > > > factor fill ratio given 5, needed 1.91048 > > > > > Factored matrix follows: > > > > > Mat Object: 1 MPI processes > > > > > type: seqaij > > > > > rows=284, cols=284 > > > > > package used to perform factorization: petsc > > > > > total: nonzeros=7726, allocated nonzeros=7726 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > using I-node routines: found 133 nodes, limit used is 5 > > > > > linear system matrix = precond matrix: > > > > > Mat Object: 1 MPI processes > > > > > type: seqaij > > > > > rows=284, cols=284 > > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > not using I-node routines > > > > > linear system matrix = precond matrix: > > > > > Mat Object: 1 MPI processes > > > > > type: seqaij > > > > > rows=284, cols=284 > > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > not using I-node routines > > > > > Down solver (pre-smoother) on level 1 ------------------------------- > > > > > KSP Object: (mg_levels_1_) 1 MPI processes > > > > > type: chebyshev > > > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = 4.10212 > > > > > maximum iterations=2 > > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > > left preconditioning > > > > > using nonzero initial guess > > > > > using NONE norm type for convergence test > > > > > PC Object: (mg_levels_1_) 1 MPI processes > > > > > type: sor > > > > > SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1 > > > > > linear system matrix = precond matrix: > > > > > Mat Object: () 1 MPI processes > > > > > type: seqaij > > > > > rows=9036, cols=9036 > > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > not using I-node routines > > > > > Up solver (post-smoother) same as down solver (pre-smoother) > > > > > linear system matrix = precond matrix: > > > > > Mat Object: () 1 MPI processes > > > > > type: seqaij > > > > > rows=9036, cols=9036 > > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > not using I-node routines > > > > > > > > > > Thanks, > > > > > Harshad > > > > > > > > > > > > > > > > > > > > > > > > > > > From hsahasra at purdue.edu Wed Sep 14 14:09:00 2016 From: hsahasra at purdue.edu (Harshad Sahasrabudhe) Date: Wed, 14 Sep 2016 15:09:00 -0400 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: <94E92F38-0D18-4400-8BB4-3F42FF051C72@mcs.anl.gov> References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> <94E92F38-0D18-4400-8BB4-3F42FF051C72@mcs.anl.gov> Message-ID: Hi Barry, Sorry, I had no idea that Newton TR would have anything to do with the linear solver. I was using TR because it is trustworthy and converges every time. I tried LS just now with GAMG+GMRES and it converges, so I don't have the CONVERGED_STEP_LENGTH problem anymore. Thanks for you help! Harshad On Wed, Sep 14, 2016 at 2:51 PM, Barry Smith wrote: > > Oh, you are using SNESSolve_NEWTONTR ! > > Now it all makes sense! The trust region methods impose other > conditions on the linear solution so it needs to have their own special > convergence test. In particular it requires that the solution from the > linear solve be inside the trust region (i.e. not too big) > > So, any particular reason you use SNESSolve_NEWTONTR instead of the > default line search NEWTONLS? In general unless you have a good reason I > recommend just using the NEWTONLS. > > If you really want to use the TR then the "early" return from the > linear solve is expected, it is controlled by the trust region size and is > not under user control. > > I'm sorry I was so brain dead and did not realize you had be using TR > or we could have resolved this much sooner. > > Barry > > > > > On Sep 14, 2016, at 1:31 PM, Harshad Sahasrabudhe > wrote: > > > > Thanks. I now put a watchpoint on > > > > watch *( (PetscErrorCode (**)(KSP, PetscInt, PetscReal, > KSPConvergedReason *, void *)) &(ksp->converged) ) > > > > The function pointer changes in the first iteration of SNES. It changed > at the following place: > > > > Old value = > > (PetscErrorCode (*)(KSP, PetscInt, PetscReal, KSPConvergedReason *, > void *)) 0x2b54acdd00aa KSPConvergedReason*, void*)> > > New value = > > (PetscErrorCode (*)(KSP, PetscInt, PetscReal, KSPConvergedReason *, > void *)) 0x2b54ad436ce8 <.SNES_TR_KSPConverged_ > Private.SNES_TR_KSPConverged_Private.SNES_TR_KSPConverged_Private(_p_KSP*, > int, double, KSPConvergedReason*, void*)> > > KSPSetConvergenceTest (ksp=0x22bf090, converge=0x2b54ad436ce8 > void*)>, cctx=0x1c8b3e0, > > destroy=0x2b54ad437210 ) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/interface/itfunc.c:1768 > > 1767 ksp->converged = converge; > > > > Here's the backtrace: > > > > #0 KSPSetConvergenceTest (ksp=0x22bf090, converge=0x2b54ad436ce8 > void*)>, cctx=0x1c8b3e0, > > destroy=0x2b54ad437210 ) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/interface/itfunc.c:1768 > > #1 0x00002b54ad43865a in SNESSolve_NEWTONTR (snes=0x1d9e490) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:146 > > #2 0x00002b54acedab57 in SNESSolve (snes=0x1d9e490, b=0x0, x=0x1923420) > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/interface/snes.c:3743 > > #3 0x00002b54abe8b780 in libMesh::PetscNonlinearSolver::solve > (this=0x19198c0, jac_in=..., x_in=..., r_in=...) at > src/solvers/petsc_nonlinear_solver.C:714 > > #4 0x00002b54abefa61d in libMesh::NonlinearImplicitSystem::solve > (this=0x1910fe0) at src/systems/nonlinear_implicit_system.C:183 > > #5 0x00002b54a5dcdceb in NonlinearPoisson::execute_solver > (this=0x100c500) at NonlinearPoisson.cpp:1191 > > #6 0x00002b54a5da733c in NonlinearPoisson::do_solve (this=0x100c500) at > NonlinearPoisson.cpp:948 > > #7 0x00002b54a6423785 in Simulation::solve (this=0x100c500) at > Simulation.cpp:781 > > #8 0x00002b54a634826e in Nemo::run_simulations (this=0x63b020) at > Nemo.cpp:1313 > > #9 0x0000000000426d0d in main (argc=6, argv=0x7ffcdb910768) at > main.cpp:447 > > > > Thanks, > > Harshad > > > > > > On Wed, Sep 14, 2016 at 2:07 PM, Barry Smith wrote: > > > > Super strange, it should never have switched to the function > SNES_TR_KSPConverged_Private > > > > Fortunately you can use the same technique to track down where the > function pointer changes. Just watch ksp->converged to see when the > function pointer gets changed and send back the new stack trace. > > > > Barry > > > > > > > > > > > On Sep 14, 2016, at 12:39 PM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > > > > Hi Barry, > > > > > > I put a watchpoint on *((KSP_CONVERGED_REASON*) &( > ((_p_KSP*)ksp)->reason )) in gdb. The ksp->reason switched between: > > > > > > Old value = KSP_CONVERGED_ITERATING > > > New value = KSP_CONVERGED_RTOL > > > 0x00002b143054bef2 in KSPConvergedDefault (ksp=0x23c3090, n=12, > rnorm=5.3617149831259514e-08, reason=0x23c3310, ctx=0x2446210) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/interface/iterativ.c:764 > > > 764 *reason = KSP_CONVERGED_RTOL; > > > > > > and > > > > > > Old value = KSP_CONVERGED_RTOL > > > New value = KSP_CONVERGED_ITERATING > > > KSPSetUp (ksp=0x23c3090) at /depot/ncn/apps/conte/conte- > gcc-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:226 > > > 226 if (!((PetscObject)ksp)->type_name) { > > > > > > However, after iteration 6, it changed to KSP_CONVERGED_STEP_LENGTH > > > > > > Old value = KSP_CONVERGED_ITERATING > > > New value = KSP_CONVERGED_STEP_LENGTH > > > SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, > rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:36 > > > 36 PetscFunctionReturn(0); > > > > > > Any ideas why that function was executed? Backtrace when the program > stopped here: > > > > > > #0 SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, > rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:36 > > > #1 0x00002b14305d3fda in KSPGMRESCycle (itcount=0x7ffdcf2d4ffc, > ksp=0x23c3090) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/impls/gmres/gmres.c:182 > > > #2 0x00002b14305d4711 in KSPSolve_GMRES (ksp=0x23c3090) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/impls/gmres/gmres.c:235 > > > #3 0x00002b1430526a8a in KSPSolve (ksp=0x23c3090, b=0x1a916c0, > x=0x1d89dc0) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/ksp/ksp/interface/itfunc.c:460 > > > #4 0x00002b1430bb3905 in SNESSolve_NEWTONTR (snes=0x1ea2490) at > /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/impls/tr/tr.c:160 > > > #5 0x00002b1430655b57 in SNESSolve (snes=0x1ea2490, b=0x0, > x=0x1a27420) > > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/ > build-real/src/snes/interface/snes.c:3743 > > > #6 0x00002b142f606780 in libMesh::PetscNonlinearSolver::solve > (this=0x1a1d8c0, jac_in=..., x_in=..., r_in=...) at > src/solvers/petsc_nonlinear_solver.C:714 > > > #7 0x00002b142f67561d in libMesh::NonlinearImplicitSystem::solve > (this=0x1a14fe0) at src/systems/nonlinear_implicit_system.C:183 > > > #8 0x00002b1429548ceb in NonlinearPoisson::execute_solver > (this=0x1110500) at NonlinearPoisson.cpp:1191 > > > #9 0x00002b142952233c in NonlinearPoisson::do_solve (this=0x1110500) > at NonlinearPoisson.cpp:948 > > > #10 0x00002b1429b9e785 in Simulation::solve (this=0x1110500) at > Simulation.cpp:781 > > > #11 0x00002b1429ac326e in Nemo::run_simulations (this=0x63b020) at > Nemo.cpp:1313 > > > #12 0x0000000000426d0d in main (argc=6, argv=0x7ffdcf2d7908) at > main.cpp:447 > > > > > > > > > Thanks! > > > Harshad > > > > > > On Wed, Sep 14, 2016 at 10:10 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > I think I found the problem. I configured PETSc with COPTFLAGS=-O3. > I'll remove that option and try again. > > > > > > Thanks! > > > Harshad > > > > > > On Wed, Sep 14, 2016 at 10:06 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > Hi Barry, > > > > > > Thanks for your inputs. I tried to set a watchpoint on > ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. > Basically, GDB isn't able to find the PETSc source code. I built PETSc with > --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get > doesn't contain debugging symbols (checked using objdump -g). How do I get > PETSc library to have debugging info? > > > > > > Thanks, > > > Harshad > > > > > > On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith > wrote: > > > > > > > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > > > > > > Hi Barry, > > > > > > > > I compiled with mpich configured using --enable-g=meminit to get rid > of MPI errors in Valgrind. Doing this reduced the number of errors to 2. I > have attached the Valgrind output. > > > > > > This isn't helpful but it seems not to be a memory corruption issue > :-( > > > > > > > > I'm using GAMG+GMRES for in each linear iteration of SNES. The > linear solver converges with CONVERGED_RTOL for the first 6 iterations and > with CONVERGED_STEP_LENGTH after that. I'm still very confused about why > this is happening. Any thoughts/ideas? > > > > > > Does this happen on one process? If so I would run in the debugger > and track the variable to see everyplace the variable is changed, this > would point to exactly what piece of code is changing the variable to this > unexpected value. > > > > > > For example with lldb one can use watch > http://lldb.llvm.org/tutorial.html to see each time a variable gets > changed. Similar thing with gdb. > > > > > > The variable to watch is ksp->reason Once you get the hang of this > it can take just a few minutes to track down the code that is making this > unexpected value, though I understand if you haven't done it before it can > be intimidating. > > > > > > Barry > > > > > > You can do the same thing in parallel (like on two processes) if you > need to but it is more cumbersome since you need run multiple debuggers. > You can have PETSc start up multiple debuggers with mpiexec -n 2 ./ex > -start_in_debugger > > > > > > > > > > > > > > > > > > > > Thanks, > > > > Harshad > > > > > > > > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith > wrote: > > > > > > > > Install your MPI with --download-mpich as a PETSc ./configure > option, this will eliminate all the MPICH valgrind errors. Then send as an > attachment the resulting valgrind file. > > > > > > > > I do not 100 % trust any code that produces such valgrind errors. > > > > > > > > Barry > > > > > > > > > > > > > > > > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > > > > > > > > Hi Barry, > > > > > > > > > > Thanks for the reply. My code is in C. I ran with Valgrind and > found many "Conditional jump or move depends on uninitialized value(s)", > "Invalid read" and "Use of uninitialized value" errors. I think all of them > are from the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not > really sure what I'm looking for in the Valgrind output. At the end of the > file, I get: > > > > > > > > > > ==40223== More than 10000000 total errors detected. I'm not > reporting any more. > > > > > ==40223== Final error counts will be inaccurate. Go fix your > program! > > > > > ==40223== Rerun with --error-limit=no to disable this cutoff. Note > > > > > ==40223== that errors may occur in your program without prior > warning from > > > > > ==40223== Valgrind, because errors are no longer being displayed. > > > > > > > > > > Can you give some suggestions on how I should proceed? > > > > > > > > > > Thanks, > > > > > Harshad > > > > > > > > > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith > wrote: > > > > > > > > > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very > specialized and should never occur with GMRES. > > > > > > > > > > Can you run with valgrind to make sure there is no memory > corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > > > > > > Is your code fortran or C? > > > > > > > > > > Barry > > > > > > > > > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe < > hsahasra at purdue.edu> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > I'm using GAMG + GMRES for my Poisson problem. The solver > converges with KSP_CONVERGED_STEP_LENGTH at a residual of > 9.773346857844e-02, which is much higher than what I need (I need a > tolerance of at least 1E-8). I am not able to figure out which tolerance I > need to set to avoid convergence due to CONVERGED_STEP_LENGTH. > > > > > > > > > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: > > > > > > > > > > > > 0 KSP Residual norm 3.121347818142e+00 > > > > > > 1 KSP Residual norm 9.773346857844e-02 > > > > > > Linear solve converged due to CONVERGED_STEP_LENGTH iterations > 1 > > > > > > KSP Object: 1 MPI processes > > > > > > type: gmres > > > > > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > > > > > GMRES: happy breakdown tolerance 1e-30 > > > > > > maximum iterations=10000, initial guess is zero > > > > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 > > > > > > left preconditioning > > > > > > using PRECONDITIONED norm type for convergence test > > > > > > PC Object: 1 MPI processes > > > > > > type: gamg > > > > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v > > > > > > Cycles per PCApply=1 > > > > > > Using Galerkin computed coarse grid matrices > > > > > > Coarse grid solver -- level ------------------------------- > > > > > > KSP Object: (mg_coarse_) 1 MPI processes > > > > > > type: preonly > > > > > > maximum iterations=1, initial guess is zero > > > > > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > > > > > > left preconditioning > > > > > > using NONE norm type for convergence test > > > > > > PC Object: (mg_coarse_) 1 MPI processes > > > > > > type: bjacobi > > > > > > block Jacobi: number of blocks = 1 > > > > > > Local solve is same for all blocks, in the following KSP > and PC objects: > > > > > > KSP Object: (mg_coarse_sub_) 1 MPI > processes > > > > > > type: preonly > > > > > > maximum iterations=1, initial guess is zero > > > > > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > > > > > > left preconditioning > > > > > > using NONE norm type for convergence test > > > > > > PC Object: (mg_coarse_sub_) 1 MPI > processes > > > > > > type: lu > > > > > > LU: out-of-place factorization > > > > > > tolerance for zero pivot 2.22045e-14 > > > > > > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > > > > > > matrix ordering: nd > > > > > > factor fill ratio given 5, needed 1.91048 > > > > > > Factored matrix follows: > > > > > > Mat Object: 1 MPI processes > > > > > > type: seqaij > > > > > > rows=284, cols=284 > > > > > > package used to perform factorization: petsc > > > > > > total: nonzeros=7726, allocated nonzeros=7726 > > > > > > total number of mallocs used during > MatSetValues calls =0 > > > > > > using I-node routines: found 133 nodes, > limit used is 5 > > > > > > linear system matrix = precond matrix: > > > > > > Mat Object: 1 MPI processes > > > > > > type: seqaij > > > > > > rows=284, cols=284 > > > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > > > total number of mallocs used during MatSetValues > calls =0 > > > > > > not using I-node routines > > > > > > linear system matrix = precond matrix: > > > > > > Mat Object: 1 MPI processes > > > > > > type: seqaij > > > > > > rows=284, cols=284 > > > > > > total: nonzeros=4044, allocated nonzeros=4044 > > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > > not using I-node routines > > > > > > Down solver (pre-smoother) on level 1 > ------------------------------- > > > > > > KSP Object: (mg_levels_1_) 1 MPI processes > > > > > > type: chebyshev > > > > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = > 4.10212 > > > > > > maximum iterations=2 > > > > > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > > > > > > left preconditioning > > > > > > using nonzero initial guess > > > > > > using NONE norm type for convergence test > > > > > > PC Object: (mg_levels_1_) 1 MPI processes > > > > > > type: sor > > > > > > SOR: type = local_symmetric, iterations = 1, local > iterations = 1, omega = 1 > > > > > > linear system matrix = precond matrix: > > > > > > Mat Object: () 1 MPI processes > > > > > > type: seqaij > > > > > > rows=9036, cols=9036 > > > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > > not using I-node routines > > > > > > Up solver (post-smoother) same as down solver (pre-smoother) > > > > > > linear system matrix = precond matrix: > > > > > > Mat Object: () 1 MPI processes > > > > > > type: seqaij > > > > > > rows=9036, cols=9036 > > > > > > total: nonzeros=192256, allocated nonzeros=192256 > > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > > not using I-node routines > > > > > > > > > > > > Thanks, > > > > > > Harshad > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hsahasra at purdue.edu Wed Sep 14 14:12:01 2016 From: hsahasra at purdue.edu (Harshad Sahasrabudhe) Date: Wed, 14 Sep 2016 15:12:01 -0400 Subject: [petsc-users] KSP_CONVERGED_STEP_LENGTH In-Reply-To: References: <9D4EAC51-1D35-44A5-B2A5-289D1B141E51@mcs.anl.gov> <6DE800B8-AA08-4EAE-B614-CEC9F92BC366@mcs.anl.gov> <38733AAB-9103-488B-A335-578E8644AD59@mcs.anl.gov> <94E92F38-0D18-4400-8BB4-3F42FF051C72@mcs.anl.gov> Message-ID: Actually, I was using TR with MUMPS before I tried GAMG and it was working pretty well. The reason I want to switch to GAMG is that I have to increase my system size, and the simulation just takes too long with MUMPS and doesn't scale well. On Wed, Sep 14, 2016 at 3:09 PM, Harshad Sahasrabudhe wrote: > Hi Barry, > > Sorry, I had no idea that Newton TR would have anything to do with the > linear solver. I was using TR because it is trustworthy and converges every > time. I tried LS just now with GAMG+GMRES and it converges, so I don't have > the CONVERGED_STEP_LENGTH problem anymore. > > Thanks for you help! > > Harshad > > On Wed, Sep 14, 2016 at 2:51 PM, Barry Smith wrote: > >> >> Oh, you are using SNESSolve_NEWTONTR ! >> >> Now it all makes sense! The trust region methods impose other >> conditions on the linear solution so it needs to have their own special >> convergence test. In particular it requires that the solution from the >> linear solve be inside the trust region (i.e. not too big) >> >> So, any particular reason you use SNESSolve_NEWTONTR instead of the >> default line search NEWTONLS? In general unless you have a good reason I >> recommend just using the NEWTONLS. >> >> If you really want to use the TR then the "early" return from the >> linear solve is expected, it is controlled by the trust region size and is >> not under user control. >> >> I'm sorry I was so brain dead and did not realize you had be using TR >> or we could have resolved this much sooner. >> >> Barry >> >> >> >> > On Sep 14, 2016, at 1:31 PM, Harshad Sahasrabudhe >> wrote: >> > >> > Thanks. I now put a watchpoint on >> > >> > watch *( (PetscErrorCode (**)(KSP, PetscInt, PetscReal, >> KSPConvergedReason *, void *)) &(ksp->converged) ) >> > >> > The function pointer changes in the first iteration of SNES. It changed >> at the following place: >> > >> > Old value = >> > (PetscErrorCode (*)(KSP, PetscInt, PetscReal, KSPConvergedReason *, >> void *)) 0x2b54acdd00aa > KSPConvergedReason*, void*)> >> > New value = >> > (PetscErrorCode (*)(KSP, PetscInt, PetscReal, KSPConvergedReason *, >> void *)) 0x2b54ad436ce8 <.SNES_TR_KSPConverged_Private >> .SNES_TR_KSPConverged_Private.SNES_TR_KSPConverged_Private(_p_KSP*, int, >> double, KSPConvergedReason*, void*)> >> > KSPSetConvergenceTest (ksp=0x22bf090, converge=0x2b54ad436ce8 >> > void*)>, cctx=0x1c8b3e0, >> > destroy=0x2b54ad437210 ) >> > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/ksp/ksp/interface/itfunc.c:1768 >> > 1767 ksp->converged = converge; >> > >> > Here's the backtrace: >> > >> > #0 KSPSetConvergenceTest (ksp=0x22bf090, converge=0x2b54ad436ce8 >> > void*)>, cctx=0x1c8b3e0, >> > destroy=0x2b54ad437210 ) >> > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/ksp/ksp/interface/itfunc.c:1768 >> > #1 0x00002b54ad43865a in SNESSolve_NEWTONTR (snes=0x1d9e490) at >> /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/snes/impls/tr/tr.c:146 >> > #2 0x00002b54acedab57 in SNESSolve (snes=0x1d9e490, b=0x0, x=0x1923420) >> > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/snes/interface/snes.c:3743 >> > #3 0x00002b54abe8b780 in libMesh::PetscNonlinearSolver::solve >> (this=0x19198c0, jac_in=..., x_in=..., r_in=...) at >> src/solvers/petsc_nonlinear_solver.C:714 >> > #4 0x00002b54abefa61d in libMesh::NonlinearImplicitSystem::solve >> (this=0x1910fe0) at src/systems/nonlinear_implicit_system.C:183 >> > #5 0x00002b54a5dcdceb in NonlinearPoisson::execute_solver >> (this=0x100c500) at NonlinearPoisson.cpp:1191 >> > #6 0x00002b54a5da733c in NonlinearPoisson::do_solve (this=0x100c500) >> at NonlinearPoisson.cpp:948 >> > #7 0x00002b54a6423785 in Simulation::solve (this=0x100c500) at >> Simulation.cpp:781 >> > #8 0x00002b54a634826e in Nemo::run_simulations (this=0x63b020) at >> Nemo.cpp:1313 >> > #9 0x0000000000426d0d in main (argc=6, argv=0x7ffcdb910768) at >> main.cpp:447 >> > >> > Thanks, >> > Harshad >> > >> > >> > On Wed, Sep 14, 2016 at 2:07 PM, Barry Smith >> wrote: >> > >> > Super strange, it should never have switched to the function >> SNES_TR_KSPConverged_Private >> > >> > Fortunately you can use the same technique to track down where the >> function pointer changes. Just watch ksp->converged to see when the >> function pointer gets changed and send back the new stack trace. >> > >> > Barry >> > >> > >> > >> > >> > > On Sep 14, 2016, at 12:39 PM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> > > >> > > Hi Barry, >> > > >> > > I put a watchpoint on *((KSP_CONVERGED_REASON*) &( >> ((_p_KSP*)ksp)->reason )) in gdb. The ksp->reason switched between: >> > > >> > > Old value = KSP_CONVERGED_ITERATING >> > > New value = KSP_CONVERGED_RTOL >> > > 0x00002b143054bef2 in KSPConvergedDefault (ksp=0x23c3090, n=12, >> rnorm=5.3617149831259514e-08, reason=0x23c3310, ctx=0x2446210) >> > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/ksp/ksp/interface/iterativ.c:764 >> > > 764 *reason = KSP_CONVERGED_RTOL; >> > > >> > > and >> > > >> > > Old value = KSP_CONVERGED_RTOL >> > > New value = KSP_CONVERGED_ITERATING >> > > KSPSetUp (ksp=0x23c3090) at /depot/ncn/apps/conte/conte-gc >> c-petsc35-dbg/libs/petsc/build-real/src/ksp/ksp/interface/itfunc.c:226 >> > > 226 if (!((PetscObject)ksp)->type_name) { >> > > >> > > However, after iteration 6, it changed to KSP_CONVERGED_STEP_LENGTH >> > > >> > > Old value = KSP_CONVERGED_ITERATING >> > > New value = KSP_CONVERGED_STEP_LENGTH >> > > SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, >> rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) >> > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/snes/impls/tr/tr.c:36 >> > > 36 PetscFunctionReturn(0); >> > > >> > > Any ideas why that function was executed? Backtrace when the program >> stopped here: >> > > >> > > #0 SNES_TR_KSPConverged_Private (ksp=0x23c3090, n=1, >> rnorm=0.097733468578376406, reason=0x23c3310, cctx=0x1d8f3e0) >> > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/snes/impls/tr/tr.c:36 >> > > #1 0x00002b14305d3fda in KSPGMRESCycle (itcount=0x7ffdcf2d4ffc, >> ksp=0x23c3090) >> > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/ksp/ksp/impls/gmres/gmres.c:182 >> > > #2 0x00002b14305d4711 in KSPSolve_GMRES (ksp=0x23c3090) at >> /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/ksp/ksp/impls/gmres/gmres.c:235 >> > > #3 0x00002b1430526a8a in KSPSolve (ksp=0x23c3090, b=0x1a916c0, >> x=0x1d89dc0) >> > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/ksp/ksp/interface/itfunc.c:460 >> > > #4 0x00002b1430bb3905 in SNESSolve_NEWTONTR (snes=0x1ea2490) at >> /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/snes/impls/tr/tr.c:160 >> > > #5 0x00002b1430655b57 in SNESSolve (snes=0x1ea2490, b=0x0, >> x=0x1a27420) >> > > at /depot/ncn/apps/conte/conte-gcc-petsc35-dbg/libs/petsc/build >> -real/src/snes/interface/snes.c:3743 >> > > #6 0x00002b142f606780 in libMesh::PetscNonlinearSolver::solve >> (this=0x1a1d8c0, jac_in=..., x_in=..., r_in=...) at >> src/solvers/petsc_nonlinear_solver.C:714 >> > > #7 0x00002b142f67561d in libMesh::NonlinearImplicitSystem::solve >> (this=0x1a14fe0) at src/systems/nonlinear_implicit_system.C:183 >> > > #8 0x00002b1429548ceb in NonlinearPoisson::execute_solver >> (this=0x1110500) at NonlinearPoisson.cpp:1191 >> > > #9 0x00002b142952233c in NonlinearPoisson::do_solve (this=0x1110500) >> at NonlinearPoisson.cpp:948 >> > > #10 0x00002b1429b9e785 in Simulation::solve (this=0x1110500) at >> Simulation.cpp:781 >> > > #11 0x00002b1429ac326e in Nemo::run_simulations (this=0x63b020) at >> Nemo.cpp:1313 >> > > #12 0x0000000000426d0d in main (argc=6, argv=0x7ffdcf2d7908) at >> main.cpp:447 >> > > >> > > >> > > Thanks! >> > > Harshad >> > > >> > > On Wed, Sep 14, 2016 at 10:10 AM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> > > I think I found the problem. I configured PETSc with COPTFLAGS=-O3. >> I'll remove that option and try again. >> > > >> > > Thanks! >> > > Harshad >> > > >> > > On Wed, Sep 14, 2016 at 10:06 AM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> > > Hi Barry, >> > > >> > > Thanks for your inputs. I tried to set a watchpoint on >> ((_p_KSP*)ksp)->reason, but gdb says no symbol _p_KSP in context. >> Basically, GDB isn't able to find the PETSc source code. I built PETSc with >> --with-debugging=1 statically and -fPIC, but it seems the libpetsc.a I get >> doesn't contain debugging symbols (checked using objdump -g). How do I get >> PETSc library to have debugging info? >> > > >> > > Thanks, >> > > Harshad >> > > >> > > On Tue, Sep 13, 2016 at 2:47 PM, Barry Smith >> wrote: >> > > >> > > > On Sep 13, 2016, at 1:01 PM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> > > > >> > > > Hi Barry, >> > > > >> > > > I compiled with mpich configured using --enable-g=meminit to get >> rid of MPI errors in Valgrind. Doing this reduced the number of errors to >> 2. I have attached the Valgrind output. >> > > >> > > This isn't helpful but it seems not to be a memory corruption >> issue :-( >> > > > >> > > > I'm using GAMG+GMRES for in each linear iteration of SNES. The >> linear solver converges with CONVERGED_RTOL for the first 6 iterations and >> with CONVERGED_STEP_LENGTH after that. I'm still very confused about why >> this is happening. Any thoughts/ideas? >> > > >> > > Does this happen on one process? If so I would run in the debugger >> and track the variable to see everyplace the variable is changed, this >> would point to exactly what piece of code is changing the variable to this >> unexpected value. >> > > >> > > For example with lldb one can use watch >> http://lldb.llvm.org/tutorial.html to see each time a variable gets >> changed. Similar thing with gdb. >> > > >> > > The variable to watch is ksp->reason Once you get the hang of >> this it can take just a few minutes to track down the code that is making >> this unexpected value, though I understand if you haven't done it before it >> can be intimidating. >> > > >> > > Barry >> > > >> > > You can do the same thing in parallel (like on two processes) if you >> need to but it is more cumbersome since you need run multiple debuggers. >> You can have PETSc start up multiple debuggers with mpiexec -n 2 ./ex >> -start_in_debugger >> > > >> > > >> > > >> > > >> > > > >> > > > Thanks, >> > > > Harshad >> > > > >> > > > On Thu, Sep 8, 2016 at 11:26 PM, Barry Smith >> wrote: >> > > > >> > > > Install your MPI with --download-mpich as a PETSc ./configure >> option, this will eliminate all the MPICH valgrind errors. Then send as an >> attachment the resulting valgrind file. >> > > > >> > > > I do not 100 % trust any code that produces such valgrind errors. >> > > > >> > > > Barry >> > > > >> > > > >> > > > >> > > > > On Sep 8, 2016, at 10:12 PM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> > > > > >> > > > > Hi Barry, >> > > > > >> > > > > Thanks for the reply. My code is in C. I ran with Valgrind and >> found many "Conditional jump or move depends on uninitialized value(s)", >> "Invalid read" and "Use of uninitialized value" errors. I think all of them >> are from the libraries I'm using (LibMesh, Boost, MPI, etc.). I'm not >> really sure what I'm looking for in the Valgrind output. At the end of the >> file, I get: >> > > > > >> > > > > ==40223== More than 10000000 total errors detected. I'm not >> reporting any more. >> > > > > ==40223== Final error counts will be inaccurate. Go fix your >> program! >> > > > > ==40223== Rerun with --error-limit=no to disable this cutoff. >> Note >> > > > > ==40223== that errors may occur in your program without prior >> warning from >> > > > > ==40223== Valgrind, because errors are no longer being displayed. >> > > > > >> > > > > Can you give some suggestions on how I should proceed? >> > > > > >> > > > > Thanks, >> > > > > Harshad >> > > > > >> > > > > On Thu, Sep 8, 2016 at 1:59 PM, Barry Smith >> wrote: >> > > > > >> > > > > This is very odd. CONVERGED_STEP_LENGTH for KSP is very >> specialized and should never occur with GMRES. >> > > > > >> > > > > Can you run with valgrind to make sure there is no memory >> corruption? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > > > >> > > > > Is your code fortran or C? >> > > > > >> > > > > Barry >> > > > > >> > > > > > On Sep 8, 2016, at 10:38 AM, Harshad Sahasrabudhe < >> hsahasra at purdue.edu> wrote: >> > > > > > >> > > > > > Hi, >> > > > > > >> > > > > > I'm using GAMG + GMRES for my Poisson problem. The solver >> converges with KSP_CONVERGED_STEP_LENGTH at a residual of >> 9.773346857844e-02, which is much higher than what I need (I need a >> tolerance of at least 1E-8). I am not able to figure out which tolerance I >> need to set to avoid convergence due to CONVERGED_STEP_LENGTH. >> > > > > > >> > > > > > Any help is appreciated! Output of -ksp_view and -ksp_monitor: >> > > > > > >> > > > > > 0 KSP Residual norm 3.121347818142e+00 >> > > > > > 1 KSP Residual norm 9.773346857844e-02 >> > > > > > Linear solve converged due to CONVERGED_STEP_LENGTH >> iterations 1 >> > > > > > KSP Object: 1 MPI processes >> > > > > > type: gmres >> > > > > > GMRES: restart=30, using Classical (unmodified) >> Gram-Schmidt Orthogonalization with no iterative refinement >> > > > > > GMRES: happy breakdown tolerance 1e-30 >> > > > > > maximum iterations=10000, initial guess is zero >> > > > > > tolerances: relative=1e-08, absolute=1e-50, divergence=10000 >> > > > > > left preconditioning >> > > > > > using PRECONDITIONED norm type for convergence test >> > > > > > PC Object: 1 MPI processes >> > > > > > type: gamg >> > > > > > MG: type is MULTIPLICATIVE, levels=2 cycles=v >> > > > > > Cycles per PCApply=1 >> > > > > > Using Galerkin computed coarse grid matrices >> > > > > > Coarse grid solver -- level ------------------------------- >> > > > > > KSP Object: (mg_coarse_) 1 MPI processes >> > > > > > type: preonly >> > > > > > maximum iterations=1, initial guess is zero >> > > > > > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000 >> > > > > > left preconditioning >> > > > > > using NONE norm type for convergence test >> > > > > > PC Object: (mg_coarse_) 1 MPI processes >> > > > > > type: bjacobi >> > > > > > block Jacobi: number of blocks = 1 >> > > > > > Local solve is same for all blocks, in the following >> KSP and PC objects: >> > > > > > KSP Object: (mg_coarse_sub_) 1 MPI >> processes >> > > > > > type: preonly >> > > > > > maximum iterations=1, initial guess is zero >> > > > > > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000 >> > > > > > left preconditioning >> > > > > > using NONE norm type for convergence test >> > > > > > PC Object: (mg_coarse_sub_) 1 MPI >> processes >> > > > > > type: lu >> > > > > > LU: out-of-place factorization >> > > > > > tolerance for zero pivot 2.22045e-14 >> > > > > > using diagonal shift on blocks to prevent zero >> pivot [INBLOCKS] >> > > > > > matrix ordering: nd >> > > > > > factor fill ratio given 5, needed 1.91048 >> > > > > > Factored matrix follows: >> > > > > > Mat Object: 1 MPI processes >> > > > > > type: seqaij >> > > > > > rows=284, cols=284 >> > > > > > package used to perform factorization: petsc >> > > > > > total: nonzeros=7726, allocated nonzeros=7726 >> > > > > > total number of mallocs used during >> MatSetValues calls =0 >> > > > > > using I-node routines: found 133 nodes, >> limit used is 5 >> > > > > > linear system matrix = precond matrix: >> > > > > > Mat Object: 1 MPI processes >> > > > > > type: seqaij >> > > > > > rows=284, cols=284 >> > > > > > total: nonzeros=4044, allocated nonzeros=4044 >> > > > > > total number of mallocs used during MatSetValues >> calls =0 >> > > > > > not using I-node routines >> > > > > > linear system matrix = precond matrix: >> > > > > > Mat Object: 1 MPI processes >> > > > > > type: seqaij >> > > > > > rows=284, cols=284 >> > > > > > total: nonzeros=4044, allocated nonzeros=4044 >> > > > > > total number of mallocs used during MatSetValues calls >> =0 >> > > > > > not using I-node routines >> > > > > > Down solver (pre-smoother) on level 1 >> ------------------------------- >> > > > > > KSP Object: (mg_levels_1_) 1 MPI processes >> > > > > > type: chebyshev >> > > > > > Chebyshev: eigenvalue estimates: min = 0.195339, max = >> 4.10212 >> > > > > > maximum iterations=2 >> > > > > > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000 >> > > > > > left preconditioning >> > > > > > using nonzero initial guess >> > > > > > using NONE norm type for convergence test >> > > > > > PC Object: (mg_levels_1_) 1 MPI processes >> > > > > > type: sor >> > > > > > SOR: type = local_symmetric, iterations = 1, local >> iterations = 1, omega = 1 >> > > > > > linear system matrix = precond matrix: >> > > > > > Mat Object: () 1 MPI processes >> > > > > > type: seqaij >> > > > > > rows=9036, cols=9036 >> > > > > > total: nonzeros=192256, allocated nonzeros=192256 >> > > > > > total number of mallocs used during MatSetValues calls >> =0 >> > > > > > not using I-node routines >> > > > > > Up solver (post-smoother) same as down solver (pre-smoother) >> > > > > > linear system matrix = precond matrix: >> > > > > > Mat Object: () 1 MPI processes >> > > > > > type: seqaij >> > > > > > rows=9036, cols=9036 >> > > > > > total: nonzeros=192256, allocated nonzeros=192256 >> > > > > > total number of mallocs used during MatSetValues calls =0 >> > > > > > not using I-node routines >> > > > > > >> > > > > > Thanks, >> > > > > > Harshad >> > > > > >> > > > > >> > > > >> > > > >> > > > >> > > >> > > >> > > >> > > >> > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hengjiew at uci.edu Wed Sep 14 17:24:02 2016 From: hengjiew at uci.edu (frank) Date: Wed, 14 Sep 2016 15:24:02 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> Message-ID: Hi, I write a simple code to re-produce the error. I hope this can help to diagnose the problem. The code just solves a 3d poisson equation. I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. That's when I re-produce the OOM error. Each core has about 2G memory. I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The ksp solver works fine. I attached the code, ksp_view_pre's output and my petsc option file. Thank you. Frank On 09/09/2016 06:38 PM, Hengjie Wang wrote: > Hi Barry, > > I checked. On the supercomputer, I had the option "-ksp_view_pre" but > it is not in file I sent you. I am sorry for the confusion. > > Regards, > Frank > > On Friday, September 9, 2016, Barry Smith > wrote: > > > > On Sep 9, 2016, at 3:11 PM, frank > wrote: > > > > Hi Barry, > > > > I think the first KSP view output is from -ksp_view_pre. Before > I submitted the test, I was not sure whether there would be OOM > error or not. So I added both -ksp_view_pre and -ksp_view. > > But the options file you sent specifically does NOT list the > -ksp_view_pre so how could it be from that? > > Sorry to be pedantic but I've spent too much time in the past > trying to debug from incorrect information and want to make sure > that the information I have is correct before thinking. Please > recheck exactly what happened. Rerun with the exact input file you > emailed if that is needed. > > Barry > > > > > Frank > > > > > > On 09/09/2016 12:38 PM, Barry Smith wrote: > >> Why does ksp_view2.txt have two KSP views in it while > ksp_view1.txt has only one KSPView in it? Did you run two > different solves in the 2 case but not the one? > >> > >> Barry > >> > >> > >> > >>> On Sep 9, 2016, at 10:56 AM, frank > wrote: > >>> > >>> Hi, > >>> > >>> I want to continue digging into the memory problem here. > >>> I did find a work around in the past, which is to use less > cores per node so that each core has 8G memory. However this is > deficient and expensive. I hope to locate the place that uses the > most memory. > >>> > >>> Here is a brief summary of the tests I did in past: > >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 > >>> Maximum (over computational time) process memory: > total 7.0727e+08 > >>> Current process memory: total > 7.0727e+08 > >>> Maximum (over computational time) space PetscMalloc()ed: > total 6.3908e+11 > >>> Current space PetscMalloc()ed: > total 1.8275e+09 > >>> > >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 > >>> Maximum (over computational time) process memory: > total 5.9431e+09 > >>> Current process memory: total > 5.9431e+09 > >>> Maximum (over computational time) space PetscMalloc()ed: > total 5.3202e+12 > >>> Current space PetscMalloc()ed: > total 5.4844e+09 > >>> > >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 > >>> OOM( Out Of Memory ) killer of the supercomputer > terminated the job during "KSPSolve". > >>> > >>> I attached the output of ksp_view( the third test's output is > from ksp_view_pre ), memory_view and also the petsc options. > >>> > >>> In all the tests, each core can access about 2G memory. In > test3, there are 4223139840 non-zeros in the matrix. This will > consume about 1.74M, using double precision. Considering some > extra memory used to store integer index, 2G memory should still > be way enough. > >>> > >>> Is there a way to find out which part of KSPSolve uses the > most memory? > >>> Thank you so much. > >>> > >>> BTW, there are 4 options remains unused and I don't understand > why they are omitted: > >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly > >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi > >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 > >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson > >>> > >>> > >>> Regards, > >>> Frank > >>> > >>> On 07/13/2016 05:47 PM, Dave May wrote: > >>>> > >>>> On 14 July 2016 at 01:07, frank > wrote: > >>>> Hi Dave, > >>>> > >>>> Sorry for the late reply. > >>>> Thank you so much for your detailed reply. > >>>> > >>>> I have a question about the estimation of the memory usage. > There are 4223139840 allocated non-zeros and 18432 MPI processes. > Double precision is used. So the memory per process is: > >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? > >>>> Did I do sth wrong here? Because this seems too small. > >>>> > >>>> No - I totally f***ed it up. You are correct. That'll teach > me for fumbling around with my iphone calculator and not using my > brain. (Note that to convert to MB just divide by 1e6, not 1024^2 > - although I apparently cannot convert between units correctly....) > >>>> > >>>> From the PETSc objects associated with the solver, It looks > like it _should_ run with 2GB per MPI rank. Sorry for my mistake. > Possibilities are: somewhere in your usage of PETSc you've > introduced a memory leak; PETSc is doing a huge over allocation > (e.g. as per our discussion of MatPtAP); or in your application > code there are other objects you have forgotten to log the memory for. > >>>> > >>>> > >>>> > >>>> I am running this job on Bluewater > >>>> I am using the 7 points FD stencil in 3D. > >>>> > >>>> I thought so on both counts. > >>>> > >>>> I apologize that I made a stupid mistake in computing the > memory per core. My settings render each core can access only 2G > memory on average instead of 8G which I mentioned in previous > email. I re-run the job with 8G memory per core on average and > there is no "Out Of Memory" error. I would do more test to see if > there is still some memory issue. > >>>> > >>>> Ok. I'd still like to know where the memory was being used > since my estimates were off. > >>>> > >>>> > >>>> Thanks, > >>>> Dave > >>>> > >>>> Regards, > >>>> Frank > >>>> > >>>> > >>>> > >>>> On 07/11/2016 01:18 PM, Dave May wrote: > >>>>> Hi Frank, > >>>>> > >>>>> > >>>>> On 11 July 2016 at 19:14, frank > wrote: > >>>>> Hi Dave, > >>>>> > >>>>> I re-run the test using bjacobi as the preconditioner on the > coarse mesh of telescope. The Grid is 3072*256*768 and process > mesh is 96*8*24. The petsc option file is attached. > >>>>> I still got the "Out Of Memory" error. The error occurred > before the linear solver finished one step. So I don't have the > full info from ksp_view. The info from ksp_view_pre is attached. > >>>>> > >>>>> Okay - that is essentially useless (sorry) > >>>>> > >>>>> It seems to me that the error occurred when the > decomposition was going to be changed. > >>>>> > >>>>> Based on what information? > >>>>> Running with -info would give us more clues, but will create > a ton of output. > >>>>> Please try running the case which failed with -info > >>>>> I had another test with a grid of 1536*128*384 and the same > process mesh as above. There was no error. The ksp_view info is > attached for comparison. > >>>>> Thank you. > >>>>> > >>>>> > >>>>> [3] Here is my crude estimate of your memory usage. > >>>>> I'll target the biggest memory hogs only to get an order of > magnitude estimate > >>>>> > >>>>> * The Fine grid operator contains 4223139840 non-zeros --> > 1.8 GB per MPI rank assuming double precision. > >>>>> The indices for the AIJ could amount to another 0.3 GB > (assuming 32 bit integers) > >>>>> > >>>>> * You use 5 levels of coarsening, so the other operators > should represent (collectively) > >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank > on the communicator with 18432 ranks. > >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the > communicator with 18432 ranks. > >>>>> > >>>>> * You use a reduction factor of 64, making the new > communicator with 288 MPI ranks. > >>>>> PCTelescope will first gather a temporary matrix associated > with your coarse level operator assuming a comm size of 288 living > on the comm with size 18432. > >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per > core on the 288 ranks. > >>>>> This matrix is then used to form a new MPIAIJ matrix on the > subcomm, thus require another 32 MB per rank. > >>>>> The temporary matrix is now destroyed. > >>>>> > >>>>> * Because a DMDA is detected, a permutation matrix is assembled. > >>>>> This requires 2 doubles per point in the DMDA. > >>>>> Your coarse DMDA contains 92 x 16 x 48 points. > >>>>> Thus the permutation matrix will require < 1 MB per MPI rank > on the sub-comm. > >>>>> > >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but > the resulting operator will have the same memory footprint as the > unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 > operators of size 32 MB are held in memory when the DMDA is provided. > >>>>> > >>>>> From my rough estimates, the worst case memory foot print > for any given core, given your options is approximately > >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > >>>>> This is way below 8 GB. > >>>>> > >>>>> Note this estimate completely ignores: > >>>>> (1) the memory required for the restriction operator, > >>>>> (2) the potential growth in the number of non-zeros per row > due to Galerkin coarsening (I wished -ksp_view_pre reported the > output from MatView so we could see the number of non-zeros > required by the coarse level operators) > >>>>> (3) all temporary vectors required by the CG solver, and > those required by the smoothers. > >>>>> (4) internal memory allocated by MatPtAP > >>>>> (5) memory associated with IS's used within PCTelescope > >>>>> > >>>>> So either I am completely off in my estimates, or you have > not carefully estimated the memory usage of your application code. > Hopefully others might examine/correct my rough estimates > >>>>> > >>>>> Since I don't have your code I cannot access the latter. > >>>>> Since I don't have access to the same machine you are > running on, I think we need to take a step back. > >>>>> > >>>>> [1] What machine are you running on? Send me a URL if its > available > >>>>> > >>>>> [2] What discretization are you using? (I am guessing a > scalar 7 point FD stencil) > >>>>> If it's a 7 point FD stencil, we should be able to examine > the memory usage of your solver configuration using a standard, > light weight existing PETSc example, run on your machine at the > same scale. > >>>>> This would hopefully enable us to correctly evaluate the > actual memory usage required by the solver configuration you are > using. > >>>>> > >>>>> Thanks, > >>>>> Dave > >>>>> > >>>>> > >>>>> Frank > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On 07/08/2016 10:38 PM, Dave May wrote: > >>>>>> > >>>>>> On Saturday, 9 July 2016, frank > wrote: > >>>>>> Hi Barry and Dave, > >>>>>> > >>>>>> Thank both of you for the advice. > >>>>>> > >>>>>> @Barry > >>>>>> I made a mistake in the file names in last email. I > attached the correct files this time. > >>>>>> For all the three tests, 'Telescope' is used as the coarse > preconditioner. > >>>>>> > >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > >>>>>> Part of the memory usage: Vector 125 124 > 3971904 0. > >>>>>> Matrix 101 101 9462372 0 > >>>>>> > >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > >>>>>> Part of the memory usage: Vector 125 124 > 681672 0. > >>>>>> Matrix 101 101 1462180 0. > >>>>>> > >>>>>> In theory, the memory usage in Test1 should be 8 times of > Test2. In my case, it is about 6 times. > >>>>>> > >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. > Sub-domain per process: 32*32*32 > >>>>>> Here I get the out of memory error. > >>>>>> > >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need > to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > >>>>>> The linear solver didn't work in this case. Petsc output > some errors. > >>>>>> > >>>>>> @Dave > >>>>>> In test3, I use only one instance of 'Telescope'. On the > coarse mesh of 'Telescope', I used LU as the preconditioner > instead of SVD. > >>>>>> If my set the levels correctly, then on the last coarse > mesh of MG where it calls 'Telescope', the sub-domain per process > is 2*2*2. > >>>>>> On the last coarse mesh of 'Telescope', there is only one > grid point per process. > >>>>>> I still got the OOM error. The detailed petsc option file > is attached. > >>>>>> > >>>>>> Do you understand the expected memory usage for the > particular parallel LU implementation you are using? I don't > (seriously). Replace LU with bjacobi and re-run this test. My > point about solver debugging is still valid. > >>>>>> > >>>>>> And please send the result of KSPView so we can see what is > actually used in the computations > >>>>>> > >>>>>> Thanks > >>>>>> Dave > >>>>>> > >>>>>> > >>>>>> Thank you so much. > >>>>>> > >>>>>> Frank > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: > >>>>>> On Jul 6, 2016, at 4:19 PM, frank > wrote: > >>>>>> > >>>>>> Hi Barry, > >>>>>> > >>>>>> Thank you for you advice. > >>>>>> I tried three test. In the 1st test, the grid is > 3072*256*768 and the process mesh is 96*8*24. > >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and > 'telescope' is used as the preconditioner at the coarse mesh. > >>>>>> The system gives me the "Out of Memory" error before the > linear system is completely solved. > >>>>>> The info from '-ksp_view_pre' is attached. I seems to me > that the error occurs when it reaches the coarse mesh. > >>>>>> > >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh > is 96*8*24. The 3rd test uses the > same grid but a different process mesh 48*4*12. > >>>>>> Are you sure this is right? The total matrix and vector > memory usage goes from 2nd test > >>>>>> Vector 384 383 8,193,712 0. > >>>>>> Matrix 103 103 11,508,688 0. > >>>>>> to 3rd test > >>>>>> Vector 384 383 1,590,520 0. > >>>>>> Matrix 103 103 3,508,664 0. > >>>>>> that is the memory usage got smaller but if you have only > 1/8th the processes and the same grid it should have gotten about > 8 times bigger. Did you maybe cut the grid by a factor of 8 also? > If so that still doesn't explain it because the memory usage > changed by a factor of 5 something for the vectors and 3 something > for the matrices. > >>>>>> > >>>>>> > >>>>>> The linear solver and petsc options in 2nd and 3rd tests > are the same in 1st test. The linear solver works fine in both test. > >>>>>> I attached the memory usage of the 2nd and 3rd tests. The > memory info is from the option '-log_summary'. I tried to use > '-momery_info' as you suggested, but in my case petsc treated it > as an unused option. It output nothing about the memory. Do I need > to add sth to my code so I can use '-memory_info'? > >>>>>> Sorry, my mistake the option is -memory_view > >>>>>> > >>>>>> Can you run the one case with -memory_view and > -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate > forever) to see how much memory is used without the telescope? > Also run case 2 the same way. > >>>>>> > >>>>>> Barry > >>>>>> > >>>>>> > >>>>>> > >>>>>> In both tests the memory usage is not large. > >>>>>> > >>>>>> It seems to me that it might be the 'telescope' > preconditioner that allocated a lot of memory and caused the error > in the 1st test. > >>>>>> Is there is a way to show how much memory it allocated? > >>>>>> > >>>>>> Frank > >>>>>> > >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: > >>>>>> Frank, > >>>>>> > >>>>>> You can run with -ksp_view_pre to have it "view" the > KSP before the solve so hopefully it gets that far. > >>>>>> > >>>>>> Please run the problem that does fit with > -memory_info when the problem completes it will show the "high > water mark" for PETSc allocated memory and total memory used. We > first want to look at these numbers to see if it is using more > memory than you expect. You could also run with say half the grid > spacing to see how the memory usage scaled with the increase in > grid points. Make the runs also with -log_view and send all the > output from these options. > >>>>>> > >>>>>> Barry > >>>>>> > >>>>>> On Jul 5, 2016, at 5:23 PM, frank > wrote: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I am using the CG ksp solver and Multigrid preconditioner > to solve a linear system in parallel. > >>>>>> I chose to use the 'Telescope' as the preconditioner on the > coarse mesh for its good performance. > >>>>>> The petsc options file is attached. > >>>>>> > >>>>>> The domain is a 3d box. > >>>>>> It works well when the grid is 1536*128*384 and the process > mesh is 96*8*24. When I double the size of grid and keep > the same process mesh and petsc options, I get an "out of memory" > error from the super-cluster I am using. > >>>>>> Each process has access to at least 8G memory, which should > be more than enough for my application. I am sure that all the > other parts of my code( except the linear solver ) do not use much > memory. So I doubt if there is something wrong with the linear solver. > >>>>>> The error occurs before the linear system is completely > solved so I don't have the info from ksp view. I am not able to > re-produce the error with a smaller problem either. > >>>>>> In addition, I tried to use the block jacobi as the > preconditioner with the same grid and same decomposition. The > linear solver runs extremely slow but there is no memory error. > >>>>>> > >>>>>> How can I diagnose what exactly cause the error? > >>>>>> Thank you so much. > >>>>>> > >>>>>> Frank > >>>>>> > >>>>>> > > >>>>>> > >>>>> > >>>> > >>> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_ksp.f90 Type: text/x-fortran Size: 5742 bytes Desc: not available URL: -------------- next part -------------- KSP Object: 32768 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 32768 MPI processes type: mg PC has not been set up so information may be incomplete MG: type is MULTIPLICATIVE, levels=5 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 32768 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using DEFAULT norm type for convergence test PC Object: (mg_coarse_) 32768 MPI processes type: redundant PC has not been set up so information may be incomplete Redundant preconditioner: Not yet setup Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 32768 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 32768 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 32768 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 32768 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 32768 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 32768 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (mg_levels_4_) 32768 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_) 32768 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 32768 MPI processes type: mpiaij rows=1073741824, cols=1073741824 total: nonzeros=7516192768, allocated nonzeros=7516192768 total number of mallocs used during MatSetValues calls =0 has attached null space -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -pc_type mg -pc_mg_galerkin -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -options_left 1 -log_summary -ksp_view_pre #setting dmdarepart on subcomm -mg_coarse_telescope_repart_da_processors_x 8 -mg_coarse_telescope_repart_da_processors_y 8 -mg_coarse_telescope_repart_da_processors_z 8 -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 3 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type redundant -mg_coarse_telescope_mg_coarse_redundant_pc_type bjacobi From dave.mayhem23 at gmail.com Wed Sep 14 20:44:03 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 15 Sep 2016 03:44:03 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> Message-ID: Hi Frank, On Thursday, 15 September 2016, frank wrote: > Hi, > > I write a simple code to re-produce the error. I hope this can help to > diagnose the problem. > The code just solves a 3d poisson equation. > I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. > That's when I re-produce the OOM error. Each core has about 2G memory. > I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The ksp > solver works fine. > Perfect! That's very helpful, I can use this to track down here the issue is coming from. Give me some time to figure this out. Thanks, Dave > > I attached the code, ksp_view_pre's output and my petsc option file. > > Thank you. > Frank > > On 09/09/2016 06:38 PM, Hengjie Wang wrote: > > Hi Barry, > > I checked. On the supercomputer, I had the option "-ksp_view_pre" but it > is not in file I sent you. I am sorry for the confusion. > > Regards, > Frank > > On Friday, September 9, 2016, Barry Smith > wrote: > >> >> > On Sep 9, 2016, at 3:11 PM, frank wrote: >> > >> > Hi Barry, >> > >> > I think the first KSP view output is from -ksp_view_pre. Before I >> submitted the test, I was not sure whether there would be OOM error or not. >> So I added both -ksp_view_pre and -ksp_view. >> >> But the options file you sent specifically does NOT list the >> -ksp_view_pre so how could it be from that? >> >> Sorry to be pedantic but I've spent too much time in the past trying >> to debug from incorrect information and want to make sure that the >> information I have is correct before thinking. Please recheck exactly what >> happened. Rerun with the exact input file you emailed if that is needed. >> >> Barry >> >> > >> > Frank >> > >> > >> > On 09/09/2016 12:38 PM, Barry Smith wrote: >> >> Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt >> has only one KSPView in it? Did you run two different solves in the 2 case >> but not the one? >> >> >> >> Barry >> >> >> >> >> >> >> >>> On Sep 9, 2016, at 10:56 AM, frank wrote: >> >>> >> >>> Hi, >> >>> >> >>> I want to continue digging into the memory problem here. >> >>> I did find a work around in the past, which is to use less cores per >> node so that each core has 8G memory. However this is deficient and >> expensive. I hope to locate the place that uses the most memory. >> >>> >> >>> Here is a brief summary of the tests I did in past: >> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >> >>> Maximum (over computational time) process memory: total >> 7.0727e+08 >> >>> Current process memory: >> total 7.0727e+08 >> >>> Maximum (over computational time) space PetscMalloc()ed: total >> 6.3908e+11 >> >>> Current space PetscMalloc()ed: >> total 1.8275e+09 >> >>> >> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >> >>> Maximum (over computational time) process memory: total >> 5.9431e+09 >> >>> Current process memory: >> total 5.9431e+09 >> >>> Maximum (over computational time) space PetscMalloc()ed: total >> 5.3202e+12 >> >>> Current space PetscMalloc()ed: >> total 5.4844e+09 >> >>> >> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >> >>> OOM( Out Of Memory ) killer of the supercomputer terminated the >> job during "KSPSolve". >> >>> >> >>> I attached the output of ksp_view( the third test's output is from >> ksp_view_pre ), memory_view and also the petsc options. >> >>> >> >>> In all the tests, each core can access about 2G memory. In test3, >> there are 4223139840 non-zeros in the matrix. This will consume about >> 1.74M, using double precision. Considering some extra memory used to store >> integer index, 2G memory should still be way enough. >> >>> >> >>> Is there a way to find out which part of KSPSolve uses the most >> memory? >> >>> Thank you so much. >> >>> >> >>> BTW, there are 4 options remains unused and I don't understand why >> they are omitted: >> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >> >>> >> >>> >> >>> Regards, >> >>> Frank >> >>> >> >>> On 07/13/2016 05:47 PM, Dave May wrote: >> >>>> >> >>>> On 14 July 2016 at 01:07, frank wrote: >> >>>> Hi Dave, >> >>>> >> >>>> Sorry for the late reply. >> >>>> Thank you so much for your detailed reply. >> >>>> >> >>>> I have a question about the estimation of the memory usage. There >> are 4223139840 allocated non-zeros and 18432 MPI processes. Double >> precision is used. So the memory per process is: >> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >> >>>> Did I do sth wrong here? Because this seems too small. >> >>>> >> >>>> No - I totally f***ed it up. You are correct. That'll teach me for >> fumbling around with my iphone calculator and not using my brain. (Note >> that to convert to MB just divide by 1e6, not 1024^2 - although I >> apparently cannot convert between units correctly....) >> >>>> >> >>>> From the PETSc objects associated with the solver, It looks like it >> _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities >> are: somewhere in your usage of PETSc you've introduced a memory leak; >> PETSc is doing a huge over allocation (e.g. as per our discussion of >> MatPtAP); or in your application code there are other objects you have >> forgotten to log the memory for. >> >>>> >> >>>> >> >>>> >> >>>> I am running this job on Bluewater >> >>>> I am using the 7 points FD stencil in 3D. >> >>>> >> >>>> I thought so on both counts. >> >>>> >> >>>> I apologize that I made a stupid mistake in computing the memory per >> core. My settings render each core can access only 2G memory on average >> instead of 8G which I mentioned in previous email. I re-run the job with 8G >> memory per core on average and there is no "Out Of Memory" error. I would >> do more test to see if there is still some memory issue. >> >>>> >> >>>> Ok. I'd still like to know where the memory was being used since my >> estimates were off. >> >>>> >> >>>> >> >>>> Thanks, >> >>>> Dave >> >>>> >> >>>> Regards, >> >>>> Frank >> >>>> >> >>>> >> >>>> >> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >> >>>>> Hi Frank, >> >>>>> >> >>>>> >> >>>>> On 11 July 2016 at 19:14, frank wrote: >> >>>>> Hi Dave, >> >>>>> >> >>>>> I re-run the test using bjacobi as the preconditioner on the coarse >> mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. >> The petsc option file is attached. >> >>>>> I still got the "Out Of Memory" error. The error occurred before >> the linear solver finished one step. So I don't have the full info from >> ksp_view. The info from ksp_view_pre is attached. >> >>>>> >> >>>>> Okay - that is essentially useless (sorry) >> >>>>> >> >>>>> It seems to me that the error occurred when the decomposition was >> going to be changed. >> >>>>> >> >>>>> Based on what information? >> >>>>> Running with -info would give us more clues, but will create a ton >> of output. >> >>>>> Please try running the case which failed with -info >> >>>>> I had another test with a grid of 1536*128*384 and the same >> process mesh as above. There was no error. The ksp_view info is attached >> for comparison. >> >>>>> Thank you. >> >>>>> >> >>>>> >> >>>>> [3] Here is my crude estimate of your memory usage. >> >>>>> I'll target the biggest memory hogs only to get an order of >> magnitude estimate >> >>>>> >> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB >> per MPI rank assuming double precision. >> >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 >> bit integers) >> >>>>> >> >>>>> * You use 5 levels of coarsening, so the other operators should >> represent (collectively) >> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the >> communicator with 18432 ranks. >> >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the >> communicator with 18432 ranks. >> >>>>> >> >>>>> * You use a reduction factor of 64, making the new communicator >> with 288 MPI ranks. >> >>>>> PCTelescope will first gather a temporary matrix associated with >> your coarse level operator assuming a comm size of 288 living on the comm >> with size 18432. >> >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core on >> the 288 ranks. >> >>>>> This matrix is then used to form a new MPIAIJ matrix on the >> subcomm, thus require another 32 MB per rank. >> >>>>> The temporary matrix is now destroyed. >> >>>>> >> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. >> >>>>> This requires 2 doubles per point in the DMDA. >> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >> >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on the >> sub-comm. >> >>>>> >> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the >> resulting operator will have the same memory footprint as the unpermuted >> matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB >> are held in memory when the DMDA is provided. >> >>>>> >> >>>>> From my rough estimates, the worst case memory foot print for any >> given core, given your options is approximately >> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >> >>>>> This is way below 8 GB. >> >>>>> >> >>>>> Note this estimate completely ignores: >> >>>>> (1) the memory required for the restriction operator, >> >>>>> (2) the potential growth in the number of non-zeros per row due to >> Galerkin coarsening (I wished -ksp_view_pre reported the output from >> MatView so we could see the number of non-zeros required by the coarse >> level operators) >> >>>>> (3) all temporary vectors required by the CG solver, and those >> required by the smoothers. >> >>>>> (4) internal memory allocated by MatPtAP >> >>>>> (5) memory associated with IS's used within PCTelescope >> >>>>> >> >>>>> So either I am completely off in my estimates, or you have not >> carefully estimated the memory usage of your application code. Hopefully >> others might examine/correct my rough estimates >> >>>>> >> >>>>> Since I don't have your code I cannot access the latter. >> >>>>> Since I don't have access to the same machine you are running on, I >> think we need to take a step back. >> >>>>> >> >>>>> [1] What machine are you running on? Send me a URL if its available >> >>>>> >> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 >> point FD stencil) >> >>>>> If it's a 7 point FD stencil, we should be able to examine the >> memory usage of your solver configuration using a standard, light weight >> existing PETSc example, run on your machine at the same scale. >> >>>>> This would hopefully enable us to correctly evaluate the actual >> memory usage required by the solver configuration you are using. >> >>>>> >> >>>>> Thanks, >> >>>>> Dave >> >>>>> >> >>>>> >> >>>>> Frank >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >> >>>>>> >> >>>>>> On Saturday, 9 July 2016, frank wrote: >> >>>>>> Hi Barry and Dave, >> >>>>>> >> >>>>>> Thank both of you for the advice. >> >>>>>> >> >>>>>> @Barry >> >>>>>> I made a mistake in the file names in last email. I attached the >> correct files this time. >> >>>>>> For all the three tests, 'Telescope' is used as the coarse >> preconditioner. >> >>>>>> >> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >> >>>>>> Part of the memory usage: Vector 125 124 3971904 >> 0. >> >>>>>> Matrix 101 101 >> 9462372 0 >> >>>>>> >> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >> >>>>>> Part of the memory usage: Vector 125 124 681672 >> 0. >> >>>>>> Matrix 101 101 >> 1462180 0. >> >>>>>> >> >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. >> In my case, it is about 6 times. >> >>>>>> >> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain >> per process: 32*32*32 >> >>>>>> Here I get the out of memory error. >> >>>>>> >> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set >> -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >> >>>>>> The linear solver didn't work in this case. Petsc output some >> errors. >> >>>>>> >> >>>>>> @Dave >> >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse >> mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >> >>>>>> If my set the levels correctly, then on the last coarse mesh of MG >> where it calls 'Telescope', the sub-domain per process is 2*2*2. >> >>>>>> On the last coarse mesh of 'Telescope', there is only one grid >> point per process. >> >>>>>> I still got the OOM error. The detailed petsc option file is >> attached. >> >>>>>> >> >>>>>> Do you understand the expected memory usage for the particular >> parallel LU implementation you are using? I don't (seriously). Replace LU >> with bjacobi and re-run this test. My point about solver debugging is still >> valid. >> >>>>>> >> >>>>>> And please send the result of KSPView so we can see what is >> actually used in the computations >> >>>>>> >> >>>>>> Thanks >> >>>>>> Dave >> >>>>>> >> >>>>>> >> >>>>>> Thank you so much. >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >> >>>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >> >>>>>> >> >>>>>> Hi Barry, >> >>>>>> >> >>>>>> Thank you for you advice. >> >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and >> the process mesh is 96*8*24. >> >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and >> 'telescope' is used as the preconditioner at the coarse mesh. >> >>>>>> The system gives me the "Out of Memory" error before the linear >> system is completely solved. >> >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that the >> error occurs when it reaches the coarse mesh. >> >>>>>> >> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is >> 96*8*24. The 3rd test uses the >> same grid but a different process mesh 48*4*12. >> >>>>>> Are you sure this is right? The total matrix and vector memory >> usage goes from 2nd test >> >>>>>> Vector 384 383 8,193,712 0. >> >>>>>> Matrix 103 103 11,508,688 0. >> >>>>>> to 3rd test >> >>>>>> Vector 384 383 1,590,520 0. >> >>>>>> Matrix 103 103 3,508,664 0. >> >>>>>> that is the memory usage got smaller but if you have only 1/8th >> the processes and the same grid it should have gotten about 8 times bigger. >> Did you maybe cut the grid by a factor of 8 also? If so that still doesn't >> explain it because the memory usage changed by a factor of 5 something for >> the vectors and 3 something for the matrices. >> >>>>>> >> >>>>>> >> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the >> same in 1st test. The linear solver works fine in both test. >> >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory >> info is from the option '-log_summary'. I tried to use '-momery_info' as >> you suggested, but in my case petsc treated it as an unused option. It >> output nothing about the memory. Do I need to add sth to my code so I can >> use '-memory_info'? >> >>>>>> Sorry, my mistake the option is -memory_view >> >>>>>> >> >>>>>> Can you run the one case with -memory_view and -mg_coarse >> jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much >> memory is used without the telescope? Also run case 2 the same way. >> >>>>>> >> >>>>>> Barry >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> In both tests the memory usage is not large. >> >>>>>> >> >>>>>> It seems to me that it might be the 'telescope' preconditioner >> that allocated a lot of memory and caused the error in the 1st test. >> >>>>>> Is there is a way to show how much memory it allocated? >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >> >>>>>> Frank, >> >>>>>> >> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP >> before the solve so hopefully it gets that far. >> >>>>>> >> >>>>>> Please run the problem that does fit with -memory_info when >> the problem completes it will show the "high water mark" for PETSc >> allocated memory and total memory used. We first want to look at these >> numbers to see if it is using more memory than you expect. You could also >> run with say half the grid spacing to see how the memory usage scaled with >> the increase in grid points. Make the runs also with -log_view and send all >> the output from these options. >> >>>>>> >> >>>>>> Barry >> >>>>>> >> >>>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >> >>>>>> >> >>>>>> Hi, >> >>>>>> >> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to >> solve a linear system in parallel. >> >>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse >> mesh for its good performance. >> >>>>>> The petsc options file is attached. >> >>>>>> >> >>>>>> The domain is a 3d box. >> >>>>>> It works well when the grid is 1536*128*384 and the process mesh >> is 96*8*24. When I double the size of grid and >> keep the same process mesh and petsc options, I get an >> "out of memory" error from the super-cluster I am using. >> >>>>>> Each process has access to at least 8G memory, which should be >> more than enough for my application. I am sure that all the other parts of >> my code( except the linear solver ) do not use much memory. So I doubt if >> there is something wrong with the linear solver. >> >>>>>> The error occurs before the linear system is completely solved so >> I don't have the info from ksp view. I am not able to re-produce the error >> with a smaller problem either. >> >>>>>> In addition, I tried to use the block jacobi as the >> preconditioner with the same grid and same decomposition. The linear solver >> runs extremely slow but there is no memory error. >> >>>>>> >> >>>>>> How can I diagnose what exactly cause the error? >> >>>>>> Thank you so much. >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> > _options.txt> >> >>>>>> >> >>>>> >> >>>> >> >>> > emory2.txt> >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Sep 14 22:55:29 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 15 Sep 2016 05:55:29 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> Message-ID: On Thursday, 15 September 2016, frank wrote: > Hi, > > I write a simple code to re-produce the error. I hope this can help to > diagnose the problem. > The code just solves a 3d poisson equation. > Why is the stencil width a runtime parameter?? And why is the default value 2? For 7-pnt FD Laplace, you only need a stencil width of 1. Was this choice made to mimic something in the real application code? > > I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. > That's when I re-produce the OOM error. Each core has about 2G memory. > I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The ksp > solver works fine. > I attached the code, ksp_view_pre's output and my petsc option file. > > Thank you. > Frank > > On 09/09/2016 06:38 PM, Hengjie Wang wrote: > > Hi Barry, > > I checked. On the supercomputer, I had the option "-ksp_view_pre" but it > is not in file I sent you. I am sorry for the confusion. > > Regards, > Frank > > On Friday, September 9, 2016, Barry Smith > wrote: > >> >> > On Sep 9, 2016, at 3:11 PM, frank wrote: >> > >> > Hi Barry, >> > >> > I think the first KSP view output is from -ksp_view_pre. Before I >> submitted the test, I was not sure whether there would be OOM error or not. >> So I added both -ksp_view_pre and -ksp_view. >> >> But the options file you sent specifically does NOT list the >> -ksp_view_pre so how could it be from that? >> >> Sorry to be pedantic but I've spent too much time in the past trying >> to debug from incorrect information and want to make sure that the >> information I have is correct before thinking. Please recheck exactly what >> happened. Rerun with the exact input file you emailed if that is needed. >> >> Barry >> >> > >> > Frank >> > >> > >> > On 09/09/2016 12:38 PM, Barry Smith wrote: >> >> Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt >> has only one KSPView in it? Did you run two different solves in the 2 case >> but not the one? >> >> >> >> Barry >> >> >> >> >> >> >> >>> On Sep 9, 2016, at 10:56 AM, frank wrote: >> >>> >> >>> Hi, >> >>> >> >>> I want to continue digging into the memory problem here. >> >>> I did find a work around in the past, which is to use less cores per >> node so that each core has 8G memory. However this is deficient and >> expensive. I hope to locate the place that uses the most memory. >> >>> >> >>> Here is a brief summary of the tests I did in past: >> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >> >>> Maximum (over computational time) process memory: total >> 7.0727e+08 >> >>> Current process memory: >> total 7.0727e+08 >> >>> Maximum (over computational time) space PetscMalloc()ed: total >> 6.3908e+11 >> >>> Current space PetscMalloc()ed: >> total 1.8275e+09 >> >>> >> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >> >>> Maximum (over computational time) process memory: total >> 5.9431e+09 >> >>> Current process memory: >> total 5.9431e+09 >> >>> Maximum (over computational time) space PetscMalloc()ed: total >> 5.3202e+12 >> >>> Current space PetscMalloc()ed: >> total 5.4844e+09 >> >>> >> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >> >>> OOM( Out Of Memory ) killer of the supercomputer terminated the >> job during "KSPSolve". >> >>> >> >>> I attached the output of ksp_view( the third test's output is from >> ksp_view_pre ), memory_view and also the petsc options. >> >>> >> >>> In all the tests, each core can access about 2G memory. In test3, >> there are 4223139840 non-zeros in the matrix. This will consume about >> 1.74M, using double precision. Considering some extra memory used to store >> integer index, 2G memory should still be way enough. >> >>> >> >>> Is there a way to find out which part of KSPSolve uses the most >> memory? >> >>> Thank you so much. >> >>> >> >>> BTW, there are 4 options remains unused and I don't understand why >> they are omitted: >> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >> >>> >> >>> >> >>> Regards, >> >>> Frank >> >>> >> >>> On 07/13/2016 05:47 PM, Dave May wrote: >> >>>> >> >>>> On 14 July 2016 at 01:07, frank wrote: >> >>>> Hi Dave, >> >>>> >> >>>> Sorry for the late reply. >> >>>> Thank you so much for your detailed reply. >> >>>> >> >>>> I have a question about the estimation of the memory usage. There >> are 4223139840 allocated non-zeros and 18432 MPI processes. Double >> precision is used. So the memory per process is: >> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >> >>>> Did I do sth wrong here? Because this seems too small. >> >>>> >> >>>> No - I totally f***ed it up. You are correct. That'll teach me for >> fumbling around with my iphone calculator and not using my brain. (Note >> that to convert to MB just divide by 1e6, not 1024^2 - although I >> apparently cannot convert between units correctly....) >> >>>> >> >>>> From the PETSc objects associated with the solver, It looks like it >> _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities >> are: somewhere in your usage of PETSc you've introduced a memory leak; >> PETSc is doing a huge over allocation (e.g. as per our discussion of >> MatPtAP); or in your application code there are other objects you have >> forgotten to log the memory for. >> >>>> >> >>>> >> >>>> >> >>>> I am running this job on Bluewater >> >>>> I am using the 7 points FD stencil in 3D. >> >>>> >> >>>> I thought so on both counts. >> >>>> >> >>>> I apologize that I made a stupid mistake in computing the memory per >> core. My settings render each core can access only 2G memory on average >> instead of 8G which I mentioned in previous email. I re-run the job with 8G >> memory per core on average and there is no "Out Of Memory" error. I would >> do more test to see if there is still some memory issue. >> >>>> >> >>>> Ok. I'd still like to know where the memory was being used since my >> estimates were off. >> >>>> >> >>>> >> >>>> Thanks, >> >>>> Dave >> >>>> >> >>>> Regards, >> >>>> Frank >> >>>> >> >>>> >> >>>> >> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >> >>>>> Hi Frank, >> >>>>> >> >>>>> >> >>>>> On 11 July 2016 at 19:14, frank wrote: >> >>>>> Hi Dave, >> >>>>> >> >>>>> I re-run the test using bjacobi as the preconditioner on the coarse >> mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. >> The petsc option file is attached. >> >>>>> I still got the "Out Of Memory" error. The error occurred before >> the linear solver finished one step. So I don't have the full info from >> ksp_view. The info from ksp_view_pre is attached. >> >>>>> >> >>>>> Okay - that is essentially useless (sorry) >> >>>>> >> >>>>> It seems to me that the error occurred when the decomposition was >> going to be changed. >> >>>>> >> >>>>> Based on what information? >> >>>>> Running with -info would give us more clues, but will create a ton >> of output. >> >>>>> Please try running the case which failed with -info >> >>>>> I had another test with a grid of 1536*128*384 and the same >> process mesh as above. There was no error. The ksp_view info is attached >> for comparison. >> >>>>> Thank you. >> >>>>> >> >>>>> >> >>>>> [3] Here is my crude estimate of your memory usage. >> >>>>> I'll target the biggest memory hogs only to get an order of >> magnitude estimate >> >>>>> >> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB >> per MPI rank assuming double precision. >> >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 >> bit integers) >> >>>>> >> >>>>> * You use 5 levels of coarsening, so the other operators should >> represent (collectively) >> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the >> communicator with 18432 ranks. >> >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the >> communicator with 18432 ranks. >> >>>>> >> >>>>> * You use a reduction factor of 64, making the new communicator >> with 288 MPI ranks. >> >>>>> PCTelescope will first gather a temporary matrix associated with >> your coarse level operator assuming a comm size of 288 living on the comm >> with size 18432. >> >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core on >> the 288 ranks. >> >>>>> This matrix is then used to form a new MPIAIJ matrix on the >> subcomm, thus require another 32 MB per rank. >> >>>>> The temporary matrix is now destroyed. >> >>>>> >> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. >> >>>>> This requires 2 doubles per point in the DMDA. >> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >> >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on the >> sub-comm. >> >>>>> >> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the >> resulting operator will have the same memory footprint as the unpermuted >> matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB >> are held in memory when the DMDA is provided. >> >>>>> >> >>>>> From my rough estimates, the worst case memory foot print for any >> given core, given your options is approximately >> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >> >>>>> This is way below 8 GB. >> >>>>> >> >>>>> Note this estimate completely ignores: >> >>>>> (1) the memory required for the restriction operator, >> >>>>> (2) the potential growth in the number of non-zeros per row due to >> Galerkin coarsening (I wished -ksp_view_pre reported the output from >> MatView so we could see the number of non-zeros required by the coarse >> level operators) >> >>>>> (3) all temporary vectors required by the CG solver, and those >> required by the smoothers. >> >>>>> (4) internal memory allocated by MatPtAP >> >>>>> (5) memory associated with IS's used within PCTelescope >> >>>>> >> >>>>> So either I am completely off in my estimates, or you have not >> carefully estimated the memory usage of your application code. Hopefully >> others might examine/correct my rough estimates >> >>>>> >> >>>>> Since I don't have your code I cannot access the latter. >> >>>>> Since I don't have access to the same machine you are running on, I >> think we need to take a step back. >> >>>>> >> >>>>> [1] What machine are you running on? Send me a URL if its available >> >>>>> >> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 >> point FD stencil) >> >>>>> If it's a 7 point FD stencil, we should be able to examine the >> memory usage of your solver configuration using a standard, light weight >> existing PETSc example, run on your machine at the same scale. >> >>>>> This would hopefully enable us to correctly evaluate the actual >> memory usage required by the solver configuration you are using. >> >>>>> >> >>>>> Thanks, >> >>>>> Dave >> >>>>> >> >>>>> >> >>>>> Frank >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >> >>>>>> >> >>>>>> On Saturday, 9 July 2016, frank wrote: >> >>>>>> Hi Barry and Dave, >> >>>>>> >> >>>>>> Thank both of you for the advice. >> >>>>>> >> >>>>>> @Barry >> >>>>>> I made a mistake in the file names in last email. I attached the >> correct files this time. >> >>>>>> For all the three tests, 'Telescope' is used as the coarse >> preconditioner. >> >>>>>> >> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >> >>>>>> Part of the memory usage: Vector 125 124 3971904 >> 0. >> >>>>>> Matrix 101 101 >> 9462372 0 >> >>>>>> >> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >> >>>>>> Part of the memory usage: Vector 125 124 681672 >> 0. >> >>>>>> Matrix 101 101 >> 1462180 0. >> >>>>>> >> >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. >> In my case, it is about 6 times. >> >>>>>> >> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain >> per process: 32*32*32 >> >>>>>> Here I get the out of memory error. >> >>>>>> >> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set >> -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >> >>>>>> The linear solver didn't work in this case. Petsc output some >> errors. >> >>>>>> >> >>>>>> @Dave >> >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse >> mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >> >>>>>> If my set the levels correctly, then on the last coarse mesh of MG >> where it calls 'Telescope', the sub-domain per process is 2*2*2. >> >>>>>> On the last coarse mesh of 'Telescope', there is only one grid >> point per process. >> >>>>>> I still got the OOM error. The detailed petsc option file is >> attached. >> >>>>>> >> >>>>>> Do you understand the expected memory usage for the particular >> parallel LU implementation you are using? I don't (seriously). Replace LU >> with bjacobi and re-run this test. My point about solver debugging is still >> valid. >> >>>>>> >> >>>>>> And please send the result of KSPView so we can see what is >> actually used in the computations >> >>>>>> >> >>>>>> Thanks >> >>>>>> Dave >> >>>>>> >> >>>>>> >> >>>>>> Thank you so much. >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >> >>>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >> >>>>>> >> >>>>>> Hi Barry, >> >>>>>> >> >>>>>> Thank you for you advice. >> >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and >> the process mesh is 96*8*24. >> >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and >> 'telescope' is used as the preconditioner at the coarse mesh. >> >>>>>> The system gives me the "Out of Memory" error before the linear >> system is completely solved. >> >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that the >> error occurs when it reaches the coarse mesh. >> >>>>>> >> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is >> 96*8*24. The 3rd test uses the >> same grid but a different process mesh 48*4*12. >> >>>>>> Are you sure this is right? The total matrix and vector memory >> usage goes from 2nd test >> >>>>>> Vector 384 383 8,193,712 0. >> >>>>>> Matrix 103 103 11,508,688 0. >> >>>>>> to 3rd test >> >>>>>> Vector 384 383 1,590,520 0. >> >>>>>> Matrix 103 103 3,508,664 0. >> >>>>>> that is the memory usage got smaller but if you have only 1/8th >> the processes and the same grid it should have gotten about 8 times bigger. >> Did you maybe cut the grid by a factor of 8 also? If so that still doesn't >> explain it because the memory usage changed by a factor of 5 something for >> the vectors and 3 something for the matrices. >> >>>>>> >> >>>>>> >> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the >> same in 1st test. The linear solver works fine in both test. >> >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory >> info is from the option '-log_summary'. I tried to use '-momery_info' as >> you suggested, but in my case petsc treated it as an unused option. It >> output nothing about the memory. Do I need to add sth to my code so I can >> use '-memory_info'? >> >>>>>> Sorry, my mistake the option is -memory_view >> >>>>>> >> >>>>>> Can you run the one case with -memory_view and -mg_coarse >> jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much >> memory is used without the telescope? Also run case 2 the same way. >> >>>>>> >> >>>>>> Barry >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> In both tests the memory usage is not large. >> >>>>>> >> >>>>>> It seems to me that it might be the 'telescope' preconditioner >> that allocated a lot of memory and caused the error in the 1st test. >> >>>>>> Is there is a way to show how much memory it allocated? >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >> >>>>>> Frank, >> >>>>>> >> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP >> before the solve so hopefully it gets that far. >> >>>>>> >> >>>>>> Please run the problem that does fit with -memory_info when >> the problem completes it will show the "high water mark" for PETSc >> allocated memory and total memory used. We first want to look at these >> numbers to see if it is using more memory than you expect. You could also >> run with say half the grid spacing to see how the memory usage scaled with >> the increase in grid points. Make the runs also with -log_view and send all >> the output from these options. >> >>>>>> >> >>>>>> Barry >> >>>>>> >> >>>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >> >>>>>> >> >>>>>> Hi, >> >>>>>> >> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to >> solve a linear system in parallel. >> >>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse >> mesh for its good performance. >> >>>>>> The petsc options file is attached. >> >>>>>> >> >>>>>> The domain is a 3d box. >> >>>>>> It works well when the grid is 1536*128*384 and the process mesh >> is 96*8*24. When I double the size of grid and >> keep the same process mesh and petsc options, I get an >> "out of memory" error from the super-cluster I am using. >> >>>>>> Each process has access to at least 8G memory, which should be >> more than enough for my application. I am sure that all the other parts of >> my code( except the linear solver ) do not use much memory. So I doubt if >> there is something wrong with the linear solver. >> >>>>>> The error occurs before the linear system is completely solved so >> I don't have the info from ksp view. I am not able to re-produce the error >> with a smaller problem either. >> >>>>>> In addition, I tried to use the block jacobi as the >> preconditioner with the same grid and same decomposition. The linear solver >> runs extremely slow but there is no memory error. >> >>>>>> >> >>>>>> How can I diagnose what exactly cause the error? >> >>>>>> Thank you so much. >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> > _options.txt> >> >>>>>> >> >>>>> >> >>>> >> >>> > emory2.txt> >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Sep 14 23:05:48 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 15 Sep 2016 06:05:48 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> Message-ID: On Thursday, 15 September 2016, Dave May wrote: > > > On Thursday, 15 September 2016, frank > wrote: > >> Hi, >> >> I write a simple code to re-produce the error. I hope this can help to >> diagnose the problem. >> The code just solves a 3d poisson equation. >> > > Why is the stencil width a runtime parameter?? And why is the default > value 2? For 7-pnt FD Laplace, you only need a stencil width of 1. > > Was this choice made to mimic something in the real application code? > Please ignore - I misunderstood your usage of the param set by -P > > >> >> I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. >> That's when I re-produce the OOM error. Each core has about 2G memory. >> I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The ksp >> solver works fine. >> I attached the code, ksp_view_pre's output and my petsc option file. >> >> Thank you. >> Frank >> >> On 09/09/2016 06:38 PM, Hengjie Wang wrote: >> >> Hi Barry, >> >> I checked. On the supercomputer, I had the option "-ksp_view_pre" but it >> is not in file I sent you. I am sorry for the confusion. >> >> Regards, >> Frank >> >> On Friday, September 9, 2016, Barry Smith wrote: >> >>> >>> > On Sep 9, 2016, at 3:11 PM, frank wrote: >>> > >>> > Hi Barry, >>> > >>> > I think the first KSP view output is from -ksp_view_pre. Before I >>> submitted the test, I was not sure whether there would be OOM error or not. >>> So I added both -ksp_view_pre and -ksp_view. >>> >>> But the options file you sent specifically does NOT list the >>> -ksp_view_pre so how could it be from that? >>> >>> Sorry to be pedantic but I've spent too much time in the past trying >>> to debug from incorrect information and want to make sure that the >>> information I have is correct before thinking. Please recheck exactly what >>> happened. Rerun with the exact input file you emailed if that is needed. >>> >>> Barry >>> >>> > >>> > Frank >>> > >>> > >>> > On 09/09/2016 12:38 PM, Barry Smith wrote: >>> >> Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt >>> has only one KSPView in it? Did you run two different solves in the 2 case >>> but not the one? >>> >> >>> >> Barry >>> >> >>> >> >>> >> >>> >>> On Sep 9, 2016, at 10:56 AM, frank wrote: >>> >>> >>> >>> Hi, >>> >>> >>> >>> I want to continue digging into the memory problem here. >>> >>> I did find a work around in the past, which is to use less cores per >>> node so that each core has 8G memory. However this is deficient and >>> expensive. I hope to locate the place that uses the most memory. >>> >>> >>> >>> Here is a brief summary of the tests I did in past: >>> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >>> >>> Maximum (over computational time) process memory: total >>> 7.0727e+08 >>> >>> Current process memory: >>> total 7.0727e+08 >>> >>> Maximum (over computational time) space PetscMalloc()ed: total >>> 6.3908e+11 >>> >>> Current space PetscMalloc()ed: >>> total 1.8275e+09 >>> >>> >>> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >>> >>> Maximum (over computational time) process memory: total >>> 5.9431e+09 >>> >>> Current process memory: >>> total 5.9431e+09 >>> >>> Maximum (over computational time) space PetscMalloc()ed: total >>> 5.3202e+12 >>> >>> Current space PetscMalloc()ed: >>> total 5.4844e+09 >>> >>> >>> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >>> >>> OOM( Out Of Memory ) killer of the supercomputer terminated the >>> job during "KSPSolve". >>> >>> >>> >>> I attached the output of ksp_view( the third test's output is from >>> ksp_view_pre ), memory_view and also the petsc options. >>> >>> >>> >>> In all the tests, each core can access about 2G memory. In test3, >>> there are 4223139840 non-zeros in the matrix. This will consume about >>> 1.74M, using double precision. Considering some extra memory used to store >>> integer index, 2G memory should still be way enough. >>> >>> >>> >>> Is there a way to find out which part of KSPSolve uses the most >>> memory? >>> >>> Thank you so much. >>> >>> >>> >>> BTW, there are 4 options remains unused and I don't understand why >>> they are omitted: >>> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >>> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >>> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >>> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >>> >>> >>> >>> >>> >>> Regards, >>> >>> Frank >>> >>> >>> >>> On 07/13/2016 05:47 PM, Dave May wrote: >>> >>>> >>> >>>> On 14 July 2016 at 01:07, frank wrote: >>> >>>> Hi Dave, >>> >>>> >>> >>>> Sorry for the late reply. >>> >>>> Thank you so much for your detailed reply. >>> >>>> >>> >>>> I have a question about the estimation of the memory usage. There >>> are 4223139840 allocated non-zeros and 18432 MPI processes. Double >>> precision is used. So the memory per process is: >>> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >>> >>>> Did I do sth wrong here? Because this seems too small. >>> >>>> >>> >>>> No - I totally f***ed it up. You are correct. That'll teach me for >>> fumbling around with my iphone calculator and not using my brain. (Note >>> that to convert to MB just divide by 1e6, not 1024^2 - although I >>> apparently cannot convert between units correctly....) >>> >>>> >>> >>>> From the PETSc objects associated with the solver, It looks like it >>> _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities >>> are: somewhere in your usage of PETSc you've introduced a memory leak; >>> PETSc is doing a huge over allocation (e.g. as per our discussion of >>> MatPtAP); or in your application code there are other objects you have >>> forgotten to log the memory for. >>> >>>> >>> >>>> >>> >>>> >>> >>>> I am running this job on Bluewater >>> >>>> I am using the 7 points FD stencil in 3D. >>> >>>> >>> >>>> I thought so on both counts. >>> >>>> >>> >>>> I apologize that I made a stupid mistake in computing the memory >>> per core. My settings render each core can access only 2G memory on average >>> instead of 8G which I mentioned in previous email. I re-run the job with 8G >>> memory per core on average and there is no "Out Of Memory" error. I would >>> do more test to see if there is still some memory issue. >>> >>>> >>> >>>> Ok. I'd still like to know where the memory was being used since my >>> estimates were off. >>> >>>> >>> >>>> >>> >>>> Thanks, >>> >>>> Dave >>> >>>> >>> >>>> Regards, >>> >>>> Frank >>> >>>> >>> >>>> >>> >>>> >>> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >>> >>>>> Hi Frank, >>> >>>>> >>> >>>>> >>> >>>>> On 11 July 2016 at 19:14, frank wrote: >>> >>>>> Hi Dave, >>> >>>>> >>> >>>>> I re-run the test using bjacobi as the preconditioner on the >>> coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is >>> 96*8*24. The petsc option file is attached. >>> >>>>> I still got the "Out Of Memory" error. The error occurred before >>> the linear solver finished one step. So I don't have the full info from >>> ksp_view. The info from ksp_view_pre is attached. >>> >>>>> >>> >>>>> Okay - that is essentially useless (sorry) >>> >>>>> >>> >>>>> It seems to me that the error occurred when the decomposition was >>> going to be changed. >>> >>>>> >>> >>>>> Based on what information? >>> >>>>> Running with -info would give us more clues, but will create a ton >>> of output. >>> >>>>> Please try running the case which failed with -info >>> >>>>> I had another test with a grid of 1536*128*384 and the same >>> process mesh as above. There was no error. The ksp_view info is attached >>> for comparison. >>> >>>>> Thank you. >>> >>>>> >>> >>>>> >>> >>>>> [3] Here is my crude estimate of your memory usage. >>> >>>>> I'll target the biggest memory hogs only to get an order of >>> magnitude estimate >>> >>>>> >>> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB >>> per MPI rank assuming double precision. >>> >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming >>> 32 bit integers) >>> >>>>> >>> >>>>> * You use 5 levels of coarsening, so the other operators should >>> represent (collectively) >>> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on >>> the communicator with 18432 ranks. >>> >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the >>> communicator with 18432 ranks. >>> >>>>> >>> >>>>> * You use a reduction factor of 64, making the new communicator >>> with 288 MPI ranks. >>> >>>>> PCTelescope will first gather a temporary matrix associated with >>> your coarse level operator assuming a comm size of 288 living on the comm >>> with size 18432. >>> >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core >>> on the 288 ranks. >>> >>>>> This matrix is then used to form a new MPIAIJ matrix on the >>> subcomm, thus require another 32 MB per rank. >>> >>>>> The temporary matrix is now destroyed. >>> >>>>> >>> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. >>> >>>>> This requires 2 doubles per point in the DMDA. >>> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >>> >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on >>> the sub-comm. >>> >>>>> >>> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the >>> resulting operator will have the same memory footprint as the unpermuted >>> matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB >>> are held in memory when the DMDA is provided. >>> >>>>> >>> >>>>> From my rough estimates, the worst case memory foot print for any >>> given core, given your options is approximately >>> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >>> >>>>> This is way below 8 GB. >>> >>>>> >>> >>>>> Note this estimate completely ignores: >>> >>>>> (1) the memory required for the restriction operator, >>> >>>>> (2) the potential growth in the number of non-zeros per row due to >>> Galerkin coarsening (I wished -ksp_view_pre reported the output from >>> MatView so we could see the number of non-zeros required by the coarse >>> level operators) >>> >>>>> (3) all temporary vectors required by the CG solver, and those >>> required by the smoothers. >>> >>>>> (4) internal memory allocated by MatPtAP >>> >>>>> (5) memory associated with IS's used within PCTelescope >>> >>>>> >>> >>>>> So either I am completely off in my estimates, or you have not >>> carefully estimated the memory usage of your application code. Hopefully >>> others might examine/correct my rough estimates >>> >>>>> >>> >>>>> Since I don't have your code I cannot access the latter. >>> >>>>> Since I don't have access to the same machine you are running on, >>> I think we need to take a step back. >>> >>>>> >>> >>>>> [1] What machine are you running on? Send me a URL if its available >>> >>>>> >>> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 >>> point FD stencil) >>> >>>>> If it's a 7 point FD stencil, we should be able to examine the >>> memory usage of your solver configuration using a standard, light weight >>> existing PETSc example, run on your machine at the same scale. >>> >>>>> This would hopefully enable us to correctly evaluate the actual >>> memory usage required by the solver configuration you are using. >>> >>>>> >>> >>>>> Thanks, >>> >>>>> Dave >>> >>>>> >>> >>>>> >>> >>>>> Frank >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >>> >>>>>> >>> >>>>>> On Saturday, 9 July 2016, frank wrote: >>> >>>>>> Hi Barry and Dave, >>> >>>>>> >>> >>>>>> Thank both of you for the advice. >>> >>>>>> >>> >>>>>> @Barry >>> >>>>>> I made a mistake in the file names in last email. I attached the >>> correct files this time. >>> >>>>>> For all the three tests, 'Telescope' is used as the coarse >>> preconditioner. >>> >>>>>> >>> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>> >>>>>> Part of the memory usage: Vector 125 124 3971904 >>> 0. >>> >>>>>> Matrix 101 101 >>> 9462372 0 >>> >>>>>> >>> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>> >>>>>> Part of the memory usage: Vector 125 124 681672 >>> 0. >>> >>>>>> Matrix 101 101 >>> 1462180 0. >>> >>>>>> >>> >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. >>> In my case, it is about 6 times. >>> >>>>>> >>> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain >>> per process: 32*32*32 >>> >>>>>> Here I get the out of memory error. >>> >>>>>> >>> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to >>> set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>> >>>>>> The linear solver didn't work in this case. Petsc output some >>> errors. >>> >>>>>> >>> >>>>>> @Dave >>> >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse >>> mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >>> >>>>>> If my set the levels correctly, then on the last coarse mesh of >>> MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >>> >>>>>> On the last coarse mesh of 'Telescope', there is only one grid >>> point per process. >>> >>>>>> I still got the OOM error. The detailed petsc option file is >>> attached. >>> >>>>>> >>> >>>>>> Do you understand the expected memory usage for the particular >>> parallel LU implementation you are using? I don't (seriously). Replace LU >>> with bjacobi and re-run this test. My point about solver debugging is still >>> valid. >>> >>>>>> >>> >>>>>> And please send the result of KSPView so we can see what is >>> actually used in the computations >>> >>>>>> >>> >>>>>> Thanks >>> >>>>>> Dave >>> >>>>>> >>> >>>>>> >>> >>>>>> Thank you so much. >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>> >>>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>> >>>>>> >>> >>>>>> Hi Barry, >>> >>>>>> >>> >>>>>> Thank you for you advice. >>> >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and >>> the process mesh is 96*8*24. >>> >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and >>> 'telescope' is used as the preconditioner at the coarse mesh. >>> >>>>>> The system gives me the "Out of Memory" error before the linear >>> system is completely solved. >>> >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that the >>> error occurs when it reaches the coarse mesh. >>> >>>>>> >>> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is >>> 96*8*24. The 3rd test uses the >>> same grid but a different process mesh 48*4*12. >>> >>>>>> Are you sure this is right? The total matrix and vector >>> memory usage goes from 2nd test >>> >>>>>> Vector 384 383 8,193,712 0. >>> >>>>>> Matrix 103 103 11,508,688 0. >>> >>>>>> to 3rd test >>> >>>>>> Vector 384 383 1,590,520 0. >>> >>>>>> Matrix 103 103 3,508,664 0. >>> >>>>>> that is the memory usage got smaller but if you have only 1/8th >>> the processes and the same grid it should have gotten about 8 times bigger. >>> Did you maybe cut the grid by a factor of 8 also? If so that still doesn't >>> explain it because the memory usage changed by a factor of 5 something for >>> the vectors and 3 something for the matrices. >>> >>>>>> >>> >>>>>> >>> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the >>> same in 1st test. The linear solver works fine in both test. >>> >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory >>> info is from the option '-log_summary'. I tried to use '-momery_info' as >>> you suggested, but in my case petsc treated it as an unused option. It >>> output nothing about the memory. Do I need to add sth to my code so I can >>> use '-memory_info'? >>> >>>>>> Sorry, my mistake the option is -memory_view >>> >>>>>> >>> >>>>>> Can you run the one case with -memory_view and -mg_coarse >>> jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much >>> memory is used without the telescope? Also run case 2 the same way. >>> >>>>>> >>> >>>>>> Barry >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> In both tests the memory usage is not large. >>> >>>>>> >>> >>>>>> It seems to me that it might be the 'telescope' preconditioner >>> that allocated a lot of memory and caused the error in the 1st test. >>> >>>>>> Is there is a way to show how much memory it allocated? >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>> >>>>>> Frank, >>> >>>>>> >>> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP >>> before the solve so hopefully it gets that far. >>> >>>>>> >>> >>>>>> Please run the problem that does fit with -memory_info when >>> the problem completes it will show the "high water mark" for PETSc >>> allocated memory and total memory used. We first want to look at these >>> numbers to see if it is using more memory than you expect. You could also >>> run with say half the grid spacing to see how the memory usage scaled with >>> the increase in grid points. Make the runs also with -log_view and send all >>> the output from these options. >>> >>>>>> >>> >>>>>> Barry >>> >>>>>> >>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>> >>>>>> >>> >>>>>> Hi, >>> >>>>>> >>> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to >>> solve a linear system in parallel. >>> >>>>>> I chose to use the 'Telescope' as the preconditioner on the >>> coarse mesh for its good performance. >>> >>>>>> The petsc options file is attached. >>> >>>>>> >>> >>>>>> The domain is a 3d box. >>> >>>>>> It works well when the grid is 1536*128*384 and the process mesh >>> is 96*8*24. When I double the size of grid and >>> keep the same process mesh and petsc options, I get an >>> "out of memory" error from the super-cluster I am using. >>> >>>>>> Each process has access to at least 8G memory, which should be >>> more than enough for my application. I am sure that all the other parts of >>> my code( except the linear solver ) do not use much memory. So I doubt if >>> there is something wrong with the linear solver. >>> >>>>>> The error occurs before the linear system is completely solved so >>> I don't have the info from ksp view. I am not able to re-produce the error >>> with a smaller problem either. >>> >>>>>> In addition, I tried to use the block jacobi as the >>> preconditioner with the same grid and same decomposition. The linear solver >>> runs extremely slow but there is no memory error. >>> >>>>>> >>> >>>>>> How can I diagnose what exactly cause the error? >>> >>>>>> Thank you so much. >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> >> _options.txt> >>> >>>>>> >>> >>>>> >>> >>>> >>> >>> >> emory2.txt> >>> > >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hengjiew at uci.edu Thu Sep 15 01:03:39 2016 From: hengjiew at uci.edu (Hengjie Wang) Date: Wed, 14 Sep 2016 23:03:39 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> Message-ID: <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> Hi Dave, Sorry, I should have put more comment to explain the code. The number of process in each dimension is the same: Px = Py=Pz=P. So is the domain size. So if the you want to run the code for a 512^3 grid points on 16^3 cores, you need to set "-N 512 -P 16" in the command line. I add more comments and also fix an error in the attached code. ( The error only effects the accuracy of solution but not the memory usage. ) Thank you. Frank On 9/14/2016 9:05 PM, Dave May wrote: > > > On Thursday, 15 September 2016, Dave May > wrote: > > > > On Thursday, 15 September 2016, frank > wrote: > > Hi, > > I write a simple code to re-produce the error. I hope this can > help to diagnose the problem. > The code just solves a 3d poisson equation. > > > Why is the stencil width a runtime parameter?? And why is the > default value 2? For 7-pnt FD Laplace, you only need a stencil > width of 1. > > Was this choice made to mimic something in the real application code? > > > Please ignore - I misunderstood your usage of the param set by -P > > > I run the code on a 1024^3 mesh. The process partition is 32 * > 32 * 32. That's when I re-produce the OOM error. Each core has > about 2G memory. > I also run the code on a 512^3 mesh with 16 * 16 * 16 > processes. The ksp solver works fine. > I attached the code, ksp_view_pre's output and my petsc option > file. > > Thank you. > Frank > > On 09/09/2016 06:38 PM, Hengjie Wang wrote: >> Hi Barry, >> >> I checked. On the supercomputer, I had the option >> "-ksp_view_pre" but it is not in file I sent you. I am sorry >> for the confusion. >> >> Regards, >> Frank >> >> On Friday, September 9, 2016, Barry Smith >> wrote: >> >> >> > On Sep 9, 2016, at 3:11 PM, frank wrote: >> > >> > Hi Barry, >> > >> > I think the first KSP view output is from >> -ksp_view_pre. Before I submitted the test, I was not >> sure whether there would be OOM error or not. So I added >> both -ksp_view_pre and -ksp_view. >> >> But the options file you sent specifically does NOT >> list the -ksp_view_pre so how could it be from that? >> >> Sorry to be pedantic but I've spent too much time in >> the past trying to debug from incorrect information and >> want to make sure that the information I have is correct >> before thinking. Please recheck exactly what happened. >> Rerun with the exact input file you emailed if that is >> needed. >> >> Barry >> >> > >> > Frank >> > >> > >> > On 09/09/2016 12:38 PM, Barry Smith wrote: >> >> Why does ksp_view2.txt have two KSP views in it >> while ksp_view1.txt has only one KSPView in it? Did you >> run two different solves in the 2 case but not the one? >> >> >> >> Barry >> >> >> >> >> >> >> >>> On Sep 9, 2016, at 10:56 AM, frank >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> I want to continue digging into the memory problem here. >> >>> I did find a work around in the past, which is to use >> less cores per node so that each core has 8G memory. >> However this is deficient and expensive. I hope to locate >> the place that uses the most memory. >> >>> >> >>> Here is a brief summary of the tests I did in past: >> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >> >>> Maximum (over computational time) process memory: >> total 7.0727e+08 >> >>> Current process memory: >> total 7.0727e+08 >> >>> Maximum (over computational time) space >> PetscMalloc()ed: total 6.3908e+11 >> >>> Current space PetscMalloc()ed: >> total 1.8275e+09 >> >>> >> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >> >>> Maximum (over computational time) process memory: >> total 5.9431e+09 >> >>> Current process memory: >> total 5.9431e+09 >> >>> Maximum (over computational time) space >> PetscMalloc()ed: total 5.3202e+12 >> >>> Current space PetscMalloc()ed: >> total 5.4844e+09 >> >>> >> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >> >>> OOM( Out Of Memory ) killer of the supercomputer >> terminated the job during "KSPSolve". >> >>> >> >>> I attached the output of ksp_view( the third test's >> output is from ksp_view_pre ), memory_view and also the >> petsc options. >> >>> >> >>> In all the tests, each core can access about 2G >> memory. In test3, there are 4223139840 non-zeros in the >> matrix. This will consume about 1.74M, using double >> precision. Considering some extra memory used to store >> integer index, 2G memory should still be way enough. >> >>> >> >>> Is there a way to find out which part of KSPSolve >> uses the most memory? >> >>> Thank you so much. >> >>> >> >>> BTW, there are 4 options remains unused and I don't >> understand why they are omitted: >> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >> >>> >> >>> >> >>> Regards, >> >>> Frank >> >>> >> >>> On 07/13/2016 05:47 PM, Dave May wrote: >> >>>> >> >>>> On 14 July 2016 at 01:07, frank >> wrote: >> >>>> Hi Dave, >> >>>> >> >>>> Sorry for the late reply. >> >>>> Thank you so much for your detailed reply. >> >>>> >> >>>> I have a question about the estimation of the memory >> usage. There are 4223139840 allocated non-zeros and 18432 >> MPI processes. Double precision is used. So the memory >> per process is: >> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >> >>>> Did I do sth wrong here? Because this seems too small. >> >>>> >> >>>> No - I totally f***ed it up. You are correct. >> That'll teach me for fumbling around with my iphone >> calculator and not using my brain. (Note that to convert >> to MB just divide by 1e6, not 1024^2 - although I >> apparently cannot convert between units correctly....) >> >>>> >> >>>> From the PETSc objects associated with the solver, >> It looks like it _should_ run with 2GB per MPI rank. >> Sorry for my mistake. Possibilities are: somewhere in >> your usage of PETSc you've introduced a memory leak; >> PETSc is doing a huge over allocation (e.g. as per our >> discussion of MatPtAP); or in your application code there >> are other objects you have forgotten to log the memory for. >> >>>> >> >>>> >> >>>> >> >>>> I am running this job on Bluewater >> >>>> I am using the 7 points FD stencil in 3D. >> >>>> >> >>>> I thought so on both counts. >> >>>> >> >>>> I apologize that I made a stupid mistake in >> computing the memory per core. My settings render each >> core can access only 2G memory on average instead of 8G >> which I mentioned in previous email. I re-run the job >> with 8G memory per core on average and there is no "Out >> Of Memory" error. I would do more test to see if there is >> still some memory issue. >> >>>> >> >>>> Ok. I'd still like to know where the memory was >> being used since my estimates were off. >> >>>> >> >>>> >> >>>> Thanks, >> >>>> Dave >> >>>> >> >>>> Regards, >> >>>> Frank >> >>>> >> >>>> >> >>>> >> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >> >>>>> Hi Frank, >> >>>>> >> >>>>> >> >>>>> On 11 July 2016 at 19:14, frank >> wrote: >> >>>>> Hi Dave, >> >>>>> >> >>>>> I re-run the test using bjacobi as the >> preconditioner on the coarse mesh of telescope. The Grid >> is 3072*256*768 and process mesh is 96*8*24. The petsc >> option file is attached. >> >>>>> I still got the "Out Of Memory" error. The error >> occurred before the linear solver finished one step. So I >> don't have the full info from ksp_view. The info from >> ksp_view_pre is attached. >> >>>>> >> >>>>> Okay - that is essentially useless (sorry) >> >>>>> >> >>>>> It seems to me that the error occurred when the >> decomposition was going to be changed. >> >>>>> >> >>>>> Based on what information? >> >>>>> Running with -info would give us more clues, but >> will create a ton of output. >> >>>>> Please try running the case which failed with -info >> >>>>> I had another test with a grid of 1536*128*384 and >> the same process mesh as above. There was no error. The >> ksp_view info is attached for comparison. >> >>>>> Thank you. >> >>>>> >> >>>>> >> >>>>> [3] Here is my crude estimate of your memory usage. >> >>>>> I'll target the biggest memory hogs only to get an >> order of magnitude estimate >> >>>>> >> >>>>> * The Fine grid operator contains 4223139840 >> non-zeros --> 1.8 GB per MPI rank assuming double precision. >> >>>>> The indices for the AIJ could amount to another 0.3 >> GB (assuming 32 bit integers) >> >>>>> >> >>>>> * You use 5 levels of coarsening, so the other >> operators should represent (collectively) >> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per >> MPI rank on the communicator with 18432 ranks. >> >>>>> The coarse grid should consume ~ 0.5 MB per MPI >> rank on the communicator with 18432 ranks. >> >>>>> >> >>>>> * You use a reduction factor of 64, making the new >> communicator with 288 MPI ranks. >> >>>>> PCTelescope will first gather a temporary matrix >> associated with your coarse level operator assuming a >> comm size of 288 living on the comm with size 18432. >> >>>>> This matrix will require approximately 0.5 * 64 = >> 32 MB per core on the 288 ranks. >> >>>>> This matrix is then used to form a new MPIAIJ >> matrix on the subcomm, thus require another 32 MB per rank. >> >>>>> The temporary matrix is now destroyed. >> >>>>> >> >>>>> * Because a DMDA is detected, a permutation matrix >> is assembled. >> >>>>> This requires 2 doubles per point in the DMDA. >> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >> >>>>> Thus the permutation matrix will require < 1 MB per >> MPI rank on the sub-comm. >> >>>>> >> >>>>> * Lastly, the matrix is permuted. This uses >> MatPtAP(), but the resulting operator will have the same >> memory footprint as the unpermuted matrix (32 MB). At any >> stage in PCTelescope, only 2 operators of size 32 MB are >> held in memory when the DMDA is provided. >> >>>>> >> >>>>> From my rough estimates, the worst case memory foot >> print for any given core, given your options is approximately >> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >> >>>>> This is way below 8 GB. >> >>>>> >> >>>>> Note this estimate completely ignores: >> >>>>> (1) the memory required for the restriction operator, >> >>>>> (2) the potential growth in the number of non-zeros >> per row due to Galerkin coarsening (I wished >> -ksp_view_pre reported the output from MatView so we >> could see the number of non-zeros required by the coarse >> level operators) >> >>>>> (3) all temporary vectors required by the CG >> solver, and those required by the smoothers. >> >>>>> (4) internal memory allocated by MatPtAP >> >>>>> (5) memory associated with IS's used within PCTelescope >> >>>>> >> >>>>> So either I am completely off in my estimates, or >> you have not carefully estimated the memory usage of your >> application code. Hopefully others might examine/correct >> my rough estimates >> >>>>> >> >>>>> Since I don't have your code I cannot access the >> latter. >> >>>>> Since I don't have access to the same machine you >> are running on, I think we need to take a step back. >> >>>>> >> >>>>> [1] What machine are you running on? Send me a URL >> if its available >> >>>>> >> >>>>> [2] What discretization are you using? (I am >> guessing a scalar 7 point FD stencil) >> >>>>> If it's a 7 point FD stencil, we should be able to >> examine the memory usage of your solver configuration >> using a standard, light weight existing PETSc example, >> run on your machine at the same scale. >> >>>>> This would hopefully enable us to correctly >> evaluate the actual memory usage required by the solver >> configuration you are using. >> >>>>> >> >>>>> Thanks, >> >>>>> Dave >> >>>>> >> >>>>> >> >>>>> Frank >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >> >>>>>> >> >>>>>> On Saturday, 9 July 2016, frank >> wrote: >> >>>>>> Hi Barry and Dave, >> >>>>>> >> >>>>>> Thank both of you for the advice. >> >>>>>> >> >>>>>> @Barry >> >>>>>> I made a mistake in the file names in last email. >> I attached the correct files this time. >> >>>>>> For all the three tests, 'Telescope' is used as >> the coarse preconditioner. >> >>>>>> >> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: >> 48*4*12 >> >>>>>> Part of the memory usage: Vector 125 >> 124 3971904 0. >> >>>>>> Matrix 101 101 9462372 0 >> >>>>>> >> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >> >>>>>> Part of the memory usage: Vector 125 >> 124 681672 0. >> >>>>>> Matrix 101 101 1462180 0. >> >>>>>> >> >>>>>> In theory, the memory usage in Test1 should be 8 >> times of Test2. In my case, it is about 6 times. >> >>>>>> >> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: >> 96*8*24. Sub-domain per process: 32*32*32 >> >>>>>> Here I get the out of memory error. >> >>>>>> >> >>>>>> I tried to use -mg_coarse jacobi. In this way, I >> don't need to set -mg_coarse_ksp_type and >> -mg_coarse_pc_type explicitly, right? >> >>>>>> The linear solver didn't work in this case. Petsc >> output some errors. >> >>>>>> >> >>>>>> @Dave >> >>>>>> In test3, I use only one instance of 'Telescope'. >> On the coarse mesh of 'Telescope', I used LU as the >> preconditioner instead of SVD. >> >>>>>> If my set the levels correctly, then on the last >> coarse mesh of MG where it calls 'Telescope', the >> sub-domain per process is 2*2*2. >> >>>>>> On the last coarse mesh of 'Telescope', there is >> only one grid point per process. >> >>>>>> I still got the OOM error. The detailed petsc >> option file is attached. >> >>>>>> >> >>>>>> Do you understand the expected memory usage for >> the particular parallel LU implementation you are using? >> I don't (seriously). Replace LU with bjacobi and re-run >> this test. My point about solver debugging is still valid. >> >>>>>> >> >>>>>> And please send the result of KSPView so we can >> see what is actually used in the computations >> >>>>>> >> >>>>>> Thanks >> >>>>>> Dave >> >>>>>> >> >>>>>> >> >>>>>> Thank you so much. >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >> >>>>>> On Jul 6, 2016, at 4:19 PM, frank >> wrote: >> >>>>>> >> >>>>>> Hi Barry, >> >>>>>> >> >>>>>> Thank you for you advice. >> >>>>>> I tried three test. In the 1st test, the grid is >> 3072*256*768 and the process mesh is 96*8*24. >> >>>>>> The linear solver is 'cg' the preconditioner is >> 'mg' and 'telescope' is used as the preconditioner at the >> coarse mesh. >> >>>>>> The system gives me the "Out of Memory" error >> before the linear system is completely solved. >> >>>>>> The info from '-ksp_view_pre' is attached. I seems >> to me that the error occurs when it reaches the coarse mesh. >> >>>>>> >> >>>>>> The 2nd test uses a grid of 1536*128*384 and >> process mesh is 96*8*24. The 3rd >> test uses the same grid but a different >> process mesh 48*4*12. >> >>>>>> Are you sure this is right? The total matrix >> and vector memory usage goes from 2nd test >> >>>>>> Vector 384 383 >> 8,193,712 0. >> >>>>>> Matrix 103 103 >> 11,508,688 0. >> >>>>>> to 3rd test >> >>>>>> Vector 384 383 >> 1,590,520 0. >> >>>>>> Matrix 103 103 >> 3,508,664 0. >> >>>>>> that is the memory usage got smaller but if you >> have only 1/8th the processes and the same grid it should >> have gotten about 8 times bigger. Did you maybe cut the >> grid by a factor of 8 also? If so that still doesn't >> explain it because the memory usage changed by a factor >> of 5 something for the vectors and 3 something for the >> matrices. >> >>>>>> >> >>>>>> >> >>>>>> The linear solver and petsc options in 2nd and 3rd >> tests are the same in 1st test. The linear solver works >> fine in both test. >> >>>>>> I attached the memory usage of the 2nd and 3rd >> tests. The memory info is from the option '-log_summary'. >> I tried to use '-momery_info' as you suggested, but in my >> case petsc treated it as an unused option. It output >> nothing about the memory. Do I need to add sth to my code >> so I can use '-memory_info'? >> >>>>>> Sorry, my mistake the option is -memory_view >> >>>>>> >> >>>>>> Can you run the one case with -memory_view and >> -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't >> iterate forever) to see how much memory is used without >> the telescope? Also run case 2 the same way. >> >>>>>> >> >>>>>> Barry >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> In both tests the memory usage is not large. >> >>>>>> >> >>>>>> It seems to me that it might be the 'telescope' >> preconditioner that allocated a lot of memory and caused >> the error in the 1st test. >> >>>>>> Is there is a way to show how much memory it >> allocated? >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >> >>>>>> Frank, >> >>>>>> >> >>>>>> You can run with -ksp_view_pre to have it >> "view" the KSP before the solve so hopefully it gets that >> far. >> >>>>>> >> >>>>>> Please run the problem that does fit with >> -memory_info when the problem completes it will show the >> "high water mark" for PETSc allocated memory and total >> memory used. We first want to look at these numbers to >> see if it is using more memory than you expect. You could >> also run with say half the grid spacing to see how the >> memory usage scaled with the increase in grid points. >> Make the runs also with -log_view and send all the output >> from these options. >> >>>>>> >> >>>>>> Barry >> >>>>>> >> >>>>>> On Jul 5, 2016, at 5:23 PM, frank >> wrote: >> >>>>>> >> >>>>>> Hi, >> >>>>>> >> >>>>>> I am using the CG ksp solver and Multigrid >> preconditioner to solve a linear system in parallel. >> >>>>>> I chose to use the 'Telescope' as the >> preconditioner on the coarse mesh for its good performance. >> >>>>>> The petsc options file is attached. >> >>>>>> >> >>>>>> The domain is a 3d box. >> >>>>>> It works well when the grid is 1536*128*384 and >> the process mesh is 96*8*24. When I double the size of >> grid and keep the same >> process mesh and petsc options, I get an "out of memory" >> error from the super-cluster I am using. >> >>>>>> Each process has access to at least 8G memory, >> which should be more than enough for my application. I am >> sure that all the other parts of my code( except the >> linear solver ) do not use much memory. So I doubt if >> there is something wrong with the linear solver. >> >>>>>> The error occurs before the linear system is >> completely solved so I don't have the info from ksp view. >> I am not able to re-produce the error with a smaller >> problem either. >> >>>>>> In addition, I tried to use the block jacobi as >> the preconditioner with the same grid and same >> decomposition. The linear solver runs extremely slow but >> there is no memory error. >> >>>>>> >> >>>>>> How can I diagnose what exactly cause the error? >> >>>>>> Thank you so much. >> >>>>>> >> >>>>>> Frank >> >>>>>> >> >>>>>> >> >> >>>>>> >> >>>>> >> >>>> >> >>> >> >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ! Purpose: Test ksp solver with a 3d poisson eqn. ! ! Discrip: The domain is a periodic cube. The exact solution is sin(x)*sin(y)*sin(z). ! The eqn is discretized with 2nd order central difference. ! The rhs of the eqn would be -3*sin(x)*(y)*sin(z) * (-h**2), where h is the space step. ! In the matrix, the diagram element is 6 and the other 6 non-zero elements in the row is -1. ! ! Usage: The length of the cube is set by command line input. ! The number of process in each dimension is the same and also set by command line input. ! To run the code on 512^3 grid points and 16^3 cores: ! mpirun -n 4096 ./test_ksp.exe -N 512 -P 16 ! Petsc options are passed in a file "pestc_options.txt" in the working directory. ! ! Author: Frank ! hengjiew at uci.edu ! ! Date: 09/14/2016 PROGRAM test_ksp #include #include #include #include USE petscdmda USE petscsys IMPLICIT NONE INTEGER, PARAMETER :: pr = 8 CHARACTER(*), PARAMETER :: petsc_options_file="petsc_options.txt" DM :: decomp DMBoundaryType :: BType(3) INTEGER :: N ! length of cube integer :: P ! process # in each dimension INTEGER :: is, js ,ks, nxl, nyl, nzl ! subdomain index INTEGER :: rank INTEGER :: i, j, k REAL(pr) :: h ! space step Vec :: x,b ! solution and rhs REAL(pr), POINTER :: ptr_vec(:,:,:) => NULL() ! pointer to petsc vector Mat :: A MatNullSpace :: nullspace MatStencil :: row(4), col(4,7) PetscInt :: i1 = 1, i7 = 7 REAL(pr) :: value(7) KSP :: ksp PetscErrorCode :: ierr REAL(pr) :: q2 = 0.5_pr REAL(pr) :: erri = 0, err1 = 0 LOGICAL :: is_converged, is_set CALL PetscInitialize( petsc_options_file, ierr ) CALL MPI_Comm_rank( PETSC_COMM_WORLD, rank, ierr ) ! default values N = 64 P = 2 BType = DM_BOUNDARY_PERIODIC ! read domain size and process# from command line CALL PetscOptionsGetInt( PETSC_NULL_OBJECT, PETSC_NULL_CHARACTER, '-N', N, is_set, ierr ) CALL PetscOptionsGetInt( PETSC_NULL_OBJECT, PETSC_NULL_CHARACTER, '-P', P, is_set, ierr ) CALL DMDACreate3d( PETSC_COMM_WORLD, BType(1), BType(2), BType(3), & & DMDA_STENCIL_STAR, N, N, N, P, P, P, 1, 1, & & PETSC_NULL_INTEGER, PETSC_NULL_INTEGER, PETSC_NULL_INTEGER, & & decomp, ierr ) CALL DMDAGetCorners( decomp, is, js, ks, nxl, nyl, nzl, ierr) !WRITE(*,'(7I6)') rank, is, js, ks, nxl, nyl, nzl ! create vector for rhs and solution CALL DMCreateGlobalVector( decomp, b, ierr ) CALL VecDuplicate( b, x, ierr ) CALL VecSet( x, 0.0_pr, ierr ) ! create matrix CALL DMSetMatType( decomp, MATAIJ, ierr ) CALL DMCreateMatrix( decomp, A, ierr ) ! create ksp solver CALL KSPCreate( PETSC_COMM_WORLD, ksp, ierr ) CALL KSPSetDM( ksp, decomp, ierr ) CALL KSPSetDMActive( ksp, PETSC_FALSE, ierr ) CALL KSPSetFromOptions( ksp, ierr ) ! create nullspace, periodic bc give the matrix a nullspace CALL MatNullSpaceCreate( PETSC_COMM_WORLD, PETSC_TRUE, PETSC_NULL_INTEGER, & & PETSC_NULL_INTEGER, nullspace, ierr ) ! set rhs CALL DMDAVecGetArrayF90( decomp, b, ptr_vec, ierr ) DO i = is, is+nxl-1 DO j = js, js+nyl-1 DO k = ks, ks+nzl-1 ptr_vec(i,j,k) = -3 * SIN((i+q2)*h) * SIN((j+q2)*h) * SIN((k+q2)*h) END DO END DO END DO h = 1.0_pr / N ptr_vec = - h**2 * ptr_vec ! assembly rhs CALL VecAssemblyBegin( b, ierr ) CALL VecAssemblyEnd( b, ierr ) CALL DMDAVecRestoreArrayF90( decomp, b, ptr_vec, ierr ) ! set matrix DO i = is, is+nxl-1 DO j = js, js+nyl-1 DO k = ks, ks+nzl-1 ! row index of current point row(MatStencil_i) = i row(MatStencil_j) = j row(MatStencil_k) = k ! column index of current point and its neighbors col(MatStencil_i,1) = i; col(MatStencil_j,1) = j; col(MatStencil_k,1) = k col(MatStencil_i,2) = i-1; col(MatStencil_j,2) = j; col(MatStencil_k,2) = k col(MatStencil_i,3) = i+1; col(MatStencil_j,3) = j; col(MatStencil_k,3) = k col(MatStencil_i,4) = i; col(MatStencil_j,4) = j-1; col(MatStencil_k,4) = k col(MatStencil_i,5) = i; col(MatStencil_j,5) = j+1; col(MatStencil_k,5) = k col(MatStencil_i,6) = i; col(MatStencil_j,6) = j; col(MatStencil_k,6) = k-1 col(MatStencil_i,7) = i; col(MatStencil_j,7) = j; col(MatStencil_k,7) = k+1 ! set values at current point and its neighbors value(1 ) = 6.0_pr value(2:) = -1.0_pr ! set the matrix's elements CALL MatSetValuesStencil( A, i1, row, i7, col, value, INSERT_VALUES, ierr ) END DO END DO END DO ! assembly the matrix CALL MatAssemblyBegin( A, MAT_FINAL_ASSEMBLY, ierr ) CALL MatAssemblyEnd( A, MAT_FINAL_ASSEMBLY, ierr ) ! remove the nullspace CALL MatSetNullSpace( A, nullspace, ierr ) CALL MatNullSpaceRemove( nullspace, b, PETSC_NULL_OBJECT, ierr ) ! Solve system CALL KSPSetOperators( ksp, A, A, ierr ) !CALL KSPSetReusePreconditioner( ksp, reuse_pc, ierr ) CALL KSPSolve( ksp, b, x, ierr ) !CALL KSPGetIterationNumber( ksp, niter, ierr ) !CALL KSPGetConvergedReason( ksp, is_converged, ierr ) ! get solution CALL VecAssemblyBegin( x, ierr ) CALL VecAssemblyEnd( x, ierr ) CALL DMDAVecGetArrayF90( decomp, x, ptr_vec, ierr ) ! check the error DO i = is, is+nxl-1 DO j = js, js+nyl-1 DO k = ks, ks+nzl-1 erri = MAX(erri, ABS(ptr_vec(i,j,k) - SIN((i+q2)*h)*SIN((j+q2)*h)*SIN((k+q2)*h))) err1 = err1 + ABS(ptr_vec(i,j,k) - SIN((i+q2)*h)*SIN((j+q2)*h)*SIN((k+q2)*h)) END DO END DO END DO CALL DMDAVecRestoreArrayF90( decomp, x, ptr_vec, ierr ) CALL MPI_Allreduce( erri, MPI_IN_PLACE, 1, MPI_REAL8, MPI_MAX, PETSC_COMM_WORLD, ierr ) CALL MPI_Allreduce( err1, MPI_IN_PLACE, 1, MPI_REAL8, MPI_SUM, PETSC_COMM_WORLD, ierr ) IF( rank == 0 ) THEN PRINT*, 'norm1 error: ', err1 / N**3 PRINT*, 'norm inf error:', erri END IF CALL VecDestroy( x, ierr ) CALL VecDestroy( b, ierr ) CALL MatDestroy( A, ierr ) CALL KSPDestroy( ksp, ierr ) END PROGRAM test_ksp From dave.mayhem23 at gmail.com Thu Sep 15 01:25:37 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 15 Sep 2016 08:25:37 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> Message-ID: On Thursday, 15 September 2016, Hengjie Wang wrote: > Hi Dave, > > Sorry, I should have put more comment to explain the code. > No problem. I was looking at the code after only 3 hrs of sleep.... > > The number of process in each dimension is the same: Px = Py=Pz=P. So is > the domain size. > So if the you want to run the code for a 512^3 grid points on 16^3 cores, > you need to set "-N 512 -P 16" in the command line. > I add more comments and also fix an error in the attached code. ( The > error only effects the accuracy of solution but not the memory usage. ) > Yep thanks, I see that now. I know this is only a test, but this is kinda clunky. The dmda can automatically choose the partition, and if the user wants control over it, they can use the command line options -da_processors_{x,y,z} (as in your options file). For my testing purposes I'll have to tweak your code as I don't want to always have to change two options when changing the partition size or mesh size (as I'll certainly get it wrong every second time leading to a lose of my time due to queue wait times) Thanks, Dave > > > Thank you. > Frank > > On 9/14/2016 9:05 PM, Dave May wrote: > > > > On Thursday, 15 September 2016, Dave May > wrote: > >> >> >> On Thursday, 15 September 2016, frank wrote: >> >>> Hi, >>> >>> I write a simple code to re-produce the error. I hope this can help to >>> diagnose the problem. >>> The code just solves a 3d poisson equation. >>> >> >> Why is the stencil width a runtime parameter?? And why is the default >> value 2? For 7-pnt FD Laplace, you only need a stencil width of 1. >> >> Was this choice made to mimic something in the real application code? >> > > Please ignore - I misunderstood your usage of the param set by -P > > >> >> >>> >>> I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. >>> That's when I re-produce the OOM error. Each core has about 2G memory. >>> I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The ksp >>> solver works fine. >>> I attached the code, ksp_view_pre's output and my petsc option file. >>> >>> Thank you. >>> Frank >>> >>> On 09/09/2016 06:38 PM, Hengjie Wang wrote: >>> >>> Hi Barry, >>> >>> I checked. On the supercomputer, I had the option "-ksp_view_pre" but it >>> is not in file I sent you. I am sorry for the confusion. >>> >>> Regards, >>> Frank >>> >>> On Friday, September 9, 2016, Barry Smith wrote: >>> >>>> >>>> > On Sep 9, 2016, at 3:11 PM, frank wrote: >>>> > >>>> > Hi Barry, >>>> > >>>> > I think the first KSP view output is from -ksp_view_pre. Before I >>>> submitted the test, I was not sure whether there would be OOM error or not. >>>> So I added both -ksp_view_pre and -ksp_view. >>>> >>>> But the options file you sent specifically does NOT list the >>>> -ksp_view_pre so how could it be from that? >>>> >>>> Sorry to be pedantic but I've spent too much time in the past trying >>>> to debug from incorrect information and want to make sure that the >>>> information I have is correct before thinking. Please recheck exactly what >>>> happened. Rerun with the exact input file you emailed if that is needed. >>>> >>>> Barry >>>> >>>> > >>>> > Frank >>>> > >>>> > >>>> > On 09/09/2016 12:38 PM, Barry Smith wrote: >>>> >> Why does ksp_view2.txt have two KSP views in it while >>>> ksp_view1.txt has only one KSPView in it? Did you run two different solves >>>> in the 2 case but not the one? >>>> >> >>>> >> Barry >>>> >> >>>> >> >>>> >> >>>> >>> On Sep 9, 2016, at 10:56 AM, frank wrote: >>>> >>> >>>> >>> Hi, >>>> >>> >>>> >>> I want to continue digging into the memory problem here. >>>> >>> I did find a work around in the past, which is to use less cores >>>> per node so that each core has 8G memory. However this is deficient and >>>> expensive. I hope to locate the place that uses the most memory. >>>> >>> >>>> >>> Here is a brief summary of the tests I did in past: >>>> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >>>> >>> Maximum (over computational time) process memory: total >>>> 7.0727e+08 >>>> >>> Current process memory: >>>> total 7.0727e+08 >>>> >>> Maximum (over computational time) space PetscMalloc()ed: total >>>> 6.3908e+11 >>>> >>> Current space PetscMalloc()ed: >>>> total 1.8275e+09 >>>> >>> >>>> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >>>> >>> Maximum (over computational time) process memory: total >>>> 5.9431e+09 >>>> >>> Current process memory: >>>> total 5.9431e+09 >>>> >>> Maximum (over computational time) space PetscMalloc()ed: total >>>> 5.3202e+12 >>>> >>> Current space PetscMalloc()ed: >>>> total 5.4844e+09 >>>> >>> >>>> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >>>> >>> OOM( Out Of Memory ) killer of the supercomputer terminated the >>>> job during "KSPSolve". >>>> >>> >>>> >>> I attached the output of ksp_view( the third test's output is from >>>> ksp_view_pre ), memory_view and also the petsc options. >>>> >>> >>>> >>> In all the tests, each core can access about 2G memory. In test3, >>>> there are 4223139840 non-zeros in the matrix. This will consume about >>>> 1.74M, using double precision. Considering some extra memory used to store >>>> integer index, 2G memory should still be way enough. >>>> >>> >>>> >>> Is there a way to find out which part of KSPSolve uses the most >>>> memory? >>>> >>> Thank you so much. >>>> >>> >>>> >>> BTW, there are 4 options remains unused and I don't understand why >>>> they are omitted: >>>> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >>>> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >>>> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >>>> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >>>> >>> >>>> >>> >>>> >>> Regards, >>>> >>> Frank >>>> >>> >>>> >>> On 07/13/2016 05:47 PM, Dave May wrote: >>>> >>>> >>>> >>>> On 14 July 2016 at 01:07, frank wrote: >>>> >>>> Hi Dave, >>>> >>>> >>>> >>>> Sorry for the late reply. >>>> >>>> Thank you so much for your detailed reply. >>>> >>>> >>>> >>>> I have a question about the estimation of the memory usage. There >>>> are 4223139840 allocated non-zeros and 18432 MPI processes. Double >>>> precision is used. So the memory per process is: >>>> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >>>> >>>> Did I do sth wrong here? Because this seems too small. >>>> >>>> >>>> >>>> No - I totally f***ed it up. You are correct. That'll teach me for >>>> fumbling around with my iphone calculator and not using my brain. (Note >>>> that to convert to MB just divide by 1e6, not 1024^2 - although I >>>> apparently cannot convert between units correctly....) >>>> >>>> >>>> >>>> From the PETSc objects associated with the solver, It looks like >>>> it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities >>>> are: somewhere in your usage of PETSc you've introduced a memory leak; >>>> PETSc is doing a huge over allocation (e.g. as per our discussion of >>>> MatPtAP); or in your application code there are other objects you have >>>> forgotten to log the memory for. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> I am running this job on Bluewater >>>> >>>> I am using the 7 points FD stencil in 3D. >>>> >>>> >>>> >>>> I thought so on both counts. >>>> >>>> >>>> >>>> I apologize that I made a stupid mistake in computing the memory >>>> per core. My settings render each core can access only 2G memory on average >>>> instead of 8G which I mentioned in previous email. I re-run the job with 8G >>>> memory per core on average and there is no "Out Of Memory" error. I would >>>> do more test to see if there is still some memory issue. >>>> >>>> >>>> >>>> Ok. I'd still like to know where the memory was being used since >>>> my estimates were off. >>>> >>>> >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Dave >>>> >>>> >>>> >>>> Regards, >>>> >>>> Frank >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >>>> >>>>> Hi Frank, >>>> >>>>> >>>> >>>>> >>>> >>>>> On 11 July 2016 at 19:14, frank wrote: >>>> >>>>> Hi Dave, >>>> >>>>> >>>> >>>>> I re-run the test using bjacobi as the preconditioner on the >>>> coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is >>>> 96*8*24. The petsc option file is attached. >>>> >>>>> I still got the "Out Of Memory" error. The error occurred before >>>> the linear solver finished one step. So I don't have the full info from >>>> ksp_view. The info from ksp_view_pre is attached. >>>> >>>>> >>>> >>>>> Okay - that is essentially useless (sorry) >>>> >>>>> >>>> >>>>> It seems to me that the error occurred when the decomposition was >>>> going to be changed. >>>> >>>>> >>>> >>>>> Based on what information? >>>> >>>>> Running with -info would give us more clues, but will create a >>>> ton of output. >>>> >>>>> Please try running the case which failed with -info >>>> >>>>> I had another test with a grid of 1536*128*384 and the same >>>> process mesh as above. There was no error. The ksp_view info is attached >>>> for comparison. >>>> >>>>> Thank you. >>>> >>>>> >>>> >>>>> >>>> >>>>> [3] Here is my crude estimate of your memory usage. >>>> >>>>> I'll target the biggest memory hogs only to get an order of >>>> magnitude estimate >>>> >>>>> >>>> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB >>>> per MPI rank assuming double precision. >>>> >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming >>>> 32 bit integers) >>>> >>>>> >>>> >>>>> * You use 5 levels of coarsening, so the other operators should >>>> represent (collectively) >>>> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on >>>> the communicator with 18432 ranks. >>>> >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the >>>> communicator with 18432 ranks. >>>> >>>>> >>>> >>>>> * You use a reduction factor of 64, making the new communicator >>>> with 288 MPI ranks. >>>> >>>>> PCTelescope will first gather a temporary matrix associated with >>>> your coarse level operator assuming a comm size of 288 living on the comm >>>> with size 18432. >>>> >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core >>>> on the 288 ranks. >>>> >>>>> This matrix is then used to form a new MPIAIJ matrix on the >>>> subcomm, thus require another 32 MB per rank. >>>> >>>>> The temporary matrix is now destroyed. >>>> >>>>> >>>> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. >>>> >>>>> This requires 2 doubles per point in the DMDA. >>>> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >>>> >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on >>>> the sub-comm. >>>> >>>>> >>>> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the >>>> resulting operator will have the same memory footprint as the unpermuted >>>> matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB >>>> are held in memory when the DMDA is provided. >>>> >>>>> >>>> >>>>> From my rough estimates, the worst case memory foot print for any >>>> given core, given your options is approximately >>>> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >>>> >>>>> This is way below 8 GB. >>>> >>>>> >>>> >>>>> Note this estimate completely ignores: >>>> >>>>> (1) the memory required for the restriction operator, >>>> >>>>> (2) the potential growth in the number of non-zeros per row due >>>> to Galerkin coarsening (I wished -ksp_view_pre reported the output from >>>> MatView so we could see the number of non-zeros required by the coarse >>>> level operators) >>>> >>>>> (3) all temporary vectors required by the CG solver, and those >>>> required by the smoothers. >>>> >>>>> (4) internal memory allocated by MatPtAP >>>> >>>>> (5) memory associated with IS's used within PCTelescope >>>> >>>>> >>>> >>>>> So either I am completely off in my estimates, or you have not >>>> carefully estimated the memory usage of your application code. Hopefully >>>> others might examine/correct my rough estimates >>>> >>>>> >>>> >>>>> Since I don't have your code I cannot access the latter. >>>> >>>>> Since I don't have access to the same machine you are running on, >>>> I think we need to take a step back. >>>> >>>>> >>>> >>>>> [1] What machine are you running on? Send me a URL if its >>>> available >>>> >>>>> >>>> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 >>>> point FD stencil) >>>> >>>>> If it's a 7 point FD stencil, we should be able to examine the >>>> memory usage of your solver configuration using a standard, light weight >>>> existing PETSc example, run on your machine at the same scale. >>>> >>>>> This would hopefully enable us to correctly evaluate the actual >>>> memory usage required by the solver configuration you are using. >>>> >>>>> >>>> >>>>> Thanks, >>>> >>>>> Dave >>>> >>>>> >>>> >>>>> >>>> >>>>> Frank >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >>>> >>>>>> >>>> >>>>>> On Saturday, 9 July 2016, frank wrote: >>>> >>>>>> Hi Barry and Dave, >>>> >>>>>> >>>> >>>>>> Thank both of you for the advice. >>>> >>>>>> >>>> >>>>>> @Barry >>>> >>>>>> I made a mistake in the file names in last email. I attached the >>>> correct files this time. >>>> >>>>>> For all the three tests, 'Telescope' is used as the coarse >>>> preconditioner. >>>> >>>>>> >>>> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>>> >>>>>> Part of the memory usage: Vector 125 124 3971904 >>>> 0. >>>> >>>>>> Matrix 101 101 >>>> 9462372 0 >>>> >>>>>> >>>> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>>> >>>>>> Part of the memory usage: Vector 125 124 681672 >>>> 0. >>>> >>>>>> Matrix 101 101 >>>> 1462180 0. >>>> >>>>>> >>>> >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. >>>> In my case, it is about 6 times. >>>> >>>>>> >>>> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. >>>> Sub-domain per process: 32*32*32 >>>> >>>>>> Here I get the out of memory error. >>>> >>>>>> >>>> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to >>>> set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>>> >>>>>> The linear solver didn't work in this case. Petsc output some >>>> errors. >>>> >>>>>> >>>> >>>>>> @Dave >>>> >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse >>>> mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >>>> >>>>>> If my set the levels correctly, then on the last coarse mesh of >>>> MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >>>> >>>>>> On the last coarse mesh of 'Telescope', there is only one grid >>>> point per process. >>>> >>>>>> I still got the OOM error. The detailed petsc option file is >>>> attached. >>>> >>>>>> >>>> >>>>>> Do you understand the expected memory usage for the particular >>>> parallel LU implementation you are using? I don't (seriously). Replace LU >>>> with bjacobi and re-run this test. My point about solver debugging is still >>>> valid. >>>> >>>>>> >>>> >>>>>> And please send the result of KSPView so we can see what is >>>> actually used in the computations >>>> >>>>>> >>>> >>>>>> Thanks >>>> >>>>>> Dave >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> Thank you so much. >>>> >>>>>> >>>> >>>>>> Frank >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>>> >>>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>>> >>>>>> >>>> >>>>>> Hi Barry, >>>> >>>>>> >>>> >>>>>> Thank you for you advice. >>>> >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 >>>> and the process mesh is 96*8*24. >>>> >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and >>>> 'telescope' is used as the preconditioner at the coarse mesh. >>>> >>>>>> The system gives me the "Out of Memory" error before the linear >>>> system is completely solved. >>>> >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that >>>> the error occurs when it reaches the coarse mesh. >>>> >>>>>> >>>> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is >>>> 96*8*24. The 3rd test uses the >>>> same grid but a different process mesh 48*4*12. >>>> >>>>>> Are you sure this is right? The total matrix and vector >>>> memory usage goes from 2nd test >>>> >>>>>> Vector 384 383 8,193,712 0. >>>> >>>>>> Matrix 103 103 11,508,688 0. >>>> >>>>>> to 3rd test >>>> >>>>>> Vector 384 383 1,590,520 0. >>>> >>>>>> Matrix 103 103 3,508,664 0. >>>> >>>>>> that is the memory usage got smaller but if you have only 1/8th >>>> the processes and the same grid it should have gotten about 8 times bigger. >>>> Did you maybe cut the grid by a factor of 8 also? If so that still doesn't >>>> explain it because the memory usage changed by a factor of 5 something for >>>> the vectors and 3 something for the matrices. >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the >>>> same in 1st test. The linear solver works fine in both test. >>>> >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory >>>> info is from the option '-log_summary'. I tried to use '-momery_info' as >>>> you suggested, but in my case petsc treated it as an unused option. It >>>> output nothing about the memory. Do I need to add sth to my code so I can >>>> use '-memory_info'? >>>> >>>>>> Sorry, my mistake the option is -memory_view >>>> >>>>>> >>>> >>>>>> Can you run the one case with -memory_view and -mg_coarse >>>> jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much >>>> memory is used without the telescope? Also run case 2 the same way. >>>> >>>>>> >>>> >>>>>> Barry >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> In both tests the memory usage is not large. >>>> >>>>>> >>>> >>>>>> It seems to me that it might be the 'telescope' preconditioner >>>> that allocated a lot of memory and caused the error in the 1st test. >>>> >>>>>> Is there is a way to show how much memory it allocated? >>>> >>>>>> >>>> >>>>>> Frank >>>> >>>>>> >>>> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>>> >>>>>> Frank, >>>> >>>>>> >>>> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP >>>> before the solve so hopefully it gets that far. >>>> >>>>>> >>>> >>>>>> Please run the problem that does fit with -memory_info >>>> when the problem completes it will show the "high water mark" for PETSc >>>> allocated memory and total memory used. We first want to look at these >>>> numbers to see if it is using more memory than you expect. You could also >>>> run with say half the grid spacing to see how the memory usage scaled with >>>> the increase in grid points. Make the runs also with -log_view and send all >>>> the output from these options. >>>> >>>>>> >>>> >>>>>> Barry >>>> >>>>>> >>>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>> >>>>>> >>>> >>>>>> Hi, >>>> >>>>>> >>>> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to >>>> solve a linear system in parallel. >>>> >>>>>> I chose to use the 'Telescope' as the preconditioner on the >>>> coarse mesh for its good performance. >>>> >>>>>> The petsc options file is attached. >>>> >>>>>> >>>> >>>>>> The domain is a 3d box. >>>> >>>>>> It works well when the grid is 1536*128*384 and the process >>>> mesh is 96*8*24. When I double the size of grid and >>>> keep the same process mesh and petsc options, I >>>> get an "out of memory" error from the super-cluster I am using. >>>> >>>>>> Each process has access to at least 8G memory, which should be >>>> more than enough for my application. I am sure that all the other parts of >>>> my code( except the linear solver ) do not use much memory. So I doubt if >>>> there is something wrong with the linear solver. >>>> >>>>>> The error occurs before the linear system is completely solved >>>> so I don't have the info from ksp view. I am not able to re-produce the >>>> error with a smaller problem either. >>>> >>>>>> In addition, I tried to use the block jacobi as the >>>> preconditioner with the same grid and same decomposition. The linear solver >>>> runs extremely slow but there is no memory error. >>>> >>>>>> >>>> >>>>>> How can I diagnose what exactly cause the error? >>>> >>>>>> Thank you so much. >>>> >>>>>> >>>> >>>>>> Frank >>>> >>>>>> >>>> >>>>>> >>> _options.txt> >>>> >>>>>> >>>> >>>>> >>>> >>>> >>>> >>> >>> emory2.txt> >>>> > >>>> >>>> >>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Thu Sep 15 04:11:27 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Thu, 15 Sep 2016 11:11:27 +0200 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: Dear Barry Thanks for the clarification. I got exactly what you said if the code changed to ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); Residual norms for stokes_ solve. 0 KSP Residual norm 1.327791371202e-02 Residual norms for stokes_fieldsplit_p_ solve. 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan 1 KSP Residual norm 3.997711925708e-17 but I guess we solve a different problem if B is used for the linear system. in addition, changed to ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); also works but inner iteration converged not in one iteration Residual norms for stokes_ solve. 0 KSP Residual norm 1.327791371202e-02 Residual norms for stokes_fieldsplit_p_ solve. 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 1 KSP Residual norm 1.445282647127e-16 Seem like zero pivot does not happen, but why the solver for Schur takes 13 steps if the preconditioner is direct solver? I also so tried another problem which I known does have a nonsingular Schur (at least A11 != 0) and it also have the same problem: 1 step outer convergence but multiple step inner convergence. Any ideas? Giang On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: > > Normally you'd be absolutely correct to expect convergence in one > iteration. However in this example note the call > > ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); > > It is solving the linear system defined by A but building the > preconditioner (i.e. the entire fieldsplit process) from a different matrix > B. Since A is not B you should not expect convergence in one iteration. If > you change the code to > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > you will see exactly what you expect, convergence in one iteration. > > Sorry about this, the example is lacking clarity and documentation its > author obviously knew too well what he was doing that he didn't realize > everyone else in the world would need more comments in the code. If you > change the code to > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > it will stop without being able to build the preconditioner because LU > factorization of the Sp matrix will result in a zero pivot. This is why > this "auxiliary" matrix B is used to define the preconditioner instead of A. > > Barry > > > > > > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui wrote: > > > > Sorry I slept quite a while in this thread. Now I start to look at it > again. In the last try, the previous setting doesn't work either (in fact > diverge). So I would speculate if the Schur complement in my case is > actually not invertible. It's also possible that the code is wrong > somewhere. However, before looking at that, I want to understand thoroughly > the settings for Schur complement > > > > I experimented ex42 with the settings: > > mpirun -np 1 ex42 \ > > -stokes_ksp_monitor \ > > -stokes_ksp_type fgmres \ > > -stokes_pc_type fieldsplit \ > > -stokes_pc_fieldsplit_type schur \ > > -stokes_pc_fieldsplit_schur_fact_type full \ > > -stokes_pc_fieldsplit_schur_precondition selfp \ > > -stokes_fieldsplit_u_ksp_type preonly \ > > -stokes_fieldsplit_u_pc_type lu \ > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > -stokes_fieldsplit_p_ksp_type gmres \ > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > -stokes_fieldsplit_p_pc_type lu \ > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > > > In my understanding, the solver should converge in 1 (outer) step. > Execution gives: > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm > 0.000000000000e+00 ||r(i)||/||b|| -nan > > 1 KSP Residual norm 7.656238881621e-04 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm > 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > > 2 KSP Residual norm 9.895645456398e-06 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm > 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > > 3 KSP Residual norm 1.608206702571e-06 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm > 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > > 4 KSP Residual norm 2.453618999882e-07 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm > 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > > 5 KSP Residual norm 4.230399264750e-08 > > > > Looks like the "selfp" does construct the Schur nicely. But does "full" > really construct the full block preconditioner? > > > > Giang > > P/S: I'm also generating a smaller size of the previous problem for > checking again. > > > > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley > wrote: > > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui > wrote: > > > > It could be taking time in the MatMatMult() here if that matrix is > dense. Is there any reason to > > believe that is a good preconditioner for your problem? > > > > This is the first approach to the problem, so I chose the most simple > setting. Do you have any other recommendation? > > > > This is in no way the simplest PC. We need to make it simpler first. > > > > 1) Run on only 1 proc > > > > 2) Use -pc_fieldsplit_schur_fact_type full > > > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_ > true_residual > > > > This should converge in 1 outer iteration, but we will see how good your > Schur complement preconditioner > > is for this problem. > > > > You need to start out from something you understand and then start > making approximations. > > > > Matt > > > > For any solver question, please send us the output of > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > I sent here the full output (after changed to fgmres), again it takes > long at the first iteration but after that, it does not converge > > > > -ksp_type fgmres > > -ksp_max_it 300 > > -ksp_gmres_restart 300 > > -ksp_gmres_modifiedgramschmidt > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type diag > > -pc_fieldsplit_schur_precondition selfp > > -pc_fieldsplit_detect_saddle_point > > -fieldsplit_u_ksp_type preonly > > -fieldsplit_u_pc_type lu > > -fieldsplit_u_pc_factor_mat_solver_package mumps > > -fieldsplit_lu_ksp_type preonly > > -fieldsplit_lu_pc_type lu > > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm > 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm > 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm > 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm > 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm > 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm > 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm > 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm > 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm > 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm > 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm > 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm > 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm > 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm > 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm > 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm > 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm > 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm > 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm > 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm > 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm > 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm > 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm > 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm > 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm > 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm > 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm > 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm > 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm > 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm > 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm > 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm > 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm > 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm > 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm > 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm > 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm > 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm > 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm > 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm > 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm > 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm > 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm > 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm > 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm > 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm > 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm > 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm > 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm > 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm > 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm > 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm > 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm > 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm > 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm > 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm > 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm > 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm > 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm > 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm > 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm > 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm > 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm > 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm > 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm > 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm > 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm > 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm > 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm > 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm > 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm > 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm > 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm > 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm > 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm > 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm > 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm > 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm > 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm > 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm > 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm > 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm > 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm > 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm > 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm > 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm > 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm > 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm > 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm > 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm > 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm > 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm > 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm > 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm > 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm > 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm > 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm > 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm > 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm > 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm > 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm > 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm > 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm > 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm > 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm > 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm > 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm > 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm > 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm > 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm > 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm > 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm > 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm > 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm > 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm > 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm > 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm > 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm > 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm > 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm > 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm > 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm > 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm > 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm > 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm > 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm > 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm > 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm > 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm > 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm > 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm > 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm > 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm > 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm > 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm > 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm > 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm > 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm > 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm > 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm > 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm > 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm > 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm > 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm > 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm > 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm > 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm > 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm > 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm > 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm > 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm > 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm > 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm > 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm > 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm > 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm > 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm > 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm > 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm > 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm > 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm > 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm > 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm > 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm > 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm > 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm > 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm > 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm > 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm > 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm > 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm > 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm > 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm > 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm > 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm > 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm > 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm > 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm > 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm > 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm > 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm > 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm > 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm > 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm > 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm > 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm > 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm > 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm > 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm > 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm > 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm > 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm > 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm > 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm > 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm > 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm > 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm > 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm > 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm > 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm > 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm > 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm > 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm > 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm > 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm > 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm > 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm > 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm > 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm > 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm > 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm > 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm > 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm > 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm > 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm > 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm > 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm > 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm > 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm > 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm > 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm > 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm > 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm > 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm > 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm > 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm > 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm > 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm > 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm > 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm > 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm > 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm > 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm > 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm > 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm > 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm > 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm > 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm > 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm > 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm > 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm > 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm > 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm > 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm > 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm > 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm > 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm > 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm > 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm > 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm > 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm > 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm > 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm > 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm > 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm > 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm > 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm > 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm > 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm > 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm > 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm > 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm > 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm > 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm > 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm > 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm > 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm > 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm > 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm > 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm > 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm > 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm > 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm > 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm > 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm > 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm > 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm > 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm > 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm > 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm > 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm > 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm > 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm > 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm > 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm > 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm > 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm > 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm > 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm > 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm > 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm > 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm > 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm > 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm > 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm > 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm > 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm > 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm > 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm > 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm > 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm > 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > > Linear solve did not converge due to DIVERGED_ITS iterations 300 > > KSP Object: 2 MPI processes > > type: fgmres > > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=300, initial guess is zero > > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > > right preconditioning > > using UNPRECONDITIONED norm type for convergence test > > PC Object: 2 MPI processes > > type: fieldsplit > > FieldSplit with Schur preconditioner, factorization DIAG > > Preconditioner for the Schur complement formed from Sp, an assembled > approximation to S, which uses (lumped, if requested) A00's diagonal's > inverse > > Split info: > > Split number 0 Defined by IS > > Split number 1 Defined by IS > > KSP solver for A00 block > > KSP Object: (fieldsplit_u_) 2 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_u_) 2 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326 > > package used to perform factorization: mumps > > total: nonzeros=4.03041e+08, allocated > nonzeros=4.03041e+08 > > total number of mallocs used during MatSetValues calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): > 1 > > ICNTL(13) (efficiency control): > 0 > > ICNTL(14) (percentage of estimated workspace > increase): 20 > > ICNTL(18) (input mat struct): > 3 > > ICNTL(19) (Shur complement info): > 0 > > ICNTL(20) (rhs sparse pattern): > 0 > > ICNTL(21) (solution struct): > 1 > > ICNTL(22) (in-core/out-of-core facility): > 0 > > ICNTL(23) (max size of memory can be allocated > locally):0 > > ICNTL(24) (detection of null pivot rows): > 0 > > ICNTL(25) (computation of a null space basis): > 0 > > ICNTL(26) (Schur options for rhs or solution): > 0 > > ICNTL(27) (experimental parameter): > -24 > > ICNTL(28) (use parallel or sequential ordering): > 1 > > ICNTL(29) (parallel ordering): > 0 > > ICNTL(30) (user-specified set of entries in > inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve > phase): 0 > > ICNTL(33) (compute determinant): > 0 > > CNTL(1) (relative pivoting threshold): 0.01 > > CNTL(2) (stopping criterion of refinement): > 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the elimination > after analysis): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > RINFO(2) (local estimated flops for the assembly > after factorization): > > [0] 4.2839e+08 > > [1] 3.799e+08 > > RINFO(3) (local estimated flops for the elimination > after factorization): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > INFO(15) (estimated size of (in MB) MUMPS internal > data for running numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(16) (size of (in MB) MUMPS internal data used > during numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(23) (num of pivots eliminated on this processor > after factorization): > > [0] 90423 > > [1] 93903 > > RINFOG(1) (global estimated flops for the > elimination after analysis): 1.09445e+12 > > RINFOG(2) (global estimated flops for the assembly > after factorization): 8.0829e+08 > > RINFOG(3) (global estimated flops for the > elimination after factorization): 1.09445e+12 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0,0)*(2^0) > > INFOG(3) (estimated real workspace for factors on > all processors after analysis): 403041366 > > INFOG(4) (estimated integer workspace for factors on > all processors after analysis): 2265748 > > INFOG(5) (estimated maximum front size in the > complete tree): 6663 > > INFOG(6) (number of nodes in the complete tree): 2812 > > INFOG(7) (ordering option effectively use after > analysis): 5 > > INFOG(8) (structural symmetry in percent of the > permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to store the > matrix factors after factorization): 403041366 > > INFOG(10) (total integer space store the matrix > factors after factorization): 2265766 > > INFOG(11) (order of largest frontal matrix after > factorization): 6663 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after > factorization): 0 > > INFOG(14) (number of memory compress after > factorization): 0 > > INFOG(15) (number of steps of iterative refinement > after solution): 0 > > INFOG(16) (estimated size (in MB) of all MUMPS > internal data for factorization after analysis: value on the most memory > consuming processor): 2649 > > INFOG(17) (estimated size of all MUMPS internal data > for factorization after analysis: sum over all processors): 5270 > > INFOG(18) (size of all MUMPS internal data allocated > during factorization: value on the most memory consuming processor): 2649 > > INFOG(19) (size of all MUMPS internal data allocated > during factorization: sum over all processors): 5270 > > INFOG(20) (estimated number of entries in the > factors): 403041366 > > INFOG(21) (size in MB of memory effectively used > during factorization - value on the most memory consuming processor): 2121 > > INFOG(22) (size in MB of memory effectively used > during factorization - sum over all processors): 4174 > > INFOG(23) (after analysis: value of ICNTL(6) > effectively used): 0 > > INFOG(24) (after analysis: value of ICNTL(12) > effectively used): 1 > > INFOG(25) (after factorization: number of pivots > modified by static pivoting): 0 > > INFOG(28) (after factorization: number of null > pivots encountered): 0 > > INFOG(29) (after factorization: effective number of > entries in the factors (sum over all processors)): 403041366 > > INFOG(30, 31) (after solution: size in Mbytes of > memory used during solution phase): 2467, 4922 > > INFOG(32) (after analysis: type of analysis done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if > determinant is requested): 0 > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_u_) 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326, bs=3 > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 26829 nodes, > limit used is 5 > > KSP solver for S = A11 - A10 inv(A00) A01 > > KSP Object: (fieldsplit_lu_) 2 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_lu_) 2 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=2583 > > package used to perform factorization: mumps > > total: nonzeros=2.17621e+06, allocated > nonzeros=2.17621e+06 > > total number of mallocs used during MatSetValues calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): > 1 > > ICNTL(13) (efficiency control): > 0 > > ICNTL(14) (percentage of estimated workspace > increase): 20 > > ICNTL(18) (input mat struct): > 3 > > ICNTL(19) (Shur complement info): > 0 > > ICNTL(20) (rhs sparse pattern): > 0 > > ICNTL(21) (solution struct): > 1 > > ICNTL(22) (in-core/out-of-core facility): > 0 > > ICNTL(23) (max size of memory can be allocated > locally):0 > > ICNTL(24) (detection of null pivot rows): > 0 > > ICNTL(25) (computation of a null space basis): > 0 > > ICNTL(26) (Schur options for rhs or solution): > 0 > > ICNTL(27) (experimental parameter): > -24 > > ICNTL(28) (use parallel or sequential ordering): > 1 > > ICNTL(29) (parallel ordering): > 0 > > ICNTL(30) (user-specified set of entries in > inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve > phase): 0 > > ICNTL(33) (compute determinant): > 0 > > CNTL(1) (relative pivoting threshold): 0.01 > > CNTL(2) (stopping criterion of refinement): > 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the elimination > after analysis): > > [0] 5.12794e+08 > > [1] 5.02142e+08 > > RINFO(2) (local estimated flops for the assembly > after factorization): > > [0] 815031 > > [1] 745263 > > RINFO(3) (local estimated flops for the elimination > after factorization): > > [0] 5.12794e+08 > > [1] 5.02142e+08 > > INFO(15) (estimated size of (in MB) MUMPS internal > data for running numerical factorization): > > [0] 34 > > [1] 34 > > INFO(16) (size of (in MB) MUMPS internal data used > during numerical factorization): > > [0] 34 > > [1] 34 > > INFO(23) (num of pivots eliminated on this processor > after factorization): > > [0] 1158 > > [1] 1425 > > RINFOG(1) (global estimated flops for the > elimination after analysis): 1.01494e+09 > > RINFOG(2) (global estimated flops for the assembly > after factorization): 1.56029e+06 > > RINFOG(3) (global estimated flops for the > elimination after factorization): 1.01494e+09 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0,0)*(2^0) > > INFOG(3) (estimated real workspace for factors on > all processors after analysis): 2176209 > > INFOG(4) (estimated integer workspace for factors on > all processors after analysis): 14427 > > INFOG(5) (estimated maximum front size in the > complete tree): 699 > > INFOG(6) (number of nodes in the complete tree): 15 > > INFOG(7) (ordering option effectively use after > analysis): 2 > > INFOG(8) (structural symmetry in percent of the > permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to store the > matrix factors after factorization): 2176209 > > INFOG(10) (total integer space store the matrix > factors after factorization): 14427 > > INFOG(11) (order of largest frontal matrix after > factorization): 699 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after > factorization): 0 > > INFOG(14) (number of memory compress after > factorization): 0 > > INFOG(15) (number of steps of iterative refinement > after solution): 0 > > INFOG(16) (estimated size (in MB) of all MUMPS > internal data for factorization after analysis: value on the most memory > consuming processor): 34 > > INFOG(17) (estimated size of all MUMPS internal data > for factorization after analysis: sum over all processors): 68 > > INFOG(18) (size of all MUMPS internal data allocated > during factorization: value on the most memory consuming processor): 34 > > INFOG(19) (size of all MUMPS internal data allocated > during factorization: sum over all processors): 68 > > INFOG(20) (estimated number of entries in the > factors): 2176209 > > INFOG(21) (size in MB of memory effectively used > during factorization - value on the most memory consuming processor): 30 > > INFOG(22) (size in MB of memory effectively used > during factorization - sum over all processors): 59 > > INFOG(23) (after analysis: value of ICNTL(6) > effectively used): 0 > > INFOG(24) (after analysis: value of ICNTL(12) > effectively used): 1 > > INFOG(25) (after factorization: number of pivots > modified by static pivoting): 0 > > INFOG(28) (after factorization: number of null > pivots encountered): 0 > > INFOG(29) (after factorization: effective number of > entries in the factors (sum over all processors)): 2176209 > > INFOG(30, 31) (after solution: size in Mbytes of > memory used during solution phase): 16, 32 > > INFOG(32) (after analysis: type of analysis done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if > determinant is requested): 0 > > linear system matrix followed by preconditioner matrix: > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > type: schurcomplement > > rows=2583, cols=2583 > > Schur complement A11 - A10 inv(A00) A01 > > A11 > > Mat Object: (fieldsplit_lu_) 2 > MPI processes > > type: mpiaij > > rows=2583, cols=2583, bs=3 > > total: nonzeros=117369, allocated nonzeros=117369 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > A10 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=184326, rbs=3, cbs = 1 > > total: nonzeros=292770, allocated nonzeros=292770 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > KSP of A00 > > KSP Object: (fieldsplit_u_) 2 > MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_u_) 2 > MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326 > > package used to perform factorization: mumps > > total: nonzeros=4.03041e+08, allocated > nonzeros=4.03041e+08 > > total number of mallocs used during MatSetValues > calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): > 1 > > ICNTL(13) (efficiency control): > 0 > > ICNTL(14) (percentage of estimated workspace > increase): 20 > > ICNTL(18) (input mat struct): > 3 > > ICNTL(19) (Shur complement info): > 0 > > ICNTL(20) (rhs sparse pattern): > 0 > > ICNTL(21) (solution struct): > 1 > > ICNTL(22) (in-core/out-of-core facility): > 0 > > ICNTL(23) (max size of memory can be > allocated locally):0 > > ICNTL(24) (detection of null pivot rows): > 0 > > ICNTL(25) (computation of a null space > basis): 0 > > ICNTL(26) (Schur options for rhs or > solution): 0 > > ICNTL(27) (experimental parameter): > -24 > > ICNTL(28) (use parallel or sequential > ordering): 1 > > ICNTL(29) (parallel ordering): > 0 > > ICNTL(30) (user-specified set of entries in > inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve > phase): 0 > > ICNTL(33) (compute determinant): > 0 > > CNTL(1) (relative pivoting threshold): > 0.01 > > CNTL(2) (stopping criterion of refinement): > 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): > -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the > elimination after analysis): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > RINFO(2) (local estimated flops for the > assembly after factorization): > > [0] 4.2839e+08 > > [1] 3.799e+08 > > RINFO(3) (local estimated flops for the > elimination after factorization): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > INFO(15) (estimated size of (in MB) MUMPS > internal data for running numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(16) (size of (in MB) MUMPS internal > data used during numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(23) (num of pivots eliminated on this > processor after factorization): > > [0] 90423 > > [1] 93903 > > RINFOG(1) (global estimated flops for the > elimination after analysis): 1.09445e+12 > > RINFOG(2) (global estimated flops for the > assembly after factorization): 8.0829e+08 > > RINFOG(3) (global estimated flops for the > elimination after factorization): 1.09445e+12 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) > (determinant): (0,0)*(2^0) > > INFOG(3) (estimated real workspace for > factors on all processors after analysis): 403041366 > > INFOG(4) (estimated integer workspace for > factors on all processors after analysis): 2265748 > > INFOG(5) (estimated maximum front size in > the complete tree): 6663 > > INFOG(6) (number of nodes in the complete > tree): 2812 > > INFOG(7) (ordering option effectively use > after analysis): 5 > > INFOG(8) (structural symmetry in percent of > the permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to > store the matrix factors after factorization): 403041366 > > INFOG(10) (total integer space store the > matrix factors after factorization): 2265766 > > INFOG(11) (order of largest frontal matrix > after factorization): 6663 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after > factorization): 0 > > INFOG(14) (number of memory compress after > factorization): 0 > > INFOG(15) (number of steps of iterative > refinement after solution): 0 > > INFOG(16) (estimated size (in MB) of all > MUMPS internal data for factorization after analysis: value on the most > memory consuming processor): 2649 > > INFOG(17) (estimated size of all MUMPS > internal data for factorization after analysis: sum over all processors): > 5270 > > INFOG(18) (size of all MUMPS internal data > allocated during factorization: value on the most memory consuming > processor): 2649 > > INFOG(19) (size of all MUMPS internal data > allocated during factorization: sum over all processors): 5270 > > INFOG(20) (estimated number of entries in > the factors): 403041366 > > INFOG(21) (size in MB of memory effectively > used during factorization - value on the most memory consuming processor): > 2121 > > INFOG(22) (size in MB of memory effectively > used during factorization - sum over all processors): 4174 > > INFOG(23) (after analysis: value of ICNTL(6) > effectively used): 0 > > INFOG(24) (after analysis: value of > ICNTL(12) effectively used): 1 > > INFOG(25) (after factorization: number of > pivots modified by static pivoting): 0 > > INFOG(28) (after factorization: number of > null pivots encountered): 0 > > INFOG(29) (after factorization: effective > number of entries in the factors (sum over all processors)): 403041366 > > INFOG(30, 31) (after solution: size in > Mbytes of memory used during solution phase): 2467, 4922 > > INFOG(32) (after analysis: type of analysis > done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if > determinant is requested): 0 > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_u_) > 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326, bs=3 > > total: nonzeros=3.32649e+07, allocated > nonzeros=3.32649e+07 > > total number of mallocs used during MatSetValues calls > =0 > > using I-node (on process 0) routines: found 26829 > nodes, limit used is 5 > > A01 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=2583, rbs=3, cbs = 1 > > total: nonzeros=292770, allocated nonzeros=292770 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 16098 > nodes, limit used is 5 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=2583, rbs=3, cbs = 1 > > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > linear system matrix = precond matrix: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=186909, cols=186909 > > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 26829 nodes, limit > used is 5 > > KSPSolve completed > > > > > > Giang > > > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley > wrote: > > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui > wrote: > > Hello > > > > I'm solving an indefinite problem arising from mesh tying/contact using > Lagrange multiplier, the matrix has the form > > > > K = [A P^T > > P 0] > > > > I used the FIELDSPLIT preconditioner with one field is the main variable > (displacement) and the other field for dual variable (Lagrange multiplier). > The block size for each field is 3. According to the manual, I first chose > the preconditioner based on Schur complement to treat this problem. > > > > > > For any solver question, please send us the output of > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > However, I will comment below > > > > The parameters used for the solve is > > -ksp_type gmres > > > > You need 'fgmres' here with the options you have below. > > > > -ksp_max_it 300 > > -ksp_gmres_restart 300 > > -ksp_gmres_modifiedgramschmidt > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type diag > > -pc_fieldsplit_schur_precondition selfp > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is > dense. Is there any reason to > > believe that is a good preconditioner for your problem? > > > > > > -pc_fieldsplit_detect_saddle_point > > -fieldsplit_u_pc_type hypre > > > > I would just use MUMPS here to start, especially if it works on the > whole problem. Same with the one below. > > > > Matt > > > > -fieldsplit_u_pc_hypre_type boomeramg > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > -fieldsplit_lu_pc_type hypre > > -fieldsplit_lu_pc_hypre_type boomeramg > > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > > > For the test case, a small problem is solved on 2 processes. Due to the > decomposition, the contact only happens in 1 proc, so the size of Lagrange > multiplier dofs on proc 0 is 0. > > > > 0: mIndexU.size(): 80490 > > 0: mIndexLU.size(): 0 > > 1: mIndexU.size(): 103836 > > 1: mIndexLU.size(): 2583 > > > > However, with this setup the solver takes very long at KSPSolve before > going to iteration, and the first iteration seems forever so I have to stop > the calculation. I guessed that the solver takes time to compute the Schur > complement, but according to the manual only the diagonal of A is used to > approximate the Schur complement, so it should not take long to compute > this. > > > > Note that I ran the same problem with direct solver (MUMPS) and it's > able to produce the valid results. The parameter for the solve is pretty > standard > > -ksp_type preonly > > -pc_type lu > > -pc_factor_mat_solver_package mumps > > > > Hence the matrix/rhs must not have any problem here. Do you have any > idea or suggestion for this case? > > > > > > Giang > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gotofd at gmail.com Thu Sep 15 04:23:54 2016 From: gotofd at gmail.com (Ji Zhang) Date: Thu, 15 Sep 2016 17:23:54 +0800 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: Thanks Matt. It works well for signal core. But is there any solution if I need a MPI program? Thanks. Wayne On Tue, Sep 13, 2016 at 9:30 AM, Matthew Knepley wrote: > On Mon, Sep 12, 2016 at 8:24 PM, Ji Zhang wrote: > >> Dear all, >> >> I'm using petsc4py and now face some problems. >> I have a number of small petsc dense matrices mij, and I want to >> construct them to a big matrix M like this: >> >> [ m11 m12 m13 ] >> M = | m21 m22 m23 | , >> [ m31 m32 m33 ] >> How could I do it effectively? >> >> Now I'm using the code below: >> >> # get indexes of matrix mij >> index1_begin, index1_end = getindex_i( ) >> index2_begin, index2_end = getindex_j( ) >> M[index1_begin:index1_end, index2_begin:index2_end] = mij[:, :] >> which report such error messages: >> >> petsc4py.PETSc.Error: error code 56 >> [0] MatGetValues() line 1818 in /home/zhangji/PycharmProjects/ >> petsc-petsc-31a1859eaff6/src/mat/interface/matrix.c >> [0] MatGetValues_MPIDense() line 154 in /home/zhangji/PycharmProjects/ >> petsc-petsc-31a1859eaff6/src/mat/impls/dense/mpi/mpidense.c >> > > Make M a sequential dense matrix. > > Matt > > >> [0] No support for this operation for this object type >> [0] Only local values currently supported >> >> Thanks. >> >> >> 2016-09-13 >> Best, >> Regards, >> Zhang Ji >> Beijing Computational Science Research Center >> E-mail: gotofd at gmail.com >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Sep 15 05:35:47 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 15 Sep 2016 12:35:47 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> Message-ID: HI all, I the only unexpected memory usage I can see is associated with the call to MatPtAP(). Here is something you can try immediately. Run your code with the additional options -matrap 0 -matptap_scalable I didn't realize this before, but the default behaviour of MatPtAP in parallel is actually to to explicitly form the transpose of P (e.g. assemble R = P^T) and then compute R.A.P. You don't want to do this. The option -matrap 0 resolves this issue. The implementation of P^T.A.P has two variants. The scalable implementation (with respect to memory usage) is selected via the second option -matptap_scalable. Try it out - I see a significant memory reduction using these options for particular mesh sizes / partitions. I've attached a cleaned up version of the code you sent me. There were a number of memory leaks and other issues. The main points being * You should call DMDAVecGetArrayF90() before VecAssembly{Begin,End} * You should call PetscFinalize(), otherwise the option -log_summary (-log_view) will not display anything once the program has completed. Thanks, Dave On 15 September 2016 at 08:03, Hengjie Wang wrote: > Hi Dave, > > Sorry, I should have put more comment to explain the code. > The number of process in each dimension is the same: Px = Py=Pz=P. So is > the domain size. > So if the you want to run the code for a 512^3 grid points on 16^3 cores, > you need to set "-N 512 -P 16" in the command line. > I add more comments and also fix an error in the attached code. ( The > error only effects the accuracy of solution but not the memory usage. ) > > Thank you. > Frank > > > On 9/14/2016 9:05 PM, Dave May wrote: > > > > On Thursday, 15 September 2016, Dave May wrote: > >> >> >> On Thursday, 15 September 2016, frank wrote: >> >>> Hi, >>> >>> I write a simple code to re-produce the error. I hope this can help to >>> diagnose the problem. >>> The code just solves a 3d poisson equation. >>> >> >> Why is the stencil width a runtime parameter?? And why is the default >> value 2? For 7-pnt FD Laplace, you only need a stencil width of 1. >> >> Was this choice made to mimic something in the real application code? >> > > Please ignore - I misunderstood your usage of the param set by -P > > >> >> >>> >>> I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. >>> That's when I re-produce the OOM error. Each core has about 2G memory. >>> I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The ksp >>> solver works fine. >>> I attached the code, ksp_view_pre's output and my petsc option file. >>> >>> Thank you. >>> Frank >>> >>> On 09/09/2016 06:38 PM, Hengjie Wang wrote: >>> >>> Hi Barry, >>> >>> I checked. On the supercomputer, I had the option "-ksp_view_pre" but it >>> is not in file I sent you. I am sorry for the confusion. >>> >>> Regards, >>> Frank >>> >>> On Friday, September 9, 2016, Barry Smith wrote: >>> >>>> >>>> > On Sep 9, 2016, at 3:11 PM, frank wrote: >>>> > >>>> > Hi Barry, >>>> > >>>> > I think the first KSP view output is from -ksp_view_pre. Before I >>>> submitted the test, I was not sure whether there would be OOM error or not. >>>> So I added both -ksp_view_pre and -ksp_view. >>>> >>>> But the options file you sent specifically does NOT list the >>>> -ksp_view_pre so how could it be from that? >>>> >>>> Sorry to be pedantic but I've spent too much time in the past trying >>>> to debug from incorrect information and want to make sure that the >>>> information I have is correct before thinking. Please recheck exactly what >>>> happened. Rerun with the exact input file you emailed if that is needed. >>>> >>>> Barry >>>> >>>> > >>>> > Frank >>>> > >>>> > >>>> > On 09/09/2016 12:38 PM, Barry Smith wrote: >>>> >> Why does ksp_view2.txt have two KSP views in it while >>>> ksp_view1.txt has only one KSPView in it? Did you run two different solves >>>> in the 2 case but not the one? >>>> >> >>>> >> Barry >>>> >> >>>> >> >>>> >> >>>> >>> On Sep 9, 2016, at 10:56 AM, frank wrote: >>>> >>> >>>> >>> Hi, >>>> >>> >>>> >>> I want to continue digging into the memory problem here. >>>> >>> I did find a work around in the past, which is to use less cores >>>> per node so that each core has 8G memory. However this is deficient and >>>> expensive. I hope to locate the place that uses the most memory. >>>> >>> >>>> >>> Here is a brief summary of the tests I did in past: >>>> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >>>> >>> Maximum (over computational time) process memory: total >>>> 7.0727e+08 >>>> >>> Current process memory: >>>> total 7.0727e+08 >>>> >>> Maximum (over computational time) space PetscMalloc()ed: total >>>> 6.3908e+11 >>>> >>> Current space PetscMalloc()ed: >>>> total 1.8275e+09 >>>> >>> >>>> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >>>> >>> Maximum (over computational time) process memory: total >>>> 5.9431e+09 >>>> >>> Current process memory: >>>> total 5.9431e+09 >>>> >>> Maximum (over computational time) space PetscMalloc()ed: total >>>> 5.3202e+12 >>>> >>> Current space PetscMalloc()ed: >>>> total 5.4844e+09 >>>> >>> >>>> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >>>> >>> OOM( Out Of Memory ) killer of the supercomputer terminated the >>>> job during "KSPSolve". >>>> >>> >>>> >>> I attached the output of ksp_view( the third test's output is from >>>> ksp_view_pre ), memory_view and also the petsc options. >>>> >>> >>>> >>> In all the tests, each core can access about 2G memory. In test3, >>>> there are 4223139840 non-zeros in the matrix. This will consume about >>>> 1.74M, using double precision. Considering some extra memory used to store >>>> integer index, 2G memory should still be way enough. >>>> >>> >>>> >>> Is there a way to find out which part of KSPSolve uses the most >>>> memory? >>>> >>> Thank you so much. >>>> >>> >>>> >>> BTW, there are 4 options remains unused and I don't understand why >>>> they are omitted: >>>> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >>>> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >>>> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >>>> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >>>> >>> >>>> >>> >>>> >>> Regards, >>>> >>> Frank >>>> >>> >>>> >>> On 07/13/2016 05:47 PM, Dave May wrote: >>>> >>>> >>>> >>>> On 14 July 2016 at 01:07, frank wrote: >>>> >>>> Hi Dave, >>>> >>>> >>>> >>>> Sorry for the late reply. >>>> >>>> Thank you so much for your detailed reply. >>>> >>>> >>>> >>>> I have a question about the estimation of the memory usage. There >>>> are 4223139840 allocated non-zeros and 18432 MPI processes. Double >>>> precision is used. So the memory per process is: >>>> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >>>> >>>> Did I do sth wrong here? Because this seems too small. >>>> >>>> >>>> >>>> No - I totally f***ed it up. You are correct. That'll teach me for >>>> fumbling around with my iphone calculator and not using my brain. (Note >>>> that to convert to MB just divide by 1e6, not 1024^2 - although I >>>> apparently cannot convert between units correctly....) >>>> >>>> >>>> >>>> From the PETSc objects associated with the solver, It looks like >>>> it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities >>>> are: somewhere in your usage of PETSc you've introduced a memory leak; >>>> PETSc is doing a huge over allocation (e.g. as per our discussion of >>>> MatPtAP); or in your application code there are other objects you have >>>> forgotten to log the memory for. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> I am running this job on Bluewater >>>> >>>> I am using the 7 points FD stencil in 3D. >>>> >>>> >>>> >>>> I thought so on both counts. >>>> >>>> >>>> >>>> I apologize that I made a stupid mistake in computing the memory >>>> per core. My settings render each core can access only 2G memory on average >>>> instead of 8G which I mentioned in previous email. I re-run the job with 8G >>>> memory per core on average and there is no "Out Of Memory" error. I would >>>> do more test to see if there is still some memory issue. >>>> >>>> >>>> >>>> Ok. I'd still like to know where the memory was being used since >>>> my estimates were off. >>>> >>>> >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Dave >>>> >>>> >>>> >>>> Regards, >>>> >>>> Frank >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >>>> >>>>> Hi Frank, >>>> >>>>> >>>> >>>>> >>>> >>>>> On 11 July 2016 at 19:14, frank wrote: >>>> >>>>> Hi Dave, >>>> >>>>> >>>> >>>>> I re-run the test using bjacobi as the preconditioner on the >>>> coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is >>>> 96*8*24. The petsc option file is attached. >>>> >>>>> I still got the "Out Of Memory" error. The error occurred before >>>> the linear solver finished one step. So I don't have the full info from >>>> ksp_view. The info from ksp_view_pre is attached. >>>> >>>>> >>>> >>>>> Okay - that is essentially useless (sorry) >>>> >>>>> >>>> >>>>> It seems to me that the error occurred when the decomposition was >>>> going to be changed. >>>> >>>>> >>>> >>>>> Based on what information? >>>> >>>>> Running with -info would give us more clues, but will create a >>>> ton of output. >>>> >>>>> Please try running the case which failed with -info >>>> >>>>> I had another test with a grid of 1536*128*384 and the same >>>> process mesh as above. There was no error. The ksp_view info is attached >>>> for comparison. >>>> >>>>> Thank you. >>>> >>>>> >>>> >>>>> >>>> >>>>> [3] Here is my crude estimate of your memory usage. >>>> >>>>> I'll target the biggest memory hogs only to get an order of >>>> magnitude estimate >>>> >>>>> >>>> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB >>>> per MPI rank assuming double precision. >>>> >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming >>>> 32 bit integers) >>>> >>>>> >>>> >>>>> * You use 5 levels of coarsening, so the other operators should >>>> represent (collectively) >>>> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on >>>> the communicator with 18432 ranks. >>>> >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the >>>> communicator with 18432 ranks. >>>> >>>>> >>>> >>>>> * You use a reduction factor of 64, making the new communicator >>>> with 288 MPI ranks. >>>> >>>>> PCTelescope will first gather a temporary matrix associated with >>>> your coarse level operator assuming a comm size of 288 living on the comm >>>> with size 18432. >>>> >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core >>>> on the 288 ranks. >>>> >>>>> This matrix is then used to form a new MPIAIJ matrix on the >>>> subcomm, thus require another 32 MB per rank. >>>> >>>>> The temporary matrix is now destroyed. >>>> >>>>> >>>> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. >>>> >>>>> This requires 2 doubles per point in the DMDA. >>>> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >>>> >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on >>>> the sub-comm. >>>> >>>>> >>>> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the >>>> resulting operator will have the same memory footprint as the unpermuted >>>> matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB >>>> are held in memory when the DMDA is provided. >>>> >>>>> >>>> >>>>> From my rough estimates, the worst case memory foot print for any >>>> given core, given your options is approximately >>>> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >>>> >>>>> This is way below 8 GB. >>>> >>>>> >>>> >>>>> Note this estimate completely ignores: >>>> >>>>> (1) the memory required for the restriction operator, >>>> >>>>> (2) the potential growth in the number of non-zeros per row due >>>> to Galerkin coarsening (I wished -ksp_view_pre reported the output from >>>> MatView so we could see the number of non-zeros required by the coarse >>>> level operators) >>>> >>>>> (3) all temporary vectors required by the CG solver, and those >>>> required by the smoothers. >>>> >>>>> (4) internal memory allocated by MatPtAP >>>> >>>>> (5) memory associated with IS's used within PCTelescope >>>> >>>>> >>>> >>>>> So either I am completely off in my estimates, or you have not >>>> carefully estimated the memory usage of your application code. Hopefully >>>> others might examine/correct my rough estimates >>>> >>>>> >>>> >>>>> Since I don't have your code I cannot access the latter. >>>> >>>>> Since I don't have access to the same machine you are running on, >>>> I think we need to take a step back. >>>> >>>>> >>>> >>>>> [1] What machine are you running on? Send me a URL if its >>>> available >>>> >>>>> >>>> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 >>>> point FD stencil) >>>> >>>>> If it's a 7 point FD stencil, we should be able to examine the >>>> memory usage of your solver configuration using a standard, light weight >>>> existing PETSc example, run on your machine at the same scale. >>>> >>>>> This would hopefully enable us to correctly evaluate the actual >>>> memory usage required by the solver configuration you are using. >>>> >>>>> >>>> >>>>> Thanks, >>>> >>>>> Dave >>>> >>>>> >>>> >>>>> >>>> >>>>> Frank >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >>>> >>>>>> >>>> >>>>>> On Saturday, 9 July 2016, frank wrote: >>>> >>>>>> Hi Barry and Dave, >>>> >>>>>> >>>> >>>>>> Thank both of you for the advice. >>>> >>>>>> >>>> >>>>>> @Barry >>>> >>>>>> I made a mistake in the file names in last email. I attached the >>>> correct files this time. >>>> >>>>>> For all the three tests, 'Telescope' is used as the coarse >>>> preconditioner. >>>> >>>>>> >>>> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>>> >>>>>> Part of the memory usage: Vector 125 124 3971904 >>>> 0. >>>> >>>>>> Matrix 101 101 >>>> 9462372 0 >>>> >>>>>> >>>> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>>> >>>>>> Part of the memory usage: Vector 125 124 681672 >>>> 0. >>>> >>>>>> Matrix 101 101 >>>> 1462180 0. >>>> >>>>>> >>>> >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. >>>> In my case, it is about 6 times. >>>> >>>>>> >>>> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. >>>> Sub-domain per process: 32*32*32 >>>> >>>>>> Here I get the out of memory error. >>>> >>>>>> >>>> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to >>>> set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>>> >>>>>> The linear solver didn't work in this case. Petsc output some >>>> errors. >>>> >>>>>> >>>> >>>>>> @Dave >>>> >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse >>>> mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >>>> >>>>>> If my set the levels correctly, then on the last coarse mesh of >>>> MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >>>> >>>>>> On the last coarse mesh of 'Telescope', there is only one grid >>>> point per process. >>>> >>>>>> I still got the OOM error. The detailed petsc option file is >>>> attached. >>>> >>>>>> >>>> >>>>>> Do you understand the expected memory usage for the particular >>>> parallel LU implementation you are using? I don't (seriously). Replace LU >>>> with bjacobi and re-run this test. My point about solver debugging is still >>>> valid. >>>> >>>>>> >>>> >>>>>> And please send the result of KSPView so we can see what is >>>> actually used in the computations >>>> >>>>>> >>>> >>>>>> Thanks >>>> >>>>>> Dave >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> Thank you so much. >>>> >>>>>> >>>> >>>>>> Frank >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>>> >>>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>>> >>>>>> >>>> >>>>>> Hi Barry, >>>> >>>>>> >>>> >>>>>> Thank you for you advice. >>>> >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 >>>> and the process mesh is 96*8*24. >>>> >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and >>>> 'telescope' is used as the preconditioner at the coarse mesh. >>>> >>>>>> The system gives me the "Out of Memory" error before the linear >>>> system is completely solved. >>>> >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that >>>> the error occurs when it reaches the coarse mesh. >>>> >>>>>> >>>> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is >>>> 96*8*24. The 3rd test uses the >>>> same grid but a different process mesh 48*4*12. >>>> >>>>>> Are you sure this is right? The total matrix and vector >>>> memory usage goes from 2nd test >>>> >>>>>> Vector 384 383 8,193,712 0. >>>> >>>>>> Matrix 103 103 11,508,688 0. >>>> >>>>>> to 3rd test >>>> >>>>>> Vector 384 383 1,590,520 0. >>>> >>>>>> Matrix 103 103 3,508,664 0. >>>> >>>>>> that is the memory usage got smaller but if you have only 1/8th >>>> the processes and the same grid it should have gotten about 8 times bigger. >>>> Did you maybe cut the grid by a factor of 8 also? If so that still doesn't >>>> explain it because the memory usage changed by a factor of 5 something for >>>> the vectors and 3 something for the matrices. >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the >>>> same in 1st test. The linear solver works fine in both test. >>>> >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory >>>> info is from the option '-log_summary'. I tried to use '-momery_info' as >>>> you suggested, but in my case petsc treated it as an unused option. It >>>> output nothing about the memory. Do I need to add sth to my code so I can >>>> use '-memory_info'? >>>> >>>>>> Sorry, my mistake the option is -memory_view >>>> >>>>>> >>>> >>>>>> Can you run the one case with -memory_view and -mg_coarse >>>> jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much >>>> memory is used without the telescope? Also run case 2 the same way. >>>> >>>>>> >>>> >>>>>> Barry >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> In both tests the memory usage is not large. >>>> >>>>>> >>>> >>>>>> It seems to me that it might be the 'telescope' preconditioner >>>> that allocated a lot of memory and caused the error in the 1st test. >>>> >>>>>> Is there is a way to show how much memory it allocated? >>>> >>>>>> >>>> >>>>>> Frank >>>> >>>>>> >>>> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>>> >>>>>> Frank, >>>> >>>>>> >>>> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP >>>> before the solve so hopefully it gets that far. >>>> >>>>>> >>>> >>>>>> Please run the problem that does fit with -memory_info >>>> when the problem completes it will show the "high water mark" for PETSc >>>> allocated memory and total memory used. We first want to look at these >>>> numbers to see if it is using more memory than you expect. You could also >>>> run with say half the grid spacing to see how the memory usage scaled with >>>> the increase in grid points. Make the runs also with -log_view and send all >>>> the output from these options. >>>> >>>>>> >>>> >>>>>> Barry >>>> >>>>>> >>>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>> >>>>>> >>>> >>>>>> Hi, >>>> >>>>>> >>>> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to >>>> solve a linear system in parallel. >>>> >>>>>> I chose to use the 'Telescope' as the preconditioner on the >>>> coarse mesh for its good performance. >>>> >>>>>> The petsc options file is attached. >>>> >>>>>> >>>> >>>>>> The domain is a 3d box. >>>> >>>>>> It works well when the grid is 1536*128*384 and the process >>>> mesh is 96*8*24. When I double the size of grid and >>>> keep the same process mesh and petsc options, I >>>> get an "out of memory" error from the super-cluster I am using. >>>> >>>>>> Each process has access to at least 8G memory, which should be >>>> more than enough for my application. I am sure that all the other parts of >>>> my code( except the linear solver ) do not use much memory. So I doubt if >>>> there is something wrong with the linear solver. >>>> >>>>>> The error occurs before the linear system is completely solved >>>> so I don't have the info from ksp view. I am not able to re-produce the >>>> error with a smaller problem either. >>>> >>>>>> In addition, I tried to use the block jacobi as the >>>> preconditioner with the same grid and same decomposition. The linear solver >>>> runs extremely slow but there is no memory error. >>>> >>>>>> >>>> >>>>>> How can I diagnose what exactly cause the error? >>>> >>>>>> Thank you so much. >>>> >>>>>> >>>> >>>>>> Frank >>>> >>>>>> >>>> >>>>>> >>> _options.txt> >>>> >>>>>> >>>> >>>>> >>>> >>>> >>>> >>> >>> emory2.txt> >>>> > >>>> >>>> >>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_ksp.F90 Type: application/octet-stream Size: 5589 bytes Desc: not available URL: From knepley at gmail.com Thu Sep 15 07:56:54 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Sep 2016 07:56:54 -0500 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui wrote: > Dear Barry > > Thanks for the clarification. I got exactly what you said if the code > changed to > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > Residual norms for stokes_ solve. > 0 KSP Residual norm 1.327791371202e-02 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm > 0.000000000000e+00 ||r(i)||/||b|| -nan > 1 KSP Residual norm 3.997711925708e-17 > > but I guess we solve a different problem if B is used for the linear > system. > > in addition, changed to > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > also works but inner iteration converged not in one iteration > > Residual norms for stokes_ solve. > 0 KSP Residual norm 1.327791371202e-02 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm > 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm > 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 > 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm > 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 > 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm > 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 > 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm > 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 > 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm > 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 > 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm > 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 > 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm > 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 > 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm > 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 > 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm > 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 > 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm > 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 > 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm > 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 > 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm > 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 > 1 KSP Residual norm 1.445282647127e-16 > > > Seem like zero pivot does not happen, but why the solver for Schur takes > 13 steps if the preconditioner is direct solver? > Look at the -ksp_view. I will bet that the default is to shift (add a multiple of the identity) the matrix instead of failing. This gives an inexact PC, but as you see it can converge. Thanks, Matt > > I also so tried another problem which I known does have a nonsingular > Schur (at least A11 != 0) and it also have the same problem: 1 step outer > convergence but multiple step inner convergence. > > Any ideas? > > Giang > > On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: > >> >> Normally you'd be absolutely correct to expect convergence in one >> iteration. However in this example note the call >> >> ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); >> >> It is solving the linear system defined by A but building the >> preconditioner (i.e. the entire fieldsplit process) from a different matrix >> B. Since A is not B you should not expect convergence in one iteration. If >> you change the code to >> >> ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); >> >> you will see exactly what you expect, convergence in one iteration. >> >> Sorry about this, the example is lacking clarity and documentation its >> author obviously knew too well what he was doing that he didn't realize >> everyone else in the world would need more comments in the code. If you >> change the code to >> >> ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); >> >> it will stop without being able to build the preconditioner because LU >> factorization of the Sp matrix will result in a zero pivot. This is why >> this "auxiliary" matrix B is used to define the preconditioner instead of A. >> >> Barry >> >> >> >> >> > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui wrote: >> > >> > Sorry I slept quite a while in this thread. Now I start to look at it >> again. In the last try, the previous setting doesn't work either (in fact >> diverge). So I would speculate if the Schur complement in my case is >> actually not invertible. It's also possible that the code is wrong >> somewhere. However, before looking at that, I want to understand thoroughly >> the settings for Schur complement >> > >> > I experimented ex42 with the settings: >> > mpirun -np 1 ex42 \ >> > -stokes_ksp_monitor \ >> > -stokes_ksp_type fgmres \ >> > -stokes_pc_type fieldsplit \ >> > -stokes_pc_fieldsplit_type schur \ >> > -stokes_pc_fieldsplit_schur_fact_type full \ >> > -stokes_pc_fieldsplit_schur_precondition selfp \ >> > -stokes_fieldsplit_u_ksp_type preonly \ >> > -stokes_fieldsplit_u_pc_type lu \ >> > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ >> > -stokes_fieldsplit_p_ksp_type gmres \ >> > -stokes_fieldsplit_p_ksp_monitor_true_residual \ >> > -stokes_fieldsplit_p_ksp_max_it 300 \ >> > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ >> > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ >> > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ >> > -stokes_fieldsplit_p_pc_type lu \ >> > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps >> > >> > In my understanding, the solver should converge in 1 (outer) step. >> Execution gives: >> > Residual norms for stokes_ solve. >> > 0 KSP Residual norm 1.327791371202e-02 >> > Residual norms for stokes_fieldsplit_p_ solve. >> > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm >> 0.000000000000e+00 ||r(i)||/||b|| -nan >> > 1 KSP Residual norm 7.656238881621e-04 >> > Residual norms for stokes_fieldsplit_p_ solve. >> > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm >> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >> > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm >> 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 >> > 2 KSP Residual norm 9.895645456398e-06 >> > Residual norms for stokes_fieldsplit_p_ solve. >> > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm >> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >> > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm >> 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 >> > 3 KSP Residual norm 1.608206702571e-06 >> > Residual norms for stokes_fieldsplit_p_ solve. >> > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm >> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >> > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm >> 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 >> > 4 KSP Residual norm 2.453618999882e-07 >> > Residual norms for stokes_fieldsplit_p_ solve. >> > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm >> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >> > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm >> 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 >> > 5 KSP Residual norm 4.230399264750e-08 >> > >> > Looks like the "selfp" does construct the Schur nicely. But does "full" >> really construct the full block preconditioner? >> > >> > Giang >> > P/S: I'm also generating a smaller size of the previous problem for >> checking again. >> > >> > >> > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley >> wrote: >> > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui >> wrote: >> > >> > It could be taking time in the MatMatMult() here if that matrix is >> dense. Is there any reason to >> > believe that is a good preconditioner for your problem? >> > >> > This is the first approach to the problem, so I chose the most simple >> setting. Do you have any other recommendation? >> > >> > This is in no way the simplest PC. We need to make it simpler first. >> > >> > 1) Run on only 1 proc >> > >> > 2) Use -pc_fieldsplit_schur_fact_type full >> > >> > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_tru >> e_residual >> > >> > This should converge in 1 outer iteration, but we will see how good >> your Schur complement preconditioner >> > is for this problem. >> > >> > You need to start out from something you understand and then start >> making approximations. >> > >> > Matt >> > >> > For any solver question, please send us the output of >> > >> > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >> > >> > >> > I sent here the full output (after changed to fgmres), again it takes >> long at the first iteration but after that, it does not converge >> > >> > -ksp_type fgmres >> > -ksp_max_it 300 >> > -ksp_gmres_restart 300 >> > -ksp_gmres_modifiedgramschmidt >> > -pc_fieldsplit_type schur >> > -pc_fieldsplit_schur_fact_type diag >> > -pc_fieldsplit_schur_precondition selfp >> > -pc_fieldsplit_detect_saddle_point >> > -fieldsplit_u_ksp_type preonly >> > -fieldsplit_u_pc_type lu >> > -fieldsplit_u_pc_factor_mat_solver_package mumps >> > -fieldsplit_lu_ksp_type preonly >> > -fieldsplit_lu_pc_type lu >> > -fieldsplit_lu_pc_factor_mat_solver_package mumps >> > >> > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm >> 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 >> > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm >> 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 >> > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm >> 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 >> > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm >> 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 >> > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm >> 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 >> > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm >> 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 >> > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm >> 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 >> > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm >> 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 >> > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm >> 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 >> > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm >> 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 >> > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm >> 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 >> > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm >> 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 >> > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm >> 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 >> > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm >> 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 >> > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm >> 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 >> > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm >> 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 >> > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm >> 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 >> > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm >> 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 >> > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm >> 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 >> > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm >> 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 >> > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm >> 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 >> > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm >> 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 >> > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm >> 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 >> > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm >> 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 >> > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm >> 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 >> > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm >> 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 >> > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm >> 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 >> > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm >> 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 >> > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm >> 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 >> > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm >> 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 >> > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm >> 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 >> > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm >> 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 >> > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm >> 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 >> > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm >> 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 >> > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm >> 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 >> > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm >> 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 >> > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm >> 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 >> > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm >> 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 >> > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm >> 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 >> > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm >> 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 >> > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm >> 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 >> > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm >> 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 >> > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm >> 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 >> > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm >> 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 >> > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm >> 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 >> > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm >> 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 >> > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm >> 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 >> > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm >> 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 >> > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm >> 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 >> > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm >> 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 >> > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm >> 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 >> > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm >> 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 >> > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm >> 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 >> > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm >> 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 >> > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm >> 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 >> > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm >> 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 >> > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm >> 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 >> > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm >> 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 >> > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm >> 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 >> > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm >> 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 >> > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm >> 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 >> > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm >> 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 >> > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm >> 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 >> > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm >> 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 >> > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm >> 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 >> > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm >> 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 >> > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm >> 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 >> > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm >> 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 >> > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm >> 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 >> > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm >> 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 >> > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm >> 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 >> > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm >> 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 >> > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm >> 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 >> > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm >> 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 >> > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm >> 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 >> > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm >> 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 >> > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm >> 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 >> > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm >> 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 >> > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm >> 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 >> > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm >> 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 >> > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm >> 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 >> > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm >> 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 >> > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm >> 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 >> > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm >> 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 >> > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm >> 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 >> > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm >> 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 >> > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm >> 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 >> > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm >> 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 >> > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm >> 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 >> > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm >> 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 >> > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm >> 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 >> > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm >> 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 >> > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm >> 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 >> > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm >> 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 >> > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm >> 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 >> > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm >> 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 >> > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm >> 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 >> > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm >> 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 >> > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm >> 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 >> > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm >> 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 >> > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm >> 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 >> > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm >> 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 >> > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm >> 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 >> > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm >> 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 >> > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm >> 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 >> > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm >> 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 >> > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm >> 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 >> > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm >> 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 >> > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm >> 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 >> > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm >> 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 >> > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm >> 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 >> > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm >> 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 >> > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm >> 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 >> > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm >> 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 >> > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm >> 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 >> > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm >> 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 >> > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm >> 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 >> > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm >> 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 >> > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm >> 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 >> > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm >> 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 >> > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm >> 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 >> > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm >> 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 >> > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm >> 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 >> > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm >> 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 >> > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm >> 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 >> > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm >> 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 >> > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm >> 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 >> > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm >> 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 >> > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm >> 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 >> > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm >> 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 >> > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm >> 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 >> > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm >> 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 >> > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm >> 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 >> > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm >> 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 >> > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm >> 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 >> > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm >> 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 >> > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm >> 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 >> > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm >> 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 >> > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm >> 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 >> > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm >> 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 >> > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm >> 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 >> > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm >> 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 >> > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm >> 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 >> > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm >> 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 >> > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm >> 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 >> > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm >> 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 >> > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm >> 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 >> > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm >> 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 >> > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm >> 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 >> > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm >> 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 >> > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm >> 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 >> > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm >> 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 >> > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm >> 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 >> > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm >> 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 >> > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm >> 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 >> > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm >> 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 >> > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm >> 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 >> > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm >> 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 >> > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm >> 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 >> > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm >> 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 >> > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm >> 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 >> > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm >> 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 >> > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm >> 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 >> > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm >> 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 >> > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm >> 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 >> > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm >> 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 >> > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm >> 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 >> > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm >> 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 >> > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm >> 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 >> > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm >> 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 >> > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm >> 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 >> > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm >> 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 >> > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm >> 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 >> > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm >> 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 >> > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm >> 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 >> > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm >> 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 >> > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm >> 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 >> > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm >> 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 >> > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm >> 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 >> > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm >> 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 >> > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm >> 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 >> > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm >> 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 >> > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm >> 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 >> > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm >> 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 >> > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm >> 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 >> > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm >> 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 >> > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm >> 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 >> > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm >> 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 >> > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm >> 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 >> > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm >> 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 >> > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm >> 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 >> > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm >> 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 >> > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm >> 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 >> > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm >> 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 >> > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm >> 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 >> > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm >> 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 >> > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm >> 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 >> > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm >> 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 >> > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm >> 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 >> > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm >> 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 >> > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm >> 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 >> > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm >> 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 >> > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm >> 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 >> > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm >> 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 >> > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm >> 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 >> > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm >> 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 >> > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm >> 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 >> > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm >> 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 >> > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm >> 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 >> > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm >> 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 >> > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm >> 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 >> > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm >> 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 >> > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm >> 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 >> > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm >> 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 >> > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm >> 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 >> > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm >> 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 >> > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm >> 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 >> > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm >> 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 >> > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm >> 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 >> > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm >> 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 >> > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm >> 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 >> > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm >> 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 >> > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm >> 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 >> > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm >> 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 >> > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm >> 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 >> > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm >> 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 >> > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm >> 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 >> > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm >> 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 >> > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm >> 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 >> > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm >> 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 >> > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm >> 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 >> > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm >> 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 >> > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm >> 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 >> > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm >> 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 >> > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm >> 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 >> > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm >> 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 >> > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm >> 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 >> > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm >> 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 >> > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm >> 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 >> > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm >> 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 >> > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm >> 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 >> > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm >> 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 >> > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm >> 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 >> > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm >> 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 >> > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm >> 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 >> > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm >> 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 >> > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm >> 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 >> > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm >> 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 >> > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm >> 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 >> > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm >> 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 >> > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm >> 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 >> > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm >> 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 >> > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm >> 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 >> > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm >> 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 >> > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm >> 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 >> > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm >> 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 >> > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm >> 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 >> > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm >> 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 >> > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm >> 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 >> > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm >> 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 >> > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm >> 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 >> > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm >> 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 >> > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm >> 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 >> > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm >> 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 >> > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm >> 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 >> > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm >> 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 >> > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm >> 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 >> > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm >> 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 >> > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm >> 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 >> > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm >> 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 >> > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm >> 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 >> > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm >> 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 >> > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm >> 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 >> > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm >> 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 >> > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm >> 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 >> > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm >> 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 >> > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm >> 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 >> > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm >> 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 >> > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm >> 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 >> > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm >> 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 >> > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm >> 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 >> > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm >> 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 >> > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm >> 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 >> > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm >> 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 >> > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm >> 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 >> > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm >> 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 >> > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm >> 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 >> > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm >> 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 >> > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm >> 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 >> > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm >> 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 >> > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm >> 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 >> > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm >> 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 >> > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm >> 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 >> > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm >> 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 >> > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm >> 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 >> > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm >> 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 >> > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm >> 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 >> > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm >> 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 >> > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm >> 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 >> > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm >> 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 >> > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm >> 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 >> > Linear solve did not converge due to DIVERGED_ITS iterations 300 >> > KSP Object: 2 MPI processes >> > type: fgmres >> > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization >> > GMRES: happy breakdown tolerance 1e-30 >> > maximum iterations=300, initial guess is zero >> > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 >> > right preconditioning >> > using UNPRECONDITIONED norm type for convergence test >> > PC Object: 2 MPI processes >> > type: fieldsplit >> > FieldSplit with Schur preconditioner, factorization DIAG >> > Preconditioner for the Schur complement formed from Sp, an >> assembled approximation to S, which uses (lumped, if requested) A00's >> diagonal's inverse >> > Split info: >> > Split number 0 Defined by IS >> > Split number 1 Defined by IS >> > KSP solver for A00 block >> > KSP Object: (fieldsplit_u_) 2 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (fieldsplit_u_) 2 MPI processes >> > type: lu >> > LU: out-of-place factorization >> > tolerance for zero pivot 2.22045e-14 >> > matrix ordering: natural >> > factor fill ratio given 0, needed 0 >> > Factored matrix follows: >> > Mat Object: 2 MPI processes >> > type: mpiaij >> > rows=184326, cols=184326 >> > package used to perform factorization: mumps >> > total: nonzeros=4.03041e+08, allocated >> nonzeros=4.03041e+08 >> > total number of mallocs used during MatSetValues calls >> =0 >> > MUMPS run parameters: >> > SYM (matrix type): 0 >> > PAR (host participation): 1 >> > ICNTL(1) (output for error): 6 >> > ICNTL(2) (output of diagnostic msg): 0 >> > ICNTL(3) (output for global info): 0 >> > ICNTL(4) (level of printing): 0 >> > ICNTL(5) (input mat struct): 0 >> > ICNTL(6) (matrix prescaling): 7 >> > ICNTL(7) (sequentia matrix ordering):7 >> > ICNTL(8) (scalling strategy): 77 >> > ICNTL(10) (max num of refinements): 0 >> > ICNTL(11) (error analysis): 0 >> > ICNTL(12) (efficiency control): >> 1 >> > ICNTL(13) (efficiency control): >> 0 >> > ICNTL(14) (percentage of estimated workspace >> increase): 20 >> > ICNTL(18) (input mat struct): >> 3 >> > ICNTL(19) (Shur complement info): >> 0 >> > ICNTL(20) (rhs sparse pattern): >> 0 >> > ICNTL(21) (solution struct): >> 1 >> > ICNTL(22) (in-core/out-of-core facility): >> 0 >> > ICNTL(23) (max size of memory can be allocated >> locally):0 >> > ICNTL(24) (detection of null pivot rows): >> 0 >> > ICNTL(25) (computation of a null space basis): >> 0 >> > ICNTL(26) (Schur options for rhs or solution): >> 0 >> > ICNTL(27) (experimental parameter): >> -24 >> > ICNTL(28) (use parallel or sequential ordering): >> 1 >> > ICNTL(29) (parallel ordering): >> 0 >> > ICNTL(30) (user-specified set of entries in >> inv(A)): 0 >> > ICNTL(31) (factors is discarded in the solve >> phase): 0 >> > ICNTL(33) (compute determinant): >> 0 >> > CNTL(1) (relative pivoting threshold): 0.01 >> > CNTL(2) (stopping criterion of refinement): >> 1.49012e-08 >> > CNTL(3) (absolute pivoting threshold): 0 >> > CNTL(4) (value of static pivoting): -1 >> > CNTL(5) (fixation for null pivots): 0 >> > RINFO(1) (local estimated flops for the elimination >> after analysis): >> > [0] 5.59214e+11 >> > [1] 5.35237e+11 >> > RINFO(2) (local estimated flops for the assembly >> after factorization): >> > [0] 4.2839e+08 >> > [1] 3.799e+08 >> > RINFO(3) (local estimated flops for the elimination >> after factorization): >> > [0] 5.59214e+11 >> > [1] 5.35237e+11 >> > INFO(15) (estimated size of (in MB) MUMPS internal >> data for running numerical factorization): >> > [0] 2621 >> > [1] 2649 >> > INFO(16) (size of (in MB) MUMPS internal data used >> during numerical factorization): >> > [0] 2621 >> > [1] 2649 >> > INFO(23) (num of pivots eliminated on this >> processor after factorization): >> > [0] 90423 >> > [1] 93903 >> > RINFOG(1) (global estimated flops for the >> elimination after analysis): 1.09445e+12 >> > RINFOG(2) (global estimated flops for the assembly >> after factorization): 8.0829e+08 >> > RINFOG(3) (global estimated flops for the >> elimination after factorization): 1.09445e+12 >> > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >> (0,0)*(2^0) >> > INFOG(3) (estimated real workspace for factors on >> all processors after analysis): 403041366 >> > INFOG(4) (estimated integer workspace for factors >> on all processors after analysis): 2265748 >> > INFOG(5) (estimated maximum front size in the >> complete tree): 6663 >> > INFOG(6) (number of nodes in the complete tree): >> 2812 >> > INFOG(7) (ordering option effectively use after >> analysis): 5 >> > INFOG(8) (structural symmetry in percent of the >> permuted matrix after analysis): 100 >> > INFOG(9) (total real/complex workspace to store the >> matrix factors after factorization): 403041366 >> > INFOG(10) (total integer space store the matrix >> factors after factorization): 2265766 >> > INFOG(11) (order of largest frontal matrix after >> factorization): 6663 >> > INFOG(12) (number of off-diagonal pivots): 0 >> > INFOG(13) (number of delayed pivots after >> factorization): 0 >> > INFOG(14) (number of memory compress after >> factorization): 0 >> > INFOG(15) (number of steps of iterative refinement >> after solution): 0 >> > INFOG(16) (estimated size (in MB) of all MUMPS >> internal data for factorization after analysis: value on the most memory >> consuming processor): 2649 >> > INFOG(17) (estimated size of all MUMPS internal >> data for factorization after analysis: sum over all processors): 5270 >> > INFOG(18) (size of all MUMPS internal data >> allocated during factorization: value on the most memory consuming >> processor): 2649 >> > INFOG(19) (size of all MUMPS internal data >> allocated during factorization: sum over all processors): 5270 >> > INFOG(20) (estimated number of entries in the >> factors): 403041366 >> > INFOG(21) (size in MB of memory effectively used >> during factorization - value on the most memory consuming processor): 2121 >> > INFOG(22) (size in MB of memory effectively used >> during factorization - sum over all processors): 4174 >> > INFOG(23) (after analysis: value of ICNTL(6) >> effectively used): 0 >> > INFOG(24) (after analysis: value of ICNTL(12) >> effectively used): 1 >> > INFOG(25) (after factorization: number of pivots >> modified by static pivoting): 0 >> > INFOG(28) (after factorization: number of null >> pivots encountered): 0 >> > INFOG(29) (after factorization: effective number of >> entries in the factors (sum over all processors)): 403041366 >> > INFOG(30, 31) (after solution: size in Mbytes of >> memory used during solution phase): 2467, 4922 >> > INFOG(32) (after analysis: type of analysis done): 1 >> > INFOG(33) (value used for ICNTL(8)): 7 >> > INFOG(34) (exponent of the determinant if >> determinant is requested): 0 >> > linear system matrix = precond matrix: >> > Mat Object: (fieldsplit_u_) 2 MPI processes >> > type: mpiaij >> > rows=184326, cols=184326, bs=3 >> > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 >> > total number of mallocs used during MatSetValues calls =0 >> > using I-node (on process 0) routines: found 26829 nodes, >> limit used is 5 >> > KSP solver for S = A11 - A10 inv(A00) A01 >> > KSP Object: (fieldsplit_lu_) 2 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (fieldsplit_lu_) 2 MPI processes >> > type: lu >> > LU: out-of-place factorization >> > tolerance for zero pivot 2.22045e-14 >> > matrix ordering: natural >> > factor fill ratio given 0, needed 0 >> > Factored matrix follows: >> > Mat Object: 2 MPI processes >> > type: mpiaij >> > rows=2583, cols=2583 >> > package used to perform factorization: mumps >> > total: nonzeros=2.17621e+06, allocated >> nonzeros=2.17621e+06 >> > total number of mallocs used during MatSetValues calls >> =0 >> > MUMPS run parameters: >> > SYM (matrix type): 0 >> > PAR (host participation): 1 >> > ICNTL(1) (output for error): 6 >> > ICNTL(2) (output of diagnostic msg): 0 >> > ICNTL(3) (output for global info): 0 >> > ICNTL(4) (level of printing): 0 >> > ICNTL(5) (input mat struct): 0 >> > ICNTL(6) (matrix prescaling): 7 >> > ICNTL(7) (sequentia matrix ordering):7 >> > ICNTL(8) (scalling strategy): 77 >> > ICNTL(10) (max num of refinements): 0 >> > ICNTL(11) (error analysis): 0 >> > ICNTL(12) (efficiency control): >> 1 >> > ICNTL(13) (efficiency control): >> 0 >> > ICNTL(14) (percentage of estimated workspace >> increase): 20 >> > ICNTL(18) (input mat struct): >> 3 >> > ICNTL(19) (Shur complement info): >> 0 >> > ICNTL(20) (rhs sparse pattern): >> 0 >> > ICNTL(21) (solution struct): >> 1 >> > ICNTL(22) (in-core/out-of-core facility): >> 0 >> > ICNTL(23) (max size of memory can be allocated >> locally):0 >> > ICNTL(24) (detection of null pivot rows): >> 0 >> > ICNTL(25) (computation of a null space basis): >> 0 >> > ICNTL(26) (Schur options for rhs or solution): >> 0 >> > ICNTL(27) (experimental parameter): >> -24 >> > ICNTL(28) (use parallel or sequential ordering): >> 1 >> > ICNTL(29) (parallel ordering): >> 0 >> > ICNTL(30) (user-specified set of entries in >> inv(A)): 0 >> > ICNTL(31) (factors is discarded in the solve >> phase): 0 >> > ICNTL(33) (compute determinant): >> 0 >> > CNTL(1) (relative pivoting threshold): 0.01 >> > CNTL(2) (stopping criterion of refinement): >> 1.49012e-08 >> > CNTL(3) (absolute pivoting threshold): 0 >> > CNTL(4) (value of static pivoting): -1 >> > CNTL(5) (fixation for null pivots): 0 >> > RINFO(1) (local estimated flops for the elimination >> after analysis): >> > [0] 5.12794e+08 >> > [1] 5.02142e+08 >> > RINFO(2) (local estimated flops for the assembly >> after factorization): >> > [0] 815031 >> > [1] 745263 >> > RINFO(3) (local estimated flops for the elimination >> after factorization): >> > [0] 5.12794e+08 >> > [1] 5.02142e+08 >> > INFO(15) (estimated size of (in MB) MUMPS internal >> data for running numerical factorization): >> > [0] 34 >> > [1] 34 >> > INFO(16) (size of (in MB) MUMPS internal data used >> during numerical factorization): >> > [0] 34 >> > [1] 34 >> > INFO(23) (num of pivots eliminated on this >> processor after factorization): >> > [0] 1158 >> > [1] 1425 >> > RINFOG(1) (global estimated flops for the >> elimination after analysis): 1.01494e+09 >> > RINFOG(2) (global estimated flops for the assembly >> after factorization): 1.56029e+06 >> > RINFOG(3) (global estimated flops for the >> elimination after factorization): 1.01494e+09 >> > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >> (0,0)*(2^0) >> > INFOG(3) (estimated real workspace for factors on >> all processors after analysis): 2176209 >> > INFOG(4) (estimated integer workspace for factors >> on all processors after analysis): 14427 >> > INFOG(5) (estimated maximum front size in the >> complete tree): 699 >> > INFOG(6) (number of nodes in the complete tree): 15 >> > INFOG(7) (ordering option effectively use after >> analysis): 2 >> > INFOG(8) (structural symmetry in percent of the >> permuted matrix after analysis): 100 >> > INFOG(9) (total real/complex workspace to store the >> matrix factors after factorization): 2176209 >> > INFOG(10) (total integer space store the matrix >> factors after factorization): 14427 >> > INFOG(11) (order of largest frontal matrix after >> factorization): 699 >> > INFOG(12) (number of off-diagonal pivots): 0 >> > INFOG(13) (number of delayed pivots after >> factorization): 0 >> > INFOG(14) (number of memory compress after >> factorization): 0 >> > INFOG(15) (number of steps of iterative refinement >> after solution): 0 >> > INFOG(16) (estimated size (in MB) of all MUMPS >> internal data for factorization after analysis: value on the most memory >> consuming processor): 34 >> > INFOG(17) (estimated size of all MUMPS internal >> data for factorization after analysis: sum over all processors): 68 >> > INFOG(18) (size of all MUMPS internal data >> allocated during factorization: value on the most memory consuming >> processor): 34 >> > INFOG(19) (size of all MUMPS internal data >> allocated during factorization: sum over all processors): 68 >> > INFOG(20) (estimated number of entries in the >> factors): 2176209 >> > INFOG(21) (size in MB of memory effectively used >> during factorization - value on the most memory consuming processor): 30 >> > INFOG(22) (size in MB of memory effectively used >> during factorization - sum over all processors): 59 >> > INFOG(23) (after analysis: value of ICNTL(6) >> effectively used): 0 >> > INFOG(24) (after analysis: value of ICNTL(12) >> effectively used): 1 >> > INFOG(25) (after factorization: number of pivots >> modified by static pivoting): 0 >> > INFOG(28) (after factorization: number of null >> pivots encountered): 0 >> > INFOG(29) (after factorization: effective number of >> entries in the factors (sum over all processors)): 2176209 >> > INFOG(30, 31) (after solution: size in Mbytes of >> memory used during solution phase): 16, 32 >> > INFOG(32) (after analysis: type of analysis done): 1 >> > INFOG(33) (value used for ICNTL(8)): 7 >> > INFOG(34) (exponent of the determinant if >> determinant is requested): 0 >> > linear system matrix followed by preconditioner matrix: >> > Mat Object: (fieldsplit_lu_) 2 MPI processes >> > type: schurcomplement >> > rows=2583, cols=2583 >> > Schur complement A11 - A10 inv(A00) A01 >> > A11 >> > Mat Object: (fieldsplit_lu_) 2 >> MPI processes >> > type: mpiaij >> > rows=2583, cols=2583, bs=3 >> > total: nonzeros=117369, allocated nonzeros=117369 >> > total number of mallocs used during MatSetValues calls >> =0 >> > not using I-node (on process 0) routines >> > A10 >> > Mat Object: 2 MPI processes >> > type: mpiaij >> > rows=2583, cols=184326, rbs=3, cbs = 1 >> > total: nonzeros=292770, allocated nonzeros=292770 >> > total number of mallocs used during MatSetValues calls >> =0 >> > not using I-node (on process 0) routines >> > KSP of A00 >> > KSP Object: (fieldsplit_u_) 2 >> MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000 >> > left preconditioning >> > using NONE norm type for convergence test >> > PC Object: (fieldsplit_u_) 2 >> MPI processes >> > type: lu >> > LU: out-of-place factorization >> > tolerance for zero pivot 2.22045e-14 >> > matrix ordering: natural >> > factor fill ratio given 0, needed 0 >> > Factored matrix follows: >> > Mat Object: 2 MPI processes >> > type: mpiaij >> > rows=184326, cols=184326 >> > package used to perform factorization: mumps >> > total: nonzeros=4.03041e+08, allocated >> nonzeros=4.03041e+08 >> > total number of mallocs used during >> MatSetValues calls =0 >> > MUMPS run parameters: >> > SYM (matrix type): 0 >> > PAR (host participation): 1 >> > ICNTL(1) (output for error): 6 >> > ICNTL(2) (output of diagnostic msg): 0 >> > ICNTL(3) (output for global info): 0 >> > ICNTL(4) (level of printing): 0 >> > ICNTL(5) (input mat struct): 0 >> > ICNTL(6) (matrix prescaling): 7 >> > ICNTL(7) (sequentia matrix ordering):7 >> > ICNTL(8) (scalling strategy): 77 >> > ICNTL(10) (max num of refinements): 0 >> > ICNTL(11) (error analysis): 0 >> > ICNTL(12) (efficiency control): >> 1 >> > ICNTL(13) (efficiency control): >> 0 >> > ICNTL(14) (percentage of estimated >> workspace increase): 20 >> > ICNTL(18) (input mat struct): >> 3 >> > ICNTL(19) (Shur complement info): >> 0 >> > ICNTL(20) (rhs sparse pattern): >> 0 >> > ICNTL(21) (solution struct): >> 1 >> > ICNTL(22) (in-core/out-of-core facility): >> 0 >> > ICNTL(23) (max size of memory can be >> allocated locally):0 >> > ICNTL(24) (detection of null pivot rows): >> 0 >> > ICNTL(25) (computation of a null space >> basis): 0 >> > ICNTL(26) (Schur options for rhs or >> solution): 0 >> > ICNTL(27) (experimental parameter): >> -24 >> > ICNTL(28) (use parallel or sequential >> ordering): 1 >> > ICNTL(29) (parallel ordering): >> 0 >> > ICNTL(30) (user-specified set of entries in >> inv(A)): 0 >> > ICNTL(31) (factors is discarded in the >> solve phase): 0 >> > ICNTL(33) (compute determinant): >> 0 >> > CNTL(1) (relative pivoting threshold): >> 0.01 >> > CNTL(2) (stopping criterion of refinement): >> 1.49012e-08 >> > CNTL(3) (absolute pivoting threshold): >> 0 >> > CNTL(4) (value of static pivoting): >> -1 >> > CNTL(5) (fixation for null pivots): >> 0 >> > RINFO(1) (local estimated flops for the >> elimination after analysis): >> > [0] 5.59214e+11 >> > [1] 5.35237e+11 >> > RINFO(2) (local estimated flops for the >> assembly after factorization): >> > [0] 4.2839e+08 >> > [1] 3.799e+08 >> > RINFO(3) (local estimated flops for the >> elimination after factorization): >> > [0] 5.59214e+11 >> > [1] 5.35237e+11 >> > INFO(15) (estimated size of (in MB) MUMPS >> internal data for running numerical factorization): >> > [0] 2621 >> > [1] 2649 >> > INFO(16) (size of (in MB) MUMPS internal >> data used during numerical factorization): >> > [0] 2621 >> > [1] 2649 >> > INFO(23) (num of pivots eliminated on this >> processor after factorization): >> > [0] 90423 >> > [1] 93903 >> > RINFOG(1) (global estimated flops for the >> elimination after analysis): 1.09445e+12 >> > RINFOG(2) (global estimated flops for the >> assembly after factorization): 8.0829e+08 >> > RINFOG(3) (global estimated flops for the >> elimination after factorization): 1.09445e+12 >> > (RINFOG(12) RINFOG(13))*2^INFOG(34) >> (determinant): (0,0)*(2^0) >> > INFOG(3) (estimated real workspace for >> factors on all processors after analysis): 403041366 >> > INFOG(4) (estimated integer workspace for >> factors on all processors after analysis): 2265748 >> > INFOG(5) (estimated maximum front size in >> the complete tree): 6663 >> > INFOG(6) (number of nodes in the complete >> tree): 2812 >> > INFOG(7) (ordering option effectively use >> after analysis): 5 >> > INFOG(8) (structural symmetry in percent of >> the permuted matrix after analysis): 100 >> > INFOG(9) (total real/complex workspace to >> store the matrix factors after factorization): 403041366 >> > INFOG(10) (total integer space store the >> matrix factors after factorization): 2265766 >> > INFOG(11) (order of largest frontal matrix >> after factorization): 6663 >> > INFOG(12) (number of off-diagonal pivots): 0 >> > INFOG(13) (number of delayed pivots after >> factorization): 0 >> > INFOG(14) (number of memory compress after >> factorization): 0 >> > INFOG(15) (number of steps of iterative >> refinement after solution): 0 >> > INFOG(16) (estimated size (in MB) of all >> MUMPS internal data for factorization after analysis: value on the most >> memory consuming processor): 2649 >> > INFOG(17) (estimated size of all MUMPS >> internal data for factorization after analysis: sum over all processors): >> 5270 >> > INFOG(18) (size of all MUMPS internal data >> allocated during factorization: value on the most memory consuming >> processor): 2649 >> > INFOG(19) (size of all MUMPS internal data >> allocated during factorization: sum over all processors): 5270 >> > INFOG(20) (estimated number of entries in >> the factors): 403041366 >> > INFOG(21) (size in MB of memory effectively >> used during factorization - value on the most memory consuming processor): >> 2121 >> > INFOG(22) (size in MB of memory effectively >> used during factorization - sum over all processors): 4174 >> > INFOG(23) (after analysis: value of >> ICNTL(6) effectively used): 0 >> > INFOG(24) (after analysis: value of >> ICNTL(12) effectively used): 1 >> > INFOG(25) (after factorization: number of >> pivots modified by static pivoting): 0 >> > INFOG(28) (after factorization: number of >> null pivots encountered): 0 >> > INFOG(29) (after factorization: effective >> number of entries in the factors (sum over all processors)): 403041366 >> > INFOG(30, 31) (after solution: size in >> Mbytes of memory used during solution phase): 2467, 4922 >> > INFOG(32) (after analysis: type of analysis >> done): 1 >> > INFOG(33) (value used for ICNTL(8)): 7 >> > INFOG(34) (exponent of the determinant if >> determinant is requested): 0 >> > linear system matrix = precond matrix: >> > Mat Object: (fieldsplit_u_) >> 2 MPI processes >> > type: mpiaij >> > rows=184326, cols=184326, bs=3 >> > total: nonzeros=3.32649e+07, allocated >> nonzeros=3.32649e+07 >> > total number of mallocs used during MatSetValues >> calls =0 >> > using I-node (on process 0) routines: found 26829 >> nodes, limit used is 5 >> > A01 >> > Mat Object: 2 MPI processes >> > type: mpiaij >> > rows=184326, cols=2583, rbs=3, cbs = 1 >> > total: nonzeros=292770, allocated nonzeros=292770 >> > total number of mallocs used during MatSetValues calls >> =0 >> > using I-node (on process 0) routines: found 16098 >> nodes, limit used is 5 >> > Mat Object: 2 MPI processes >> > type: mpiaij >> > rows=2583, cols=2583, rbs=3, cbs = 1 >> > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 >> > total number of mallocs used during MatSetValues calls =0 >> > not using I-node (on process 0) routines >> > linear system matrix = precond matrix: >> > Mat Object: 2 MPI processes >> > type: mpiaij >> > rows=186909, cols=186909 >> > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 >> > total number of mallocs used during MatSetValues calls =0 >> > using I-node (on process 0) routines: found 26829 nodes, limit >> used is 5 >> > KSPSolve completed >> > >> > >> > Giang >> > >> > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley >> wrote: >> > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui >> wrote: >> > Hello >> > >> > I'm solving an indefinite problem arising from mesh tying/contact using >> Lagrange multiplier, the matrix has the form >> > >> > K = [A P^T >> > P 0] >> > >> > I used the FIELDSPLIT preconditioner with one field is the main >> variable (displacement) and the other field for dual variable (Lagrange >> multiplier). The block size for each field is 3. According to the manual, I >> first chose the preconditioner based on Schur complement to treat this >> problem. >> > >> > >> > For any solver question, please send us the output of >> > >> > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >> > >> > >> > However, I will comment below >> > >> > The parameters used for the solve is >> > -ksp_type gmres >> > >> > You need 'fgmres' here with the options you have below. >> > >> > -ksp_max_it 300 >> > -ksp_gmres_restart 300 >> > -ksp_gmres_modifiedgramschmidt >> > -pc_fieldsplit_type schur >> > -pc_fieldsplit_schur_fact_type diag >> > -pc_fieldsplit_schur_precondition selfp >> > >> > >> > >> > It could be taking time in the MatMatMult() here if that matrix is >> dense. Is there any reason to >> > believe that is a good preconditioner for your problem? >> > >> > >> > -pc_fieldsplit_detect_saddle_point >> > -fieldsplit_u_pc_type hypre >> > >> > I would just use MUMPS here to start, especially if it works on the >> whole problem. Same with the one below. >> > >> > Matt >> > >> > -fieldsplit_u_pc_hypre_type boomeramg >> > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS >> > -fieldsplit_lu_pc_type hypre >> > -fieldsplit_lu_pc_hypre_type boomeramg >> > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS >> > >> > For the test case, a small problem is solved on 2 processes. Due to the >> decomposition, the contact only happens in 1 proc, so the size of Lagrange >> multiplier dofs on proc 0 is 0. >> > >> > 0: mIndexU.size(): 80490 >> > 0: mIndexLU.size(): 0 >> > 1: mIndexU.size(): 103836 >> > 1: mIndexLU.size(): 2583 >> > >> > However, with this setup the solver takes very long at KSPSolve before >> going to iteration, and the first iteration seems forever so I have to stop >> the calculation. I guessed that the solver takes time to compute the Schur >> complement, but according to the manual only the diagonal of A is used to >> approximate the Schur complement, so it should not take long to compute >> this. >> > >> > Note that I ran the same problem with direct solver (MUMPS) and it's >> able to produce the valid results. The parameter for the solve is pretty >> standard >> > -ksp_type preonly >> > -pc_type lu >> > -pc_factor_mat_solver_package mumps >> > >> > Hence the matrix/rhs must not have any problem here. Do you have any >> idea or suggestion for this case? >> > >> > >> > Giang >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 15 07:58:45 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Sep 2016 07:58:45 -0500 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: On Thu, Sep 15, 2016 at 4:23 AM, Ji Zhang wrote: > Thanks Matt. It works well for signal core. But is there any solution if I > need a MPI program? > It unclear what the stuff below would mean in parallel. If you want to assemble several blocks of a parallel matrix that looks like serial matrices, then use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html Thanks, Matt > Thanks. > > Wayne > > On Tue, Sep 13, 2016 at 9:30 AM, Matthew Knepley > wrote: > >> On Mon, Sep 12, 2016 at 8:24 PM, Ji Zhang wrote: >> >>> Dear all, >>> >>> I'm using petsc4py and now face some problems. >>> I have a number of small petsc dense matrices mij, and I want to >>> construct them to a big matrix M like this: >>> >>> [ m11 m12 m13 ] >>> M = | m21 m22 m23 | , >>> [ m31 m32 m33 ] >>> How could I do it effectively? >>> >>> Now I'm using the code below: >>> >>> # get indexes of matrix mij >>> index1_begin, index1_end = getindex_i( ) >>> index2_begin, index2_end = getindex_j( ) >>> M[index1_begin:index1_end, index2_begin:index2_end] = mij[:, :] >>> which report such error messages: >>> >>> petsc4py.PETSc.Error: error code 56 >>> [0] MatGetValues() line 1818 in /home/zhangji/PycharmProjects/ >>> petsc-petsc-31a1859eaff6/src/mat/interface/matrix.c >>> [0] MatGetValues_MPIDense() line 154 in >>> /home/zhangji/PycharmProjects/petsc-petsc-31a1859eaff6/src/m >>> at/impls/dense/mpi/mpidense.c >>> >> >> Make M a sequential dense matrix. >> >> Matt >> >> >>> [0] No support for this operation for this object type >>> [0] Only local values currently supported >>> >>> Thanks. >>> >>> >>> 2016-09-13 >>> Best, >>> Regards, >>> Zhang Ji >>> Beijing Computational Science Research Center >>> E-mail: gotofd at gmail.com >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Thu Sep 15 09:07:30 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Thu, 15 Sep 2016 16:07:30 +0200 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: Hi Matt Thanks for the comment. After looking carefully into the manual again, the key take away is that with selfp there is no option to compute the exact Schur, there are only two options to approximate the inv(A00) for selfp, which are lump and diag (diag by default). I misunderstood this previously. There is online manual entry mentioned about PC_FIELDSPLIT_SCHUR_PRE_FULL, which is not documented elsewhere in the offline manual. I tried to access that by setting -pc_fieldsplit_schur_precondition full but it gives the error [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Arguments are incompatible [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with B, seqaij [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui Thu Sep 15 15:46:56 2016 [0]PETSC ERROR: Configure options --with-shared-libraries --with-debugging=0 --with-pic --download-fblaslapack=yes --download-suitesparse --download-ptscotch=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes --download-mumps=yes --download-hypre=yes --download-ml=yes --download-pastix=yes --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 [0]PETSC ERROR: #1 MatMatMult() line 9514 in /home/hbui/sw/petsc-3.7.3/src/mat/interface/matrix.c [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line 526 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c Please excuse me to insist on forming the exact Schur complement, but as you said, I would like to track down what creates problem in my code by starting from a very exact but ineffective solution. Giang On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley wrote: > On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui > wrote: > >> Dear Barry >> >> Thanks for the clarification. I got exactly what you said if the code >> changed to >> ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); >> Residual norms for stokes_ solve. >> 0 KSP Residual norm 1.327791371202e-02 >> Residual norms for stokes_fieldsplit_p_ solve. >> 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm >> 0.000000000000e+00 ||r(i)||/||b|| -nan >> 1 KSP Residual norm 3.997711925708e-17 >> >> but I guess we solve a different problem if B is used for the linear >> system. >> >> in addition, changed to >> ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); >> also works but inner iteration converged not in one iteration >> >> Residual norms for stokes_ solve. >> 0 KSP Residual norm 1.327791371202e-02 >> Residual norms for stokes_fieldsplit_p_ solve. >> 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm >> 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm >> 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 >> 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm >> 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 >> 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm >> 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 >> 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm >> 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 >> 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm >> 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 >> 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm >> 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 >> 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm >> 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 >> 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm >> 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 >> 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm >> 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 >> 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm >> 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 >> 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm >> 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 >> 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm >> 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 >> 1 KSP Residual norm 1.445282647127e-16 >> >> >> Seem like zero pivot does not happen, but why the solver for Schur takes >> 13 steps if the preconditioner is direct solver? >> > > Look at the -ksp_view. I will bet that the default is to shift (add a > multiple of the identity) the matrix instead of failing. This > gives an inexact PC, but as you see it can converge. > > Thanks, > > Matt > > >> >> I also so tried another problem which I known does have a nonsingular >> Schur (at least A11 != 0) and it also have the same problem: 1 step outer >> convergence but multiple step inner convergence. >> >> Any ideas? >> >> Giang >> >> On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: >> >>> >>> Normally you'd be absolutely correct to expect convergence in one >>> iteration. However in this example note the call >>> >>> ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); >>> >>> It is solving the linear system defined by A but building the >>> preconditioner (i.e. the entire fieldsplit process) from a different matrix >>> B. Since A is not B you should not expect convergence in one iteration. If >>> you change the code to >>> >>> ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); >>> >>> you will see exactly what you expect, convergence in one iteration. >>> >>> Sorry about this, the example is lacking clarity and documentation its >>> author obviously knew too well what he was doing that he didn't realize >>> everyone else in the world would need more comments in the code. If you >>> change the code to >>> >>> ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); >>> >>> it will stop without being able to build the preconditioner because LU >>> factorization of the Sp matrix will result in a zero pivot. This is why >>> this "auxiliary" matrix B is used to define the preconditioner instead of A. >>> >>> Barry >>> >>> >>> >>> >>> > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui >>> wrote: >>> > >>> > Sorry I slept quite a while in this thread. Now I start to look at it >>> again. In the last try, the previous setting doesn't work either (in fact >>> diverge). So I would speculate if the Schur complement in my case is >>> actually not invertible. It's also possible that the code is wrong >>> somewhere. However, before looking at that, I want to understand thoroughly >>> the settings for Schur complement >>> > >>> > I experimented ex42 with the settings: >>> > mpirun -np 1 ex42 \ >>> > -stokes_ksp_monitor \ >>> > -stokes_ksp_type fgmres \ >>> > -stokes_pc_type fieldsplit \ >>> > -stokes_pc_fieldsplit_type schur \ >>> > -stokes_pc_fieldsplit_schur_fact_type full \ >>> > -stokes_pc_fieldsplit_schur_precondition selfp \ >>> > -stokes_fieldsplit_u_ksp_type preonly \ >>> > -stokes_fieldsplit_u_pc_type lu \ >>> > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ >>> > -stokes_fieldsplit_p_ksp_type gmres \ >>> > -stokes_fieldsplit_p_ksp_monitor_true_residual \ >>> > -stokes_fieldsplit_p_ksp_max_it 300 \ >>> > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ >>> > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ >>> > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ >>> > -stokes_fieldsplit_p_pc_type lu \ >>> > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps >>> > >>> > In my understanding, the solver should converge in 1 (outer) step. >>> Execution gives: >>> > Residual norms for stokes_ solve. >>> > 0 KSP Residual norm 1.327791371202e-02 >>> > Residual norms for stokes_fieldsplit_p_ solve. >>> > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm >>> 0.000000000000e+00 ||r(i)||/||b|| -nan >>> > 1 KSP Residual norm 7.656238881621e-04 >>> > Residual norms for stokes_fieldsplit_p_ solve. >>> > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm >>> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>> > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm >>> 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 >>> > 2 KSP Residual norm 9.895645456398e-06 >>> > Residual norms for stokes_fieldsplit_p_ solve. >>> > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm >>> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>> > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm >>> 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 >>> > 3 KSP Residual norm 1.608206702571e-06 >>> > Residual norms for stokes_fieldsplit_p_ solve. >>> > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm >>> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>> > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm >>> 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 >>> > 4 KSP Residual norm 2.453618999882e-07 >>> > Residual norms for stokes_fieldsplit_p_ solve. >>> > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm >>> 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>> > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm >>> 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 >>> > 5 KSP Residual norm 4.230399264750e-08 >>> > >>> > Looks like the "selfp" does construct the Schur nicely. But does >>> "full" really construct the full block preconditioner? >>> > >>> > Giang >>> > P/S: I'm also generating a smaller size of the previous problem for >>> checking again. >>> > >>> > >>> > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley >>> wrote: >>> > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui >>> wrote: >>> > >>> > It could be taking time in the MatMatMult() here if that matrix is >>> dense. Is there any reason to >>> > believe that is a good preconditioner for your problem? >>> > >>> > This is the first approach to the problem, so I chose the most simple >>> setting. Do you have any other recommendation? >>> > >>> > This is in no way the simplest PC. We need to make it simpler first. >>> > >>> > 1) Run on only 1 proc >>> > >>> > 2) Use -pc_fieldsplit_schur_fact_type full >>> > >>> > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_tru >>> e_residual >>> > >>> > This should converge in 1 outer iteration, but we will see how good >>> your Schur complement preconditioner >>> > is for this problem. >>> > >>> > You need to start out from something you understand and then start >>> making approximations. >>> > >>> > Matt >>> > >>> > For any solver question, please send us the output of >>> > >>> > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>> > >>> > >>> > I sent here the full output (after changed to fgmres), again it takes >>> long at the first iteration but after that, it does not converge >>> > >>> > -ksp_type fgmres >>> > -ksp_max_it 300 >>> > -ksp_gmres_restart 300 >>> > -ksp_gmres_modifiedgramschmidt >>> > -pc_fieldsplit_type schur >>> > -pc_fieldsplit_schur_fact_type diag >>> > -pc_fieldsplit_schur_precondition selfp >>> > -pc_fieldsplit_detect_saddle_point >>> > -fieldsplit_u_ksp_type preonly >>> > -fieldsplit_u_pc_type lu >>> > -fieldsplit_u_pc_factor_mat_solver_package mumps >>> > -fieldsplit_lu_ksp_type preonly >>> > -fieldsplit_lu_pc_type lu >>> > -fieldsplit_lu_pc_factor_mat_solver_package mumps >>> > >>> > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm >>> 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 >>> > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm >>> 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 >>> > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm >>> 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 >>> > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm >>> 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 >>> > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm >>> 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 >>> > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm >>> 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 >>> > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm >>> 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 >>> > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm >>> 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 >>> > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm >>> 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 >>> > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm >>> 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 >>> > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm >>> 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 >>> > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm >>> 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 >>> > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm >>> 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 >>> > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm >>> 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 >>> > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm >>> 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 >>> > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm >>> 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 >>> > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm >>> 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 >>> > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm >>> 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 >>> > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm >>> 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 >>> > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm >>> 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 >>> > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm >>> 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 >>> > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm >>> 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 >>> > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm >>> 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 >>> > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm >>> 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 >>> > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm >>> 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 >>> > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm >>> 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 >>> > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm >>> 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 >>> > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm >>> 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 >>> > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm >>> 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 >>> > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm >>> 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 >>> > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm >>> 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 >>> > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm >>> 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 >>> > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm >>> 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 >>> > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm >>> 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 >>> > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm >>> 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 >>> > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm >>> 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 >>> > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm >>> 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 >>> > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm >>> 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 >>> > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm >>> 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 >>> > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm >>> 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 >>> > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm >>> 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 >>> > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm >>> 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 >>> > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm >>> 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 >>> > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm >>> 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 >>> > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm >>> 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 >>> > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm >>> 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 >>> > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm >>> 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 >>> > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm >>> 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 >>> > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm >>> 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 >>> > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm >>> 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 >>> > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm >>> 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 >>> > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm >>> 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 >>> > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm >>> 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 >>> > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm >>> 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 >>> > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm >>> 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 >>> > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm >>> 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 >>> > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm >>> 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 >>> > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm >>> 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 >>> > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm >>> 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 >>> > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm >>> 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 >>> > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm >>> 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 >>> > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm >>> 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 >>> > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm >>> 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 >>> > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm >>> 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 >>> > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm >>> 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 >>> > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm >>> 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 >>> > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm >>> 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 >>> > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm >>> 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 >>> > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm >>> 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 >>> > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm >>> 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 >>> > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm >>> 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 >>> > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm >>> 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 >>> > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm >>> 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 >>> > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm >>> 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 >>> > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm >>> 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 >>> > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm >>> 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 >>> > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm >>> 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 >>> > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm >>> 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 >>> > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm >>> 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 >>> > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm >>> 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 >>> > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm >>> 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 >>> > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm >>> 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 >>> > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm >>> 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 >>> > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm >>> 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 >>> > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm >>> 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 >>> > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm >>> 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 >>> > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm >>> 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 >>> > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm >>> 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 >>> > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm >>> 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 >>> > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm >>> 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 >>> > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm >>> 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 >>> > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm >>> 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 >>> > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm >>> 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 >>> > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm >>> 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 >>> > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm >>> 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 >>> > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm >>> 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 >>> > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm >>> 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 >>> > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm >>> 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 >>> > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm >>> 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 >>> > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm >>> 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 >>> > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm >>> 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 >>> > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm >>> 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 >>> > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm >>> 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 >>> > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm >>> 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 >>> > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm >>> 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 >>> > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm >>> 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 >>> > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm >>> 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 >>> > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm >>> 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 >>> > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm >>> 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 >>> > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm >>> 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 >>> > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm >>> 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 >>> > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm >>> 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 >>> > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm >>> 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 >>> > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm >>> 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 >>> > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm >>> 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 >>> > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm >>> 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 >>> > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm >>> 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 >>> > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm >>> 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 >>> > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm >>> 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 >>> > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm >>> 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 >>> > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm >>> 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 >>> > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm >>> 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 >>> > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm >>> 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 >>> > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm >>> 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 >>> > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm >>> 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 >>> > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm >>> 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 >>> > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm >>> 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 >>> > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm >>> 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 >>> > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm >>> 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 >>> > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm >>> 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 >>> > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm >>> 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 >>> > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm >>> 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 >>> > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm >>> 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 >>> > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm >>> 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 >>> > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm >>> 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 >>> > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm >>> 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 >>> > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm >>> 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 >>> > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm >>> 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 >>> > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm >>> 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 >>> > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm >>> 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 >>> > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm >>> 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 >>> > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm >>> 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 >>> > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm >>> 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 >>> > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm >>> 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 >>> > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm >>> 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 >>> > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm >>> 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 >>> > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm >>> 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 >>> > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm >>> 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 >>> > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm >>> 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 >>> > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm >>> 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 >>> > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm >>> 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 >>> > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm >>> 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 >>> > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm >>> 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 >>> > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm >>> 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 >>> > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm >>> 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 >>> > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm >>> 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 >>> > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm >>> 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 >>> > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm >>> 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 >>> > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm >>> 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 >>> > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm >>> 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 >>> > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm >>> 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 >>> > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm >>> 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 >>> > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm >>> 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 >>> > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm >>> 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 >>> > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm >>> 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 >>> > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm >>> 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 >>> > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm >>> 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 >>> > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm >>> 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 >>> > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm >>> 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 >>> > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm >>> 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 >>> > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm >>> 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 >>> > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm >>> 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 >>> > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm >>> 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 >>> > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm >>> 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 >>> > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm >>> 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 >>> > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm >>> 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 >>> > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm >>> 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 >>> > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm >>> 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 >>> > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm >>> 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 >>> > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm >>> 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 >>> > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm >>> 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 >>> > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm >>> 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 >>> > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm >>> 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 >>> > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm >>> 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 >>> > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm >>> 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 >>> > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm >>> 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 >>> > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm >>> 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 >>> > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm >>> 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 >>> > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm >>> 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 >>> > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm >>> 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 >>> > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm >>> 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 >>> > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm >>> 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 >>> > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm >>> 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 >>> > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm >>> 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 >>> > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm >>> 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 >>> > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm >>> 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 >>> > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm >>> 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 >>> > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm >>> 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 >>> > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm >>> 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 >>> > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm >>> 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 >>> > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm >>> 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 >>> > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm >>> 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 >>> > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm >>> 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 >>> > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm >>> 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 >>> > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm >>> 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 >>> > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm >>> 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 >>> > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm >>> 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 >>> > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm >>> 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 >>> > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm >>> 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 >>> > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm >>> 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 >>> > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm >>> 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 >>> > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm >>> 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 >>> > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm >>> 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 >>> > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm >>> 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 >>> > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm >>> 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 >>> > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm >>> 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 >>> > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm >>> 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 >>> > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm >>> 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 >>> > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm >>> 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 >>> > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm >>> 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 >>> > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm >>> 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 >>> > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm >>> 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 >>> > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm >>> 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 >>> > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm >>> 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 >>> > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm >>> 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 >>> > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm >>> 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 >>> > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm >>> 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 >>> > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm >>> 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 >>> > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm >>> 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 >>> > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm >>> 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 >>> > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm >>> 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 >>> > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm >>> 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 >>> > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm >>> 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 >>> > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm >>> 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 >>> > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm >>> 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 >>> > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm >>> 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 >>> > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm >>> 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 >>> > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm >>> 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 >>> > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm >>> 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 >>> > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm >>> 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 >>> > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm >>> 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 >>> > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm >>> 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 >>> > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm >>> 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 >>> > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm >>> 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 >>> > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm >>> 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 >>> > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm >>> 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 >>> > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm >>> 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 >>> > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm >>> 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 >>> > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm >>> 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 >>> > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm >>> 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 >>> > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm >>> 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 >>> > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm >>> 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 >>> > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm >>> 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 >>> > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm >>> 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 >>> > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm >>> 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 >>> > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm >>> 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 >>> > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm >>> 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 >>> > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm >>> 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 >>> > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm >>> 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 >>> > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm >>> 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 >>> > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm >>> 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 >>> > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm >>> 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 >>> > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm >>> 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 >>> > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm >>> 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 >>> > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm >>> 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 >>> > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm >>> 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 >>> > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm >>> 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 >>> > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm >>> 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 >>> > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm >>> 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 >>> > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm >>> 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 >>> > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm >>> 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 >>> > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm >>> 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 >>> > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm >>> 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 >>> > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm >>> 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 >>> > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm >>> 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 >>> > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm >>> 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 >>> > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm >>> 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 >>> > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm >>> 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 >>> > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm >>> 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 >>> > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm >>> 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 >>> > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm >>> 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 >>> > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm >>> 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 >>> > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm >>> 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 >>> > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm >>> 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 >>> > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm >>> 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 >>> > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm >>> 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 >>> > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm >>> 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 >>> > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm >>> 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 >>> > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm >>> 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 >>> > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm >>> 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 >>> > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm >>> 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 >>> > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm >>> 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 >>> > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm >>> 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 >>> > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm >>> 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 >>> > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm >>> 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 >>> > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm >>> 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 >>> > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm >>> 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 >>> > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm >>> 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 >>> > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm >>> 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 >>> > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm >>> 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 >>> > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm >>> 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 >>> > Linear solve did not converge due to DIVERGED_ITS iterations 300 >>> > KSP Object: 2 MPI processes >>> > type: fgmres >>> > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization >>> > GMRES: happy breakdown tolerance 1e-30 >>> > maximum iterations=300, initial guess is zero >>> > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 >>> > right preconditioning >>> > using UNPRECONDITIONED norm type for convergence test >>> > PC Object: 2 MPI processes >>> > type: fieldsplit >>> > FieldSplit with Schur preconditioner, factorization DIAG >>> > Preconditioner for the Schur complement formed from Sp, an >>> assembled approximation to S, which uses (lumped, if requested) A00's >>> diagonal's inverse >>> > Split info: >>> > Split number 0 Defined by IS >>> > Split number 1 Defined by IS >>> > KSP solver for A00 block >>> > KSP Object: (fieldsplit_u_) 2 MPI processes >>> > type: preonly >>> > maximum iterations=10000, initial guess is zero >>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> > left preconditioning >>> > using NONE norm type for convergence test >>> > PC Object: (fieldsplit_u_) 2 MPI processes >>> > type: lu >>> > LU: out-of-place factorization >>> > tolerance for zero pivot 2.22045e-14 >>> > matrix ordering: natural >>> > factor fill ratio given 0, needed 0 >>> > Factored matrix follows: >>> > Mat Object: 2 MPI processes >>> > type: mpiaij >>> > rows=184326, cols=184326 >>> > package used to perform factorization: mumps >>> > total: nonzeros=4.03041e+08, allocated >>> nonzeros=4.03041e+08 >>> > total number of mallocs used during MatSetValues calls >>> =0 >>> > MUMPS run parameters: >>> > SYM (matrix type): 0 >>> > PAR (host participation): 1 >>> > ICNTL(1) (output for error): 6 >>> > ICNTL(2) (output of diagnostic msg): 0 >>> > ICNTL(3) (output for global info): 0 >>> > ICNTL(4) (level of printing): 0 >>> > ICNTL(5) (input mat struct): 0 >>> > ICNTL(6) (matrix prescaling): 7 >>> > ICNTL(7) (sequentia matrix ordering):7 >>> > ICNTL(8) (scalling strategy): 77 >>> > ICNTL(10) (max num of refinements): 0 >>> > ICNTL(11) (error analysis): 0 >>> > ICNTL(12) (efficiency control): >>> 1 >>> > ICNTL(13) (efficiency control): >>> 0 >>> > ICNTL(14) (percentage of estimated workspace >>> increase): 20 >>> > ICNTL(18) (input mat struct): >>> 3 >>> > ICNTL(19) (Shur complement info): >>> 0 >>> > ICNTL(20) (rhs sparse pattern): >>> 0 >>> > ICNTL(21) (solution struct): >>> 1 >>> > ICNTL(22) (in-core/out-of-core facility): >>> 0 >>> > ICNTL(23) (max size of memory can be allocated >>> locally):0 >>> > ICNTL(24) (detection of null pivot rows): >>> 0 >>> > ICNTL(25) (computation of a null space basis): >>> 0 >>> > ICNTL(26) (Schur options for rhs or solution): >>> 0 >>> > ICNTL(27) (experimental parameter): >>> -24 >>> > ICNTL(28) (use parallel or sequential ordering): >>> 1 >>> > ICNTL(29) (parallel ordering): >>> 0 >>> > ICNTL(30) (user-specified set of entries in >>> inv(A)): 0 >>> > ICNTL(31) (factors is discarded in the solve >>> phase): 0 >>> > ICNTL(33) (compute determinant): >>> 0 >>> > CNTL(1) (relative pivoting threshold): 0.01 >>> > CNTL(2) (stopping criterion of refinement): >>> 1.49012e-08 >>> > CNTL(3) (absolute pivoting threshold): 0 >>> > CNTL(4) (value of static pivoting): -1 >>> > CNTL(5) (fixation for null pivots): 0 >>> > RINFO(1) (local estimated flops for the >>> elimination after analysis): >>> > [0] 5.59214e+11 >>> > [1] 5.35237e+11 >>> > RINFO(2) (local estimated flops for the assembly >>> after factorization): >>> > [0] 4.2839e+08 >>> > [1] 3.799e+08 >>> > RINFO(3) (local estimated flops for the >>> elimination after factorization): >>> > [0] 5.59214e+11 >>> > [1] 5.35237e+11 >>> > INFO(15) (estimated size of (in MB) MUMPS internal >>> data for running numerical factorization): >>> > [0] 2621 >>> > [1] 2649 >>> > INFO(16) (size of (in MB) MUMPS internal data used >>> during numerical factorization): >>> > [0] 2621 >>> > [1] 2649 >>> > INFO(23) (num of pivots eliminated on this >>> processor after factorization): >>> > [0] 90423 >>> > [1] 93903 >>> > RINFOG(1) (global estimated flops for the >>> elimination after analysis): 1.09445e+12 >>> > RINFOG(2) (global estimated flops for the assembly >>> after factorization): 8.0829e+08 >>> > RINFOG(3) (global estimated flops for the >>> elimination after factorization): 1.09445e+12 >>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>> (0,0)*(2^0) >>> > INFOG(3) (estimated real workspace for factors on >>> all processors after analysis): 403041366 >>> > INFOG(4) (estimated integer workspace for factors >>> on all processors after analysis): 2265748 >>> > INFOG(5) (estimated maximum front size in the >>> complete tree): 6663 >>> > INFOG(6) (number of nodes in the complete tree): >>> 2812 >>> > INFOG(7) (ordering option effectively use after >>> analysis): 5 >>> > INFOG(8) (structural symmetry in percent of the >>> permuted matrix after analysis): 100 >>> > INFOG(9) (total real/complex workspace to store >>> the matrix factors after factorization): 403041366 >>> > INFOG(10) (total integer space store the matrix >>> factors after factorization): 2265766 >>> > INFOG(11) (order of largest frontal matrix after >>> factorization): 6663 >>> > INFOG(12) (number of off-diagonal pivots): 0 >>> > INFOG(13) (number of delayed pivots after >>> factorization): 0 >>> > INFOG(14) (number of memory compress after >>> factorization): 0 >>> > INFOG(15) (number of steps of iterative refinement >>> after solution): 0 >>> > INFOG(16) (estimated size (in MB) of all MUMPS >>> internal data for factorization after analysis: value on the most memory >>> consuming processor): 2649 >>> > INFOG(17) (estimated size of all MUMPS internal >>> data for factorization after analysis: sum over all processors): 5270 >>> > INFOG(18) (size of all MUMPS internal data >>> allocated during factorization: value on the most memory consuming >>> processor): 2649 >>> > INFOG(19) (size of all MUMPS internal data >>> allocated during factorization: sum over all processors): 5270 >>> > INFOG(20) (estimated number of entries in the >>> factors): 403041366 >>> > INFOG(21) (size in MB of memory effectively used >>> during factorization - value on the most memory consuming processor): 2121 >>> > INFOG(22) (size in MB of memory effectively used >>> during factorization - sum over all processors): 4174 >>> > INFOG(23) (after analysis: value of ICNTL(6) >>> effectively used): 0 >>> > INFOG(24) (after analysis: value of ICNTL(12) >>> effectively used): 1 >>> > INFOG(25) (after factorization: number of pivots >>> modified by static pivoting): 0 >>> > INFOG(28) (after factorization: number of null >>> pivots encountered): 0 >>> > INFOG(29) (after factorization: effective number >>> of entries in the factors (sum over all processors)): 403041366 >>> > INFOG(30, 31) (after solution: size in Mbytes of >>> memory used during solution phase): 2467, 4922 >>> > INFOG(32) (after analysis: type of analysis done): >>> 1 >>> > INFOG(33) (value used for ICNTL(8)): 7 >>> > INFOG(34) (exponent of the determinant if >>> determinant is requested): 0 >>> > linear system matrix = precond matrix: >>> > Mat Object: (fieldsplit_u_) 2 MPI processes >>> > type: mpiaij >>> > rows=184326, cols=184326, bs=3 >>> > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 >>> > total number of mallocs used during MatSetValues calls =0 >>> > using I-node (on process 0) routines: found 26829 nodes, >>> limit used is 5 >>> > KSP solver for S = A11 - A10 inv(A00) A01 >>> > KSP Object: (fieldsplit_lu_) 2 MPI processes >>> > type: preonly >>> > maximum iterations=10000, initial guess is zero >>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> > left preconditioning >>> > using NONE norm type for convergence test >>> > PC Object: (fieldsplit_lu_) 2 MPI processes >>> > type: lu >>> > LU: out-of-place factorization >>> > tolerance for zero pivot 2.22045e-14 >>> > matrix ordering: natural >>> > factor fill ratio given 0, needed 0 >>> > Factored matrix follows: >>> > Mat Object: 2 MPI processes >>> > type: mpiaij >>> > rows=2583, cols=2583 >>> > package used to perform factorization: mumps >>> > total: nonzeros=2.17621e+06, allocated >>> nonzeros=2.17621e+06 >>> > total number of mallocs used during MatSetValues calls >>> =0 >>> > MUMPS run parameters: >>> > SYM (matrix type): 0 >>> > PAR (host participation): 1 >>> > ICNTL(1) (output for error): 6 >>> > ICNTL(2) (output of diagnostic msg): 0 >>> > ICNTL(3) (output for global info): 0 >>> > ICNTL(4) (level of printing): 0 >>> > ICNTL(5) (input mat struct): 0 >>> > ICNTL(6) (matrix prescaling): 7 >>> > ICNTL(7) (sequentia matrix ordering):7 >>> > ICNTL(8) (scalling strategy): 77 >>> > ICNTL(10) (max num of refinements): 0 >>> > ICNTL(11) (error analysis): 0 >>> > ICNTL(12) (efficiency control): >>> 1 >>> > ICNTL(13) (efficiency control): >>> 0 >>> > ICNTL(14) (percentage of estimated workspace >>> increase): 20 >>> > ICNTL(18) (input mat struct): >>> 3 >>> > ICNTL(19) (Shur complement info): >>> 0 >>> > ICNTL(20) (rhs sparse pattern): >>> 0 >>> > ICNTL(21) (solution struct): >>> 1 >>> > ICNTL(22) (in-core/out-of-core facility): >>> 0 >>> > ICNTL(23) (max size of memory can be allocated >>> locally):0 >>> > ICNTL(24) (detection of null pivot rows): >>> 0 >>> > ICNTL(25) (computation of a null space basis): >>> 0 >>> > ICNTL(26) (Schur options for rhs or solution): >>> 0 >>> > ICNTL(27) (experimental parameter): >>> -24 >>> > ICNTL(28) (use parallel or sequential ordering): >>> 1 >>> > ICNTL(29) (parallel ordering): >>> 0 >>> > ICNTL(30) (user-specified set of entries in >>> inv(A)): 0 >>> > ICNTL(31) (factors is discarded in the solve >>> phase): 0 >>> > ICNTL(33) (compute determinant): >>> 0 >>> > CNTL(1) (relative pivoting threshold): 0.01 >>> > CNTL(2) (stopping criterion of refinement): >>> 1.49012e-08 >>> > CNTL(3) (absolute pivoting threshold): 0 >>> > CNTL(4) (value of static pivoting): -1 >>> > CNTL(5) (fixation for null pivots): 0 >>> > RINFO(1) (local estimated flops for the >>> elimination after analysis): >>> > [0] 5.12794e+08 >>> > [1] 5.02142e+08 >>> > RINFO(2) (local estimated flops for the assembly >>> after factorization): >>> > [0] 815031 >>> > [1] 745263 >>> > RINFO(3) (local estimated flops for the >>> elimination after factorization): >>> > [0] 5.12794e+08 >>> > [1] 5.02142e+08 >>> > INFO(15) (estimated size of (in MB) MUMPS internal >>> data for running numerical factorization): >>> > [0] 34 >>> > [1] 34 >>> > INFO(16) (size of (in MB) MUMPS internal data used >>> during numerical factorization): >>> > [0] 34 >>> > [1] 34 >>> > INFO(23) (num of pivots eliminated on this >>> processor after factorization): >>> > [0] 1158 >>> > [1] 1425 >>> > RINFOG(1) (global estimated flops for the >>> elimination after analysis): 1.01494e+09 >>> > RINFOG(2) (global estimated flops for the assembly >>> after factorization): 1.56029e+06 >>> > RINFOG(3) (global estimated flops for the >>> elimination after factorization): 1.01494e+09 >>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>> (0,0)*(2^0) >>> > INFOG(3) (estimated real workspace for factors on >>> all processors after analysis): 2176209 >>> > INFOG(4) (estimated integer workspace for factors >>> on all processors after analysis): 14427 >>> > INFOG(5) (estimated maximum front size in the >>> complete tree): 699 >>> > INFOG(6) (number of nodes in the complete tree): 15 >>> > INFOG(7) (ordering option effectively use after >>> analysis): 2 >>> > INFOG(8) (structural symmetry in percent of the >>> permuted matrix after analysis): 100 >>> > INFOG(9) (total real/complex workspace to store >>> the matrix factors after factorization): 2176209 >>> > INFOG(10) (total integer space store the matrix >>> factors after factorization): 14427 >>> > INFOG(11) (order of largest frontal matrix after >>> factorization): 699 >>> > INFOG(12) (number of off-diagonal pivots): 0 >>> > INFOG(13) (number of delayed pivots after >>> factorization): 0 >>> > INFOG(14) (number of memory compress after >>> factorization): 0 >>> > INFOG(15) (number of steps of iterative refinement >>> after solution): 0 >>> > INFOG(16) (estimated size (in MB) of all MUMPS >>> internal data for factorization after analysis: value on the most memory >>> consuming processor): 34 >>> > INFOG(17) (estimated size of all MUMPS internal >>> data for factorization after analysis: sum over all processors): 68 >>> > INFOG(18) (size of all MUMPS internal data >>> allocated during factorization: value on the most memory consuming >>> processor): 34 >>> > INFOG(19) (size of all MUMPS internal data >>> allocated during factorization: sum over all processors): 68 >>> > INFOG(20) (estimated number of entries in the >>> factors): 2176209 >>> > INFOG(21) (size in MB of memory effectively used >>> during factorization - value on the most memory consuming processor): 30 >>> > INFOG(22) (size in MB of memory effectively used >>> during factorization - sum over all processors): 59 >>> > INFOG(23) (after analysis: value of ICNTL(6) >>> effectively used): 0 >>> > INFOG(24) (after analysis: value of ICNTL(12) >>> effectively used): 1 >>> > INFOG(25) (after factorization: number of pivots >>> modified by static pivoting): 0 >>> > INFOG(28) (after factorization: number of null >>> pivots encountered): 0 >>> > INFOG(29) (after factorization: effective number >>> of entries in the factors (sum over all processors)): 2176209 >>> > INFOG(30, 31) (after solution: size in Mbytes of >>> memory used during solution phase): 16, 32 >>> > INFOG(32) (after analysis: type of analysis done): >>> 1 >>> > INFOG(33) (value used for ICNTL(8)): 7 >>> > INFOG(34) (exponent of the determinant if >>> determinant is requested): 0 >>> > linear system matrix followed by preconditioner matrix: >>> > Mat Object: (fieldsplit_lu_) 2 MPI processes >>> > type: schurcomplement >>> > rows=2583, cols=2583 >>> > Schur complement A11 - A10 inv(A00) A01 >>> > A11 >>> > Mat Object: (fieldsplit_lu_) >>> 2 MPI processes >>> > type: mpiaij >>> > rows=2583, cols=2583, bs=3 >>> > total: nonzeros=117369, allocated nonzeros=117369 >>> > total number of mallocs used during MatSetValues calls >>> =0 >>> > not using I-node (on process 0) routines >>> > A10 >>> > Mat Object: 2 MPI processes >>> > type: mpiaij >>> > rows=2583, cols=184326, rbs=3, cbs = 1 >>> > total: nonzeros=292770, allocated nonzeros=292770 >>> > total number of mallocs used during MatSetValues calls >>> =0 >>> > not using I-node (on process 0) routines >>> > KSP of A00 >>> > KSP Object: (fieldsplit_u_) 2 >>> MPI processes >>> > type: preonly >>> > maximum iterations=10000, initial guess is zero >>> > tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000 >>> > left preconditioning >>> > using NONE norm type for convergence test >>> > PC Object: (fieldsplit_u_) 2 >>> MPI processes >>> > type: lu >>> > LU: out-of-place factorization >>> > tolerance for zero pivot 2.22045e-14 >>> > matrix ordering: natural >>> > factor fill ratio given 0, needed 0 >>> > Factored matrix follows: >>> > Mat Object: 2 MPI processes >>> > type: mpiaij >>> > rows=184326, cols=184326 >>> > package used to perform factorization: mumps >>> > total: nonzeros=4.03041e+08, allocated >>> nonzeros=4.03041e+08 >>> > total number of mallocs used during >>> MatSetValues calls =0 >>> > MUMPS run parameters: >>> > SYM (matrix type): 0 >>> > PAR (host participation): 1 >>> > ICNTL(1) (output for error): 6 >>> > ICNTL(2) (output of diagnostic msg): 0 >>> > ICNTL(3) (output for global info): 0 >>> > ICNTL(4) (level of printing): 0 >>> > ICNTL(5) (input mat struct): 0 >>> > ICNTL(6) (matrix prescaling): 7 >>> > ICNTL(7) (sequentia matrix ordering):7 >>> > ICNTL(8) (scalling strategy): 77 >>> > ICNTL(10) (max num of refinements): 0 >>> > ICNTL(11) (error analysis): 0 >>> > ICNTL(12) (efficiency control): >>> 1 >>> > ICNTL(13) (efficiency control): >>> 0 >>> > ICNTL(14) (percentage of estimated >>> workspace increase): 20 >>> > ICNTL(18) (input mat struct): >>> 3 >>> > ICNTL(19) (Shur complement info): >>> 0 >>> > ICNTL(20) (rhs sparse pattern): >>> 0 >>> > ICNTL(21) (solution struct): >>> 1 >>> > ICNTL(22) (in-core/out-of-core facility): >>> 0 >>> > ICNTL(23) (max size of memory can be >>> allocated locally):0 >>> > ICNTL(24) (detection of null pivot rows): >>> 0 >>> > ICNTL(25) (computation of a null space >>> basis): 0 >>> > ICNTL(26) (Schur options for rhs or >>> solution): 0 >>> > ICNTL(27) (experimental parameter): >>> -24 >>> > ICNTL(28) (use parallel or sequential >>> ordering): 1 >>> > ICNTL(29) (parallel ordering): >>> 0 >>> > ICNTL(30) (user-specified set of entries >>> in inv(A)): 0 >>> > ICNTL(31) (factors is discarded in the >>> solve phase): 0 >>> > ICNTL(33) (compute determinant): >>> 0 >>> > CNTL(1) (relative pivoting threshold): >>> 0.01 >>> > CNTL(2) (stopping criterion of >>> refinement): 1.49012e-08 >>> > CNTL(3) (absolute pivoting threshold): >>> 0 >>> > CNTL(4) (value of static pivoting): >>> -1 >>> > CNTL(5) (fixation for null pivots): >>> 0 >>> > RINFO(1) (local estimated flops for the >>> elimination after analysis): >>> > [0] 5.59214e+11 >>> > [1] 5.35237e+11 >>> > RINFO(2) (local estimated flops for the >>> assembly after factorization): >>> > [0] 4.2839e+08 >>> > [1] 3.799e+08 >>> > RINFO(3) (local estimated flops for the >>> elimination after factorization): >>> > [0] 5.59214e+11 >>> > [1] 5.35237e+11 >>> > INFO(15) (estimated size of (in MB) MUMPS >>> internal data for running numerical factorization): >>> > [0] 2621 >>> > [1] 2649 >>> > INFO(16) (size of (in MB) MUMPS internal >>> data used during numerical factorization): >>> > [0] 2621 >>> > [1] 2649 >>> > INFO(23) (num of pivots eliminated on this >>> processor after factorization): >>> > [0] 90423 >>> > [1] 93903 >>> > RINFOG(1) (global estimated flops for the >>> elimination after analysis): 1.09445e+12 >>> > RINFOG(2) (global estimated flops for the >>> assembly after factorization): 8.0829e+08 >>> > RINFOG(3) (global estimated flops for the >>> elimination after factorization): 1.09445e+12 >>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) >>> (determinant): (0,0)*(2^0) >>> > INFOG(3) (estimated real workspace for >>> factors on all processors after analysis): 403041366 >>> > INFOG(4) (estimated integer workspace for >>> factors on all processors after analysis): 2265748 >>> > INFOG(5) (estimated maximum front size in >>> the complete tree): 6663 >>> > INFOG(6) (number of nodes in the complete >>> tree): 2812 >>> > INFOG(7) (ordering option effectively use >>> after analysis): 5 >>> > INFOG(8) (structural symmetry in percent >>> of the permuted matrix after analysis): 100 >>> > INFOG(9) (total real/complex workspace to >>> store the matrix factors after factorization): 403041366 >>> > INFOG(10) (total integer space store the >>> matrix factors after factorization): 2265766 >>> > INFOG(11) (order of largest frontal matrix >>> after factorization): 6663 >>> > INFOG(12) (number of off-diagonal pivots): >>> 0 >>> > INFOG(13) (number of delayed pivots after >>> factorization): 0 >>> > INFOG(14) (number of memory compress after >>> factorization): 0 >>> > INFOG(15) (number of steps of iterative >>> refinement after solution): 0 >>> > INFOG(16) (estimated size (in MB) of all >>> MUMPS internal data for factorization after analysis: value on the most >>> memory consuming processor): 2649 >>> > INFOG(17) (estimated size of all MUMPS >>> internal data for factorization after analysis: sum over all processors): >>> 5270 >>> > INFOG(18) (size of all MUMPS internal data >>> allocated during factorization: value on the most memory consuming >>> processor): 2649 >>> > INFOG(19) (size of all MUMPS internal data >>> allocated during factorization: sum over all processors): 5270 >>> > INFOG(20) (estimated number of entries in >>> the factors): 403041366 >>> > INFOG(21) (size in MB of memory >>> effectively used during factorization - value on the most memory consuming >>> processor): 2121 >>> > INFOG(22) (size in MB of memory >>> effectively used during factorization - sum over all processors): 4174 >>> > INFOG(23) (after analysis: value of >>> ICNTL(6) effectively used): 0 >>> > INFOG(24) (after analysis: value of >>> ICNTL(12) effectively used): 1 >>> > INFOG(25) (after factorization: number of >>> pivots modified by static pivoting): 0 >>> > INFOG(28) (after factorization: number of >>> null pivots encountered): 0 >>> > INFOG(29) (after factorization: effective >>> number of entries in the factors (sum over all processors)): 403041366 >>> > INFOG(30, 31) (after solution: size in >>> Mbytes of memory used during solution phase): 2467, 4922 >>> > INFOG(32) (after analysis: type of >>> analysis done): 1 >>> > INFOG(33) (value used for ICNTL(8)): 7 >>> > INFOG(34) (exponent of the determinant if >>> determinant is requested): 0 >>> > linear system matrix = precond matrix: >>> > Mat Object: (fieldsplit_u_) >>> 2 MPI processes >>> > type: mpiaij >>> > rows=184326, cols=184326, bs=3 >>> > total: nonzeros=3.32649e+07, allocated >>> nonzeros=3.32649e+07 >>> > total number of mallocs used during MatSetValues >>> calls =0 >>> > using I-node (on process 0) routines: found 26829 >>> nodes, limit used is 5 >>> > A01 >>> > Mat Object: 2 MPI processes >>> > type: mpiaij >>> > rows=184326, cols=2583, rbs=3, cbs = 1 >>> > total: nonzeros=292770, allocated nonzeros=292770 >>> > total number of mallocs used during MatSetValues calls >>> =0 >>> > using I-node (on process 0) routines: found 16098 >>> nodes, limit used is 5 >>> > Mat Object: 2 MPI processes >>> > type: mpiaij >>> > rows=2583, cols=2583, rbs=3, cbs = 1 >>> > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 >>> > total number of mallocs used during MatSetValues calls =0 >>> > not using I-node (on process 0) routines >>> > linear system matrix = precond matrix: >>> > Mat Object: 2 MPI processes >>> > type: mpiaij >>> > rows=186909, cols=186909 >>> > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 >>> > total number of mallocs used during MatSetValues calls =0 >>> > using I-node (on process 0) routines: found 26829 nodes, limit >>> used is 5 >>> > KSPSolve completed >>> > >>> > >>> > Giang >>> > >>> > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley >>> wrote: >>> > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui >>> wrote: >>> > Hello >>> > >>> > I'm solving an indefinite problem arising from mesh tying/contact >>> using Lagrange multiplier, the matrix has the form >>> > >>> > K = [A P^T >>> > P 0] >>> > >>> > I used the FIELDSPLIT preconditioner with one field is the main >>> variable (displacement) and the other field for dual variable (Lagrange >>> multiplier). The block size for each field is 3. According to the manual, I >>> first chose the preconditioner based on Schur complement to treat this >>> problem. >>> > >>> > >>> > For any solver question, please send us the output of >>> > >>> > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>> > >>> > >>> > However, I will comment below >>> > >>> > The parameters used for the solve is >>> > -ksp_type gmres >>> > >>> > You need 'fgmres' here with the options you have below. >>> > >>> > -ksp_max_it 300 >>> > -ksp_gmres_restart 300 >>> > -ksp_gmres_modifiedgramschmidt >>> > -pc_fieldsplit_type schur >>> > -pc_fieldsplit_schur_fact_type diag >>> > -pc_fieldsplit_schur_precondition selfp >>> > >>> > >>> > >>> > It could be taking time in the MatMatMult() here if that matrix is >>> dense. Is there any reason to >>> > believe that is a good preconditioner for your problem? >>> > >>> > >>> > -pc_fieldsplit_detect_saddle_point >>> > -fieldsplit_u_pc_type hypre >>> > >>> > I would just use MUMPS here to start, especially if it works on the >>> whole problem. Same with the one below. >>> > >>> > Matt >>> > >>> > -fieldsplit_u_pc_hypre_type boomeramg >>> > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS >>> > -fieldsplit_lu_pc_type hypre >>> > -fieldsplit_lu_pc_hypre_type boomeramg >>> > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS >>> > >>> > For the test case, a small problem is solved on 2 processes. Due to >>> the decomposition, the contact only happens in 1 proc, so the size of >>> Lagrange multiplier dofs on proc 0 is 0. >>> > >>> > 0: mIndexU.size(): 80490 >>> > 0: mIndexLU.size(): 0 >>> > 1: mIndexU.size(): 103836 >>> > 1: mIndexLU.size(): 2583 >>> > >>> > However, with this setup the solver takes very long at KSPSolve before >>> going to iteration, and the first iteration seems forever so I have to stop >>> the calculation. I guessed that the solver takes time to compute the Schur >>> complement, but according to the manual only the diagonal of A is used to >>> approximate the Schur complement, so it should not take long to compute >>> this. >>> > >>> > Note that I ran the same problem with direct solver (MUMPS) and it's >>> able to produce the valid results. The parameter for the solve is pretty >>> standard >>> > -ksp_type preonly >>> > -pc_type lu >>> > -pc_factor_mat_solver_package mumps >>> > >>> > Hence the matrix/rhs must not have any problem here. Do you have any >>> idea or suggestion for this case? >>> > >>> > >>> > Giang >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> > >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> > >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 15 09:28:55 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Sep 2016 09:28:55 -0500 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: On Thu, Sep 15, 2016 at 9:07 AM, Hoang Giang Bui wrote: > Hi Matt > > Thanks for the comment. After looking carefully into the manual again, the > key take away is that with selfp there is no option to compute the exact > Schur, there are only two options to approximate the inv(A00) for selfp, > which are lump and diag (diag by default). I misunderstood this previously. > > There is online manual entry mentioned about PC_FIELDSPLIT_SCHUR_PRE_FULL, > which is not documented elsewhere in the offline manual. I tried to access > that by setting > -pc_fieldsplit_schur_precondition full > Yep, I wrote that specifically for testing, but its very slow so I did not document it to prevent people from complaining. > but it gives the error > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Arguments are incompatible > [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with B, > seqaij > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui Thu > Sep 15 15:46:56 2016 > [0]PETSC ERROR: Configure options --with-shared-libraries > --with-debugging=0 --with-pic --download-fblaslapack=yes > --download-suitesparse --download-ptscotch=yes --download-metis=yes > --download-parmetis=yes --download-scalapack=yes --download-mumps=yes > --download-hypre=yes --download-ml=yes --download-pastix=yes > --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 > [0]PETSC ERROR: #1 MatMatMult() line 9514 in /home/hbui/sw/petsc-3.7.3/src/ > mat/interface/matrix.c > [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line 526 > in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c > [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in > /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ > ksp/pc/interface/precon.c > [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/hbui/sw/petsc-3.7.3/src/ > ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/hbui/sw/petsc-3.7.3/src/ > ksp/ksp/interface/itfunc.c > > Please excuse me to insist on forming the exact Schur complement, but as > you said, I would like to track down what creates problem in my code by > starting from a very exact but ineffective solution. > Sure, I understand. I do not understand how A can be MPI and B can be Seq. Do you know how that happens? Thanks, Matt > Giang > > On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley > wrote: > >> On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui >> wrote: >> >>> Dear Barry >>> >>> Thanks for the clarification. I got exactly what you said if the code >>> changed to >>> ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); >>> Residual norms for stokes_ solve. >>> 0 KSP Residual norm 1.327791371202e-02 >>> Residual norms for stokes_fieldsplit_p_ solve. >>> 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm >>> 0.000000000000e+00 ||r(i)||/||b|| -nan >>> 1 KSP Residual norm 3.997711925708e-17 >>> >>> but I guess we solve a different problem if B is used for the linear >>> system. >>> >>> in addition, changed to >>> ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); >>> also works but inner iteration converged not in one iteration >>> >>> Residual norms for stokes_ solve. >>> 0 KSP Residual norm 1.327791371202e-02 >>> Residual norms for stokes_fieldsplit_p_ solve. >>> 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm >>> 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 >>> 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm >>> 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 >>> 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm >>> 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 >>> 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm >>> 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 >>> 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm >>> 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 >>> 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm >>> 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 >>> 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm >>> 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 >>> 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm >>> 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 >>> 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm >>> 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 >>> 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm >>> 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 >>> 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm >>> 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 >>> 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm >>> 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 >>> 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm >>> 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 >>> 1 KSP Residual norm 1.445282647127e-16 >>> >>> >>> Seem like zero pivot does not happen, but why the solver for Schur takes >>> 13 steps if the preconditioner is direct solver? >>> >> >> Look at the -ksp_view. I will bet that the default is to shift (add a >> multiple of the identity) the matrix instead of failing. This >> gives an inexact PC, but as you see it can converge. >> >> Thanks, >> >> Matt >> >> >>> >>> I also so tried another problem which I known does have a nonsingular >>> Schur (at least A11 != 0) and it also have the same problem: 1 step outer >>> convergence but multiple step inner convergence. >>> >>> Any ideas? >>> >>> Giang >>> >>> On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: >>> >>>> >>>> Normally you'd be absolutely correct to expect convergence in one >>>> iteration. However in this example note the call >>>> >>>> ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); >>>> >>>> It is solving the linear system defined by A but building the >>>> preconditioner (i.e. the entire fieldsplit process) from a different matrix >>>> B. Since A is not B you should not expect convergence in one iteration. If >>>> you change the code to >>>> >>>> ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); >>>> >>>> you will see exactly what you expect, convergence in one iteration. >>>> >>>> Sorry about this, the example is lacking clarity and documentation >>>> its author obviously knew too well what he was doing that he didn't realize >>>> everyone else in the world would need more comments in the code. If you >>>> change the code to >>>> >>>> ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); >>>> >>>> it will stop without being able to build the preconditioner because LU >>>> factorization of the Sp matrix will result in a zero pivot. This is why >>>> this "auxiliary" matrix B is used to define the preconditioner instead of A. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui >>>> wrote: >>>> > >>>> > Sorry I slept quite a while in this thread. Now I start to look at it >>>> again. In the last try, the previous setting doesn't work either (in fact >>>> diverge). So I would speculate if the Schur complement in my case is >>>> actually not invertible. It's also possible that the code is wrong >>>> somewhere. However, before looking at that, I want to understand thoroughly >>>> the settings for Schur complement >>>> > >>>> > I experimented ex42 with the settings: >>>> > mpirun -np 1 ex42 \ >>>> > -stokes_ksp_monitor \ >>>> > -stokes_ksp_type fgmres \ >>>> > -stokes_pc_type fieldsplit \ >>>> > -stokes_pc_fieldsplit_type schur \ >>>> > -stokes_pc_fieldsplit_schur_fact_type full \ >>>> > -stokes_pc_fieldsplit_schur_precondition selfp \ >>>> > -stokes_fieldsplit_u_ksp_type preonly \ >>>> > -stokes_fieldsplit_u_pc_type lu \ >>>> > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ >>>> > -stokes_fieldsplit_p_ksp_type gmres \ >>>> > -stokes_fieldsplit_p_ksp_monitor_true_residual \ >>>> > -stokes_fieldsplit_p_ksp_max_it 300 \ >>>> > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ >>>> > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ >>>> > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ >>>> > -stokes_fieldsplit_p_pc_type lu \ >>>> > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps >>>> > >>>> > In my understanding, the solver should converge in 1 (outer) step. >>>> Execution gives: >>>> > Residual norms for stokes_ solve. >>>> > 0 KSP Residual norm 1.327791371202e-02 >>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>> > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid >>>> norm 0.000000000000e+00 ||r(i)||/||b|| -nan >>>> > 1 KSP Residual norm 7.656238881621e-04 >>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>> > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid >>>> norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>> > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid >>>> norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 >>>> > 2 KSP Residual norm 9.895645456398e-06 >>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>> > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid >>>> norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>> > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid >>>> norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 >>>> > 3 KSP Residual norm 1.608206702571e-06 >>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>> > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid >>>> norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>> > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid >>>> norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 >>>> > 4 KSP Residual norm 2.453618999882e-07 >>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>> > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid >>>> norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>> > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid >>>> norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 >>>> > 5 KSP Residual norm 4.230399264750e-08 >>>> > >>>> > Looks like the "selfp" does construct the Schur nicely. But does >>>> "full" really construct the full block preconditioner? >>>> > >>>> > Giang >>>> > P/S: I'm also generating a smaller size of the previous problem for >>>> checking again. >>>> > >>>> > >>>> > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley >>>> wrote: >>>> > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui >>>> wrote: >>>> > >>>> > It could be taking time in the MatMatMult() here if that matrix is >>>> dense. Is there any reason to >>>> > believe that is a good preconditioner for your problem? >>>> > >>>> > This is the first approach to the problem, so I chose the most simple >>>> setting. Do you have any other recommendation? >>>> > >>>> > This is in no way the simplest PC. We need to make it simpler first. >>>> > >>>> > 1) Run on only 1 proc >>>> > >>>> > 2) Use -pc_fieldsplit_schur_fact_type full >>>> > >>>> > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_tru >>>> e_residual >>>> > >>>> > This should converge in 1 outer iteration, but we will see how good >>>> your Schur complement preconditioner >>>> > is for this problem. >>>> > >>>> > You need to start out from something you understand and then start >>>> making approximations. >>>> > >>>> > Matt >>>> > >>>> > For any solver question, please send us the output of >>>> > >>>> > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>>> > >>>> > >>>> > I sent here the full output (after changed to fgmres), again it takes >>>> long at the first iteration but after that, it does not converge >>>> > >>>> > -ksp_type fgmres >>>> > -ksp_max_it 300 >>>> > -ksp_gmres_restart 300 >>>> > -ksp_gmres_modifiedgramschmidt >>>> > -pc_fieldsplit_type schur >>>> > -pc_fieldsplit_schur_fact_type diag >>>> > -pc_fieldsplit_schur_precondition selfp >>>> > -pc_fieldsplit_detect_saddle_point >>>> > -fieldsplit_u_ksp_type preonly >>>> > -fieldsplit_u_pc_type lu >>>> > -fieldsplit_u_pc_factor_mat_solver_package mumps >>>> > -fieldsplit_lu_ksp_type preonly >>>> > -fieldsplit_lu_pc_type lu >>>> > -fieldsplit_lu_pc_factor_mat_solver_package mumps >>>> > >>>> > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid >>>> norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 >>>> > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid >>>> norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 >>>> > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid >>>> norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 >>>> > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid >>>> norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 >>>> > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid >>>> norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 >>>> > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid >>>> norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 >>>> > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid >>>> norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 >>>> > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid >>>> norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 >>>> > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid >>>> norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 >>>> > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid >>>> norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 >>>> > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid >>>> norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 >>>> > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid >>>> norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 >>>> > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid >>>> norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 >>>> > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid >>>> norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 >>>> > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid >>>> norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 >>>> > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid >>>> norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 >>>> > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid >>>> norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 >>>> > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid >>>> norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 >>>> > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid >>>> norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 >>>> > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid >>>> norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 >>>> > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid >>>> norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 >>>> > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid >>>> norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 >>>> > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid >>>> norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 >>>> > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid >>>> norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 >>>> > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid >>>> norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 >>>> > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid >>>> norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 >>>> > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid >>>> norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 >>>> > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid >>>> norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 >>>> > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid >>>> norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 >>>> > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid >>>> norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 >>>> > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid >>>> norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 >>>> > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid >>>> norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 >>>> > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid >>>> norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 >>>> > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid >>>> norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 >>>> > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid >>>> norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 >>>> > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid >>>> norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 >>>> > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid >>>> norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 >>>> > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid >>>> norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 >>>> > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid >>>> norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 >>>> > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid >>>> norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 >>>> > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid >>>> norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 >>>> > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid >>>> norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 >>>> > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid >>>> norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 >>>> > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid >>>> norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 >>>> > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid >>>> norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 >>>> > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid >>>> norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 >>>> > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid >>>> norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 >>>> > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid >>>> norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 >>>> > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid >>>> norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 >>>> > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid >>>> norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 >>>> > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid >>>> norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 >>>> > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid >>>> norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 >>>> > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid >>>> norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 >>>> > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid >>>> norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 >>>> > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid >>>> norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 >>>> > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid >>>> norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 >>>> > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid >>>> norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 >>>> > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid >>>> norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 >>>> > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid >>>> norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 >>>> > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid >>>> norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 >>>> > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid >>>> norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 >>>> > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid >>>> norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 >>>> > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid >>>> norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 >>>> > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid >>>> norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 >>>> > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid >>>> norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 >>>> > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid >>>> norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 >>>> > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid >>>> norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 >>>> > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid >>>> norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 >>>> > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid >>>> norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 >>>> > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid >>>> norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 >>>> > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid >>>> norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 >>>> > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid >>>> norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 >>>> > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid >>>> norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 >>>> > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid >>>> norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 >>>> > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid >>>> norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 >>>> > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid >>>> norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 >>>> > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid >>>> norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 >>>> > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid >>>> norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 >>>> > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid >>>> norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 >>>> > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid >>>> norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 >>>> > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid >>>> norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 >>>> > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid >>>> norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 >>>> > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid >>>> norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 >>>> > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid >>>> norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 >>>> > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid >>>> norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 >>>> > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid >>>> norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 >>>> > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid >>>> norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 >>>> > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid >>>> norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 >>>> > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid >>>> norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 >>>> > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid >>>> norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 >>>> > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid >>>> norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 >>>> > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid >>>> norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 >>>> > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid >>>> norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 >>>> > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid >>>> norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 >>>> > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid >>>> norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 >>>> > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid >>>> norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 >>>> > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid >>>> norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 >>>> > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid >>>> norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 >>>> > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid >>>> norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 >>>> > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid >>>> norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 >>>> > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid >>>> norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 >>>> > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid >>>> norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 >>>> > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid >>>> norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 >>>> > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid >>>> norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 >>>> > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid >>>> norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 >>>> > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid >>>> norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 >>>> > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid >>>> norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 >>>> > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid >>>> norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 >>>> > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid >>>> norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 >>>> > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid >>>> norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 >>>> > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid >>>> norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 >>>> > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid >>>> norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 >>>> > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid >>>> norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 >>>> > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid >>>> norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 >>>> > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid >>>> norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 >>>> > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid >>>> norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 >>>> > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid >>>> norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 >>>> > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid >>>> norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 >>>> > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid >>>> norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 >>>> > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid >>>> norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 >>>> > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid >>>> norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 >>>> > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid >>>> norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 >>>> > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid >>>> norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 >>>> > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid >>>> norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 >>>> > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid >>>> norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 >>>> > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid >>>> norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 >>>> > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid >>>> norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 >>>> > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid >>>> norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 >>>> > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid >>>> norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 >>>> > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid >>>> norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 >>>> > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid >>>> norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 >>>> > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid >>>> norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 >>>> > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid >>>> norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 >>>> > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid >>>> norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 >>>> > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid >>>> norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 >>>> > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid >>>> norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 >>>> > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid >>>> norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 >>>> > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid >>>> norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 >>>> > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid >>>> norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 >>>> > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid >>>> norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 >>>> > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid >>>> norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 >>>> > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid >>>> norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 >>>> > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid >>>> norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 >>>> > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid >>>> norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 >>>> > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid >>>> norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 >>>> > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid >>>> norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 >>>> > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid >>>> norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 >>>> > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid >>>> norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 >>>> > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid >>>> norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 >>>> > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid >>>> norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 >>>> > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid >>>> norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 >>>> > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid >>>> norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 >>>> > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid >>>> norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 >>>> > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid >>>> norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 >>>> > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid >>>> norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 >>>> > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid >>>> norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 >>>> > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid >>>> norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 >>>> > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid >>>> norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 >>>> > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid >>>> norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 >>>> > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid >>>> norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 >>>> > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid >>>> norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 >>>> > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid >>>> norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 >>>> > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid >>>> norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 >>>> > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid >>>> norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 >>>> > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid >>>> norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 >>>> > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid >>>> norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 >>>> > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid >>>> norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 >>>> > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid >>>> norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 >>>> > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid >>>> norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 >>>> > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid >>>> norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 >>>> > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid >>>> norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 >>>> > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid >>>> norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 >>>> > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid >>>> norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 >>>> > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid >>>> norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 >>>> > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid >>>> norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 >>>> > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid >>>> norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 >>>> > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid >>>> norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 >>>> > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid >>>> norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 >>>> > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid >>>> norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 >>>> > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid >>>> norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 >>>> > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid >>>> norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 >>>> > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid >>>> norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 >>>> > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid >>>> norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 >>>> > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid >>>> norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 >>>> > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid >>>> norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 >>>> > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid >>>> norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 >>>> > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid >>>> norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 >>>> > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid >>>> norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 >>>> > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid >>>> norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 >>>> > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid >>>> norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 >>>> > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid >>>> norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 >>>> > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid >>>> norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 >>>> > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid >>>> norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 >>>> > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid >>>> norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 >>>> > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid >>>> norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 >>>> > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid >>>> norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 >>>> > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid >>>> norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 >>>> > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid >>>> norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 >>>> > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid >>>> norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 >>>> > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid >>>> norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 >>>> > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid >>>> norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 >>>> > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid >>>> norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 >>>> > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid >>>> norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 >>>> > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid >>>> norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 >>>> > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid >>>> norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 >>>> > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid >>>> norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 >>>> > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid >>>> norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 >>>> > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid >>>> norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 >>>> > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid >>>> norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 >>>> > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid >>>> norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 >>>> > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid >>>> norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 >>>> > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid >>>> norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 >>>> > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid >>>> norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 >>>> > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid >>>> norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 >>>> > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid >>>> norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 >>>> > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid >>>> norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 >>>> > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid >>>> norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 >>>> > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid >>>> norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 >>>> > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid >>>> norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 >>>> > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid >>>> norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 >>>> > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid >>>> norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 >>>> > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid >>>> norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 >>>> > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid >>>> norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 >>>> > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid >>>> norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 >>>> > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid >>>> norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 >>>> > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid >>>> norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 >>>> > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid >>>> norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 >>>> > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid >>>> norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 >>>> > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid >>>> norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 >>>> > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid >>>> norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 >>>> > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid >>>> norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 >>>> > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid >>>> norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 >>>> > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid >>>> norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 >>>> > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid >>>> norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 >>>> > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid >>>> norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 >>>> > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid >>>> norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 >>>> > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid >>>> norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 >>>> > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid >>>> norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 >>>> > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid >>>> norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 >>>> > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid >>>> norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 >>>> > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid >>>> norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 >>>> > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid >>>> norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 >>>> > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid >>>> norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 >>>> > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid >>>> norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 >>>> > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid >>>> norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 >>>> > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid >>>> norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 >>>> > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid >>>> norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 >>>> > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid >>>> norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 >>>> > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid >>>> norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 >>>> > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid >>>> norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 >>>> > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid >>>> norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 >>>> > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid >>>> norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 >>>> > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid >>>> norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 >>>> > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid >>>> norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 >>>> > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid >>>> norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 >>>> > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid >>>> norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 >>>> > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid >>>> norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 >>>> > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid >>>> norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 >>>> > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid >>>> norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 >>>> > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid >>>> norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 >>>> > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid >>>> norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 >>>> > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid >>>> norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 >>>> > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid >>>> norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 >>>> > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid >>>> norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 >>>> > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid >>>> norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 >>>> > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid >>>> norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 >>>> > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid >>>> norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 >>>> > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid >>>> norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 >>>> > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid >>>> norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 >>>> > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid >>>> norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 >>>> > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid >>>> norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 >>>> > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid >>>> norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 >>>> > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid >>>> norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 >>>> > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid >>>> norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 >>>> > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid >>>> norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 >>>> > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid >>>> norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 >>>> > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid >>>> norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 >>>> > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid >>>> norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 >>>> > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid >>>> norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 >>>> > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid >>>> norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 >>>> > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid >>>> norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 >>>> > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid >>>> norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 >>>> > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid >>>> norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 >>>> > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid >>>> norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 >>>> > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid >>>> norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 >>>> > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid >>>> norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 >>>> > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid >>>> norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 >>>> > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid >>>> norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 >>>> > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid >>>> norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 >>>> > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid >>>> norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 >>>> > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid >>>> norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 >>>> > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid >>>> norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 >>>> > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid >>>> norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 >>>> > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid >>>> norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 >>>> > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid >>>> norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 >>>> > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid >>>> norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 >>>> > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid >>>> norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 >>>> > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid >>>> norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 >>>> > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid >>>> norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 >>>> > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid >>>> norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 >>>> > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid >>>> norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 >>>> > Linear solve did not converge due to DIVERGED_ITS iterations 300 >>>> > KSP Object: 2 MPI processes >>>> > type: fgmres >>>> > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization >>>> > GMRES: happy breakdown tolerance 1e-30 >>>> > maximum iterations=300, initial guess is zero >>>> > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 >>>> > right preconditioning >>>> > using UNPRECONDITIONED norm type for convergence test >>>> > PC Object: 2 MPI processes >>>> > type: fieldsplit >>>> > FieldSplit with Schur preconditioner, factorization DIAG >>>> > Preconditioner for the Schur complement formed from Sp, an >>>> assembled approximation to S, which uses (lumped, if requested) A00's >>>> diagonal's inverse >>>> > Split info: >>>> > Split number 0 Defined by IS >>>> > Split number 1 Defined by IS >>>> > KSP solver for A00 block >>>> > KSP Object: (fieldsplit_u_) 2 MPI processes >>>> > type: preonly >>>> > maximum iterations=10000, initial guess is zero >>>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>>> > left preconditioning >>>> > using NONE norm type for convergence test >>>> > PC Object: (fieldsplit_u_) 2 MPI processes >>>> > type: lu >>>> > LU: out-of-place factorization >>>> > tolerance for zero pivot 2.22045e-14 >>>> > matrix ordering: natural >>>> > factor fill ratio given 0, needed 0 >>>> > Factored matrix follows: >>>> > Mat Object: 2 MPI processes >>>> > type: mpiaij >>>> > rows=184326, cols=184326 >>>> > package used to perform factorization: mumps >>>> > total: nonzeros=4.03041e+08, allocated >>>> nonzeros=4.03041e+08 >>>> > total number of mallocs used during MatSetValues >>>> calls =0 >>>> > MUMPS run parameters: >>>> > SYM (matrix type): 0 >>>> > PAR (host participation): 1 >>>> > ICNTL(1) (output for error): 6 >>>> > ICNTL(2) (output of diagnostic msg): 0 >>>> > ICNTL(3) (output for global info): 0 >>>> > ICNTL(4) (level of printing): 0 >>>> > ICNTL(5) (input mat struct): 0 >>>> > ICNTL(6) (matrix prescaling): 7 >>>> > ICNTL(7) (sequentia matrix ordering):7 >>>> > ICNTL(8) (scalling strategy): 77 >>>> > ICNTL(10) (max num of refinements): 0 >>>> > ICNTL(11) (error analysis): 0 >>>> > ICNTL(12) (efficiency control): >>>> 1 >>>> > ICNTL(13) (efficiency control): >>>> 0 >>>> > ICNTL(14) (percentage of estimated workspace >>>> increase): 20 >>>> > ICNTL(18) (input mat struct): >>>> 3 >>>> > ICNTL(19) (Shur complement info): >>>> 0 >>>> > ICNTL(20) (rhs sparse pattern): >>>> 0 >>>> > ICNTL(21) (solution struct): >>>> 1 >>>> > ICNTL(22) (in-core/out-of-core facility): >>>> 0 >>>> > ICNTL(23) (max size of memory can be allocated >>>> locally):0 >>>> > ICNTL(24) (detection of null pivot rows): >>>> 0 >>>> > ICNTL(25) (computation of a null space basis): >>>> 0 >>>> > ICNTL(26) (Schur options for rhs or solution): >>>> 0 >>>> > ICNTL(27) (experimental parameter): >>>> -24 >>>> > ICNTL(28) (use parallel or sequential ordering): >>>> 1 >>>> > ICNTL(29) (parallel ordering): >>>> 0 >>>> > ICNTL(30) (user-specified set of entries in >>>> inv(A)): 0 >>>> > ICNTL(31) (factors is discarded in the solve >>>> phase): 0 >>>> > ICNTL(33) (compute determinant): >>>> 0 >>>> > CNTL(1) (relative pivoting threshold): 0.01 >>>> > CNTL(2) (stopping criterion of refinement): >>>> 1.49012e-08 >>>> > CNTL(3) (absolute pivoting threshold): 0 >>>> > CNTL(4) (value of static pivoting): -1 >>>> > CNTL(5) (fixation for null pivots): 0 >>>> > RINFO(1) (local estimated flops for the >>>> elimination after analysis): >>>> > [0] 5.59214e+11 >>>> > [1] 5.35237e+11 >>>> > RINFO(2) (local estimated flops for the assembly >>>> after factorization): >>>> > [0] 4.2839e+08 >>>> > [1] 3.799e+08 >>>> > RINFO(3) (local estimated flops for the >>>> elimination after factorization): >>>> > [0] 5.59214e+11 >>>> > [1] 5.35237e+11 >>>> > INFO(15) (estimated size of (in MB) MUMPS >>>> internal data for running numerical factorization): >>>> > [0] 2621 >>>> > [1] 2649 >>>> > INFO(16) (size of (in MB) MUMPS internal data >>>> used during numerical factorization): >>>> > [0] 2621 >>>> > [1] 2649 >>>> > INFO(23) (num of pivots eliminated on this >>>> processor after factorization): >>>> > [0] 90423 >>>> > [1] 93903 >>>> > RINFOG(1) (global estimated flops for the >>>> elimination after analysis): 1.09445e+12 >>>> > RINFOG(2) (global estimated flops for the >>>> assembly after factorization): 8.0829e+08 >>>> > RINFOG(3) (global estimated flops for the >>>> elimination after factorization): 1.09445e+12 >>>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) >>>> (determinant): (0,0)*(2^0) >>>> > INFOG(3) (estimated real workspace for factors on >>>> all processors after analysis): 403041366 >>>> > INFOG(4) (estimated integer workspace for factors >>>> on all processors after analysis): 2265748 >>>> > INFOG(5) (estimated maximum front size in the >>>> complete tree): 6663 >>>> > INFOG(6) (number of nodes in the complete tree): >>>> 2812 >>>> > INFOG(7) (ordering option effectively use after >>>> analysis): 5 >>>> > INFOG(8) (structural symmetry in percent of the >>>> permuted matrix after analysis): 100 >>>> > INFOG(9) (total real/complex workspace to store >>>> the matrix factors after factorization): 403041366 >>>> > INFOG(10) (total integer space store the matrix >>>> factors after factorization): 2265766 >>>> > INFOG(11) (order of largest frontal matrix after >>>> factorization): 6663 >>>> > INFOG(12) (number of off-diagonal pivots): 0 >>>> > INFOG(13) (number of delayed pivots after >>>> factorization): 0 >>>> > INFOG(14) (number of memory compress after >>>> factorization): 0 >>>> > INFOG(15) (number of steps of iterative >>>> refinement after solution): 0 >>>> > INFOG(16) (estimated size (in MB) of all MUMPS >>>> internal data for factorization after analysis: value on the most memory >>>> consuming processor): 2649 >>>> > INFOG(17) (estimated size of all MUMPS internal >>>> data for factorization after analysis: sum over all processors): 5270 >>>> > INFOG(18) (size of all MUMPS internal data >>>> allocated during factorization: value on the most memory consuming >>>> processor): 2649 >>>> > INFOG(19) (size of all MUMPS internal data >>>> allocated during factorization: sum over all processors): 5270 >>>> > INFOG(20) (estimated number of entries in the >>>> factors): 403041366 >>>> > INFOG(21) (size in MB of memory effectively used >>>> during factorization - value on the most memory consuming processor): 2121 >>>> > INFOG(22) (size in MB of memory effectively used >>>> during factorization - sum over all processors): 4174 >>>> > INFOG(23) (after analysis: value of ICNTL(6) >>>> effectively used): 0 >>>> > INFOG(24) (after analysis: value of ICNTL(12) >>>> effectively used): 1 >>>> > INFOG(25) (after factorization: number of pivots >>>> modified by static pivoting): 0 >>>> > INFOG(28) (after factorization: number of null >>>> pivots encountered): 0 >>>> > INFOG(29) (after factorization: effective number >>>> of entries in the factors (sum over all processors)): 403041366 >>>> > INFOG(30, 31) (after solution: size in Mbytes of >>>> memory used during solution phase): 2467, 4922 >>>> > INFOG(32) (after analysis: type of analysis >>>> done): 1 >>>> > INFOG(33) (value used for ICNTL(8)): 7 >>>> > INFOG(34) (exponent of the determinant if >>>> determinant is requested): 0 >>>> > linear system matrix = precond matrix: >>>> > Mat Object: (fieldsplit_u_) 2 MPI processes >>>> > type: mpiaij >>>> > rows=184326, cols=184326, bs=3 >>>> > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 >>>> > total number of mallocs used during MatSetValues calls =0 >>>> > using I-node (on process 0) routines: found 26829 nodes, >>>> limit used is 5 >>>> > KSP solver for S = A11 - A10 inv(A00) A01 >>>> > KSP Object: (fieldsplit_lu_) 2 MPI processes >>>> > type: preonly >>>> > maximum iterations=10000, initial guess is zero >>>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>>> > left preconditioning >>>> > using NONE norm type for convergence test >>>> > PC Object: (fieldsplit_lu_) 2 MPI processes >>>> > type: lu >>>> > LU: out-of-place factorization >>>> > tolerance for zero pivot 2.22045e-14 >>>> > matrix ordering: natural >>>> > factor fill ratio given 0, needed 0 >>>> > Factored matrix follows: >>>> > Mat Object: 2 MPI processes >>>> > type: mpiaij >>>> > rows=2583, cols=2583 >>>> > package used to perform factorization: mumps >>>> > total: nonzeros=2.17621e+06, allocated >>>> nonzeros=2.17621e+06 >>>> > total number of mallocs used during MatSetValues >>>> calls =0 >>>> > MUMPS run parameters: >>>> > SYM (matrix type): 0 >>>> > PAR (host participation): 1 >>>> > ICNTL(1) (output for error): 6 >>>> > ICNTL(2) (output of diagnostic msg): 0 >>>> > ICNTL(3) (output for global info): 0 >>>> > ICNTL(4) (level of printing): 0 >>>> > ICNTL(5) (input mat struct): 0 >>>> > ICNTL(6) (matrix prescaling): 7 >>>> > ICNTL(7) (sequentia matrix ordering):7 >>>> > ICNTL(8) (scalling strategy): 77 >>>> > ICNTL(10) (max num of refinements): 0 >>>> > ICNTL(11) (error analysis): 0 >>>> > ICNTL(12) (efficiency control): >>>> 1 >>>> > ICNTL(13) (efficiency control): >>>> 0 >>>> > ICNTL(14) (percentage of estimated workspace >>>> increase): 20 >>>> > ICNTL(18) (input mat struct): >>>> 3 >>>> > ICNTL(19) (Shur complement info): >>>> 0 >>>> > ICNTL(20) (rhs sparse pattern): >>>> 0 >>>> > ICNTL(21) (solution struct): >>>> 1 >>>> > ICNTL(22) (in-core/out-of-core facility): >>>> 0 >>>> > ICNTL(23) (max size of memory can be allocated >>>> locally):0 >>>> > ICNTL(24) (detection of null pivot rows): >>>> 0 >>>> > ICNTL(25) (computation of a null space basis): >>>> 0 >>>> > ICNTL(26) (Schur options for rhs or solution): >>>> 0 >>>> > ICNTL(27) (experimental parameter): >>>> -24 >>>> > ICNTL(28) (use parallel or sequential ordering): >>>> 1 >>>> > ICNTL(29) (parallel ordering): >>>> 0 >>>> > ICNTL(30) (user-specified set of entries in >>>> inv(A)): 0 >>>> > ICNTL(31) (factors is discarded in the solve >>>> phase): 0 >>>> > ICNTL(33) (compute determinant): >>>> 0 >>>> > CNTL(1) (relative pivoting threshold): 0.01 >>>> > CNTL(2) (stopping criterion of refinement): >>>> 1.49012e-08 >>>> > CNTL(3) (absolute pivoting threshold): 0 >>>> > CNTL(4) (value of static pivoting): -1 >>>> > CNTL(5) (fixation for null pivots): 0 >>>> > RINFO(1) (local estimated flops for the >>>> elimination after analysis): >>>> > [0] 5.12794e+08 >>>> > [1] 5.02142e+08 >>>> > RINFO(2) (local estimated flops for the assembly >>>> after factorization): >>>> > [0] 815031 >>>> > [1] 745263 >>>> > RINFO(3) (local estimated flops for the >>>> elimination after factorization): >>>> > [0] 5.12794e+08 >>>> > [1] 5.02142e+08 >>>> > INFO(15) (estimated size of (in MB) MUMPS >>>> internal data for running numerical factorization): >>>> > [0] 34 >>>> > [1] 34 >>>> > INFO(16) (size of (in MB) MUMPS internal data >>>> used during numerical factorization): >>>> > [0] 34 >>>> > [1] 34 >>>> > INFO(23) (num of pivots eliminated on this >>>> processor after factorization): >>>> > [0] 1158 >>>> > [1] 1425 >>>> > RINFOG(1) (global estimated flops for the >>>> elimination after analysis): 1.01494e+09 >>>> > RINFOG(2) (global estimated flops for the >>>> assembly after factorization): 1.56029e+06 >>>> > RINFOG(3) (global estimated flops for the >>>> elimination after factorization): 1.01494e+09 >>>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) >>>> (determinant): (0,0)*(2^0) >>>> > INFOG(3) (estimated real workspace for factors on >>>> all processors after analysis): 2176209 >>>> > INFOG(4) (estimated integer workspace for factors >>>> on all processors after analysis): 14427 >>>> > INFOG(5) (estimated maximum front size in the >>>> complete tree): 699 >>>> > INFOG(6) (number of nodes in the complete tree): >>>> 15 >>>> > INFOG(7) (ordering option effectively use after >>>> analysis): 2 >>>> > INFOG(8) (structural symmetry in percent of the >>>> permuted matrix after analysis): 100 >>>> > INFOG(9) (total real/complex workspace to store >>>> the matrix factors after factorization): 2176209 >>>> > INFOG(10) (total integer space store the matrix >>>> factors after factorization): 14427 >>>> > INFOG(11) (order of largest frontal matrix after >>>> factorization): 699 >>>> > INFOG(12) (number of off-diagonal pivots): 0 >>>> > INFOG(13) (number of delayed pivots after >>>> factorization): 0 >>>> > INFOG(14) (number of memory compress after >>>> factorization): 0 >>>> > INFOG(15) (number of steps of iterative >>>> refinement after solution): 0 >>>> > INFOG(16) (estimated size (in MB) of all MUMPS >>>> internal data for factorization after analysis: value on the most memory >>>> consuming processor): 34 >>>> > INFOG(17) (estimated size of all MUMPS internal >>>> data for factorization after analysis: sum over all processors): 68 >>>> > INFOG(18) (size of all MUMPS internal data >>>> allocated during factorization: value on the most memory consuming >>>> processor): 34 >>>> > INFOG(19) (size of all MUMPS internal data >>>> allocated during factorization: sum over all processors): 68 >>>> > INFOG(20) (estimated number of entries in the >>>> factors): 2176209 >>>> > INFOG(21) (size in MB of memory effectively used >>>> during factorization - value on the most memory consuming processor): 30 >>>> > INFOG(22) (size in MB of memory effectively used >>>> during factorization - sum over all processors): 59 >>>> > INFOG(23) (after analysis: value of ICNTL(6) >>>> effectively used): 0 >>>> > INFOG(24) (after analysis: value of ICNTL(12) >>>> effectively used): 1 >>>> > INFOG(25) (after factorization: number of pivots >>>> modified by static pivoting): 0 >>>> > INFOG(28) (after factorization: number of null >>>> pivots encountered): 0 >>>> > INFOG(29) (after factorization: effective number >>>> of entries in the factors (sum over all processors)): 2176209 >>>> > INFOG(30, 31) (after solution: size in Mbytes of >>>> memory used during solution phase): 16, 32 >>>> > INFOG(32) (after analysis: type of analysis >>>> done): 1 >>>> > INFOG(33) (value used for ICNTL(8)): 7 >>>> > INFOG(34) (exponent of the determinant if >>>> determinant is requested): 0 >>>> > linear system matrix followed by preconditioner matrix: >>>> > Mat Object: (fieldsplit_lu_) 2 MPI processes >>>> > type: schurcomplement >>>> > rows=2583, cols=2583 >>>> > Schur complement A11 - A10 inv(A00) A01 >>>> > A11 >>>> > Mat Object: (fieldsplit_lu_) >>>> 2 MPI processes >>>> > type: mpiaij >>>> > rows=2583, cols=2583, bs=3 >>>> > total: nonzeros=117369, allocated nonzeros=117369 >>>> > total number of mallocs used during MatSetValues >>>> calls =0 >>>> > not using I-node (on process 0) routines >>>> > A10 >>>> > Mat Object: 2 MPI processes >>>> > type: mpiaij >>>> > rows=2583, cols=184326, rbs=3, cbs = 1 >>>> > total: nonzeros=292770, allocated nonzeros=292770 >>>> > total number of mallocs used during MatSetValues >>>> calls =0 >>>> > not using I-node (on process 0) routines >>>> > KSP of A00 >>>> > KSP Object: (fieldsplit_u_) >>>> 2 MPI processes >>>> > type: preonly >>>> > maximum iterations=10000, initial guess is zero >>>> > tolerances: relative=1e-05, absolute=1e-50, >>>> divergence=10000 >>>> > left preconditioning >>>> > using NONE norm type for convergence test >>>> > PC Object: (fieldsplit_u_) 2 >>>> MPI processes >>>> > type: lu >>>> > LU: out-of-place factorization >>>> > tolerance for zero pivot 2.22045e-14 >>>> > matrix ordering: natural >>>> > factor fill ratio given 0, needed 0 >>>> > Factored matrix follows: >>>> > Mat Object: 2 MPI >>>> processes >>>> > type: mpiaij >>>> > rows=184326, cols=184326 >>>> > package used to perform factorization: mumps >>>> > total: nonzeros=4.03041e+08, allocated >>>> nonzeros=4.03041e+08 >>>> > total number of mallocs used during >>>> MatSetValues calls =0 >>>> > MUMPS run parameters: >>>> > SYM (matrix type): 0 >>>> > PAR (host participation): 1 >>>> > ICNTL(1) (output for error): 6 >>>> > ICNTL(2) (output of diagnostic msg): 0 >>>> > ICNTL(3) (output for global info): 0 >>>> > ICNTL(4) (level of printing): 0 >>>> > ICNTL(5) (input mat struct): 0 >>>> > ICNTL(6) (matrix prescaling): 7 >>>> > ICNTL(7) (sequentia matrix ordering):7 >>>> > ICNTL(8) (scalling strategy): 77 >>>> > ICNTL(10) (max num of refinements): 0 >>>> > ICNTL(11) (error analysis): 0 >>>> > ICNTL(12) (efficiency control): >>>> 1 >>>> > ICNTL(13) (efficiency control): >>>> 0 >>>> > ICNTL(14) (percentage of estimated >>>> workspace increase): 20 >>>> > ICNTL(18) (input mat struct): >>>> 3 >>>> > ICNTL(19) (Shur complement info): >>>> 0 >>>> > ICNTL(20) (rhs sparse pattern): >>>> 0 >>>> > ICNTL(21) (solution struct): >>>> 1 >>>> > ICNTL(22) (in-core/out-of-core >>>> facility): 0 >>>> > ICNTL(23) (max size of memory can be >>>> allocated locally):0 >>>> > ICNTL(24) (detection of null pivot >>>> rows): 0 >>>> > ICNTL(25) (computation of a null space >>>> basis): 0 >>>> > ICNTL(26) (Schur options for rhs or >>>> solution): 0 >>>> > ICNTL(27) (experimental parameter): >>>> -24 >>>> > ICNTL(28) (use parallel or sequential >>>> ordering): 1 >>>> > ICNTL(29) (parallel ordering): >>>> 0 >>>> > ICNTL(30) (user-specified set of entries >>>> in inv(A)): 0 >>>> > ICNTL(31) (factors is discarded in the >>>> solve phase): 0 >>>> > ICNTL(33) (compute determinant): >>>> 0 >>>> > CNTL(1) (relative pivoting threshold): >>>> 0.01 >>>> > CNTL(2) (stopping criterion of >>>> refinement): 1.49012e-08 >>>> > CNTL(3) (absolute pivoting threshold): >>>> 0 >>>> > CNTL(4) (value of static pivoting): >>>> -1 >>>> > CNTL(5) (fixation for null pivots): >>>> 0 >>>> > RINFO(1) (local estimated flops for the >>>> elimination after analysis): >>>> > [0] 5.59214e+11 >>>> > [1] 5.35237e+11 >>>> > RINFO(2) (local estimated flops for the >>>> assembly after factorization): >>>> > [0] 4.2839e+08 >>>> > [1] 3.799e+08 >>>> > RINFO(3) (local estimated flops for the >>>> elimination after factorization): >>>> > [0] 5.59214e+11 >>>> > [1] 5.35237e+11 >>>> > INFO(15) (estimated size of (in MB) MUMPS >>>> internal data for running numerical factorization): >>>> > [0] 2621 >>>> > [1] 2649 >>>> > INFO(16) (size of (in MB) MUMPS internal >>>> data used during numerical factorization): >>>> > [0] 2621 >>>> > [1] 2649 >>>> > INFO(23) (num of pivots eliminated on >>>> this processor after factorization): >>>> > [0] 90423 >>>> > [1] 93903 >>>> > RINFOG(1) (global estimated flops for the >>>> elimination after analysis): 1.09445e+12 >>>> > RINFOG(2) (global estimated flops for the >>>> assembly after factorization): 8.0829e+08 >>>> > RINFOG(3) (global estimated flops for the >>>> elimination after factorization): 1.09445e+12 >>>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) >>>> (determinant): (0,0)*(2^0) >>>> > INFOG(3) (estimated real workspace for >>>> factors on all processors after analysis): 403041366 >>>> > INFOG(4) (estimated integer workspace for >>>> factors on all processors after analysis): 2265748 >>>> > INFOG(5) (estimated maximum front size in >>>> the complete tree): 6663 >>>> > INFOG(6) (number of nodes in the complete >>>> tree): 2812 >>>> > INFOG(7) (ordering option effectively use >>>> after analysis): 5 >>>> > INFOG(8) (structural symmetry in percent >>>> of the permuted matrix after analysis): 100 >>>> > INFOG(9) (total real/complex workspace to >>>> store the matrix factors after factorization): 403041366 >>>> > INFOG(10) (total integer space store the >>>> matrix factors after factorization): 2265766 >>>> > INFOG(11) (order of largest frontal >>>> matrix after factorization): 6663 >>>> > INFOG(12) (number of off-diagonal >>>> pivots): 0 >>>> > INFOG(13) (number of delayed pivots after >>>> factorization): 0 >>>> > INFOG(14) (number of memory compress >>>> after factorization): 0 >>>> > INFOG(15) (number of steps of iterative >>>> refinement after solution): 0 >>>> > INFOG(16) (estimated size (in MB) of all >>>> MUMPS internal data for factorization after analysis: value on the most >>>> memory consuming processor): 2649 >>>> > INFOG(17) (estimated size of all MUMPS >>>> internal data for factorization after analysis: sum over all processors): >>>> 5270 >>>> > INFOG(18) (size of all MUMPS internal >>>> data allocated during factorization: value on the most memory consuming >>>> processor): 2649 >>>> > INFOG(19) (size of all MUMPS internal >>>> data allocated during factorization: sum over all processors): 5270 >>>> > INFOG(20) (estimated number of entries in >>>> the factors): 403041366 >>>> > INFOG(21) (size in MB of memory >>>> effectively used during factorization - value on the most memory consuming >>>> processor): 2121 >>>> > INFOG(22) (size in MB of memory >>>> effectively used during factorization - sum over all processors): 4174 >>>> > INFOG(23) (after analysis: value of >>>> ICNTL(6) effectively used): 0 >>>> > INFOG(24) (after analysis: value of >>>> ICNTL(12) effectively used): 1 >>>> > INFOG(25) (after factorization: number of >>>> pivots modified by static pivoting): 0 >>>> > INFOG(28) (after factorization: number of >>>> null pivots encountered): 0 >>>> > INFOG(29) (after factorization: effective >>>> number of entries in the factors (sum over all processors)): 403041366 >>>> > INFOG(30, 31) (after solution: size in >>>> Mbytes of memory used during solution phase): 2467, 4922 >>>> > INFOG(32) (after analysis: type of >>>> analysis done): 1 >>>> > INFOG(33) (value used for ICNTL(8)): 7 >>>> > INFOG(34) (exponent of the determinant if >>>> determinant is requested): 0 >>>> > linear system matrix = precond matrix: >>>> > Mat Object: (fieldsplit_u_) >>>> 2 MPI processes >>>> > type: mpiaij >>>> > rows=184326, cols=184326, bs=3 >>>> > total: nonzeros=3.32649e+07, allocated >>>> nonzeros=3.32649e+07 >>>> > total number of mallocs used during MatSetValues >>>> calls =0 >>>> > using I-node (on process 0) routines: found 26829 >>>> nodes, limit used is 5 >>>> > A01 >>>> > Mat Object: 2 MPI processes >>>> > type: mpiaij >>>> > rows=184326, cols=2583, rbs=3, cbs = 1 >>>> > total: nonzeros=292770, allocated nonzeros=292770 >>>> > total number of mallocs used during MatSetValues >>>> calls =0 >>>> > using I-node (on process 0) routines: found 16098 >>>> nodes, limit used is 5 >>>> > Mat Object: 2 MPI processes >>>> > type: mpiaij >>>> > rows=2583, cols=2583, rbs=3, cbs = 1 >>>> > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 >>>> > total number of mallocs used during MatSetValues calls =0 >>>> > not using I-node (on process 0) routines >>>> > linear system matrix = precond matrix: >>>> > Mat Object: 2 MPI processes >>>> > type: mpiaij >>>> > rows=186909, cols=186909 >>>> > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 >>>> > total number of mallocs used during MatSetValues calls =0 >>>> > using I-node (on process 0) routines: found 26829 nodes, limit >>>> used is 5 >>>> > KSPSolve completed >>>> > >>>> > >>>> > Giang >>>> > >>>> > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley >>>> wrote: >>>> > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui >>>> wrote: >>>> > Hello >>>> > >>>> > I'm solving an indefinite problem arising from mesh tying/contact >>>> using Lagrange multiplier, the matrix has the form >>>> > >>>> > K = [A P^T >>>> > P 0] >>>> > >>>> > I used the FIELDSPLIT preconditioner with one field is the main >>>> variable (displacement) and the other field for dual variable (Lagrange >>>> multiplier). The block size for each field is 3. According to the manual, I >>>> first chose the preconditioner based on Schur complement to treat this >>>> problem. >>>> > >>>> > >>>> > For any solver question, please send us the output of >>>> > >>>> > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>>> > >>>> > >>>> > However, I will comment below >>>> > >>>> > The parameters used for the solve is >>>> > -ksp_type gmres >>>> > >>>> > You need 'fgmres' here with the options you have below. >>>> > >>>> > -ksp_max_it 300 >>>> > -ksp_gmres_restart 300 >>>> > -ksp_gmres_modifiedgramschmidt >>>> > -pc_fieldsplit_type schur >>>> > -pc_fieldsplit_schur_fact_type diag >>>> > -pc_fieldsplit_schur_precondition selfp >>>> > >>>> > >>>> > >>>> > It could be taking time in the MatMatMult() here if that matrix is >>>> dense. Is there any reason to >>>> > believe that is a good preconditioner for your problem? >>>> > >>>> > >>>> > -pc_fieldsplit_detect_saddle_point >>>> > -fieldsplit_u_pc_type hypre >>>> > >>>> > I would just use MUMPS here to start, especially if it works on the >>>> whole problem. Same with the one below. >>>> > >>>> > Matt >>>> > >>>> > -fieldsplit_u_pc_hypre_type boomeramg >>>> > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS >>>> > -fieldsplit_lu_pc_type hypre >>>> > -fieldsplit_lu_pc_hypre_type boomeramg >>>> > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS >>>> > >>>> > For the test case, a small problem is solved on 2 processes. Due to >>>> the decomposition, the contact only happens in 1 proc, so the size of >>>> Lagrange multiplier dofs on proc 0 is 0. >>>> > >>>> > 0: mIndexU.size(): 80490 >>>> > 0: mIndexLU.size(): 0 >>>> > 1: mIndexU.size(): 103836 >>>> > 1: mIndexLU.size(): 2583 >>>> > >>>> > However, with this setup the solver takes very long at KSPSolve >>>> before going to iteration, and the first iteration seems forever so I have >>>> to stop the calculation. I guessed that the solver takes time to compute >>>> the Schur complement, but according to the manual only the diagonal of A is >>>> used to approximate the Schur complement, so it should not take long to >>>> compute this. >>>> > >>>> > Note that I ran the same problem with direct solver (MUMPS) and it's >>>> able to produce the valid results. The parameter for the solve is pretty >>>> standard >>>> > -ksp_type preonly >>>> > -pc_type lu >>>> > -pc_factor_mat_solver_package mumps >>>> > >>>> > Hence the matrix/rhs must not have any problem here. Do you have any >>>> idea or suggestion for this case? >>>> > >>>> > >>>> > Giang >>>> > >>>> > >>>> > >>>> > -- >>>> > What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> > -- Norbert Wiener >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> > -- Norbert Wiener >>>> > >>>> >>>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 15 11:25:15 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 15 Sep 2016 11:25:15 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> Message-ID: <535EFF3A-8BF9-4A95-8FBA-5AC1BE798659@mcs.anl.gov> Should we have some simple selection of default algorithms based on problem size/number of processes? For example if using more than 1000 processes then use scalable version etc? How would we decide on the parameter values? Barry > On Sep 15, 2016, at 5:35 AM, Dave May wrote: > > HI all, > > I the only unexpected memory usage I can see is associated with the call to MatPtAP(). > Here is something you can try immediately. > Run your code with the additional options > -matrap 0 -matptap_scalable > > I didn't realize this before, but the default behaviour of MatPtAP in parallel is actually to to explicitly form the transpose of P (e.g. assemble R = P^T) and then compute R.A.P. > You don't want to do this. The option -matrap 0 resolves this issue. > > The implementation of P^T.A.P has two variants. > The scalable implementation (with respect to memory usage) is selected via the second option -matptap_scalable. > > Try it out - I see a significant memory reduction using these options for particular mesh sizes / partitions. > > I've attached a cleaned up version of the code you sent me. > There were a number of memory leaks and other issues. > The main points being > * You should call DMDAVecGetArrayF90() before VecAssembly{Begin,End} > * You should call PetscFinalize(), otherwise the option -log_summary (-log_view) will not display anything once the program has completed. > > > Thanks, > Dave > > > On 15 September 2016 at 08:03, Hengjie Wang wrote: > Hi Dave, > > Sorry, I should have put more comment to explain the code. > The number of process in each dimension is the same: Px = Py=Pz=P. So is the domain size. > So if the you want to run the code for a 512^3 grid points on 16^3 cores, you need to set "-N 512 -P 16" in the command line. > I add more comments and also fix an error in the attached code. ( The error only effects the accuracy of solution but not the memory usage. ) > > Thank you. > Frank > > > On 9/14/2016 9:05 PM, Dave May wrote: >> >> >> On Thursday, 15 September 2016, Dave May wrote: >> >> >> On Thursday, 15 September 2016, frank wrote: >> Hi, >> >> I write a simple code to re-produce the error. I hope this can help to diagnose the problem. >> The code just solves a 3d poisson equation. >> >> Why is the stencil width a runtime parameter?? And why is the default value 2? For 7-pnt FD Laplace, you only need a stencil width of 1. >> >> Was this choice made to mimic something in the real application code? >> >> Please ignore - I misunderstood your usage of the param set by -P >> >> >> >> I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. That's when I re-produce the OOM error. Each core has about 2G memory. >> I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The ksp solver works fine. >> I attached the code, ksp_view_pre's output and my petsc option file. >> >> Thank you. >> Frank >> >> On 09/09/2016 06:38 PM, Hengjie Wang wrote: >>> Hi Barry, >>> >>> I checked. On the supercomputer, I had the option "-ksp_view_pre" but it is not in file I sent you. I am sorry for the confusion. >>> >>> Regards, >>> Frank >>> >>> On Friday, September 9, 2016, Barry Smith wrote: >>> >>> > On Sep 9, 2016, at 3:11 PM, frank wrote: >>> > >>> > Hi Barry, >>> > >>> > I think the first KSP view output is from -ksp_view_pre. Before I submitted the test, I was not sure whether there would be OOM error or not. So I added both -ksp_view_pre and -ksp_view. >>> >>> But the options file you sent specifically does NOT list the -ksp_view_pre so how could it be from that? >>> >>> Sorry to be pedantic but I've spent too much time in the past trying to debug from incorrect information and want to make sure that the information I have is correct before thinking. Please recheck exactly what happened. Rerun with the exact input file you emailed if that is needed. >>> >>> Barry >>> >>> > >>> > Frank >>> > >>> > >>> > On 09/09/2016 12:38 PM, Barry Smith wrote: >>> >> Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt has only one KSPView in it? Did you run two different solves in the 2 case but not the one? >>> >> >>> >> Barry >>> >> >>> >> >>> >> >>> >>> On Sep 9, 2016, at 10:56 AM, frank wrote: >>> >>> >>> >>> Hi, >>> >>> >>> >>> I want to continue digging into the memory problem here. >>> >>> I did find a work around in the past, which is to use less cores per node so that each core has 8G memory. However this is deficient and expensive. I hope to locate the place that uses the most memory. >>> >>> >>> >>> Here is a brief summary of the tests I did in past: >>> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >>> >>> Maximum (over computational time) process memory: total 7.0727e+08 >>> >>> Current process memory: total 7.0727e+08 >>> >>> Maximum (over computational time) space PetscMalloc()ed: total 6.3908e+11 >>> >>> Current space PetscMalloc()ed: total 1.8275e+09 >>> >>> >>> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 >>> >>> Maximum (over computational time) process memory: total 5.9431e+09 >>> >>> Current process memory: total 5.9431e+09 >>> >>> Maximum (over computational time) space PetscMalloc()ed: total 5.3202e+12 >>> >>> Current space PetscMalloc()ed: total 5.4844e+09 >>> >>> >>> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 >>> >>> OOM( Out Of Memory ) killer of the supercomputer terminated the job during "KSPSolve". >>> >>> >>> >>> I attached the output of ksp_view( the third test's output is from ksp_view_pre ), memory_view and also the petsc options. >>> >>> >>> >>> In all the tests, each core can access about 2G memory. In test3, there are 4223139840 non-zeros in the matrix. This will consume about 1.74M, using double precision. Considering some extra memory used to store integer index, 2G memory should still be way enough. >>> >>> >>> >>> Is there a way to find out which part of KSPSolve uses the most memory? >>> >>> Thank you so much. >>> >>> >>> >>> BTW, there are 4 options remains unused and I don't understand why they are omitted: >>> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly >>> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi >>> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >>> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson >>> >>> >>> >>> >>> >>> Regards, >>> >>> Frank >>> >>> >>> >>> On 07/13/2016 05:47 PM, Dave May wrote: >>> >>>> >>> >>>> On 14 July 2016 at 01:07, frank wrote: >>> >>>> Hi Dave, >>> >>>> >>> >>>> Sorry for the late reply. >>> >>>> Thank you so much for your detailed reply. >>> >>>> >>> >>>> I have a question about the estimation of the memory usage. There are 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is used. So the memory per process is: >>> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? >>> >>>> Did I do sth wrong here? Because this seems too small. >>> >>>> >>> >>>> No - I totally f***ed it up. You are correct. That'll teach me for fumbling around with my iphone calculator and not using my brain. (Note that to convert to MB just divide by 1e6, not 1024^2 - although I apparently cannot convert between units correctly....) >>> >>>> >>> >>>> From the PETSc objects associated with the solver, It looks like it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities are: somewhere in your usage of PETSc you've introduced a memory leak; PETSc is doing a huge over allocation (e.g. as per our discussion of MatPtAP); or in your application code there are other objects you have forgotten to log the memory for. >>> >>>> >>> >>>> >>> >>>> >>> >>>> I am running this job on Bluewater >>> >>>> I am using the 7 points FD stencil in 3D. >>> >>>> >>> >>>> I thought so on both counts. >>> >>>> >>> >>>> I apologize that I made a stupid mistake in computing the memory per core. My settings render each core can access only 2G memory on average instead of 8G which I mentioned in previous email. I re-run the job with 8G memory per core on average and there is no "Out Of Memory" error. I would do more test to see if there is still some memory issue. >>> >>>> >>> >>>> Ok. I'd still like to know where the memory was being used since my estimates were off. >>> >>>> >>> >>>> >>> >>>> Thanks, >>> >>>> Dave >>> >>>> >>> >>>> Regards, >>> >>>> Frank >>> >>>> >>> >>>> >>> >>>> >>> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >>> >>>>> Hi Frank, >>> >>>>> >>> >>>>> >>> >>>>> On 11 July 2016 at 19:14, frank wrote: >>> >>>>> Hi Dave, >>> >>>>> >>> >>>>> I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached. >>> >>>>> I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached. >>> >>>>> >>> >>>>> Okay - that is essentially useless (sorry) >>> >>>>> >>> >>>>> It seems to me that the error occurred when the decomposition was going to be changed. >>> >>>>> >>> >>>>> Based on what information? >>> >>>>> Running with -info would give us more clues, but will create a ton of output. >>> >>>>> Please try running the case which failed with -info >>> >>>>> I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison. >>> >>>>> Thank you. >>> >>>>> >>> >>>>> >>> >>>>> [3] Here is my crude estimate of your memory usage. >>> >>>>> I'll target the biggest memory hogs only to get an order of magnitude estimate >>> >>>>> >>> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision. >>> >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers) >>> >>>>> >>> >>>>> * You use 5 levels of coarsening, so the other operators should represent (collectively) >>> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the communicator with 18432 ranks. >>> >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks. >>> >>>>> >>> >>>>> * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. >>> >>>>> PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. >>> >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. >>> >>>>> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. >>> >>>>> The temporary matrix is now destroyed. >>> >>>>> >>> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. >>> >>>>> This requires 2 doubles per point in the DMDA. >>> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >>> >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm. >>> >>>>> >>> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. >>> >>>>> >>> >>>>> From my rough estimates, the worst case memory foot print for any given core, given your options is approximately >>> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >>> >>>>> This is way below 8 GB. >>> >>>>> >>> >>>>> Note this estimate completely ignores: >>> >>>>> (1) the memory required for the restriction operator, >>> >>>>> (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) >>> >>>>> (3) all temporary vectors required by the CG solver, and those required by the smoothers. >>> >>>>> (4) internal memory allocated by MatPtAP >>> >>>>> (5) memory associated with IS's used within PCTelescope >>> >>>>> >>> >>>>> So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates >>> >>>>> >>> >>>>> Since I don't have your code I cannot access the latter. >>> >>>>> Since I don't have access to the same machine you are running on, I think we need to take a step back. >>> >>>>> >>> >>>>> [1] What machine are you running on? Send me a URL if its available >>> >>>>> >>> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) >>> >>>>> If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. >>> >>>>> This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. >>> >>>>> >>> >>>>> Thanks, >>> >>>>> Dave >>> >>>>> >>> >>>>> >>> >>>>> Frank >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >>> >>>>>> >>> >>>>>> On Saturday, 9 July 2016, frank wrote: >>> >>>>>> Hi Barry and Dave, >>> >>>>>> >>> >>>>>> Thank both of you for the advice. >>> >>>>>> >>> >>>>>> @Barry >>> >>>>>> I made a mistake in the file names in last email. I attached the correct files this time. >>> >>>>>> For all the three tests, 'Telescope' is used as the coarse preconditioner. >>> >>>>>> >>> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>> >>>>>> Part of the memory usage: Vector 125 124 3971904 0. >>> >>>>>> Matrix 101 101 9462372 0 >>> >>>>>> >>> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>> >>>>>> Part of the memory usage: Vector 125 124 681672 0. >>> >>>>>> Matrix 101 101 1462180 0. >>> >>>>>> >>> >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. >>> >>>>>> >>> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 >>> >>>>>> Here I get the out of memory error. >>> >>>>>> >>> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>> >>>>>> The linear solver didn't work in this case. Petsc output some errors. >>> >>>>>> >>> >>>>>> @Dave >>> >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >>> >>>>>> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >>> >>>>>> On the last coarse mesh of 'Telescope', there is only one grid point per process. >>> >>>>>> I still got the OOM error. The detailed petsc option file is attached. >>> >>>>>> >>> >>>>>> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. >>> >>>>>> >>> >>>>>> And please send the result of KSPView so we can see what is actually used in the computations >>> >>>>>> >>> >>>>>> Thanks >>> >>>>>> Dave >>> >>>>>> >>> >>>>>> >>> >>>>>> Thank you so much. >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>> >>>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>> >>>>>> >>> >>>>>> Hi Barry, >>> >>>>>> >>> >>>>>> Thank you for you advice. >>> >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. >>> >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. >>> >>>>>> The system gives me the "Out of Memory" error before the linear system is completely solved. >>> >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. >>> >>>>>> >>> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. >>> >>>>>> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test >>> >>>>>> Vector 384 383 8,193,712 0. >>> >>>>>> Matrix 103 103 11,508,688 0. >>> >>>>>> to 3rd test >>> >>>>>> Vector 384 383 1,590,520 0. >>> >>>>>> Matrix 103 103 3,508,664 0. >>> >>>>>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. >>> >>>>>> >>> >>>>>> >>> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. >>> >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? >>> >>>>>> Sorry, my mistake the option is -memory_view >>> >>>>>> >>> >>>>>> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. >>> >>>>>> >>> >>>>>> Barry >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> In both tests the memory usage is not large. >>> >>>>>> >>> >>>>>> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. >>> >>>>>> Is there is a way to show how much memory it allocated? >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>> >>>>>> Frank, >>> >>>>>> >>> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >>> >>>>>> >>> >>>>>> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >>> >>>>>> >>> >>>>>> Barry >>> >>>>>> >>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>> >>>>>> >>> >>>>>> Hi, >>> >>>>>> >>> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >>> >>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >>> >>>>>> The petsc options file is attached. >>> >>>>>> >>> >>>>>> The domain is a 3d box. >>> >>>>>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >>> >>>>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >>> >>>>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >>> >>>>>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >>> >>>>>> >>> >>>>>> How can I diagnose what exactly cause the error? >>> >>>>>> Thank you so much. >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>> >>> >>>> >>> >>> >>> > >>> >> >> > > > From bsmith at mcs.anl.gov Thu Sep 15 12:54:03 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 15 Sep 2016 12:54:03 -0500 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: <2BA49863-4DC2-462E-A80B-3123F21266B4@mcs.anl.gov> > On Sep 15, 2016, at 4:11 AM, Hoang Giang Bui wrote: > > Dear Barry > > > > Seem like zero pivot does not happen, but why the solver for Schur takes 13 steps if the preconditioner is direct solver? Because if you use KSPSetOperators(ksp_S,A,B) it is NOT a direct solver. It is only a direct solver if the two matrices you pass in are the SAME matrix! If you work through the algebra you can see why A,A gives you a direct solver with those fieldsplit options. Barry > > > I also so tried another problem which I known does have a nonsingular Schur (at least A11 != 0) and it also have the same problem: 1 step outer convergence but multiple step inner convergence. > > Any ideas? > > Giang > > On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: > > Normally you'd be absolutely correct to expect convergence in one iteration. However in this example note the call > > ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); > > It is solving the linear system defined by A but building the preconditioner (i.e. the entire fieldsplit process) from a different matrix B. Since A is not B you should not expect convergence in one iteration. If you change the code to > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > you will see exactly what you expect, convergence in one iteration. > > Sorry about this, the example is lacking clarity and documentation its author obviously knew too well what he was doing that he didn't realize everyone else in the world would need more comments in the code. If you change the code to > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > it will stop without being able to build the preconditioner because LU factorization of the Sp matrix will result in a zero pivot. This is why this "auxiliary" matrix B is used to define the preconditioner instead of A. > > Barry > > > > > > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui wrote: > > > > Sorry I slept quite a while in this thread. Now I start to look at it again. In the last try, the previous setting doesn't work either (in fact diverge). So I would speculate if the Schur complement in my case is actually not invertible. It's also possible that the code is wrong somewhere. However, before looking at that, I want to understand thoroughly the settings for Schur complement > > > > I experimented ex42 with the settings: > > mpirun -np 1 ex42 \ > > -stokes_ksp_monitor \ > > -stokes_ksp_type fgmres \ > > -stokes_pc_type fieldsplit \ > > -stokes_pc_fieldsplit_type schur \ > > -stokes_pc_fieldsplit_schur_fact_type full \ > > -stokes_pc_fieldsplit_schur_precondition selfp \ > > -stokes_fieldsplit_u_ksp_type preonly \ > > -stokes_fieldsplit_u_pc_type lu \ > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > -stokes_fieldsplit_p_ksp_type gmres \ > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > -stokes_fieldsplit_p_pc_type lu \ > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > > > In my understanding, the solver should converge in 1 (outer) step. Execution gives: > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > 1 KSP Residual norm 7.656238881621e-04 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > > 2 KSP Residual norm 9.895645456398e-06 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > > 3 KSP Residual norm 1.608206702571e-06 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > > 4 KSP Residual norm 2.453618999882e-07 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > > 5 KSP Residual norm 4.230399264750e-08 > > > > Looks like the "selfp" does construct the Schur nicely. But does "full" really construct the full block preconditioner? > > > > Giang > > P/S: I'm also generating a smaller size of the previous problem for checking again. > > > > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley wrote: > > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui wrote: > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > believe that is a good preconditioner for your problem? > > > > This is the first approach to the problem, so I chose the most simple setting. Do you have any other recommendation? > > > > This is in no way the simplest PC. We need to make it simpler first. > > > > 1) Run on only 1 proc > > > > 2) Use -pc_fieldsplit_schur_fact_type full > > > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_true_residual > > > > This should converge in 1 outer iteration, but we will see how good your Schur complement preconditioner > > is for this problem. > > > > You need to start out from something you understand and then start making approximations. > > > > Matt > > > > For any solver question, please send us the output of > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > I sent here the full output (after changed to fgmres), again it takes long at the first iteration but after that, it does not converge > > > > -ksp_type fgmres > > -ksp_max_it 300 > > -ksp_gmres_restart 300 > > -ksp_gmres_modifiedgramschmidt > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type diag > > -pc_fieldsplit_schur_precondition selfp > > -pc_fieldsplit_detect_saddle_point > > -fieldsplit_u_ksp_type preonly > > -fieldsplit_u_pc_type lu > > -fieldsplit_u_pc_factor_mat_solver_package mumps > > -fieldsplit_lu_ksp_type preonly > > -fieldsplit_lu_pc_type lu > > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > > Linear solve did not converge due to DIVERGED_ITS iterations 300 > > KSP Object: 2 MPI processes > > type: fgmres > > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=300, initial guess is zero > > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > > right preconditioning > > using UNPRECONDITIONED norm type for convergence test > > PC Object: 2 MPI processes > > type: fieldsplit > > FieldSplit with Schur preconditioner, factorization DIAG > > Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse > > Split info: > > Split number 0 Defined by IS > > Split number 1 Defined by IS > > KSP solver for A00 block > > KSP Object: (fieldsplit_u_) 2 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_u_) 2 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326 > > package used to perform factorization: mumps > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > total number of mallocs used during MatSetValues calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): 1 > > ICNTL(13) (efficiency control): 0 > > ICNTL(14) (percentage of estimated workspace increase): 20 > > ICNTL(18) (input mat struct): 3 > > ICNTL(19) (Shur complement info): 0 > > ICNTL(20) (rhs sparse pattern): 0 > > ICNTL(21) (solution struct): 1 > > ICNTL(22) (in-core/out-of-core facility): 0 > > ICNTL(23) (max size of memory can be allocated locally):0 > > ICNTL(24) (detection of null pivot rows): 0 > > ICNTL(25) (computation of a null space basis): 0 > > ICNTL(26) (Schur options for rhs or solution): 0 > > ICNTL(27) (experimental parameter): -24 > > ICNTL(28) (use parallel or sequential ordering): 1 > > ICNTL(29) (parallel ordering): 0 > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve phase): 0 > > ICNTL(33) (compute determinant): 0 > > CNTL(1) (relative pivoting threshold): 0.01 > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the elimination after analysis): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > RINFO(2) (local estimated flops for the assembly after factorization): > > [0] 4.2839e+08 > > [1] 3.799e+08 > > RINFO(3) (local estimated flops for the elimination after factorization): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(23) (num of pivots eliminated on this processor after factorization): > > [0] 90423 > > [1] 93903 > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > INFOG(6) (number of nodes in the complete tree): 2812 > > INFOG(7) (ordering option effectively use after analysis): 5 > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after factorization): 0 > > INFOG(14) (number of memory compress after factorization): 0 > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > INFOG(20) (estimated number of entries in the factors): 403041366 > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > INFOG(32) (after analysis: type of analysis done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_u_) 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326, bs=3 > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > KSP solver for S = A11 - A10 inv(A00) A01 > > KSP Object: (fieldsplit_lu_) 2 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_lu_) 2 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=2583 > > package used to perform factorization: mumps > > total: nonzeros=2.17621e+06, allocated nonzeros=2.17621e+06 > > total number of mallocs used during MatSetValues calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): 1 > > ICNTL(13) (efficiency control): 0 > > ICNTL(14) (percentage of estimated workspace increase): 20 > > ICNTL(18) (input mat struct): 3 > > ICNTL(19) (Shur complement info): 0 > > ICNTL(20) (rhs sparse pattern): 0 > > ICNTL(21) (solution struct): 1 > > ICNTL(22) (in-core/out-of-core facility): 0 > > ICNTL(23) (max size of memory can be allocated locally):0 > > ICNTL(24) (detection of null pivot rows): 0 > > ICNTL(25) (computation of a null space basis): 0 > > ICNTL(26) (Schur options for rhs or solution): 0 > > ICNTL(27) (experimental parameter): -24 > > ICNTL(28) (use parallel or sequential ordering): 1 > > ICNTL(29) (parallel ordering): 0 > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve phase): 0 > > ICNTL(33) (compute determinant): 0 > > CNTL(1) (relative pivoting threshold): 0.01 > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the elimination after analysis): > > [0] 5.12794e+08 > > [1] 5.02142e+08 > > RINFO(2) (local estimated flops for the assembly after factorization): > > [0] 815031 > > [1] 745263 > > RINFO(3) (local estimated flops for the elimination after factorization): > > [0] 5.12794e+08 > > [1] 5.02142e+08 > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > [0] 34 > > [1] 34 > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > [0] 34 > > [1] 34 > > INFO(23) (num of pivots eliminated on this processor after factorization): > > [0] 1158 > > [1] 1425 > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.01494e+09 > > RINFOG(2) (global estimated flops for the assembly after factorization): 1.56029e+06 > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.01494e+09 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 2176209 > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 14427 > > INFOG(5) (estimated maximum front size in the complete tree): 699 > > INFOG(6) (number of nodes in the complete tree): 15 > > INFOG(7) (ordering option effectively use after analysis): 2 > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 2176209 > > INFOG(10) (total integer space store the matrix factors after factorization): 14427 > > INFOG(11) (order of largest frontal matrix after factorization): 699 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after factorization): 0 > > INFOG(14) (number of memory compress after factorization): 0 > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 34 > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 68 > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 34 > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 68 > > INFOG(20) (estimated number of entries in the factors): 2176209 > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 30 > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 59 > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 2176209 > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 16, 32 > > INFOG(32) (after analysis: type of analysis done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > linear system matrix followed by preconditioner matrix: > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > type: schurcomplement > > rows=2583, cols=2583 > > Schur complement A11 - A10 inv(A00) A01 > > A11 > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > type: mpiaij > > rows=2583, cols=2583, bs=3 > > total: nonzeros=117369, allocated nonzeros=117369 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > A10 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=184326, rbs=3, cbs = 1 > > total: nonzeros=292770, allocated nonzeros=292770 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > KSP of A00 > > KSP Object: (fieldsplit_u_) 2 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_u_) 2 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326 > > package used to perform factorization: mumps > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > total number of mallocs used during MatSetValues calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): 1 > > ICNTL(13) (efficiency control): 0 > > ICNTL(14) (percentage of estimated workspace increase): 20 > > ICNTL(18) (input mat struct): 3 > > ICNTL(19) (Shur complement info): 0 > > ICNTL(20) (rhs sparse pattern): 0 > > ICNTL(21) (solution struct): 1 > > ICNTL(22) (in-core/out-of-core facility): 0 > > ICNTL(23) (max size of memory can be allocated locally):0 > > ICNTL(24) (detection of null pivot rows): 0 > > ICNTL(25) (computation of a null space basis): 0 > > ICNTL(26) (Schur options for rhs or solution): 0 > > ICNTL(27) (experimental parameter): -24 > > ICNTL(28) (use parallel or sequential ordering): 1 > > ICNTL(29) (parallel ordering): 0 > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve phase): 0 > > ICNTL(33) (compute determinant): 0 > > CNTL(1) (relative pivoting threshold): 0.01 > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the elimination after analysis): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > RINFO(2) (local estimated flops for the assembly after factorization): > > [0] 4.2839e+08 > > [1] 3.799e+08 > > RINFO(3) (local estimated flops for the elimination after factorization): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(23) (num of pivots eliminated on this processor after factorization): > > [0] 90423 > > [1] 93903 > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > INFOG(6) (number of nodes in the complete tree): 2812 > > INFOG(7) (ordering option effectively use after analysis): 5 > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after factorization): 0 > > INFOG(14) (number of memory compress after factorization): 0 > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > INFOG(20) (estimated number of entries in the factors): 403041366 > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > INFOG(32) (after analysis: type of analysis done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_u_) 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326, bs=3 > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > A01 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=2583, rbs=3, cbs = 1 > > total: nonzeros=292770, allocated nonzeros=292770 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 16098 nodes, limit used is 5 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=2583, rbs=3, cbs = 1 > > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > linear system matrix = precond matrix: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=186909, cols=186909 > > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > KSPSolve completed > > > > > > Giang > > > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley wrote: > > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui wrote: > > Hello > > > > I'm solving an indefinite problem arising from mesh tying/contact using Lagrange multiplier, the matrix has the form > > > > K = [A P^T > > P 0] > > > > I used the FIELDSPLIT preconditioner with one field is the main variable (displacement) and the other field for dual variable (Lagrange multiplier). The block size for each field is 3. According to the manual, I first chose the preconditioner based on Schur complement to treat this problem. > > > > > > For any solver question, please send us the output of > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > However, I will comment below > > > > The parameters used for the solve is > > -ksp_type gmres > > > > You need 'fgmres' here with the options you have below. > > > > -ksp_max_it 300 > > -ksp_gmres_restart 300 > > -ksp_gmres_modifiedgramschmidt > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type diag > > -pc_fieldsplit_schur_precondition selfp > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > believe that is a good preconditioner for your problem? > > > > > > -pc_fieldsplit_detect_saddle_point > > -fieldsplit_u_pc_type hypre > > > > I would just use MUMPS here to start, especially if it works on the whole problem. Same with the one below. > > > > Matt > > > > -fieldsplit_u_pc_hypre_type boomeramg > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > -fieldsplit_lu_pc_type hypre > > -fieldsplit_lu_pc_hypre_type boomeramg > > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > > > For the test case, a small problem is solved on 2 processes. Due to the decomposition, the contact only happens in 1 proc, so the size of Lagrange multiplier dofs on proc 0 is 0. > > > > 0: mIndexU.size(): 80490 > > 0: mIndexLU.size(): 0 > > 1: mIndexU.size(): 103836 > > 1: mIndexLU.size(): 2583 > > > > However, with this setup the solver takes very long at KSPSolve before going to iteration, and the first iteration seems forever so I have to stop the calculation. I guessed that the solver takes time to compute the Schur complement, but according to the manual only the diagonal of A is used to approximate the Schur complement, so it should not take long to compute this. > > > > Note that I ran the same problem with direct solver (MUMPS) and it's able to produce the valid results. The parameter for the solve is pretty standard > > -ksp_type preonly > > -pc_type lu > > -pc_factor_mat_solver_package mumps > > > > Hence the matrix/rhs must not have any problem here. Do you have any idea or suggestion for this case? > > > > > > Giang > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > From dave.mayhem23 at gmail.com Thu Sep 15 13:10:55 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 15 Sep 2016 20:10:55 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <535EFF3A-8BF9-4A95-8FBA-5AC1BE798659@mcs.anl.gov> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> <535EFF3A-8BF9-4A95-8FBA-5AC1BE798659@mcs.anl.gov> Message-ID: On Thursday, 15 September 2016, Barry Smith wrote: > > Should we have some simple selection of default algorithms based on > problem size/number of processes? For example if using more than 1000 > processes then use scalable version etc? How would we decide on the > parameter values? I don't like the idea of having "smart" selection by default as it's terribly annoying for the user when they try and understand the performance characteristics of a given method when they do a strong/weak scaling test. If such a smart selection strategy was adopted, the details of it should be made abundantly clear to the user. These algs are dependent on many some factors, thus making the smart selection for all use cases hard / impossible. I would be happy with unifying the three inplemtationa with three different options AND having these implantation options documented in the man page. Maybe even the man page should advise users which to use in particular circumstances (I think there is something similar on the VecScatter page). I have these as suggestions for unifying the options names using bools -matptap_explicit_transpose -matptap_symbolic_transpose_dense -matptap_symbolic_transpose Or maybe enums is more clear -matptap_impl {explicit_pt,symbolic_pt_dense,symbolic_pt} which are equivalent to these options 1) the current default 2) -matrap 0 3) -matrap 0 -matptap_scalable Maybe there could be a fourth option -matptap_dynamic_selection which chooses the most appropriate alg given machine info, problem size, partition size,.... At least if the user explicitly chooses the dynamic_selection mode, they wouldn't be surprised if there were any bumps appearing in any scaling study they conducted. Cheers Dave > > Barry > > > On Sep 15, 2016, at 5:35 AM, Dave May > wrote: > > > > HI all, > > > > I the only unexpected memory usage I can see is associated with the call > to MatPtAP(). > > Here is something you can try immediately. > > Run your code with the additional options > > -matrap 0 -matptap_scalable > > > > I didn't realize this before, but the default behaviour of MatPtAP in > parallel is actually to to explicitly form the transpose of P (e.g. > assemble R = P^T) and then compute R.A.P. > > You don't want to do this. The option -matrap 0 resolves this issue. > > > > The implementation of P^T.A.P has two variants. > > The scalable implementation (with respect to memory usage) is selected > via the second option -matptap_scalable. > > > > Try it out - I see a significant memory reduction using these options > for particular mesh sizes / partitions. > > > > I've attached a cleaned up version of the code you sent me. > > There were a number of memory leaks and other issues. > > The main points being > > * You should call DMDAVecGetArrayF90() before VecAssembly{Begin,End} > > * You should call PetscFinalize(), otherwise the option -log_summary > (-log_view) will not display anything once the program has completed. > > > > > > Thanks, > > Dave > > > > > > On 15 September 2016 at 08:03, Hengjie Wang > wrote: > > Hi Dave, > > > > Sorry, I should have put more comment to explain the code. > > The number of process in each dimension is the same: Px = Py=Pz=P. So is > the domain size. > > So if the you want to run the code for a 512^3 grid points on 16^3 > cores, you need to set "-N 512 -P 16" in the command line. > > I add more comments and also fix an error in the attached code. ( The > error only effects the accuracy of solution but not the memory usage. ) > > > > Thank you. > > Frank > > > > > > On 9/14/2016 9:05 PM, Dave May wrote: > >> > >> > >> On Thursday, 15 September 2016, Dave May > wrote: > >> > >> > >> On Thursday, 15 September 2016, frank > > wrote: > >> Hi, > >> > >> I write a simple code to re-produce the error. I hope this can help to > diagnose the problem. > >> The code just solves a 3d poisson equation. > >> > >> Why is the stencil width a runtime parameter?? And why is the default > value 2? For 7-pnt FD Laplace, you only need a stencil width of 1. > >> > >> Was this choice made to mimic something in the real application code? > >> > >> Please ignore - I misunderstood your usage of the param set by -P > >> > >> > >> > >> I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. > That's when I re-produce the OOM error. Each core has about 2G memory. > >> I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The > ksp solver works fine. > >> I attached the code, ksp_view_pre's output and my petsc option file. > >> > >> Thank you. > >> Frank > >> > >> On 09/09/2016 06:38 PM, Hengjie Wang wrote: > >>> Hi Barry, > >>> > >>> I checked. On the supercomputer, I had the option "-ksp_view_pre" but > it is not in file I sent you. I am sorry for the confusion. > >>> > >>> Regards, > >>> Frank > >>> > >>> On Friday, September 9, 2016, Barry Smith > wrote: > >>> > >>> > On Sep 9, 2016, at 3:11 PM, frank > > wrote: > >>> > > >>> > Hi Barry, > >>> > > >>> > I think the first KSP view output is from -ksp_view_pre. Before I > submitted the test, I was not sure whether there would be OOM error or not. > So I added both -ksp_view_pre and -ksp_view. > >>> > >>> But the options file you sent specifically does NOT list the > -ksp_view_pre so how could it be from that? > >>> > >>> Sorry to be pedantic but I've spent too much time in the past > trying to debug from incorrect information and want to make sure that the > information I have is correct before thinking. Please recheck exactly what > happened. Rerun with the exact input file you emailed if that is needed. > >>> > >>> Barry > >>> > >>> > > >>> > Frank > >>> > > >>> > > >>> > On 09/09/2016 12:38 PM, Barry Smith wrote: > >>> >> Why does ksp_view2.txt have two KSP views in it while > ksp_view1.txt has only one KSPView in it? Did you run two different solves > in the 2 case but not the one? > >>> >> > >>> >> Barry > >>> >> > >>> >> > >>> >> > >>> >>> On Sep 9, 2016, at 10:56 AM, frank > wrote: > >>> >>> > >>> >>> Hi, > >>> >>> > >>> >>> I want to continue digging into the memory problem here. > >>> >>> I did find a work around in the past, which is to use less cores > per node so that each core has 8G memory. However this is deficient and > expensive. I hope to locate the place that uses the most memory. > >>> >>> > >>> >>> Here is a brief summary of the tests I did in past: > >>> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 > >>> >>> Maximum (over computational time) process memory: total > 7.0727e+08 > >>> >>> Current process memory: > total 7.0727e+08 > >>> >>> Maximum (over computational time) space PetscMalloc()ed: total > 6.3908e+11 > >>> >>> Current space PetscMalloc()ed: > total 1.8275e+09 > >>> >>> > >>> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 > >>> >>> Maximum (over computational time) process memory: total > 5.9431e+09 > >>> >>> Current process memory: > total 5.9431e+09 > >>> >>> Maximum (over computational time) space PetscMalloc()ed: total > 5.3202e+12 > >>> >>> Current space PetscMalloc()ed: > total 5.4844e+09 > >>> >>> > >>> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 > >>> >>> OOM( Out Of Memory ) killer of the supercomputer terminated > the job during "KSPSolve". > >>> >>> > >>> >>> I attached the output of ksp_view( the third test's output is from > ksp_view_pre ), memory_view and also the petsc options. > >>> >>> > >>> >>> In all the tests, each core can access about 2G memory. In test3, > there are 4223139840 non-zeros in the matrix. This will consume about > 1.74M, using double precision. Considering some extra memory used to store > integer index, 2G memory should still be way enough. > >>> >>> > >>> >>> Is there a way to find out which part of KSPSolve uses the most > memory? > >>> >>> Thank you so much. > >>> >>> > >>> >>> BTW, there are 4 options remains unused and I don't understand why > they are omitted: > >>> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly > >>> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi > >>> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 > >>> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson > >>> >>> > >>> >>> > >>> >>> Regards, > >>> >>> Frank > >>> >>> > >>> >>> On 07/13/2016 05:47 PM, Dave May wrote: > >>> >>>> > >>> >>>> On 14 July 2016 at 01:07, frank > > wrote: > >>> >>>> Hi Dave, > >>> >>>> > >>> >>>> Sorry for the late reply. > >>> >>>> Thank you so much for your detailed reply. > >>> >>>> > >>> >>>> I have a question about the estimation of the memory usage. There > are 4223139840 allocated non-zeros and 18432 MPI processes. Double > precision is used. So the memory per process is: > >>> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? > >>> >>>> Did I do sth wrong here? Because this seems too small. > >>> >>>> > >>> >>>> No - I totally f***ed it up. You are correct. That'll teach me > for fumbling around with my iphone calculator and not using my brain. (Note > that to convert to MB just divide by 1e6, not 1024^2 - although I > apparently cannot convert between units correctly....) > >>> >>>> > >>> >>>> From the PETSc objects associated with the solver, It looks like > it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities > are: somewhere in your usage of PETSc you've introduced a memory leak; > PETSc is doing a huge over allocation (e.g. as per our discussion of > MatPtAP); or in your application code there are other objects you have > forgotten to log the memory for. > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> I am running this job on Bluewater > >>> >>>> I am using the 7 points FD stencil in 3D. > >>> >>>> > >>> >>>> I thought so on both counts. > >>> >>>> > >>> >>>> I apologize that I made a stupid mistake in computing the memory > per core. My settings render each core can access only 2G memory on average > instead of 8G which I mentioned in previous email. I re-run the job with 8G > memory per core on average and there is no "Out Of Memory" error. I would > do more test to see if there is still some memory issue. > >>> >>>> > >>> >>>> Ok. I'd still like to know where the memory was being used since > my estimates were off. > >>> >>>> > >>> >>>> > >>> >>>> Thanks, > >>> >>>> Dave > >>> >>>> > >>> >>>> Regards, > >>> >>>> Frank > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> On 07/11/2016 01:18 PM, Dave May wrote: > >>> >>>>> Hi Frank, > >>> >>>>> > >>> >>>>> > >>> >>>>> On 11 July 2016 at 19:14, frank > > wrote: > >>> >>>>> Hi Dave, > >>> >>>>> > >>> >>>>> I re-run the test using bjacobi as the preconditioner on the > coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is > 96*8*24. The petsc option file is attached. > >>> >>>>> I still got the "Out Of Memory" error. The error occurred before > the linear solver finished one step. So I don't have the full info from > ksp_view. The info from ksp_view_pre is attached. > >>> >>>>> > >>> >>>>> Okay - that is essentially useless (sorry) > >>> >>>>> > >>> >>>>> It seems to me that the error occurred when the decomposition > was going to be changed. > >>> >>>>> > >>> >>>>> Based on what information? > >>> >>>>> Running with -info would give us more clues, but will create a > ton of output. > >>> >>>>> Please try running the case which failed with -info > >>> >>>>> I had another test with a grid of 1536*128*384 and the same > process mesh as above. There was no error. The ksp_view info is attached > for comparison. > >>> >>>>> Thank you. > >>> >>>>> > >>> >>>>> > >>> >>>>> [3] Here is my crude estimate of your memory usage. > >>> >>>>> I'll target the biggest memory hogs only to get an order of > magnitude estimate > >>> >>>>> > >>> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 > GB per MPI rank assuming double precision. > >>> >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming > 32 bit integers) > >>> >>>>> > >>> >>>>> * You use 5 levels of coarsening, so the other operators should > represent (collectively) > >>> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on > the communicator with 18432 ranks. > >>> >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the > communicator with 18432 ranks. > >>> >>>>> > >>> >>>>> * You use a reduction factor of 64, making the new communicator > with 288 MPI ranks. > >>> >>>>> PCTelescope will first gather a temporary matrix associated with > your coarse level operator assuming a comm size of 288 living on the comm > with size 18432. > >>> >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core > on the 288 ranks. > >>> >>>>> This matrix is then used to form a new MPIAIJ matrix on the > subcomm, thus require another 32 MB per rank. > >>> >>>>> The temporary matrix is now destroyed. > >>> >>>>> > >>> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. > >>> >>>>> This requires 2 doubles per point in the DMDA. > >>> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. > >>> >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on > the sub-comm. > >>> >>>>> > >>> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the > resulting operator will have the same memory footprint as the unpermuted > matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB > are held in memory when the DMDA is provided. > >>> >>>>> > >>> >>>>> From my rough estimates, the worst case memory foot print for > any given core, given your options is approximately > >>> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > >>> >>>>> This is way below 8 GB. > >>> >>>>> > >>> >>>>> Note this estimate completely ignores: > >>> >>>>> (1) the memory required for the restriction operator, > >>> >>>>> (2) the potential growth in the number of non-zeros per row due > to Galerkin coarsening (I wished -ksp_view_pre reported the output from > MatView so we could see the number of non-zeros required by the coarse > level operators) > >>> >>>>> (3) all temporary vectors required by the CG solver, and those > required by the smoothers. > >>> >>>>> (4) internal memory allocated by MatPtAP > >>> >>>>> (5) memory associated with IS's used within PCTelescope > >>> >>>>> > >>> >>>>> So either I am completely off in my estimates, or you have not > carefully estimated the memory usage of your application code. Hopefully > others might examine/correct my rough estimates > >>> >>>>> > >>> >>>>> Since I don't have your code I cannot access the latter. > >>> >>>>> Since I don't have access to the same machine you are running > on, I think we need to take a step back. > >>> >>>>> > >>> >>>>> [1] What machine are you running on? Send me a URL if its > available > >>> >>>>> > >>> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 > point FD stencil) > >>> >>>>> If it's a 7 point FD stencil, we should be able to examine the > memory usage of your solver configuration using a standard, light weight > existing PETSc example, run on your machine at the same scale. > >>> >>>>> This would hopefully enable us to correctly evaluate the actual > memory usage required by the solver configuration you are using. > >>> >>>>> > >>> >>>>> Thanks, > >>> >>>>> Dave > >>> >>>>> > >>> >>>>> > >>> >>>>> Frank > >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: > >>> >>>>>> > >>> >>>>>> On Saturday, 9 July 2016, frank > wrote: > >>> >>>>>> Hi Barry and Dave, > >>> >>>>>> > >>> >>>>>> Thank both of you for the advice. > >>> >>>>>> > >>> >>>>>> @Barry > >>> >>>>>> I made a mistake in the file names in last email. I attached > the correct files this time. > >>> >>>>>> For all the three tests, 'Telescope' is used as the coarse > preconditioner. > >>> >>>>>> > >>> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > >>> >>>>>> Part of the memory usage: Vector 125 124 3971904 > 0. > >>> >>>>>> Matrix 101 101 > 9462372 0 > >>> >>>>>> > >>> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > >>> >>>>>> Part of the memory usage: Vector 125 124 681672 > 0. > >>> >>>>>> Matrix 101 101 > 1462180 0. > >>> >>>>>> > >>> >>>>>> In theory, the memory usage in Test1 should be 8 times of > Test2. In my case, it is about 6 times. > >>> >>>>>> > >>> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. > Sub-domain per process: 32*32*32 > >>> >>>>>> Here I get the out of memory error. > >>> >>>>>> > >>> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to > set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > >>> >>>>>> The linear solver didn't work in this case. Petsc output some > errors. > >>> >>>>>> > >>> >>>>>> @Dave > >>> >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse > mesh of 'Telescope', I used LU as the preconditioner instead of SVD. > >>> >>>>>> If my set the levels correctly, then on the last coarse mesh of > MG where it calls 'Telescope', the sub-domain per process is 2*2*2. > >>> >>>>>> On the last coarse mesh of 'Telescope', there is only one grid > point per process. > >>> >>>>>> I still got the OOM error. The detailed petsc option file is > attached. > >>> >>>>>> > >>> >>>>>> Do you understand the expected memory usage for the particular > parallel LU implementation you are using? I don't (seriously). Replace LU > with bjacobi and re-run this test. My point about solver debugging is still > valid. > >>> >>>>>> > >>> >>>>>> And please send the result of KSPView so we can see what is > actually used in the computations > >>> >>>>>> > >>> >>>>>> Thanks > >>> >>>>>> Dave > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> Thank you so much. > >>> >>>>>> > >>> >>>>>> Frank > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: > >>> >>>>>> On Jul 6, 2016, at 4:19 PM, frank > wrote: > >>> >>>>>> > >>> >>>>>> Hi Barry, > >>> >>>>>> > >>> >>>>>> Thank you for you advice. > >>> >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 > and the process mesh is 96*8*24. > >>> >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and > 'telescope' is used as the preconditioner at the coarse mesh. > >>> >>>>>> The system gives me the "Out of Memory" error before the linear > system is completely solved. > >>> >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that > the error occurs when it reaches the coarse mesh. > >>> >>>>>> > >>> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is > 96*8*24. The 3rd test uses the > same grid but a different process mesh 48*4*12. > >>> >>>>>> Are you sure this is right? The total matrix and vector > memory usage goes from 2nd test > >>> >>>>>> Vector 384 383 8,193,712 0. > >>> >>>>>> Matrix 103 103 11,508,688 0. > >>> >>>>>> to 3rd test > >>> >>>>>> Vector 384 383 1,590,520 0. > >>> >>>>>> Matrix 103 103 3,508,664 0. > >>> >>>>>> that is the memory usage got smaller but if you have only 1/8th > the processes and the same grid it should have gotten about 8 times bigger. > Did you maybe cut the grid by a factor of 8 also? If so that still doesn't > explain it because the memory usage changed by a factor of 5 something for > the vectors and 3 something for the matrices. > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are > the same in 1st test. The linear solver works fine in both test. > >>> >>>>>> I attached the memory usage of the 2nd and 3rd tests. The > memory info is from the option '-log_summary'. I tried to use > '-momery_info' as you suggested, but in my case petsc treated it as an > unused option. It output nothing about the memory. Do I need to add sth to > my code so I can use '-memory_info'? > >>> >>>>>> Sorry, my mistake the option is -memory_view > >>> >>>>>> > >>> >>>>>> Can you run the one case with -memory_view and -mg_coarse > jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much > memory is used without the telescope? Also run case 2 the same way. > >>> >>>>>> > >>> >>>>>> Barry > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> In both tests the memory usage is not large. > >>> >>>>>> > >>> >>>>>> It seems to me that it might be the 'telescope' preconditioner > that allocated a lot of memory and caused the error in the 1st test. > >>> >>>>>> Is there is a way to show how much memory it allocated? > >>> >>>>>> > >>> >>>>>> Frank > >>> >>>>>> > >>> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: > >>> >>>>>> Frank, > >>> >>>>>> > >>> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP > before the solve so hopefully it gets that far. > >>> >>>>>> > >>> >>>>>> Please run the problem that does fit with -memory_info > when the problem completes it will show the "high water mark" for PETSc > allocated memory and total memory used. We first want to look at these > numbers to see if it is using more memory than you expect. You could also > run with say half the grid spacing to see how the memory usage scaled with > the increase in grid points. Make the runs also with -log_view and send all > the output from these options. > >>> >>>>>> > >>> >>>>>> Barry > >>> >>>>>> > >>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank > wrote: > >>> >>>>>> > >>> >>>>>> Hi, > >>> >>>>>> > >>> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to > solve a linear system in parallel. > >>> >>>>>> I chose to use the 'Telescope' as the preconditioner on the > coarse mesh for its good performance. > >>> >>>>>> The petsc options file is attached. > >>> >>>>>> > >>> >>>>>> The domain is a 3d box. > >>> >>>>>> It works well when the grid is 1536*128*384 and the process > mesh is 96*8*24. When I double the size of grid and > keep the same process mesh and petsc options, I > get an "out of memory" error from the super-cluster I am using. > >>> >>>>>> Each process has access to at least 8G memory, which should be > more than enough for my application. I am sure that all the other parts of > my code( except the linear solver ) do not use much memory. So I doubt if > there is something wrong with the linear solver. > >>> >>>>>> The error occurs before the linear system is completely solved > so I don't have the info from ksp view. I am not able to re-produce the > error with a smaller problem either. > >>> >>>>>> In addition, I tried to use the block jacobi as the > preconditioner with the same grid and same decomposition. The linear solver > runs extremely slow but there is no memory error. > >>> >>>>>> > >>> >>>>>> How can I diagnose what exactly cause the error? > >>> >>>>>> Thank you so much. > >>> >>>>>> > >>> >>>>>> Frank > >>> >>>>>> > >>> >>>>>> < > petsc_options.txt> > >>> >>>>>> > >>> >>>>> > >>> >>>> > >>> >>> < > memory2.txt> > >>> > > >>> > >> > >> > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 15 13:14:08 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 15 Sep 2016 13:14:08 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> <535EFF3A-8BF9-4A95-8FBA-5AC1BE798659@mcs.anl.gov> Message-ID: <2D96AD50-C582-414C-963F-B231F4445BCD@mcs.anl.gov> > On Sep 15, 2016, at 1:10 PM, Dave May wrote: > > > > On Thursday, 15 September 2016, Barry Smith wrote: > > Should we have some simple selection of default algorithms based on problem size/number of processes? For example if using more than 1000 processes then use scalable version etc? How would we decide on the parameter values? > > I don't like the idea of having "smart" selection by default as it's terribly annoying for the user when they try and understand the performance characteristics of a given method when they do a strong/weak scaling test. If such a smart selection strategy was adopted, the details of it should be made abundantly clear to the user. > > These algs are dependent on many some factors, thus making the smart selection for all use cases hard / impossible. > > I would be happy with unifying the three inplemtationa with three different options AND having these implantation options documented in the man page. Maybe even the man page should advise users which to use in particular circumstances (I think there is something similar on the VecScatter page). > > I have these as suggestions for unifying the options names using bools > > -matptap_explicit_transpose > -matptap_symbolic_transpose_dense > -matptap_symbolic_transpose > > Or maybe enums is more clear > -matptap_impl {explicit_pt,symbolic_pt_dense,symbolic_pt} > > which are equivalent to these options > 1) the current default > 2) -matrap 0 > 3) -matrap 0 -matptap_scalable > > Maybe there could be a fourth option > -matptap_dynamic_selection > which chooses the most appropriate alg given machine info, problem size, partition size,.... At least if the user explicitly chooses the dynamic_selection mode, they wouldn't be surprised if there were any bumps appearing in any scaling study they conducted. I like the idea of enum types with the final enum type being "dynamically select one for me". Barry > > Cheers > Dave > > > > Barry > > > On Sep 15, 2016, at 5:35 AM, Dave May wrote: > > > > HI all, > > > > I the only unexpected memory usage I can see is associated with the call to MatPtAP(). > > Here is something you can try immediately. > > Run your code with the additional options > > -matrap 0 -matptap_scalable > > > > I didn't realize this before, but the default behaviour of MatPtAP in parallel is actually to to explicitly form the transpose of P (e.g. assemble R = P^T) and then compute R.A.P. > > You don't want to do this. The option -matrap 0 resolves this issue. > > > > The implementation of P^T.A.P has two variants. > > The scalable implementation (with respect to memory usage) is selected via the second option -matptap_scalable. > > > > Try it out - I see a significant memory reduction using these options for particular mesh sizes / partitions. > > > > I've attached a cleaned up version of the code you sent me. > > There were a number of memory leaks and other issues. > > The main points being > > * You should call DMDAVecGetArrayF90() before VecAssembly{Begin,End} > > * You should call PetscFinalize(), otherwise the option -log_summary (-log_view) will not display anything once the program has completed. > > > > > > Thanks, > > Dave > > > > > > On 15 September 2016 at 08:03, Hengjie Wang wrote: > > Hi Dave, > > > > Sorry, I should have put more comment to explain the code. > > The number of process in each dimension is the same: Px = Py=Pz=P. So is the domain size. > > So if the you want to run the code for a 512^3 grid points on 16^3 cores, you need to set "-N 512 -P 16" in the command line. > > I add more comments and also fix an error in the attached code. ( The error only effects the accuracy of solution but not the memory usage. ) > > > > Thank you. > > Frank > > > > > > On 9/14/2016 9:05 PM, Dave May wrote: > >> > >> > >> On Thursday, 15 September 2016, Dave May wrote: > >> > >> > >> On Thursday, 15 September 2016, frank wrote: > >> Hi, > >> > >> I write a simple code to re-produce the error. I hope this can help to diagnose the problem. > >> The code just solves a 3d poisson equation. > >> > >> Why is the stencil width a runtime parameter?? And why is the default value 2? For 7-pnt FD Laplace, you only need a stencil width of 1. > >> > >> Was this choice made to mimic something in the real application code? > >> > >> Please ignore - I misunderstood your usage of the param set by -P > >> > >> > >> > >> I run the code on a 1024^3 mesh. The process partition is 32 * 32 * 32. That's when I re-produce the OOM error. Each core has about 2G memory. > >> I also run the code on a 512^3 mesh with 16 * 16 * 16 processes. The ksp solver works fine. > >> I attached the code, ksp_view_pre's output and my petsc option file. > >> > >> Thank you. > >> Frank > >> > >> On 09/09/2016 06:38 PM, Hengjie Wang wrote: > >>> Hi Barry, > >>> > >>> I checked. On the supercomputer, I had the option "-ksp_view_pre" but it is not in file I sent you. I am sorry for the confusion. > >>> > >>> Regards, > >>> Frank > >>> > >>> On Friday, September 9, 2016, Barry Smith wrote: > >>> > >>> > On Sep 9, 2016, at 3:11 PM, frank wrote: > >>> > > >>> > Hi Barry, > >>> > > >>> > I think the first KSP view output is from -ksp_view_pre. Before I submitted the test, I was not sure whether there would be OOM error or not. So I added both -ksp_view_pre and -ksp_view. > >>> > >>> But the options file you sent specifically does NOT list the -ksp_view_pre so how could it be from that? > >>> > >>> Sorry to be pedantic but I've spent too much time in the past trying to debug from incorrect information and want to make sure that the information I have is correct before thinking. Please recheck exactly what happened. Rerun with the exact input file you emailed if that is needed. > >>> > >>> Barry > >>> > >>> > > >>> > Frank > >>> > > >>> > > >>> > On 09/09/2016 12:38 PM, Barry Smith wrote: > >>> >> Why does ksp_view2.txt have two KSP views in it while ksp_view1.txt has only one KSPView in it? Did you run two different solves in the 2 case but not the one? > >>> >> > >>> >> Barry > >>> >> > >>> >> > >>> >> > >>> >>> On Sep 9, 2016, at 10:56 AM, frank wrote: > >>> >>> > >>> >>> Hi, > >>> >>> > >>> >>> I want to continue digging into the memory problem here. > >>> >>> I did find a work around in the past, which is to use less cores per node so that each core has 8G memory. However this is deficient and expensive. I hope to locate the place that uses the most memory. > >>> >>> > >>> >>> Here is a brief summary of the tests I did in past: > >>> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 > >>> >>> Maximum (over computational time) process memory: total 7.0727e+08 > >>> >>> Current process memory: total 7.0727e+08 > >>> >>> Maximum (over computational time) space PetscMalloc()ed: total 6.3908e+11 > >>> >>> Current space PetscMalloc()ed: total 1.8275e+09 > >>> >>> > >>> >>>> Test2: Mesh 1536*128*384 | Process Mesh 96*8*24 > >>> >>> Maximum (over computational time) process memory: total 5.9431e+09 > >>> >>> Current process memory: total 5.9431e+09 > >>> >>> Maximum (over computational time) space PetscMalloc()ed: total 5.3202e+12 > >>> >>> Current space PetscMalloc()ed: total 5.4844e+09 > >>> >>> > >>> >>>> Test3: Mesh 3072*256*768 | Process Mesh 96*8*24 > >>> >>> OOM( Out Of Memory ) killer of the supercomputer terminated the job during "KSPSolve". > >>> >>> > >>> >>> I attached the output of ksp_view( the third test's output is from ksp_view_pre ), memory_view and also the petsc options. > >>> >>> > >>> >>> In all the tests, each core can access about 2G memory. In test3, there are 4223139840 non-zeros in the matrix. This will consume about 1.74M, using double precision. Considering some extra memory used to store integer index, 2G memory should still be way enough. > >>> >>> > >>> >>> Is there a way to find out which part of KSPSolve uses the most memory? > >>> >>> Thank you so much. > >>> >>> > >>> >>> BTW, there are 4 options remains unused and I don't understand why they are omitted: > >>> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: preonly > >>> >>> -mg_coarse_telescope_mg_coarse_pc_type value: bjacobi > >>> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 > >>> >>> -mg_coarse_telescope_mg_levels_ksp_type value: richardson > >>> >>> > >>> >>> > >>> >>> Regards, > >>> >>> Frank > >>> >>> > >>> >>> On 07/13/2016 05:47 PM, Dave May wrote: > >>> >>>> > >>> >>>> On 14 July 2016 at 01:07, frank wrote: > >>> >>>> Hi Dave, > >>> >>>> > >>> >>>> Sorry for the late reply. > >>> >>>> Thank you so much for your detailed reply. > >>> >>>> > >>> >>>> I have a question about the estimation of the memory usage. There are 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is used. So the memory per process is: > >>> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? > >>> >>>> Did I do sth wrong here? Because this seems too small. > >>> >>>> > >>> >>>> No - I totally f***ed it up. You are correct. That'll teach me for fumbling around with my iphone calculator and not using my brain. (Note that to convert to MB just divide by 1e6, not 1024^2 - although I apparently cannot convert between units correctly....) > >>> >>>> > >>> >>>> From the PETSc objects associated with the solver, It looks like it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities are: somewhere in your usage of PETSc you've introduced a memory leak; PETSc is doing a huge over allocation (e.g. as per our discussion of MatPtAP); or in your application code there are other objects you have forgotten to log the memory for. > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> I am running this job on Bluewater > >>> >>>> I am using the 7 points FD stencil in 3D. > >>> >>>> > >>> >>>> I thought so on both counts. > >>> >>>> > >>> >>>> I apologize that I made a stupid mistake in computing the memory per core. My settings render each core can access only 2G memory on average instead of 8G which I mentioned in previous email. I re-run the job with 8G memory per core on average and there is no "Out Of Memory" error. I would do more test to see if there is still some memory issue. > >>> >>>> > >>> >>>> Ok. I'd still like to know where the memory was being used since my estimates were off. > >>> >>>> > >>> >>>> > >>> >>>> Thanks, > >>> >>>> Dave > >>> >>>> > >>> >>>> Regards, > >>> >>>> Frank > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> On 07/11/2016 01:18 PM, Dave May wrote: > >>> >>>>> Hi Frank, > >>> >>>>> > >>> >>>>> > >>> >>>>> On 11 July 2016 at 19:14, frank wrote: > >>> >>>>> Hi Dave, > >>> >>>>> > >>> >>>>> I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached. > >>> >>>>> I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached. > >>> >>>>> > >>> >>>>> Okay - that is essentially useless (sorry) > >>> >>>>> > >>> >>>>> It seems to me that the error occurred when the decomposition was going to be changed. > >>> >>>>> > >>> >>>>> Based on what information? > >>> >>>>> Running with -info would give us more clues, but will create a ton of output. > >>> >>>>> Please try running the case which failed with -info > >>> >>>>> I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison. > >>> >>>>> Thank you. > >>> >>>>> > >>> >>>>> > >>> >>>>> [3] Here is my crude estimate of your memory usage. > >>> >>>>> I'll target the biggest memory hogs only to get an order of magnitude estimate > >>> >>>>> > >>> >>>>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision. > >>> >>>>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers) > >>> >>>>> > >>> >>>>> * You use 5 levels of coarsening, so the other operators should represent (collectively) > >>> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the communicator with 18432 ranks. > >>> >>>>> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks. > >>> >>>>> > >>> >>>>> * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. > >>> >>>>> PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. > >>> >>>>> This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. > >>> >>>>> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. > >>> >>>>> The temporary matrix is now destroyed. > >>> >>>>> > >>> >>>>> * Because a DMDA is detected, a permutation matrix is assembled. > >>> >>>>> This requires 2 doubles per point in the DMDA. > >>> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. > >>> >>>>> Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm. > >>> >>>>> > >>> >>>>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. > >>> >>>>> > >>> >>>>> From my rough estimates, the worst case memory foot print for any given core, given your options is approximately > >>> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > >>> >>>>> This is way below 8 GB. > >>> >>>>> > >>> >>>>> Note this estimate completely ignores: > >>> >>>>> (1) the memory required for the restriction operator, > >>> >>>>> (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) > >>> >>>>> (3) all temporary vectors required by the CG solver, and those required by the smoothers. > >>> >>>>> (4) internal memory allocated by MatPtAP > >>> >>>>> (5) memory associated with IS's used within PCTelescope > >>> >>>>> > >>> >>>>> So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates > >>> >>>>> > >>> >>>>> Since I don't have your code I cannot access the latter. > >>> >>>>> Since I don't have access to the same machine you are running on, I think we need to take a step back. > >>> >>>>> > >>> >>>>> [1] What machine are you running on? Send me a URL if its available > >>> >>>>> > >>> >>>>> [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) > >>> >>>>> If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. > >>> >>>>> This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. > >>> >>>>> > >>> >>>>> Thanks, > >>> >>>>> Dave > >>> >>>>> > >>> >>>>> > >>> >>>>> Frank > >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: > >>> >>>>>> > >>> >>>>>> On Saturday, 9 July 2016, frank wrote: > >>> >>>>>> Hi Barry and Dave, > >>> >>>>>> > >>> >>>>>> Thank both of you for the advice. > >>> >>>>>> > >>> >>>>>> @Barry > >>> >>>>>> I made a mistake in the file names in last email. I attached the correct files this time. > >>> >>>>>> For all the three tests, 'Telescope' is used as the coarse preconditioner. > >>> >>>>>> > >>> >>>>>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > >>> >>>>>> Part of the memory usage: Vector 125 124 3971904 0. > >>> >>>>>> Matrix 101 101 9462372 0 > >>> >>>>>> > >>> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > >>> >>>>>> Part of the memory usage: Vector 125 124 681672 0. > >>> >>>>>> Matrix 101 101 1462180 0. > >>> >>>>>> > >>> >>>>>> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. > >>> >>>>>> > >>> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 > >>> >>>>>> Here I get the out of memory error. > >>> >>>>>> > >>> >>>>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > >>> >>>>>> The linear solver didn't work in this case. Petsc output some errors. > >>> >>>>>> > >>> >>>>>> @Dave > >>> >>>>>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. > >>> >>>>>> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. > >>> >>>>>> On the last coarse mesh of 'Telescope', there is only one grid point per process. > >>> >>>>>> I still got the OOM error. The detailed petsc option file is attached. > >>> >>>>>> > >>> >>>>>> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. > >>> >>>>>> > >>> >>>>>> And please send the result of KSPView so we can see what is actually used in the computations > >>> >>>>>> > >>> >>>>>> Thanks > >>> >>>>>> Dave > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> Thank you so much. > >>> >>>>>> > >>> >>>>>> Frank > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: > >>> >>>>>> On Jul 6, 2016, at 4:19 PM, frank wrote: > >>> >>>>>> > >>> >>>>>> Hi Barry, > >>> >>>>>> > >>> >>>>>> Thank you for you advice. > >>> >>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. > >>> >>>>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. > >>> >>>>>> The system gives me the "Out of Memory" error before the linear system is completely solved. > >>> >>>>>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. > >>> >>>>>> > >>> >>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. > >>> >>>>>> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test > >>> >>>>>> Vector 384 383 8,193,712 0. > >>> >>>>>> Matrix 103 103 11,508,688 0. > >>> >>>>>> to 3rd test > >>> >>>>>> Vector 384 383 1,590,520 0. > >>> >>>>>> Matrix 103 103 3,508,664 0. > >>> >>>>>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. > >>> >>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? > >>> >>>>>> Sorry, my mistake the option is -memory_view > >>> >>>>>> > >>> >>>>>> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. > >>> >>>>>> > >>> >>>>>> Barry > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> In both tests the memory usage is not large. > >>> >>>>>> > >>> >>>>>> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. > >>> >>>>>> Is there is a way to show how much memory it allocated? > >>> >>>>>> > >>> >>>>>> Frank > >>> >>>>>> > >>> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: > >>> >>>>>> Frank, > >>> >>>>>> > >>> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. > >>> >>>>>> > >>> >>>>>> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. > >>> >>>>>> > >>> >>>>>> Barry > >>> >>>>>> > >>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: > >>> >>>>>> > >>> >>>>>> Hi, > >>> >>>>>> > >>> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. > >>> >>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. > >>> >>>>>> The petsc options file is attached. > >>> >>>>>> > >>> >>>>>> The domain is a 3d box. > >>> >>>>>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. > >>> >>>>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. > >>> >>>>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. > >>> >>>>>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. > >>> >>>>>> > >>> >>>>>> How can I diagnose what exactly cause the error? > >>> >>>>>> Thank you so much. > >>> >>>>>> > >>> >>>>>> Frank > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>> > >>> >>>> > >>> >>> > >>> > > >>> > >> > >> > > > > > > > From gotofd at gmail.com Thu Sep 15 21:21:32 2016 From: gotofd at gmail.com (Ji Zhang) Date: Fri, 16 Sep 2016 10:21:32 +0800 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: I'm so apologize for the ambiguity. Let me clarify it. I'm trying to simulation interactions among different bodies. Now I have calculated the interaction between two of them and stored in the sub-matrix m_ij. What I want to do is to consider the whole interaction and construct all sub-matrices m_ij into a big matrix M, just like this, imaging the problem contain 3 bodies, [ m11 m12 m13 ] M = | m21 m22 m23 | , [ m31 m32 m33 ] The system is huge that I have to use MPI and a lot of cups. A mcve code is showing below, and I'm using a python wrap of PETSc, however, their grammar is similar. import numpy as np from petsc4py import PETSc mSizes = (5, 8, 6) mij = [] # create sub-matrices mij for i in range(len(mSizes)): for j in range(len(mSizes)): temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) temp_m.setType('mpidense') temp_m.setFromOptions() temp_m.setUp() temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) temp_m.assemble() mij.append(temp_m) # Now we have four sub-matrices. I would like to construct them into a big matrix M. M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) M.setType('mpidense') M.setFromOptions() M.setUp() mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, mSizes] for i in range(len(mSizes)): for j in range(len(mSizes)): M[mLocations[i]:mLocations[i+1], mLocations[j]:mLocations[j+1]] = mij[i*len(mSizes)+j][:, :] M.assemble() Thanks. 2016-09-16 Best, Regards, Zhang Ji Beijing Computational Science Research Center E-mail: gotofd at gmail.com Wayne On Thu, Sep 15, 2016 at 8:58 PM, Matthew Knepley wrote: > On Thu, Sep 15, 2016 at 4:23 AM, Ji Zhang wrote: > >> Thanks Matt. It works well for signal core. But is there any solution if >> I need a MPI program? >> > > It unclear what the stuff below would mean in parallel. > > If you want to assemble several blocks of a parallel matrix that looks > like serial matrices, then use > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/ > MatGetLocalSubMatrix.html > > Thanks, > > Matt > > >> Thanks. >> >> Wayne >> >> On Tue, Sep 13, 2016 at 9:30 AM, Matthew Knepley >> wrote: >> >>> On Mon, Sep 12, 2016 at 8:24 PM, Ji Zhang wrote: >>> >>>> Dear all, >>>> >>>> I'm using petsc4py and now face some problems. >>>> I have a number of small petsc dense matrices mij, and I want to >>>> construct them to a big matrix M like this: >>>> >>>> [ m11 m12 m13 ] >>>> M = | m21 m22 m23 | , >>>> [ m31 m32 m33 ] >>>> How could I do it effectively? >>>> >>>> Now I'm using the code below: >>>> >>>> # get indexes of matrix mij >>>> index1_begin, index1_end = getindex_i( ) >>>> index2_begin, index2_end = getindex_j( ) >>>> M[index1_begin:index1_end, index2_begin:index2_end] = mij[:, :] >>>> which report such error messages: >>>> >>>> petsc4py.PETSc.Error: error code 56 >>>> [0] MatGetValues() line 1818 in /home/zhangji/PycharmProjects/ >>>> petsc-petsc-31a1859eaff6/src/mat/interface/matrix.c >>>> [0] MatGetValues_MPIDense() line 154 in >>>> /home/zhangji/PycharmProjects/petsc-petsc-31a1859eaff6/src/m >>>> at/impls/dense/mpi/mpidense.c >>>> >>> >>> Make M a sequential dense matrix. >>> >>> Matt >>> >>> >>>> [0] No support for this operation for this object type >>>> [0] Only local values currently supported >>>> >>>> Thanks. >>>> >>>> >>>> 2016-09-13 >>>> Best, >>>> Regards, >>>> Zhang Ji >>>> Beijing Computational Science Research Center >>>> E-mail: gotofd at gmail.com >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 15 21:32:53 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 15 Sep 2016 21:32:53 -0500 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: You should create your small m_ij matrices as just dense two dimensional arrays and then set them into the big M matrix. Do not create the small dense matrices as PETSc matrices. Barry > On Sep 15, 2016, at 9:21 PM, Ji Zhang wrote: > > I'm so apologize for the ambiguity. Let me clarify it. > > I'm trying to simulation interactions among different bodies. Now I have calculated the interaction between two of them and stored in the sub-matrix m_ij. What I want to do is to consider the whole interaction and construct all sub-matrices m_ij into a big matrix M, just like this, imaging the problem contain 3 bodies, > > [ m11 m12 m13 ] > M = | m21 m22 m23 | , > [ m31 m32 m33 ] > > The system is huge that I have to use MPI and a lot of cups. A mcve code is showing below, and I'm using a python wrap of PETSc, however, their grammar is similar. > > import numpy as np > from petsc4py import PETSc > > mSizes = (5, 8, 6) > mij = [] > > # create sub-matrices mij > for i in range(len(mSizes)): > for j in range(len(mSizes)): > temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) > temp_m.setType('mpidense') > temp_m.setFromOptions() > temp_m.setUp() > temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) > temp_m.assemble() > mij.append(temp_m) > > # Now we have four sub-matrices. I would like to construct them into a big matrix M. > M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) > M.setType('mpidense') > M.setFromOptions() > M.setUp() > mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, mSizes] > for i in range(len(mSizes)): > for j in range(len(mSizes)): > M[mLocations[i]:mLocations[i+1], mLocations[j]:mLocations[j+1]] = mij[i*len(mSizes)+j][:, :] > M.assemble() > > Thanks. > > > 2016-09-16 > Best, > Regards, > Zhang Ji > Beijing Computational Science Research Center > E-mail: gotofd at gmail.com > > > > > > Wayne > > On Thu, Sep 15, 2016 at 8:58 PM, Matthew Knepley wrote: > On Thu, Sep 15, 2016 at 4:23 AM, Ji Zhang wrote: > Thanks Matt. It works well for signal core. But is there any solution if I need a MPI program? > > It unclear what the stuff below would mean in parallel. > > If you want to assemble several blocks of a parallel matrix that looks like serial matrices, then use > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html > > Thanks, > > Matt > > Thanks. > > Wayne > > On Tue, Sep 13, 2016 at 9:30 AM, Matthew Knepley wrote: > On Mon, Sep 12, 2016 at 8:24 PM, Ji Zhang wrote: > Dear all, > > I'm using petsc4py and now face some problems. > I have a number of small petsc dense matrices mij, and I want to construct them to a big matrix M like this: > > [ m11 m12 m13 ] > M = | m21 m22 m23 | , > [ m31 m32 m33 ] > How could I do it effectively? > > Now I'm using the code below: > > # get indexes of matrix mij > index1_begin, index1_end = getindex_i( ) > index2_begin, index2_end = getindex_j( ) > M[index1_begin:index1_end, index2_begin:index2_end] = mij[:, :] > which report such error messages: > > petsc4py.PETSc.Error: error code 56 > [0] MatGetValues() line 1818 in /home/zhangji/PycharmProjects/petsc-petsc-31a1859eaff6/src/mat/interface/matrix.c > [0] MatGetValues_MPIDense() line 154 in /home/zhangji/PycharmProjects/petsc-petsc-31a1859eaff6/src/mat/impls/dense/mpi/mpidense.c > > Make M a sequential dense matrix. > > Matt > > [0] No support for this operation for this object type > [0] Only local values currently supported > > Thanks. > > > 2016-09-13 > Best, > Regards, > Zhang Ji > Beijing Computational Science Research Center > E-mail: gotofd at gmail.com > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From gotofd at gmail.com Thu Sep 15 22:33:25 2016 From: gotofd at gmail.com (Ji Zhang) Date: Fri, 16 Sep 2016 11:33:25 +0800 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: Thanks for your warm help. Could you please show me some necessary functions or a simple demo code? Wayne On Fri, Sep 16, 2016 at 10:32 AM, Barry Smith wrote: > > You should create your small m_ij matrices as just dense two dimensional > arrays and then set them into the big M matrix. Do not create the small > dense matrices as PETSc matrices. > > Barry > > > > On Sep 15, 2016, at 9:21 PM, Ji Zhang wrote: > > > > I'm so apologize for the ambiguity. Let me clarify it. > > > > I'm trying to simulation interactions among different bodies. Now I have > calculated the interaction between two of them and stored in the sub-matrix > m_ij. What I want to do is to consider the whole interaction and construct > all sub-matrices m_ij into a big matrix M, just like this, imaging the > problem contain 3 bodies, > > > > [ m11 m12 m13 ] > > M = | m21 m22 m23 | , > > [ m31 m32 m33 ] > > > > The system is huge that I have to use MPI and a lot of cups. A mcve code > is showing below, and I'm using a python wrap of PETSc, however, their > grammar is similar. > > > > import numpy as np > > from petsc4py import PETSc > > > > mSizes = (5, 8, 6) > > mij = [] > > > > # create sub-matrices mij > > for i in range(len(mSizes)): > > for j in range(len(mSizes)): > > temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) > > temp_m.setType('mpidense') > > temp_m.setFromOptions() > > temp_m.setUp() > > temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) > > temp_m.assemble() > > mij.append(temp_m) > > > > # Now we have four sub-matrices. I would like to construct them into a > big matrix M. > > M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) > > M.setType('mpidense') > > M.setFromOptions() > > M.setUp() > > mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, > mSizes] > > for i in range(len(mSizes)): > > for j in range(len(mSizes)): > > M[mLocations[i]:mLocations[i+1], mLocations[j]:mLocations[j+1]] > = mij[i*len(mSizes)+j][:, :] > > M.assemble() > > > > Thanks. > > > > > > 2016-09-16 > > Best, > > Regards, > > Zhang Ji > > Beijing Computational Science Research Center > > E-mail: gotofd at gmail.com > > > > > > > > > > > > Wayne > > > > On Thu, Sep 15, 2016 at 8:58 PM, Matthew Knepley > wrote: > > On Thu, Sep 15, 2016 at 4:23 AM, Ji Zhang wrote: > > Thanks Matt. It works well for signal core. But is there any solution if > I need a MPI program? > > > > It unclear what the stuff below would mean in parallel. > > > > If you want to assemble several blocks of a parallel matrix that looks > like serial matrices, then use > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/ > MatGetLocalSubMatrix.html > > > > Thanks, > > > > Matt > > > > Thanks. > > > > Wayne > > > > On Tue, Sep 13, 2016 at 9:30 AM, Matthew Knepley > wrote: > > On Mon, Sep 12, 2016 at 8:24 PM, Ji Zhang wrote: > > Dear all, > > > > I'm using petsc4py and now face some problems. > > I have a number of small petsc dense matrices mij, and I want to > construct them to a big matrix M like this: > > > > [ m11 m12 m13 ] > > M = | m21 m22 m23 | , > > [ m31 m32 m33 ] > > How could I do it effectively? > > > > Now I'm using the code below: > > > > # get indexes of matrix mij > > index1_begin, index1_end = getindex_i( ) > > index2_begin, index2_end = getindex_j( ) > > M[index1_begin:index1_end, index2_begin:index2_end] = mij[:, :] > > which report such error messages: > > > > petsc4py.PETSc.Error: error code 56 > > [0] MatGetValues() line 1818 in /home/zhangji/PycharmProjects/ > petsc-petsc-31a1859eaff6/src/mat/interface/matrix.c > > [0] MatGetValues_MPIDense() line 154 in > /home/zhangji/PycharmProjects/petsc-petsc-31a1859eaff6/src/ > mat/impls/dense/mpi/mpidense.c > > > > Make M a sequential dense matrix. > > > > Matt > > > > [0] No support for this operation for this object type > > [0] Only local values currently supported > > > > Thanks. > > > > > > 2016-09-13 > > Best, > > Regards, > > Zhang Ji > > Beijing Computational Science Research Center > > E-mail: gotofd at gmail.com > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gotofd at gmail.com Thu Sep 15 22:40:05 2016 From: gotofd at gmail.com (Ji Zhang) Date: Fri, 16 Sep 2016 11:40:05 +0800 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: Thanks. I think I find the right way. Wayne On Fri, Sep 16, 2016 at 11:33 AM, Ji Zhang wrote: > Thanks for your warm help. Could you please show me some necessary > functions or a simple demo code? > > > Wayne > > On Fri, Sep 16, 2016 at 10:32 AM, Barry Smith wrote: > >> >> You should create your small m_ij matrices as just dense two >> dimensional arrays and then set them into the big M matrix. Do not create >> the small dense matrices as PETSc matrices. >> >> Barry >> >> >> > On Sep 15, 2016, at 9:21 PM, Ji Zhang wrote: >> > >> > I'm so apologize for the ambiguity. Let me clarify it. >> > >> > I'm trying to simulation interactions among different bodies. Now I >> have calculated the interaction between two of them and stored in the >> sub-matrix m_ij. What I want to do is to consider the whole interaction and >> construct all sub-matrices m_ij into a big matrix M, just like this, >> imaging the problem contain 3 bodies, >> > >> > [ m11 m12 m13 ] >> > M = | m21 m22 m23 | , >> > [ m31 m32 m33 ] >> > >> > The system is huge that I have to use MPI and a lot of cups. A mcve >> code is showing below, and I'm using a python wrap of PETSc, however, their >> grammar is similar. >> > >> > import numpy as np >> > from petsc4py import PETSc >> > >> > mSizes = (5, 8, 6) >> > mij = [] >> > >> > # create sub-matrices mij >> > for i in range(len(mSizes)): >> > for j in range(len(mSizes)): >> > temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) >> > temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) >> > temp_m.setType('mpidense') >> > temp_m.setFromOptions() >> > temp_m.setUp() >> > temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) >> > temp_m.assemble() >> > mij.append(temp_m) >> > >> > # Now we have four sub-matrices. I would like to construct them into a >> big matrix M. >> > M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) >> > M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) >> > M.setType('mpidense') >> > M.setFromOptions() >> > M.setUp() >> > mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, >> mSizes] >> > for i in range(len(mSizes)): >> > for j in range(len(mSizes)): >> > M[mLocations[i]:mLocations[i+1], >> mLocations[j]:mLocations[j+1]] = mij[i*len(mSizes)+j][:, :] >> > M.assemble() >> > >> > Thanks. >> > >> > >> > 2016-09-16 >> > Best, >> > Regards, >> > Zhang Ji >> > Beijing Computational Science Research Center >> > E-mail: gotofd at gmail.com >> > >> > >> > >> > >> > >> > Wayne >> > >> > On Thu, Sep 15, 2016 at 8:58 PM, Matthew Knepley >> wrote: >> > On Thu, Sep 15, 2016 at 4:23 AM, Ji Zhang wrote: >> > Thanks Matt. It works well for signal core. But is there any solution >> if I need a MPI program? >> > >> > It unclear what the stuff below would mean in parallel. >> > >> > If you want to assemble several blocks of a parallel matrix that looks >> like serial matrices, then use >> > >> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages >> /Mat/MatGetLocalSubMatrix.html >> > >> > Thanks, >> > >> > Matt >> > >> > Thanks. >> > >> > Wayne >> > >> > On Tue, Sep 13, 2016 at 9:30 AM, Matthew Knepley >> wrote: >> > On Mon, Sep 12, 2016 at 8:24 PM, Ji Zhang wrote: >> > Dear all, >> > >> > I'm using petsc4py and now face some problems. >> > I have a number of small petsc dense matrices mij, and I want to >> construct them to a big matrix M like this: >> > >> > [ m11 m12 m13 ] >> > M = | m21 m22 m23 | , >> > [ m31 m32 m33 ] >> > How could I do it effectively? >> > >> > Now I'm using the code below: >> > >> > # get indexes of matrix mij >> > index1_begin, index1_end = getindex_i( ) >> > index2_begin, index2_end = getindex_j( ) >> > M[index1_begin:index1_end, index2_begin:index2_end] = mij[:, :] >> > which report such error messages: >> > >> > petsc4py.PETSc.Error: error code 56 >> > [0] MatGetValues() line 1818 in /home/zhangji/PycharmProjects/ >> petsc-petsc-31a1859eaff6/src/mat/interface/matrix.c >> > [0] MatGetValues_MPIDense() line 154 in >> /home/zhangji/PycharmProjects/petsc-petsc-31a1859eaff6/src/m >> at/impls/dense/mpi/mpidense.c >> > >> > Make M a sequential dense matrix. >> > >> > Matt >> > >> > [0] No support for this operation for this object type >> > [0] Only local values currently supported >> > >> > Thanks. >> > >> > >> > 2016-09-13 >> > Best, >> > Regards, >> > Zhang Ji >> > Beijing Computational Science Research Center >> > E-mail: gotofd at gmail.com >> > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Fri Sep 16 03:31:01 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Fri, 16 Sep 2016 10:31:01 +0200 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: Hi Matt I believed at line 523, src/ksp/ksp/utils/schurm.c ierr = MatMatMult(C, AinvB, MAT_INITIAL_MATRIX, fill, S);CHKERRQ(ierr); in my test case C is MPIAIJ and AinvB is SEQAIJ, hence it throws the error. In fact I guess there are two issues with it line 521, ierr = MatConvert(AinvBd, MATAIJ, MAT_INITIAL_MATRIX, &AinvB);CHKERRQ(ierr); shall we convert this to type of C matrix to ensure compatibility ? line 552, if(norm > PETSC_MACHINE_EPSILON) SETERRQ(PetscObjectComm((PetscObject) M), PETSC_ERR_SUP, "Not yet implemented for Schur complements with non-vanishing D"); with this the Schur complement with A11!=0 will be aborted Giang On Thu, Sep 15, 2016 at 4:28 PM, Matthew Knepley wrote: > On Thu, Sep 15, 2016 at 9:07 AM, Hoang Giang Bui > wrote: > >> Hi Matt >> >> Thanks for the comment. After looking carefully into the manual again, >> the key take away is that with selfp there is no option to compute the >> exact Schur, there are only two options to approximate the inv(A00) for >> selfp, which are lump and diag (diag by default). I misunderstood this >> previously. >> >> There is online manual entry mentioned about >> PC_FIELDSPLIT_SCHUR_PRE_FULL, which is not documented elsewhere in the >> offline manual. I tried to access that by setting >> -pc_fieldsplit_schur_precondition full >> > > Yep, I wrote that specifically for testing, but its very slow so I did not > document it to prevent people from complaining. > > >> but it gives the error >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Arguments are incompatible >> [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with B, >> seqaij >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 >> [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui Thu >> Sep 15 15:46:56 2016 >> [0]PETSC ERROR: Configure options --with-shared-libraries >> --with-debugging=0 --with-pic --download-fblaslapack=yes >> --download-suitesparse --download-ptscotch=yes --download-metis=yes >> --download-parmetis=yes --download-scalapack=yes --download-mumps=yes >> --download-hypre=yes --download-ml=yes --download-pastix=yes >> --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 >> [0]PETSC ERROR: #1 MatMatMult() line 9514 in >> /home/hbui/sw/petsc-3.7.3/src/mat/interface/matrix.c >> [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line 526 >> in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c >> [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in >> /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c >> [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ >> ksp/pc/interface/precon.c >> [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/hbui/sw/petsc-3.7.3/src/ >> ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/hbui/sw/petsc-3.7.3/src/ >> ksp/ksp/interface/itfunc.c >> >> Please excuse me to insist on forming the exact Schur complement, but as >> you said, I would like to track down what creates problem in my code by >> starting from a very exact but ineffective solution. >> > > Sure, I understand. I do not understand how A can be MPI and B can be Seq. > Do you know how that happens? > > Thanks, > > Matt > > >> Giang >> >> On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley >> wrote: >> >>> On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui >>> wrote: >>> >>>> Dear Barry >>>> >>>> Thanks for the clarification. I got exactly what you said if the code >>>> changed to >>>> ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); >>>> Residual norms for stokes_ solve. >>>> 0 KSP Residual norm 1.327791371202e-02 >>>> Residual norms for stokes_fieldsplit_p_ solve. >>>> 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm >>>> 0.000000000000e+00 ||r(i)||/||b|| -nan >>>> 1 KSP Residual norm 3.997711925708e-17 >>>> >>>> but I guess we solve a different problem if B is used for the linear >>>> system. >>>> >>>> in addition, changed to >>>> ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); >>>> also works but inner iteration converged not in one iteration >>>> >>>> Residual norms for stokes_ solve. >>>> 0 KSP Residual norm 1.327791371202e-02 >>>> Residual norms for stokes_fieldsplit_p_ solve. >>>> 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm >>>> 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 >>>> 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm >>>> 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 >>>> 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm >>>> 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 >>>> 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm >>>> 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 >>>> 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm >>>> 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 >>>> 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm >>>> 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 >>>> 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm >>>> 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 >>>> 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm >>>> 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 >>>> 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm >>>> 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 >>>> 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm >>>> 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 >>>> 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm >>>> 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 >>>> 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm >>>> 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 >>>> 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm >>>> 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 >>>> 1 KSP Residual norm 1.445282647127e-16 >>>> >>>> >>>> Seem like zero pivot does not happen, but why the solver for Schur >>>> takes 13 steps if the preconditioner is direct solver? >>>> >>> >>> Look at the -ksp_view. I will bet that the default is to shift (add a >>> multiple of the identity) the matrix instead of failing. This >>> gives an inexact PC, but as you see it can converge. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> >>>> I also so tried another problem which I known does have a nonsingular >>>> Schur (at least A11 != 0) and it also have the same problem: 1 step outer >>>> convergence but multiple step inner convergence. >>>> >>>> Any ideas? >>>> >>>> Giang >>>> >>>> On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: >>>> >>>>> >>>>> Normally you'd be absolutely correct to expect convergence in one >>>>> iteration. However in this example note the call >>>>> >>>>> ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); >>>>> >>>>> It is solving the linear system defined by A but building the >>>>> preconditioner (i.e. the entire fieldsplit process) from a different matrix >>>>> B. Since A is not B you should not expect convergence in one iteration. If >>>>> you change the code to >>>>> >>>>> ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); >>>>> >>>>> you will see exactly what you expect, convergence in one iteration. >>>>> >>>>> Sorry about this, the example is lacking clarity and documentation >>>>> its author obviously knew too well what he was doing that he didn't realize >>>>> everyone else in the world would need more comments in the code. If you >>>>> change the code to >>>>> >>>>> ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); >>>>> >>>>> it will stop without being able to build the preconditioner because LU >>>>> factorization of the Sp matrix will result in a zero pivot. This is why >>>>> this "auxiliary" matrix B is used to define the preconditioner instead of A. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui >>>>> wrote: >>>>> > >>>>> > Sorry I slept quite a while in this thread. Now I start to look at >>>>> it again. In the last try, the previous setting doesn't work either (in >>>>> fact diverge). So I would speculate if the Schur complement in my case is >>>>> actually not invertible. It's also possible that the code is wrong >>>>> somewhere. However, before looking at that, I want to understand thoroughly >>>>> the settings for Schur complement >>>>> > >>>>> > I experimented ex42 with the settings: >>>>> > mpirun -np 1 ex42 \ >>>>> > -stokes_ksp_monitor \ >>>>> > -stokes_ksp_type fgmres \ >>>>> > -stokes_pc_type fieldsplit \ >>>>> > -stokes_pc_fieldsplit_type schur \ >>>>> > -stokes_pc_fieldsplit_schur_fact_type full \ >>>>> > -stokes_pc_fieldsplit_schur_precondition selfp \ >>>>> > -stokes_fieldsplit_u_ksp_type preonly \ >>>>> > -stokes_fieldsplit_u_pc_type lu \ >>>>> > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ >>>>> > -stokes_fieldsplit_p_ksp_type gmres \ >>>>> > -stokes_fieldsplit_p_ksp_monitor_true_residual \ >>>>> > -stokes_fieldsplit_p_ksp_max_it 300 \ >>>>> > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ >>>>> > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ >>>>> > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ >>>>> > -stokes_fieldsplit_p_pc_type lu \ >>>>> > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps >>>>> > >>>>> > In my understanding, the solver should converge in 1 (outer) step. >>>>> Execution gives: >>>>> > Residual norms for stokes_ solve. >>>>> > 0 KSP Residual norm 1.327791371202e-02 >>>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>>> > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid >>>>> norm 0.000000000000e+00 ||r(i)||/||b|| -nan >>>>> > 1 KSP Residual norm 7.656238881621e-04 >>>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>>> > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid >>>>> norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>>> > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid >>>>> norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 >>>>> > 2 KSP Residual norm 9.895645456398e-06 >>>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>>> > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid >>>>> norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>>> > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid >>>>> norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 >>>>> > 3 KSP Residual norm 1.608206702571e-06 >>>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>>> > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid >>>>> norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>>> > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid >>>>> norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 >>>>> > 4 KSP Residual norm 2.453618999882e-07 >>>>> > Residual norms for stokes_fieldsplit_p_ solve. >>>>> > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid >>>>> norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 >>>>> > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid >>>>> norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 >>>>> > 5 KSP Residual norm 4.230399264750e-08 >>>>> > >>>>> > Looks like the "selfp" does construct the Schur nicely. But does >>>>> "full" really construct the full block preconditioner? >>>>> > >>>>> > Giang >>>>> > P/S: I'm also generating a smaller size of the previous problem for >>>>> checking again. >>>>> > >>>>> > >>>>> > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley >>>>> wrote: >>>>> > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui >>>>> wrote: >>>>> > >>>>> > It could be taking time in the MatMatMult() here if that matrix is >>>>> dense. Is there any reason to >>>>> > believe that is a good preconditioner for your problem? >>>>> > >>>>> > This is the first approach to the problem, so I chose the most >>>>> simple setting. Do you have any other recommendation? >>>>> > >>>>> > This is in no way the simplest PC. We need to make it simpler first. >>>>> > >>>>> > 1) Run on only 1 proc >>>>> > >>>>> > 2) Use -pc_fieldsplit_schur_fact_type full >>>>> > >>>>> > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_tru >>>>> e_residual >>>>> > >>>>> > This should converge in 1 outer iteration, but we will see how good >>>>> your Schur complement preconditioner >>>>> > is for this problem. >>>>> > >>>>> > You need to start out from something you understand and then start >>>>> making approximations. >>>>> > >>>>> > Matt >>>>> > >>>>> > For any solver question, please send us the output of >>>>> > >>>>> > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>>>> > >>>>> > >>>>> > I sent here the full output (after changed to fgmres), again it >>>>> takes long at the first iteration but after that, it does not converge >>>>> > >>>>> > -ksp_type fgmres >>>>> > -ksp_max_it 300 >>>>> > -ksp_gmres_restart 300 >>>>> > -ksp_gmres_modifiedgramschmidt >>>>> > -pc_fieldsplit_type schur >>>>> > -pc_fieldsplit_schur_fact_type diag >>>>> > -pc_fieldsplit_schur_precondition selfp >>>>> > -pc_fieldsplit_detect_saddle_point >>>>> > -fieldsplit_u_ksp_type preonly >>>>> > -fieldsplit_u_pc_type lu >>>>> > -fieldsplit_u_pc_factor_mat_solver_package mumps >>>>> > -fieldsplit_lu_ksp_type preonly >>>>> > -fieldsplit_lu_pc_type lu >>>>> > -fieldsplit_lu_pc_factor_mat_solver_package mumps >>>>> > >>>>> > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid >>>>> norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 >>>>> > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid >>>>> norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 >>>>> > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid >>>>> norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 >>>>> > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid >>>>> norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 >>>>> > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid >>>>> norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 >>>>> > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid >>>>> norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 >>>>> > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid >>>>> norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 >>>>> > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid >>>>> norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 >>>>> > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid >>>>> norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 >>>>> > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid >>>>> norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 >>>>> > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid >>>>> norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 >>>>> > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid >>>>> norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 >>>>> > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid >>>>> norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 >>>>> > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid >>>>> norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 >>>>> > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid >>>>> norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 >>>>> > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid >>>>> norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 >>>>> > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid >>>>> norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 >>>>> > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid >>>>> norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 >>>>> > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid >>>>> norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 >>>>> > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid >>>>> norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 >>>>> > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid >>>>> norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 >>>>> > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid >>>>> norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 >>>>> > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid >>>>> norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 >>>>> > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid >>>>> norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 >>>>> > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid >>>>> norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 >>>>> > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid >>>>> norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 >>>>> > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid >>>>> norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 >>>>> > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid >>>>> norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 >>>>> > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid >>>>> norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 >>>>> > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid >>>>> norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 >>>>> > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid >>>>> norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 >>>>> > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid >>>>> norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 >>>>> > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid >>>>> norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 >>>>> > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid >>>>> norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 >>>>> > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid >>>>> norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 >>>>> > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid >>>>> norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 >>>>> > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid >>>>> norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 >>>>> > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid >>>>> norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 >>>>> > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid >>>>> norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 >>>>> > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid >>>>> norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 >>>>> > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid >>>>> norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 >>>>> > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid >>>>> norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 >>>>> > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid >>>>> norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 >>>>> > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid >>>>> norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 >>>>> > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid >>>>> norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 >>>>> > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid >>>>> norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 >>>>> > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid >>>>> norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 >>>>> > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid >>>>> norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 >>>>> > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid >>>>> norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 >>>>> > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid >>>>> norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 >>>>> > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid >>>>> norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 >>>>> > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid >>>>> norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 >>>>> > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid >>>>> norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 >>>>> > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid >>>>> norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 >>>>> > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid >>>>> norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 >>>>> > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid >>>>> norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 >>>>> > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid >>>>> norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 >>>>> > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid >>>>> norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 >>>>> > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid >>>>> norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 >>>>> > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid >>>>> norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 >>>>> > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid >>>>> norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 >>>>> > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid >>>>> norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 >>>>> > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid >>>>> norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 >>>>> > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid >>>>> norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 >>>>> > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid >>>>> norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 >>>>> > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid >>>>> norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 >>>>> > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid >>>>> norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 >>>>> > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid >>>>> norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 >>>>> > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid >>>>> norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 >>>>> > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid >>>>> norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 >>>>> > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid >>>>> norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 >>>>> > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid >>>>> norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 >>>>> > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid >>>>> norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 >>>>> > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid >>>>> norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 >>>>> > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid >>>>> norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 >>>>> > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid >>>>> norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 >>>>> > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid >>>>> norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 >>>>> > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid >>>>> norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 >>>>> > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid >>>>> norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 >>>>> > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid >>>>> norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 >>>>> > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid >>>>> norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 >>>>> > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid >>>>> norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 >>>>> > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid >>>>> norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 >>>>> > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid >>>>> norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 >>>>> > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid >>>>> norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 >>>>> > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid >>>>> norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 >>>>> > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid >>>>> norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 >>>>> > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid >>>>> norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 >>>>> > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid >>>>> norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 >>>>> > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid >>>>> norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 >>>>> > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid >>>>> norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 >>>>> > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid >>>>> norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 >>>>> > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid >>>>> norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 >>>>> > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid >>>>> norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 >>>>> > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid >>>>> norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 >>>>> > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid >>>>> norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 >>>>> > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid >>>>> norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 >>>>> > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid >>>>> norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 >>>>> > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid >>>>> norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 >>>>> > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid >>>>> norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 >>>>> > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid >>>>> norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 >>>>> > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid >>>>> norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 >>>>> > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid >>>>> norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 >>>>> > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid >>>>> norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 >>>>> > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid >>>>> norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 >>>>> > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid >>>>> norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 >>>>> > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid >>>>> norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 >>>>> > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid >>>>> norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 >>>>> > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid >>>>> norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 >>>>> > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid >>>>> norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 >>>>> > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid >>>>> norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 >>>>> > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid >>>>> norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 >>>>> > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid >>>>> norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 >>>>> > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid >>>>> norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 >>>>> > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid >>>>> norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 >>>>> > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid >>>>> norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 >>>>> > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid >>>>> norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 >>>>> > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid >>>>> norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 >>>>> > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid >>>>> norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 >>>>> > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid >>>>> norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 >>>>> > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid >>>>> norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 >>>>> > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid >>>>> norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 >>>>> > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid >>>>> norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 >>>>> > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid >>>>> norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 >>>>> > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid >>>>> norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 >>>>> > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid >>>>> norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 >>>>> > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid >>>>> norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 >>>>> > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid >>>>> norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 >>>>> > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid >>>>> norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 >>>>> > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid >>>>> norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 >>>>> > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid >>>>> norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 >>>>> > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid >>>>> norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 >>>>> > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid >>>>> norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 >>>>> > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid >>>>> norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 >>>>> > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid >>>>> norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 >>>>> > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid >>>>> norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 >>>>> > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid >>>>> norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 >>>>> > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid >>>>> norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 >>>>> > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid >>>>> norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 >>>>> > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid >>>>> norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 >>>>> > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid >>>>> norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 >>>>> > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid >>>>> norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 >>>>> > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid >>>>> norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 >>>>> > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid >>>>> norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 >>>>> > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid >>>>> norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 >>>>> > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid >>>>> norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 >>>>> > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid >>>>> norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 >>>>> > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid >>>>> norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 >>>>> > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid >>>>> norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 >>>>> > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid >>>>> norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 >>>>> > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid >>>>> norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 >>>>> > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid >>>>> norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 >>>>> > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid >>>>> norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 >>>>> > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid >>>>> norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 >>>>> > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid >>>>> norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 >>>>> > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid >>>>> norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 >>>>> > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid >>>>> norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 >>>>> > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid >>>>> norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 >>>>> > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid >>>>> norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 >>>>> > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid >>>>> norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 >>>>> > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid >>>>> norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 >>>>> > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid >>>>> norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 >>>>> > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid >>>>> norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 >>>>> > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid >>>>> norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 >>>>> > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid >>>>> norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 >>>>> > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid >>>>> norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 >>>>> > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid >>>>> norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 >>>>> > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid >>>>> norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 >>>>> > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid >>>>> norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 >>>>> > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid >>>>> norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 >>>>> > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid >>>>> norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 >>>>> > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid >>>>> norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 >>>>> > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid >>>>> norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 >>>>> > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid >>>>> norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 >>>>> > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid >>>>> norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 >>>>> > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid >>>>> norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 >>>>> > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid >>>>> norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 >>>>> > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid >>>>> norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 >>>>> > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid >>>>> norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 >>>>> > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid >>>>> norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 >>>>> > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid >>>>> norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 >>>>> > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid >>>>> norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 >>>>> > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid >>>>> norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 >>>>> > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid >>>>> norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 >>>>> > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid >>>>> norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 >>>>> > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid >>>>> norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 >>>>> > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid >>>>> norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 >>>>> > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid >>>>> norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 >>>>> > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid >>>>> norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 >>>>> > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid >>>>> norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 >>>>> > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid >>>>> norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 >>>>> > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid >>>>> norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 >>>>> > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid >>>>> norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 >>>>> > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid >>>>> norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 >>>>> > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid >>>>> norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 >>>>> > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid >>>>> norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 >>>>> > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid >>>>> norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 >>>>> > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid >>>>> norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 >>>>> > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid >>>>> norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 >>>>> > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid >>>>> norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 >>>>> > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid >>>>> norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 >>>>> > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid >>>>> norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 >>>>> > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid >>>>> norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 >>>>> > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid >>>>> norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 >>>>> > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid >>>>> norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 >>>>> > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid >>>>> norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 >>>>> > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid >>>>> norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 >>>>> > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid >>>>> norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 >>>>> > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid >>>>> norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 >>>>> > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid >>>>> norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 >>>>> > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid >>>>> norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 >>>>> > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid >>>>> norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 >>>>> > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid >>>>> norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 >>>>> > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid >>>>> norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 >>>>> > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid >>>>> norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 >>>>> > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid >>>>> norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 >>>>> > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid >>>>> norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 >>>>> > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid >>>>> norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 >>>>> > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid >>>>> norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 >>>>> > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid >>>>> norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 >>>>> > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid >>>>> norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 >>>>> > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid >>>>> norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 >>>>> > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid >>>>> norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 >>>>> > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid >>>>> norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 >>>>> > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid >>>>> norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 >>>>> > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid >>>>> norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 >>>>> > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid >>>>> norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 >>>>> > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid >>>>> norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 >>>>> > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid >>>>> norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 >>>>> > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid >>>>> norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 >>>>> > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid >>>>> norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 >>>>> > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid >>>>> norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 >>>>> > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid >>>>> norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 >>>>> > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid >>>>> norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 >>>>> > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid >>>>> norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 >>>>> > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid >>>>> norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 >>>>> > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid >>>>> norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 >>>>> > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid >>>>> norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 >>>>> > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid >>>>> norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 >>>>> > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid >>>>> norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 >>>>> > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid >>>>> norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 >>>>> > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid >>>>> norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 >>>>> > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid >>>>> norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 >>>>> > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid >>>>> norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 >>>>> > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid >>>>> norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 >>>>> > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid >>>>> norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 >>>>> > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid >>>>> norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 >>>>> > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid >>>>> norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 >>>>> > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid >>>>> norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 >>>>> > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid >>>>> norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 >>>>> > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid >>>>> norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 >>>>> > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid >>>>> norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 >>>>> > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid >>>>> norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 >>>>> > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid >>>>> norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 >>>>> > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid >>>>> norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 >>>>> > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid >>>>> norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 >>>>> > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid >>>>> norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 >>>>> > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid >>>>> norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 >>>>> > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid >>>>> norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 >>>>> > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid >>>>> norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 >>>>> > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid >>>>> norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 >>>>> > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid >>>>> norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 >>>>> > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid >>>>> norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 >>>>> > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid >>>>> norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 >>>>> > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid >>>>> norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 >>>>> > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid >>>>> norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 >>>>> > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid >>>>> norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 >>>>> > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid >>>>> norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 >>>>> > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid >>>>> norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 >>>>> > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid >>>>> norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 >>>>> > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid >>>>> norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 >>>>> > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid >>>>> norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 >>>>> > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid >>>>> norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 >>>>> > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid >>>>> norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 >>>>> > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid >>>>> norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 >>>>> > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid >>>>> norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 >>>>> > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid >>>>> norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 >>>>> > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid >>>>> norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 >>>>> > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid >>>>> norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 >>>>> > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid >>>>> norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 >>>>> > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid >>>>> norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 >>>>> > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid >>>>> norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 >>>>> > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid >>>>> norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 >>>>> > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid >>>>> norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 >>>>> > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid >>>>> norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 >>>>> > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid >>>>> norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 >>>>> > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid >>>>> norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 >>>>> > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid >>>>> norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 >>>>> > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid >>>>> norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 >>>>> > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid >>>>> norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 >>>>> > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid >>>>> norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 >>>>> > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid >>>>> norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 >>>>> > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid >>>>> norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 >>>>> > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid >>>>> norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 >>>>> > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid >>>>> norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 >>>>> > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid >>>>> norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 >>>>> > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid >>>>> norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 >>>>> > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid >>>>> norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 >>>>> > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid >>>>> norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 >>>>> > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid >>>>> norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 >>>>> > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid >>>>> norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 >>>>> > Linear solve did not converge due to DIVERGED_ITS iterations 300 >>>>> > KSP Object: 2 MPI processes >>>>> > type: fgmres >>>>> > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization >>>>> > GMRES: happy breakdown tolerance 1e-30 >>>>> > maximum iterations=300, initial guess is zero >>>>> > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 >>>>> > right preconditioning >>>>> > using UNPRECONDITIONED norm type for convergence test >>>>> > PC Object: 2 MPI processes >>>>> > type: fieldsplit >>>>> > FieldSplit with Schur preconditioner, factorization DIAG >>>>> > Preconditioner for the Schur complement formed from Sp, an >>>>> assembled approximation to S, which uses (lumped, if requested) A00's >>>>> diagonal's inverse >>>>> > Split info: >>>>> > Split number 0 Defined by IS >>>>> > Split number 1 Defined by IS >>>>> > KSP solver for A00 block >>>>> > KSP Object: (fieldsplit_u_) 2 MPI processes >>>>> > type: preonly >>>>> > maximum iterations=10000, initial guess is zero >>>>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>>>> > left preconditioning >>>>> > using NONE norm type for convergence test >>>>> > PC Object: (fieldsplit_u_) 2 MPI processes >>>>> > type: lu >>>>> > LU: out-of-place factorization >>>>> > tolerance for zero pivot 2.22045e-14 >>>>> > matrix ordering: natural >>>>> > factor fill ratio given 0, needed 0 >>>>> > Factored matrix follows: >>>>> > Mat Object: 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=184326, cols=184326 >>>>> > package used to perform factorization: mumps >>>>> > total: nonzeros=4.03041e+08, allocated >>>>> nonzeros=4.03041e+08 >>>>> > total number of mallocs used during MatSetValues >>>>> calls =0 >>>>> > MUMPS run parameters: >>>>> > SYM (matrix type): 0 >>>>> > PAR (host participation): 1 >>>>> > ICNTL(1) (output for error): 6 >>>>> > ICNTL(2) (output of diagnostic msg): 0 >>>>> > ICNTL(3) (output for global info): 0 >>>>> > ICNTL(4) (level of printing): 0 >>>>> > ICNTL(5) (input mat struct): 0 >>>>> > ICNTL(6) (matrix prescaling): 7 >>>>> > ICNTL(7) (sequentia matrix ordering):7 >>>>> > ICNTL(8) (scalling strategy): 77 >>>>> > ICNTL(10) (max num of refinements): 0 >>>>> > ICNTL(11) (error analysis): 0 >>>>> > ICNTL(12) (efficiency control): >>>>> 1 >>>>> > ICNTL(13) (efficiency control): >>>>> 0 >>>>> > ICNTL(14) (percentage of estimated workspace >>>>> increase): 20 >>>>> > ICNTL(18) (input mat struct): >>>>> 3 >>>>> > ICNTL(19) (Shur complement info): >>>>> 0 >>>>> > ICNTL(20) (rhs sparse pattern): >>>>> 0 >>>>> > ICNTL(21) (solution struct): >>>>> 1 >>>>> > ICNTL(22) (in-core/out-of-core facility): >>>>> 0 >>>>> > ICNTL(23) (max size of memory can be allocated >>>>> locally):0 >>>>> > ICNTL(24) (detection of null pivot rows): >>>>> 0 >>>>> > ICNTL(25) (computation of a null space basis): >>>>> 0 >>>>> > ICNTL(26) (Schur options for rhs or solution): >>>>> 0 >>>>> > ICNTL(27) (experimental parameter): >>>>> -24 >>>>> > ICNTL(28) (use parallel or sequential >>>>> ordering): 1 >>>>> > ICNTL(29) (parallel ordering): >>>>> 0 >>>>> > ICNTL(30) (user-specified set of entries in >>>>> inv(A)): 0 >>>>> > ICNTL(31) (factors is discarded in the solve >>>>> phase): 0 >>>>> > ICNTL(33) (compute determinant): >>>>> 0 >>>>> > CNTL(1) (relative pivoting threshold): 0.01 >>>>> > CNTL(2) (stopping criterion of refinement): >>>>> 1.49012e-08 >>>>> > CNTL(3) (absolute pivoting threshold): 0 >>>>> > CNTL(4) (value of static pivoting): -1 >>>>> > CNTL(5) (fixation for null pivots): 0 >>>>> > RINFO(1) (local estimated flops for the >>>>> elimination after analysis): >>>>> > [0] 5.59214e+11 >>>>> > [1] 5.35237e+11 >>>>> > RINFO(2) (local estimated flops for the assembly >>>>> after factorization): >>>>> > [0] 4.2839e+08 >>>>> > [1] 3.799e+08 >>>>> > RINFO(3) (local estimated flops for the >>>>> elimination after factorization): >>>>> > [0] 5.59214e+11 >>>>> > [1] 5.35237e+11 >>>>> > INFO(15) (estimated size of (in MB) MUMPS >>>>> internal data for running numerical factorization): >>>>> > [0] 2621 >>>>> > [1] 2649 >>>>> > INFO(16) (size of (in MB) MUMPS internal data >>>>> used during numerical factorization): >>>>> > [0] 2621 >>>>> > [1] 2649 >>>>> > INFO(23) (num of pivots eliminated on this >>>>> processor after factorization): >>>>> > [0] 90423 >>>>> > [1] 93903 >>>>> > RINFOG(1) (global estimated flops for the >>>>> elimination after analysis): 1.09445e+12 >>>>> > RINFOG(2) (global estimated flops for the >>>>> assembly after factorization): 8.0829e+08 >>>>> > RINFOG(3) (global estimated flops for the >>>>> elimination after factorization): 1.09445e+12 >>>>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) >>>>> (determinant): (0,0)*(2^0) >>>>> > INFOG(3) (estimated real workspace for factors >>>>> on all processors after analysis): 403041366 >>>>> > INFOG(4) (estimated integer workspace for >>>>> factors on all processors after analysis): 2265748 >>>>> > INFOG(5) (estimated maximum front size in the >>>>> complete tree): 6663 >>>>> > INFOG(6) (number of nodes in the complete tree): >>>>> 2812 >>>>> > INFOG(7) (ordering option effectively use after >>>>> analysis): 5 >>>>> > INFOG(8) (structural symmetry in percent of the >>>>> permuted matrix after analysis): 100 >>>>> > INFOG(9) (total real/complex workspace to store >>>>> the matrix factors after factorization): 403041366 >>>>> > INFOG(10) (total integer space store the matrix >>>>> factors after factorization): 2265766 >>>>> > INFOG(11) (order of largest frontal matrix after >>>>> factorization): 6663 >>>>> > INFOG(12) (number of off-diagonal pivots): 0 >>>>> > INFOG(13) (number of delayed pivots after >>>>> factorization): 0 >>>>> > INFOG(14) (number of memory compress after >>>>> factorization): 0 >>>>> > INFOG(15) (number of steps of iterative >>>>> refinement after solution): 0 >>>>> > INFOG(16) (estimated size (in MB) of all MUMPS >>>>> internal data for factorization after analysis: value on the most memory >>>>> consuming processor): 2649 >>>>> > INFOG(17) (estimated size of all MUMPS internal >>>>> data for factorization after analysis: sum over all processors): 5270 >>>>> > INFOG(18) (size of all MUMPS internal data >>>>> allocated during factorization: value on the most memory consuming >>>>> processor): 2649 >>>>> > INFOG(19) (size of all MUMPS internal data >>>>> allocated during factorization: sum over all processors): 5270 >>>>> > INFOG(20) (estimated number of entries in the >>>>> factors): 403041366 >>>>> > INFOG(21) (size in MB of memory effectively used >>>>> during factorization - value on the most memory consuming processor): 2121 >>>>> > INFOG(22) (size in MB of memory effectively used >>>>> during factorization - sum over all processors): 4174 >>>>> > INFOG(23) (after analysis: value of ICNTL(6) >>>>> effectively used): 0 >>>>> > INFOG(24) (after analysis: value of ICNTL(12) >>>>> effectively used): 1 >>>>> > INFOG(25) (after factorization: number of pivots >>>>> modified by static pivoting): 0 >>>>> > INFOG(28) (after factorization: number of null >>>>> pivots encountered): 0 >>>>> > INFOG(29) (after factorization: effective number >>>>> of entries in the factors (sum over all processors)): 403041366 >>>>> > INFOG(30, 31) (after solution: size in Mbytes of >>>>> memory used during solution phase): 2467, 4922 >>>>> > INFOG(32) (after analysis: type of analysis >>>>> done): 1 >>>>> > INFOG(33) (value used for ICNTL(8)): 7 >>>>> > INFOG(34) (exponent of the determinant if >>>>> determinant is requested): 0 >>>>> > linear system matrix = precond matrix: >>>>> > Mat Object: (fieldsplit_u_) 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=184326, cols=184326, bs=3 >>>>> > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 >>>>> > total number of mallocs used during MatSetValues calls =0 >>>>> > using I-node (on process 0) routines: found 26829 nodes, >>>>> limit used is 5 >>>>> > KSP solver for S = A11 - A10 inv(A00) A01 >>>>> > KSP Object: (fieldsplit_lu_) 2 MPI processes >>>>> > type: preonly >>>>> > maximum iterations=10000, initial guess is zero >>>>> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>>>> > left preconditioning >>>>> > using NONE norm type for convergence test >>>>> > PC Object: (fieldsplit_lu_) 2 MPI processes >>>>> > type: lu >>>>> > LU: out-of-place factorization >>>>> > tolerance for zero pivot 2.22045e-14 >>>>> > matrix ordering: natural >>>>> > factor fill ratio given 0, needed 0 >>>>> > Factored matrix follows: >>>>> > Mat Object: 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=2583, cols=2583 >>>>> > package used to perform factorization: mumps >>>>> > total: nonzeros=2.17621e+06, allocated >>>>> nonzeros=2.17621e+06 >>>>> > total number of mallocs used during MatSetValues >>>>> calls =0 >>>>> > MUMPS run parameters: >>>>> > SYM (matrix type): 0 >>>>> > PAR (host participation): 1 >>>>> > ICNTL(1) (output for error): 6 >>>>> > ICNTL(2) (output of diagnostic msg): 0 >>>>> > ICNTL(3) (output for global info): 0 >>>>> > ICNTL(4) (level of printing): 0 >>>>> > ICNTL(5) (input mat struct): 0 >>>>> > ICNTL(6) (matrix prescaling): 7 >>>>> > ICNTL(7) (sequentia matrix ordering):7 >>>>> > ICNTL(8) (scalling strategy): 77 >>>>> > ICNTL(10) (max num of refinements): 0 >>>>> > ICNTL(11) (error analysis): 0 >>>>> > ICNTL(12) (efficiency control): >>>>> 1 >>>>> > ICNTL(13) (efficiency control): >>>>> 0 >>>>> > ICNTL(14) (percentage of estimated workspace >>>>> increase): 20 >>>>> > ICNTL(18) (input mat struct): >>>>> 3 >>>>> > ICNTL(19) (Shur complement info): >>>>> 0 >>>>> > ICNTL(20) (rhs sparse pattern): >>>>> 0 >>>>> > ICNTL(21) (solution struct): >>>>> 1 >>>>> > ICNTL(22) (in-core/out-of-core facility): >>>>> 0 >>>>> > ICNTL(23) (max size of memory can be allocated >>>>> locally):0 >>>>> > ICNTL(24) (detection of null pivot rows): >>>>> 0 >>>>> > ICNTL(25) (computation of a null space basis): >>>>> 0 >>>>> > ICNTL(26) (Schur options for rhs or solution): >>>>> 0 >>>>> > ICNTL(27) (experimental parameter): >>>>> -24 >>>>> > ICNTL(28) (use parallel or sequential >>>>> ordering): 1 >>>>> > ICNTL(29) (parallel ordering): >>>>> 0 >>>>> > ICNTL(30) (user-specified set of entries in >>>>> inv(A)): 0 >>>>> > ICNTL(31) (factors is discarded in the solve >>>>> phase): 0 >>>>> > ICNTL(33) (compute determinant): >>>>> 0 >>>>> > CNTL(1) (relative pivoting threshold): 0.01 >>>>> > CNTL(2) (stopping criterion of refinement): >>>>> 1.49012e-08 >>>>> > CNTL(3) (absolute pivoting threshold): 0 >>>>> > CNTL(4) (value of static pivoting): -1 >>>>> > CNTL(5) (fixation for null pivots): 0 >>>>> > RINFO(1) (local estimated flops for the >>>>> elimination after analysis): >>>>> > [0] 5.12794e+08 >>>>> > [1] 5.02142e+08 >>>>> > RINFO(2) (local estimated flops for the assembly >>>>> after factorization): >>>>> > [0] 815031 >>>>> > [1] 745263 >>>>> > RINFO(3) (local estimated flops for the >>>>> elimination after factorization): >>>>> > [0] 5.12794e+08 >>>>> > [1] 5.02142e+08 >>>>> > INFO(15) (estimated size of (in MB) MUMPS >>>>> internal data for running numerical factorization): >>>>> > [0] 34 >>>>> > [1] 34 >>>>> > INFO(16) (size of (in MB) MUMPS internal data >>>>> used during numerical factorization): >>>>> > [0] 34 >>>>> > [1] 34 >>>>> > INFO(23) (num of pivots eliminated on this >>>>> processor after factorization): >>>>> > [0] 1158 >>>>> > [1] 1425 >>>>> > RINFOG(1) (global estimated flops for the >>>>> elimination after analysis): 1.01494e+09 >>>>> > RINFOG(2) (global estimated flops for the >>>>> assembly after factorization): 1.56029e+06 >>>>> > RINFOG(3) (global estimated flops for the >>>>> elimination after factorization): 1.01494e+09 >>>>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) >>>>> (determinant): (0,0)*(2^0) >>>>> > INFOG(3) (estimated real workspace for factors >>>>> on all processors after analysis): 2176209 >>>>> > INFOG(4) (estimated integer workspace for >>>>> factors on all processors after analysis): 14427 >>>>> > INFOG(5) (estimated maximum front size in the >>>>> complete tree): 699 >>>>> > INFOG(6) (number of nodes in the complete tree): >>>>> 15 >>>>> > INFOG(7) (ordering option effectively use after >>>>> analysis): 2 >>>>> > INFOG(8) (structural symmetry in percent of the >>>>> permuted matrix after analysis): 100 >>>>> > INFOG(9) (total real/complex workspace to store >>>>> the matrix factors after factorization): 2176209 >>>>> > INFOG(10) (total integer space store the matrix >>>>> factors after factorization): 14427 >>>>> > INFOG(11) (order of largest frontal matrix after >>>>> factorization): 699 >>>>> > INFOG(12) (number of off-diagonal pivots): 0 >>>>> > INFOG(13) (number of delayed pivots after >>>>> factorization): 0 >>>>> > INFOG(14) (number of memory compress after >>>>> factorization): 0 >>>>> > INFOG(15) (number of steps of iterative >>>>> refinement after solution): 0 >>>>> > INFOG(16) (estimated size (in MB) of all MUMPS >>>>> internal data for factorization after analysis: value on the most memory >>>>> consuming processor): 34 >>>>> > INFOG(17) (estimated size of all MUMPS internal >>>>> data for factorization after analysis: sum over all processors): 68 >>>>> > INFOG(18) (size of all MUMPS internal data >>>>> allocated during factorization: value on the most memory consuming >>>>> processor): 34 >>>>> > INFOG(19) (size of all MUMPS internal data >>>>> allocated during factorization: sum over all processors): 68 >>>>> > INFOG(20) (estimated number of entries in the >>>>> factors): 2176209 >>>>> > INFOG(21) (size in MB of memory effectively used >>>>> during factorization - value on the most memory consuming processor): 30 >>>>> > INFOG(22) (size in MB of memory effectively used >>>>> during factorization - sum over all processors): 59 >>>>> > INFOG(23) (after analysis: value of ICNTL(6) >>>>> effectively used): 0 >>>>> > INFOG(24) (after analysis: value of ICNTL(12) >>>>> effectively used): 1 >>>>> > INFOG(25) (after factorization: number of pivots >>>>> modified by static pivoting): 0 >>>>> > INFOG(28) (after factorization: number of null >>>>> pivots encountered): 0 >>>>> > INFOG(29) (after factorization: effective number >>>>> of entries in the factors (sum over all processors)): 2176209 >>>>> > INFOG(30, 31) (after solution: size in Mbytes of >>>>> memory used during solution phase): 16, 32 >>>>> > INFOG(32) (after analysis: type of analysis >>>>> done): 1 >>>>> > INFOG(33) (value used for ICNTL(8)): 7 >>>>> > INFOG(34) (exponent of the determinant if >>>>> determinant is requested): 0 >>>>> > linear system matrix followed by preconditioner matrix: >>>>> > Mat Object: (fieldsplit_lu_) 2 MPI processes >>>>> > type: schurcomplement >>>>> > rows=2583, cols=2583 >>>>> > Schur complement A11 - A10 inv(A00) A01 >>>>> > A11 >>>>> > Mat Object: (fieldsplit_lu_) >>>>> 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=2583, cols=2583, bs=3 >>>>> > total: nonzeros=117369, allocated nonzeros=117369 >>>>> > total number of mallocs used during MatSetValues >>>>> calls =0 >>>>> > not using I-node (on process 0) routines >>>>> > A10 >>>>> > Mat Object: 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=2583, cols=184326, rbs=3, cbs = 1 >>>>> > total: nonzeros=292770, allocated nonzeros=292770 >>>>> > total number of mallocs used during MatSetValues >>>>> calls =0 >>>>> > not using I-node (on process 0) routines >>>>> > KSP of A00 >>>>> > KSP Object: (fieldsplit_u_) >>>>> 2 MPI processes >>>>> > type: preonly >>>>> > maximum iterations=10000, initial guess is zero >>>>> > tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000 >>>>> > left preconditioning >>>>> > using NONE norm type for convergence test >>>>> > PC Object: (fieldsplit_u_) >>>>> 2 MPI processes >>>>> > type: lu >>>>> > LU: out-of-place factorization >>>>> > tolerance for zero pivot 2.22045e-14 >>>>> > matrix ordering: natural >>>>> > factor fill ratio given 0, needed 0 >>>>> > Factored matrix follows: >>>>> > Mat Object: 2 MPI >>>>> processes >>>>> > type: mpiaij >>>>> > rows=184326, cols=184326 >>>>> > package used to perform factorization: mumps >>>>> > total: nonzeros=4.03041e+08, allocated >>>>> nonzeros=4.03041e+08 >>>>> > total number of mallocs used during >>>>> MatSetValues calls =0 >>>>> > MUMPS run parameters: >>>>> > SYM (matrix type): 0 >>>>> > PAR (host participation): 1 >>>>> > ICNTL(1) (output for error): 6 >>>>> > ICNTL(2) (output of diagnostic msg): 0 >>>>> > ICNTL(3) (output for global info): 0 >>>>> > ICNTL(4) (level of printing): 0 >>>>> > ICNTL(5) (input mat struct): 0 >>>>> > ICNTL(6) (matrix prescaling): 7 >>>>> > ICNTL(7) (sequentia matrix ordering):7 >>>>> > ICNTL(8) (scalling strategy): 77 >>>>> > ICNTL(10) (max num of refinements): 0 >>>>> > ICNTL(11) (error analysis): 0 >>>>> > ICNTL(12) (efficiency control): >>>>> 1 >>>>> > ICNTL(13) (efficiency control): >>>>> 0 >>>>> > ICNTL(14) (percentage of estimated >>>>> workspace increase): 20 >>>>> > ICNTL(18) (input mat struct): >>>>> 3 >>>>> > ICNTL(19) (Shur complement info): >>>>> 0 >>>>> > ICNTL(20) (rhs sparse pattern): >>>>> 0 >>>>> > ICNTL(21) (solution struct): >>>>> 1 >>>>> > ICNTL(22) (in-core/out-of-core >>>>> facility): 0 >>>>> > ICNTL(23) (max size of memory can be >>>>> allocated locally):0 >>>>> > ICNTL(24) (detection of null pivot >>>>> rows): 0 >>>>> > ICNTL(25) (computation of a null space >>>>> basis): 0 >>>>> > ICNTL(26) (Schur options for rhs or >>>>> solution): 0 >>>>> > ICNTL(27) (experimental parameter): >>>>> -24 >>>>> > ICNTL(28) (use parallel or sequential >>>>> ordering): 1 >>>>> > ICNTL(29) (parallel ordering): >>>>> 0 >>>>> > ICNTL(30) (user-specified set of entries >>>>> in inv(A)): 0 >>>>> > ICNTL(31) (factors is discarded in the >>>>> solve phase): 0 >>>>> > ICNTL(33) (compute determinant): >>>>> 0 >>>>> > CNTL(1) (relative pivoting threshold): >>>>> 0.01 >>>>> > CNTL(2) (stopping criterion of >>>>> refinement): 1.49012e-08 >>>>> > CNTL(3) (absolute pivoting threshold): >>>>> 0 >>>>> > CNTL(4) (value of static pivoting): >>>>> -1 >>>>> > CNTL(5) (fixation for null pivots): >>>>> 0 >>>>> > RINFO(1) (local estimated flops for the >>>>> elimination after analysis): >>>>> > [0] 5.59214e+11 >>>>> > [1] 5.35237e+11 >>>>> > RINFO(2) (local estimated flops for the >>>>> assembly after factorization): >>>>> > [0] 4.2839e+08 >>>>> > [1] 3.799e+08 >>>>> > RINFO(3) (local estimated flops for the >>>>> elimination after factorization): >>>>> > [0] 5.59214e+11 >>>>> > [1] 5.35237e+11 >>>>> > INFO(15) (estimated size of (in MB) >>>>> MUMPS internal data for running numerical factorization): >>>>> > [0] 2621 >>>>> > [1] 2649 >>>>> > INFO(16) (size of (in MB) MUMPS internal >>>>> data used during numerical factorization): >>>>> > [0] 2621 >>>>> > [1] 2649 >>>>> > INFO(23) (num of pivots eliminated on >>>>> this processor after factorization): >>>>> > [0] 90423 >>>>> > [1] 93903 >>>>> > RINFOG(1) (global estimated flops for >>>>> the elimination after analysis): 1.09445e+12 >>>>> > RINFOG(2) (global estimated flops for >>>>> the assembly after factorization): 8.0829e+08 >>>>> > RINFOG(3) (global estimated flops for >>>>> the elimination after factorization): 1.09445e+12 >>>>> > (RINFOG(12) RINFOG(13))*2^INFOG(34) >>>>> (determinant): (0,0)*(2^0) >>>>> > INFOG(3) (estimated real workspace for >>>>> factors on all processors after analysis): 403041366 >>>>> > INFOG(4) (estimated integer workspace >>>>> for factors on all processors after analysis): 2265748 >>>>> > INFOG(5) (estimated maximum front size >>>>> in the complete tree): 6663 >>>>> > INFOG(6) (number of nodes in the >>>>> complete tree): 2812 >>>>> > INFOG(7) (ordering option effectively >>>>> use after analysis): 5 >>>>> > INFOG(8) (structural symmetry in percent >>>>> of the permuted matrix after analysis): 100 >>>>> > INFOG(9) (total real/complex workspace >>>>> to store the matrix factors after factorization): 403041366 >>>>> > INFOG(10) (total integer space store the >>>>> matrix factors after factorization): 2265766 >>>>> > INFOG(11) (order of largest frontal >>>>> matrix after factorization): 6663 >>>>> > INFOG(12) (number of off-diagonal >>>>> pivots): 0 >>>>> > INFOG(13) (number of delayed pivots >>>>> after factorization): 0 >>>>> > INFOG(14) (number of memory compress >>>>> after factorization): 0 >>>>> > INFOG(15) (number of steps of iterative >>>>> refinement after solution): 0 >>>>> > INFOG(16) (estimated size (in MB) of all >>>>> MUMPS internal data for factorization after analysis: value on the most >>>>> memory consuming processor): 2649 >>>>> > INFOG(17) (estimated size of all MUMPS >>>>> internal data for factorization after analysis: sum over all processors): >>>>> 5270 >>>>> > INFOG(18) (size of all MUMPS internal >>>>> data allocated during factorization: value on the most memory consuming >>>>> processor): 2649 >>>>> > INFOG(19) (size of all MUMPS internal >>>>> data allocated during factorization: sum over all processors): 5270 >>>>> > INFOG(20) (estimated number of entries >>>>> in the factors): 403041366 >>>>> > INFOG(21) (size in MB of memory >>>>> effectively used during factorization - value on the most memory consuming >>>>> processor): 2121 >>>>> > INFOG(22) (size in MB of memory >>>>> effectively used during factorization - sum over all processors): 4174 >>>>> > INFOG(23) (after analysis: value of >>>>> ICNTL(6) effectively used): 0 >>>>> > INFOG(24) (after analysis: value of >>>>> ICNTL(12) effectively used): 1 >>>>> > INFOG(25) (after factorization: number >>>>> of pivots modified by static pivoting): 0 >>>>> > INFOG(28) (after factorization: number >>>>> of null pivots encountered): 0 >>>>> > INFOG(29) (after factorization: >>>>> effective number of entries in the factors (sum over all processors)): >>>>> 403041366 >>>>> > INFOG(30, 31) (after solution: size in >>>>> Mbytes of memory used during solution phase): 2467, 4922 >>>>> > INFOG(32) (after analysis: type of >>>>> analysis done): 1 >>>>> > INFOG(33) (value used for ICNTL(8)): 7 >>>>> > INFOG(34) (exponent of the determinant >>>>> if determinant is requested): 0 >>>>> > linear system matrix = precond matrix: >>>>> > Mat Object: (fieldsplit_u_) >>>>> 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=184326, cols=184326, bs=3 >>>>> > total: nonzeros=3.32649e+07, allocated >>>>> nonzeros=3.32649e+07 >>>>> > total number of mallocs used during MatSetValues >>>>> calls =0 >>>>> > using I-node (on process 0) routines: found >>>>> 26829 nodes, limit used is 5 >>>>> > A01 >>>>> > Mat Object: 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=184326, cols=2583, rbs=3, cbs = 1 >>>>> > total: nonzeros=292770, allocated nonzeros=292770 >>>>> > total number of mallocs used during MatSetValues >>>>> calls =0 >>>>> > using I-node (on process 0) routines: found 16098 >>>>> nodes, limit used is 5 >>>>> > Mat Object: 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=2583, cols=2583, rbs=3, cbs = 1 >>>>> > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 >>>>> > total number of mallocs used during MatSetValues calls =0 >>>>> > not using I-node (on process 0) routines >>>>> > linear system matrix = precond matrix: >>>>> > Mat Object: 2 MPI processes >>>>> > type: mpiaij >>>>> > rows=186909, cols=186909 >>>>> > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 >>>>> > total number of mallocs used during MatSetValues calls =0 >>>>> > using I-node (on process 0) routines: found 26829 nodes, limit >>>>> used is 5 >>>>> > KSPSolve completed >>>>> > >>>>> > >>>>> > Giang >>>>> > >>>>> > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley >>>>> wrote: >>>>> > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui >>>>> wrote: >>>>> > Hello >>>>> > >>>>> > I'm solving an indefinite problem arising from mesh tying/contact >>>>> using Lagrange multiplier, the matrix has the form >>>>> > >>>>> > K = [A P^T >>>>> > P 0] >>>>> > >>>>> > I used the FIELDSPLIT preconditioner with one field is the main >>>>> variable (displacement) and the other field for dual variable (Lagrange >>>>> multiplier). The block size for each field is 3. According to the manual, I >>>>> first chose the preconditioner based on Schur complement to treat this >>>>> problem. >>>>> > >>>>> > >>>>> > For any solver question, please send us the output of >>>>> > >>>>> > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason >>>>> > >>>>> > >>>>> > However, I will comment below >>>>> > >>>>> > The parameters used for the solve is >>>>> > -ksp_type gmres >>>>> > >>>>> > You need 'fgmres' here with the options you have below. >>>>> > >>>>> > -ksp_max_it 300 >>>>> > -ksp_gmres_restart 300 >>>>> > -ksp_gmres_modifiedgramschmidt >>>>> > -pc_fieldsplit_type schur >>>>> > -pc_fieldsplit_schur_fact_type diag >>>>> > -pc_fieldsplit_schur_precondition selfp >>>>> > >>>>> > >>>>> > >>>>> > It could be taking time in the MatMatMult() here if that matrix is >>>>> dense. Is there any reason to >>>>> > believe that is a good preconditioner for your problem? >>>>> > >>>>> > >>>>> > -pc_fieldsplit_detect_saddle_point >>>>> > -fieldsplit_u_pc_type hypre >>>>> > >>>>> > I would just use MUMPS here to start, especially if it works on the >>>>> whole problem. Same with the one below. >>>>> > >>>>> > Matt >>>>> > >>>>> > -fieldsplit_u_pc_hypre_type boomeramg >>>>> > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS >>>>> > -fieldsplit_lu_pc_type hypre >>>>> > -fieldsplit_lu_pc_hypre_type boomeramg >>>>> > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS >>>>> > >>>>> > For the test case, a small problem is solved on 2 processes. Due to >>>>> the decomposition, the contact only happens in 1 proc, so the size of >>>>> Lagrange multiplier dofs on proc 0 is 0. >>>>> > >>>>> > 0: mIndexU.size(): 80490 >>>>> > 0: mIndexLU.size(): 0 >>>>> > 1: mIndexU.size(): 103836 >>>>> > 1: mIndexLU.size(): 2583 >>>>> > >>>>> > However, with this setup the solver takes very long at KSPSolve >>>>> before going to iteration, and the first iteration seems forever so I have >>>>> to stop the calculation. I guessed that the solver takes time to compute >>>>> the Schur complement, but according to the manual only the diagonal of A is >>>>> used to approximate the Schur complement, so it should not take long to >>>>> compute this. >>>>> > >>>>> > Note that I ran the same problem with direct solver (MUMPS) and it's >>>>> able to produce the valid results. The parameter for the solve is pretty >>>>> standard >>>>> > -ksp_type preonly >>>>> > -pc_type lu >>>>> > -pc_factor_mat_solver_package mumps >>>>> > >>>>> > Hence the matrix/rhs must not have any problem here. Do you have any >>>>> idea or suggestion for this case? >>>>> > >>>>> > >>>>> > Giang >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> > -- Norbert Wiener >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> > -- Norbert Wiener >>>>> > >>>>> >>>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aurelien.ponte at ifremer.fr Fri Sep 16 09:29:47 2016 From: aurelien.ponte at ifremer.fr (Aurelien PONTE) Date: Fri, 16 Sep 2016 16:29:47 +0200 Subject: [petsc-users] 2D vector in 3D dmda Message-ID: <57DC01DB.6080603@ifremer.fr> Hi, I've started using petsc4py in order to solve a 3D problem (inversion of elliptic operator). I would like to store 2D metric terms describing the grid I am working on but don't know how to do that given my domain is tiled in 3D directions: self.da = PETSc.DMDA().create([self.grid.Nx, self.grid.Ny, self.grid.Nz], stencil_width=2) I create my 3D vectors with, for example: self.Q = self.da.createGlobalVec() What am I supposed to do for a 2D vector? Is it a bad idea? thanks aurelien From gotofd at gmail.com Fri Sep 16 11:03:54 2016 From: gotofd at gmail.com (Ji Zhang) Date: Sat, 17 Sep 2016 00:03:54 +0800 Subject: [petsc-users] How to create a local to global mapping and construct matrix correctly Message-ID: Dear all, I have a number of small 'mpidense' matrices mij, and I want to construct them to a big 'mpidense' matrix M like this: [ m11 m12 m13 ] M = | m21 m22 m23 | , [ m31 m32 m33 ] And a short demo is below. I'm using python, but their grammar are similar. import numpy as np from petsc4py import PETSc import sys, petsc4py petsc4py.init(sys.argv) mSizes = (2, 2) mij = [] # create sub-matrices mij for i in range(len(mSizes)): for j in range(len(mSizes)): temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) temp_m.setType('mpidense') temp_m.setFromOptions() temp_m.setUp() temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) temp_m.assemble() temp_m.view() mij.append(temp_m) # Now we have four sub-matrices. I would like to construct them into a big matrix M. M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) M.setType('mpidense') M.setFromOptions() M.setUp() mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, mSizes] for i in range(len(mSizes)): for j in range(len(mSizes)): temp_m = mij[i*len(mSizes)+j].getDenseArray() for k in range(temp_m.shape[0]): M.setValues(mLocations[i]+k, np.arange(mLocations[j],mLocations[j+1],dtype='int32'), temp_m[k, :]) M.assemble() M.view() The code works well in a single cup, but give wrong answer for 2 and more cores. Thanks. 2016-09-17 Best, Regards, Zhang Ji Beijing Computational Science Research Center E-mail: gotofd at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmaldona at hawk.iit.edu Fri Sep 16 11:36:36 2016 From: dmaldona at hawk.iit.edu (Adrian Maldonado) Date: Fri, 16 Sep 2016 11:36:36 -0500 Subject: [petsc-users] Question about PETScSF usage in DMNetwork/DMPlex Message-ID: Hi, I am trying to understand some of the data structures DMPlex/DMNetwork creates and the relationship among them. As an example, I have an small test circuit (/src/ksp/ksp/examples/tutorials/network/ex1.c). This is a graph that consists on 6 edges and 4 vertices, each one of those having one degree of freedom. When ran with two processors, each rank will own 3 edges. Rank 0 will own one vertex (3 ghost) and Rank 1 will own 3 vertices. These are some data structures for this problem. I am getting these data structures inside DMNetworkDistribute DM Object: Parallel Mesh 2 MPI processes type: plex Parallel Mesh in 1 dimensions: 0-cells: 4 3 1-cells: 3 3 Labels: depth: 2 strata of sizes (4, 3) This, as I understand, is printing a tree with all the vertices and edges in each processor (owned and ghost). PetscSection Object: 2 MPI processes type not yet set Process 0: ( 0) dim 1 offset 0 ( 1) dim 1 offset 1 ( 2) dim 1 offset 2 ( 3) dim 1 offset 3 ( 4) dim -2 offset -8 ( 5) dim -2 offset -9 ( 6) dim -2 offset -10 Process 1: ( 0) dim 1 offset 4 ( 1) dim 1 offset 5 ( 2) dim 1 offset 6 ( 3) dim 1 offset 7 ( 4) dim 1 offset 8 ( 5) dim 1 offset 9 This is a global PETSc section that gives me the global numbering for the owned points and (garbage?) negative values for ghost. Until here everything is good. But then I print the PetscSF that is created by 'DMPlexDistribute'. This I do not understand: PetscSF Object: Migration SF 2 MPI processes type: basic sort=rank-order [0] Number of roots=10, leaves=7, remote ranks=1 [0] 0 <- (0,0) [0] 1 <- (0,1) [0] 2 <- (0,3) [0] 3 <- (0,6) [0] 4 <- (0,7) [0] 5 <- (0,8) [0] 6 <- (0,9) [1] Number of roots=0, leaves=6, remote ranks=1 [1] 0 <- (0,2) [1] 1 <- (0,4) [1] 2 <- (0,5) [1] 3 <- (0,7) [1] 4 <- (0,8) [1] 5 <- (0,9) [0] Roots referenced by my leaves, by rank [0] 0: 7 edges [0] 0 <- 0 [0] 1 <- 1 [0] 2 <- 3 [0] 3 <- 6 [0] 4 <- 7 [0] 5 <- 8 [0] 6 <- 9 [1] Roots referenced by my leaves, by rank [1] 0: 6 edges [1] 0 <- 2 [1] 1 <- 4 [1] 2 <- 5 [1] 3 <- 7 [1] 4 <- 8 [1] 5 <- 9 I understand that SF is a data structure that saves references to pieces of data that are now owned by the process ( https://arxiv.org/pdf/1506.06194v1.pdf, page 4). Since the only ghost nodes appear in rank 0 (three ghost vertices) I would expect something like: *rank 0:* 4 - (1, 3) (to read: point 4 is owned by rank 1 and is rank's 1 point 3) etc... *rank 1:* nothing Is my intuition correct? If so, what does the star forest that I get from DMPlexDistribute mean? I am printing the wrong thing? Thank you -- D. Adrian Maldonado, PhD Candidate Electrical & Computer Engineering Dept. Illinois Institute of Technology 3301 S. Dearborn Street, Chicago, IL 60616 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 16 12:43:46 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 Sep 2016 12:43:46 -0500 Subject: [petsc-users] 2D vector in 3D dmda In-Reply-To: <57DC01DB.6080603@ifremer.fr> References: <57DC01DB.6080603@ifremer.fr> Message-ID: <48F7A9AE-1659-4C4C-AF5B-00D532C26F5F@mcs.anl.gov> > On Sep 16, 2016, at 9:29 AM, Aurelien PONTE wrote: > > Hi, > > I've started using petsc4py in order to solve a 3D problem (inversion of elliptic operator). > I would like to store 2D metric terms describing the grid What do you mean by 2D metric terms describing the grid? Do you want to store something like a little dense 2d array for each grid point? If so create another 3D DA with a dof = the product of the dimensions of the little dense 2d array and then store the little dense 2d arrays at in a global vector obtained from that DA. Or is the grid uniform in one dimension and not uniform in the other two and hence you want to store the information about the non-uniformness in only a 2d array so as to not "waste" the redundant information in the third direction? Then I recommend just "waste" the redundant information in the third dimension; it is trivial compared to all the data you need to solve the problem. Or do you mean something else? Barry > I am working on but don't know > how to do that given my domain is tiled in 3D directions: > > self.da = PETSc.DMDA().create([self.grid.Nx, self.grid.Ny, self.grid.Nz], > stencil_width=2) > > I create my 3D vectors with, for example: > > self.Q = self.da.createGlobalVec() > > What am I supposed to do for a 2D vector? > Is it a bad idea? > > thanks > > aurelien > From hengjiew at uci.edu Fri Sep 16 12:53:26 2016 From: hengjiew at uci.edu (Hengjie Wang) Date: Fri, 16 Sep 2016 10:53:26 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> Message-ID: Hi Dave, I add both options and test it by solving the poisson eqn in a 1024 cube with 32^3 cores. This test used to give the OOM error. Now it runs well. I attach the ksp_view and log_view's output in case you want to know. I also test my original code with those petsc options by simulating a decaying turbulence in a 1024 cube. It also works. I am going to test the code on a larger scale. If there is any problem then, I will let you know. This really helps a lot. Thank you so much. Regards, Frank On 9/15/2016 3:35 AM, Dave May wrote: > HI all, > > I the only unexpected memory usage I can see is associated with the > call to MatPtAP(). > Here is something you can try immediately. > Run your code with the additional options > -matrap 0 -matptap_scalable > > I didn't realize this before, but the default behaviour of MatPtAP in > parallel is actually to to explicitly form the transpose of P (e.g. > assemble R = P^T) and then compute R.A.P. > You don't want to do this. The option -matrap 0 resolves this issue. > > The implementation of P^T.A.P has two variants. > The scalable implementation (with respect to memory usage) is selected > via the second option -matptap_scalable. > > Try it out - I see a significant memory reduction using these options > for particular mesh sizes / partitions. > > I've attached a cleaned up version of the code you sent me. > There were a number of memory leaks and other issues. > The main points being > * You should call DMDAVecGetArrayF90() before VecAssembly{Begin,End} > * You should call PetscFinalize(), otherwise the option -log_summary > (-log_view) will not display anything once the program has completed. > > > Thanks, > Dave > > > On 15 September 2016 at 08:03, Hengjie Wang > wrote: > > Hi Dave, > > Sorry, I should have put more comment to explain the code. > The number of process in each dimension is the same: Px = Py=Pz=P. > So is the domain size. > So if the you want to run the code for a 512^3 grid points on > 16^3 cores, you need to set "-N 512 -P 16" in the command line. > I add more comments and also fix an error in the attached code. ( > The error only effects the accuracy of solution but not the memory > usage. ) > > Thank you. > Frank > > > On 9/14/2016 9:05 PM, Dave May wrote: >> >> >> On Thursday, 15 September 2016, Dave May > > wrote: >> >> >> >> On Thursday, 15 September 2016, frank wrote: >> >> Hi, >> >> I write a simple code to re-produce the error. I hope >> this can help to diagnose the problem. >> The code just solves a 3d poisson equation. >> >> >> Why is the stencil width a runtime parameter?? And why is the >> default value 2? For 7-pnt FD Laplace, you only need >> a stencil width of 1. >> >> Was this choice made to mimic something in the >> real application code? >> >> >> Please ignore - I misunderstood your usage of the param set by -P >> >> >> I run the code on a 1024^3 mesh. The process partition is >> 32 * 32 * 32. That's when I re-produce the OOM error. >> Each core has about 2G memory. >> I also run the code on a 512^3 mesh with 16 * 16 * 16 >> processes. The ksp solver works fine. >> I attached the code, ksp_view_pre's output and my petsc >> option file. >> >> Thank you. >> Frank >> >> On 09/09/2016 06:38 PM, Hengjie Wang wrote: >>> Hi Barry, >>> >>> I checked. On the supercomputer, I had the option >>> "-ksp_view_pre" but it is not in file I sent you. I am >>> sorry for the confusion. >>> >>> Regards, >>> Frank >>> >>> On Friday, September 9, 2016, Barry Smith >>> wrote: >>> >>> >>> > On Sep 9, 2016, at 3:11 PM, frank >>> wrote: >>> > >>> > Hi Barry, >>> > >>> > I think the first KSP view output is from >>> -ksp_view_pre. Before I submitted the test, I was >>> not sure whether there would be OOM error or not. So >>> I added both -ksp_view_pre and -ksp_view. >>> >>> But the options file you sent specifically does >>> NOT list the -ksp_view_pre so how could it be from that? >>> >>> Sorry to be pedantic but I've spent too much time >>> in the past trying to debug from incorrect >>> information and want to make sure that the >>> information I have is correct before thinking. >>> Please recheck exactly what happened. Rerun with the >>> exact input file you emailed if that is needed. >>> >>> Barry >>> >>> > >>> > Frank >>> > >>> > >>> > On 09/09/2016 12:38 PM, Barry Smith wrote: >>> >> Why does ksp_view2.txt have two KSP views in it >>> while ksp_view1.txt has only one KSPView in it? Did >>> you run two different solves in the 2 case but not >>> the one? >>> >> >>> >> Barry >>> >> >>> >> >>> >> >>> >>> On Sep 9, 2016, at 10:56 AM, frank >>> wrote: >>> >>> >>> >>> Hi, >>> >>> >>> >>> I want to continue digging into the memory >>> problem here. >>> >>> I did find a work around in the past, which is >>> to use less cores per node so that each core has 8G >>> memory. However this is deficient and expensive. I >>> hope to locate the place that uses the most memory. >>> >>> >>> >>> Here is a brief summary of the tests I did in past: >>> >>>> Test1: Mesh 1536*128*384 | Process Mesh 48*4*12 >>> >>> Maximum (over computational time) process >>> memory: total 7.0727e+08 >>> >>> Current process memory: total >>> 7.0727e+08 >>> >>> Maximum (over computational time) space >>> PetscMalloc()ed: total 6.3908e+11 >>> >>> Current space PetscMalloc()ed: >>> total 1.8275e+09 >>> >>> >>> >>>> Test2: Mesh 1536*128*384 | Process Mesh >>> 96*8*24 >>> >>> Maximum (over computational time) process >>> memory: total 5.9431e+09 >>> >>> Current process memory: total >>> 5.9431e+09 >>> >>> Maximum (over computational time) space >>> PetscMalloc()ed: total 5.3202e+12 >>> >>> Current space PetscMalloc()ed: >>> total 5.4844e+09 >>> >>> >>> >>>> Test3: Mesh 3072*256*768 | Process Mesh >>> 96*8*24 >>> >>> OOM( Out Of Memory ) killer of the >>> supercomputer terminated the job during "KSPSolve". >>> >>> >>> >>> I attached the output of ksp_view( the third >>> test's output is from ksp_view_pre ), memory_view >>> and also the petsc options. >>> >>> >>> >>> In all the tests, each core can access about 2G >>> memory. In test3, there are 4223139840 non-zeros in >>> the matrix. This will consume about 1.74M, using >>> double precision. Considering some extra memory used >>> to store integer index, 2G memory should still be >>> way enough. >>> >>> >>> >>> Is there a way to find out which part of >>> KSPSolve uses the most memory? >>> >>> Thank you so much. >>> >>> >>> >>> BTW, there are 4 options remains unused and I >>> don't understand why they are omitted: >>> >>> -mg_coarse_telescope_mg_coarse_ksp_type value: >>> preonly >>> >>> -mg_coarse_telescope_mg_coarse_pc_type value: >>> bjacobi >>> >>> -mg_coarse_telescope_mg_levels_ksp_max_it value: 1 >>> >>> -mg_coarse_telescope_mg_levels_ksp_type value: >>> richardson >>> >>> >>> >>> >>> >>> Regards, >>> >>> Frank >>> >>> >>> >>> On 07/13/2016 05:47 PM, Dave May wrote: >>> >>>> >>> >>>> On 14 July 2016 at 01:07, frank >>> wrote: >>> >>>> Hi Dave, >>> >>>> >>> >>>> Sorry for the late reply. >>> >>>> Thank you so much for your detailed reply. >>> >>>> >>> >>>> I have a question about the estimation of the >>> memory usage. There are 4223139840 allocated >>> non-zeros and 18432 MPI processes. Double precision >>> is used. So the memory per process is: >>> >>>> 4223139840 * 8bytes / 18432 / 1024 / 1024 = >>> 1.74M ? >>> >>>> Did I do sth wrong here? Because this seems too >>> small. >>> >>>> >>> >>>> No - I totally f***ed it up. You are correct. >>> That'll teach me for fumbling around with my iphone >>> calculator and not using my brain. (Note that to >>> convert to MB just divide by 1e6, not 1024^2 - >>> although I apparently cannot convert between units >>> correctly....) >>> >>>> >>> >>>> From the PETSc objects associated with the >>> solver, It looks like it _should_ run with 2GB per >>> MPI rank. Sorry for my mistake. Possibilities are: >>> somewhere in your usage of PETSc you've introduced a >>> memory leak; PETSc is doing a huge over allocation >>> (e.g. as per our discussion of MatPtAP); or in your >>> application code there are other objects you have >>> forgotten to log the memory for. >>> >>>> >>> >>>> >>> >>>> >>> >>>> I am running this job on Bluewater >>> >>>> I am using the 7 points FD stencil in 3D. >>> >>>> >>> >>>> I thought so on both counts. >>> >>>> >>> >>>> I apologize that I made a stupid mistake in >>> computing the memory per core. My settings render >>> each core can access only 2G memory on average >>> instead of 8G which I mentioned in previous email. I >>> re-run the job with 8G memory per core on average >>> and there is no "Out Of Memory" error. I would do >>> more test to see if there is still some memory issue. >>> >>>> >>> >>>> Ok. I'd still like to know where the memory was >>> being used since my estimates were off. >>> >>>> >>> >>>> >>> >>>> Thanks, >>> >>>> Dave >>> >>>> >>> >>>> Regards, >>> >>>> Frank >>> >>>> >>> >>>> >>> >>>> >>> >>>> On 07/11/2016 01:18 PM, Dave May wrote: >>> >>>>> Hi Frank, >>> >>>>> >>> >>>>> >>> >>>>> On 11 July 2016 at 19:14, frank >>> wrote: >>> >>>>> Hi Dave, >>> >>>>> >>> >>>>> I re-run the test using bjacobi as the >>> preconditioner on the coarse mesh of telescope. The >>> Grid is 3072*256*768 and process mesh is 96*8*24. >>> The petsc option file is attached. >>> >>>>> I still got the "Out Of Memory" error. The >>> error occurred before the linear solver finished one >>> step. So I don't have the full info from ksp_view. >>> The info from ksp_view_pre is attached. >>> >>>>> >>> >>>>> Okay - that is essentially useless (sorry) >>> >>>>> >>> >>>>> It seems to me that the error occurred when >>> the decomposition was going to be changed. >>> >>>>> >>> >>>>> Based on what information? >>> >>>>> Running with -info would give us more clues, >>> but will create a ton of output. >>> >>>>> Please try running the case which failed with >>> -info >>> >>>>> I had another test with a grid of >>> 1536*128*384 and the same process mesh as above. >>> There was no error. The ksp_view info is attached >>> for comparison. >>> >>>>> Thank you. >>> >>>>> >>> >>>>> >>> >>>>> [3] Here is my crude estimate of your memory >>> usage. >>> >>>>> I'll target the biggest memory hogs only to >>> get an order of magnitude estimate >>> >>>>> >>> >>>>> * The Fine grid operator contains 4223139840 >>> non-zeros --> 1.8 GB per MPI rank assuming double >>> precision. >>> >>>>> The indices for the AIJ could amount to >>> another 0.3 GB (assuming 32 bit integers) >>> >>>>> >>> >>>>> * You use 5 levels of coarsening, so the other >>> operators should represent (collectively) >>> >>>>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 >>> MB per MPI rank on the communicator with 18432 ranks. >>> >>>>> The coarse grid should consume ~ 0.5 MB per >>> MPI rank on the communicator with 18432 ranks. >>> >>>>> >>> >>>>> * You use a reduction factor of 64, making the >>> new communicator with 288 MPI ranks. >>> >>>>> PCTelescope will first gather a temporary >>> matrix associated with your coarse level operator >>> assuming a comm size of 288 living on the comm with >>> size 18432. >>> >>>>> This matrix will require approximately 0.5 * >>> 64 = 32 MB per core on the 288 ranks. >>> >>>>> This matrix is then used to form a new MPIAIJ >>> matrix on the subcomm, thus require another 32 MB >>> per rank. >>> >>>>> The temporary matrix is now destroyed. >>> >>>>> >>> >>>>> * Because a DMDA is detected, a permutation >>> matrix is assembled. >>> >>>>> This requires 2 doubles per point in the DMDA. >>> >>>>> Your coarse DMDA contains 92 x 16 x 48 points. >>> >>>>> Thus the permutation matrix will require < 1 >>> MB per MPI rank on the sub-comm. >>> >>>>> >>> >>>>> * Lastly, the matrix is permuted. This uses >>> MatPtAP(), but the resulting operator will have the >>> same memory footprint as the unpermuted matrix (32 >>> MB). At any stage in PCTelescope, only 2 operators >>> of size 32 MB are held in memory when the DMDA is >>> provided. >>> >>>>> >>> >>>>> From my rough estimates, the worst case memory >>> foot print for any given core, given your options is >>> approximately >>> >>>>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >>> >>>>> This is way below 8 GB. >>> >>>>> >>> >>>>> Note this estimate completely ignores: >>> >>>>> (1) the memory required for the restriction >>> operator, >>> >>>>> (2) the potential growth in the number of >>> non-zeros per row due to Galerkin coarsening (I >>> wished -ksp_view_pre reported the output from >>> MatView so we could see the number of non-zeros >>> required by the coarse level operators) >>> >>>>> (3) all temporary vectors required by the CG >>> solver, and those required by the smoothers. >>> >>>>> (4) internal memory allocated by MatPtAP >>> >>>>> (5) memory associated with IS's used within >>> PCTelescope >>> >>>>> >>> >>>>> So either I am completely off in my estimates, >>> or you have not carefully estimated the memory usage >>> of your application code. Hopefully others might >>> examine/correct my rough estimates >>> >>>>> >>> >>>>> Since I don't have your code I cannot access >>> the latter. >>> >>>>> Since I don't have access to the same machine >>> you are running on, I think we need to take a step back. >>> >>>>> >>> >>>>> [1] What machine are you running on? Send me a >>> URL if its available >>> >>>>> >>> >>>>> [2] What discretization are you using? (I am >>> guessing a scalar 7 point FD stencil) >>> >>>>> If it's a 7 point FD stencil, we should be >>> able to examine the memory usage of your solver >>> configuration using a standard, light weight >>> existing PETSc example, run on your machine at the >>> same scale. >>> >>>>> This would hopefully enable us to correctly >>> evaluate the actual memory usage required by the >>> solver configuration you are using. >>> >>>>> >>> >>>>> Thanks, >>> >>>>> Dave >>> >>>>> >>> >>>>> >>> >>>>> Frank >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> On 07/08/2016 10:38 PM, Dave May wrote: >>> >>>>>> >>> >>>>>> On Saturday, 9 July 2016, frank >>> wrote: >>> >>>>>> Hi Barry and Dave, >>> >>>>>> >>> >>>>>> Thank both of you for the advice. >>> >>>>>> >>> >>>>>> @Barry >>> >>>>>> I made a mistake in the file names in last >>> email. I attached the correct files this time. >>> >>>>>> For all the three tests, 'Telescope' is used >>> as the coarse preconditioner. >>> >>>>>> >>> >>>>>> == Test1: Grid: 1536*128*384, Process >>> Mesh: 48*4*12 >>> >>>>>> Part of the memory usage: Vector 125 124 >>> 3971904 0. >>> >>>>>> Matrix 101 101 9462372 0 >>> >>>>>> >>> >>>>>> == Test2: Grid: 1536*128*384, Process Mesh: >>> 96*8*24 >>> >>>>>> Part of the memory usage: Vector 125 124 >>> 681672 0. >>> >>>>>> Matrix 101 101 1462180 0. >>> >>>>>> >>> >>>>>> In theory, the memory usage in Test1 should >>> be 8 times of Test2. In my case, it is about 6 times. >>> >>>>>> >>> >>>>>> == Test3: Grid: 3072*256*768, Process Mesh: >>> 96*8*24. Sub-domain per process: 32*32*32 >>> >>>>>> Here I get the out of memory error. >>> >>>>>> >>> >>>>>> I tried to use -mg_coarse jacobi. In this >>> way, I don't need to set -mg_coarse_ksp_type and >>> -mg_coarse_pc_type explicitly, right? >>> >>>>>> The linear solver didn't work in this case. >>> Petsc output some errors. >>> >>>>>> >>> >>>>>> @Dave >>> >>>>>> In test3, I use only one instance of >>> 'Telescope'. On the coarse mesh of 'Telescope', I >>> used LU as the preconditioner instead of SVD. >>> >>>>>> If my set the levels correctly, then on the >>> last coarse mesh of MG where it calls 'Telescope', >>> the sub-domain per process is 2*2*2. >>> >>>>>> On the last coarse mesh of 'Telescope', there >>> is only one grid point per process. >>> >>>>>> I still got the OOM error. The detailed petsc >>> option file is attached. >>> >>>>>> >>> >>>>>> Do you understand the expected memory usage >>> for the particular parallel LU implementation you >>> are using? I don't (seriously). Replace LU with >>> bjacobi and re-run this test. My point about solver >>> debugging is still valid. >>> >>>>>> >>> >>>>>> And please send the result of KSPView so we >>> can see what is actually used in the computations >>> >>>>>> >>> >>>>>> Thanks >>> >>>>>> Dave >>> >>>>>> >>> >>>>>> >>> >>>>>> Thank you so much. >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>> >>>>>> On Jul 6, 2016, at 4:19 PM, frank >>> wrote: >>> >>>>>> >>> >>>>>> Hi Barry, >>> >>>>>> >>> >>>>>> Thank you for you advice. >>> >>>>>> I tried three test. In the 1st test, the grid >>> is 3072*256*768 and the process mesh is 96*8*24. >>> >>>>>> The linear solver is 'cg' the preconditioner >>> is 'mg' and 'telescope' is used as the >>> preconditioner at the coarse mesh. >>> >>>>>> The system gives me the "Out of Memory" error >>> before the linear system is completely solved. >>> >>>>>> The info from '-ksp_view_pre' is attached. I >>> seems to me that the error occurs when it reaches >>> the coarse mesh. >>> >>>>>> >>> >>>>>> The 2nd test uses a grid of 1536*128*384 and >>> process mesh is 96*8*24. The 3rd test uses the same >>> grid but a different process mesh 48*4*12. >>> >>>>>> Are you sure this is right? The total >>> matrix and vector memory usage goes from 2nd test >>> >>>>>> Vector 384 383 8,193,712 0. >>> >>>>>> Matrix 103 103 11,508,688 0. >>> >>>>>> to 3rd test >>> >>>>>> Vector 384 383 1,590,520 0. >>> >>>>>> Matrix 103 103 3,508,664 0. >>> >>>>>> that is the memory usage got smaller but if >>> you have only 1/8th the processes and the same grid >>> it should have gotten about 8 times bigger. Did you >>> maybe cut the grid by a factor of 8 also? If so that >>> still doesn't explain it because the memory usage >>> changed by a factor of 5 something for the vectors >>> and 3 something for the matrices. >>> >>>>>> >>> >>>>>> >>> >>>>>> The linear solver and petsc options in 2nd >>> and 3rd tests are the same in 1st test. The linear >>> solver works fine in both test. >>> >>>>>> I attached the memory usage of the 2nd and >>> 3rd tests. The memory info is from the option >>> '-log_summary'. I tried to use '-momery_info' as you >>> suggested, but in my case petsc treated it as an >>> unused option. It output nothing about the memory. >>> Do I need to add sth to my code so I can use >>> '-memory_info'? >>> >>>>>> Sorry, my mistake the option is -memory_view >>> >>>>>> >>> >>>>>> Can you run the one case with -memory_view >>> and -mg_coarse jacobi -ksp_max_it 1 (just so it >>> doesn't iterate forever) to see how much memory is >>> used without the telescope? Also run case 2 the same >>> way. >>> >>>>>> >>> >>>>>> Barry >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> In both tests the memory usage is not large. >>> >>>>>> >>> >>>>>> It seems to me that it might be the >>> 'telescope' preconditioner that allocated a lot of >>> memory and caused the error in the 1st test. >>> >>>>>> Is there is a way to show how much memory it >>> allocated? >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>> >>>>>> Frank, >>> >>>>>> >>> >>>>>> You can run with -ksp_view_pre to have >>> it "view" the KSP before the solve so hopefully it >>> gets that far. >>> >>>>>> >>> >>>>>> Please run the problem that does fit >>> with -memory_info when the problem completes it will >>> show the "high water mark" for PETSc allocated >>> memory and total memory used. We first want to look >>> at these numbers to see if it is using more memory >>> than you expect. You could also run with say half >>> the grid spacing to see how the memory usage scaled >>> with the increase in grid points. Make the runs also >>> with -log_view and send all the output from these >>> options. >>> >>>>>> >>> >>>>>> Barry >>> >>>>>> >>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank >>> wrote: >>> >>>>>> >>> >>>>>> Hi, >>> >>>>>> >>> >>>>>> I am using the CG ksp solver and Multigrid >>> preconditioner to solve a linear system in parallel. >>> >>>>>> I chose to use the 'Telescope' as the >>> preconditioner on the coarse mesh for its good >>> performance. >>> >>>>>> The petsc options file is attached. >>> >>>>>> >>> >>>>>> The domain is a 3d box. >>> >>>>>> It works well when the grid is 1536*128*384 >>> and the process mesh is 96*8*24. When I double the >>> size of grid and keep >>> the same process mesh and petsc options, I get an >>> "out of memory" error from the super-cluster I am using. >>> >>>>>> Each process has access to at least 8G >>> memory, which should be more than enough for my >>> application. I am sure that all the other parts of >>> my code( except the linear solver ) do not use much >>> memory. So I doubt if there is something wrong with >>> the linear solver. >>> >>>>>> The error occurs before the linear system is >>> completely solved so I don't have the info from ksp >>> view. I am not able to re-produce the error with a >>> smaller problem either. >>> >>>>>> In addition, I tried to use the block jacobi >>> as the preconditioner with the same grid and same >>> decomposition. The linear solver runs extremely slow >>> but there is no memory error. >>> >>>>>> >>> >>>>>> How can I diagnose what exactly cause the error? >>> >>>>>> Thank you so much. >>> >>>>>> >>> >>>>>> Frank >>> >>>>>> >>> >>>>>> >>> >>> >>>>>> >>> >>>>> >>> >>>> >>> >>> >>> >>> > >>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- KSP Object: 32768 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 32768 MPI processes type: mg MG: type is MULTIPLICATIVE, levels=5 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 32768 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 32768 MPI processes type: telescope Telescope: parent comm size reduction factor = 64 Telescope: comm_size = 32768 , subcomm_size = 512 Telescope: subcomm type: interlaced Telescope: DMDA detected DMDA Object: (mg_coarse_telescope_repart_) 512 MPI processes M 64 N 64 P 64 m 8 n 8 p 8 dof 1 overlap 1 KSP Object: (mg_coarse_telescope_) 512 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_) 512 MPI processes type: mg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_telescope_mg_coarse_) 512 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_coarse_) 512 MPI processes type: redundant Redundant preconditioner: First (color=0) of 512 PCs follows linear system matrix = precond matrix: Mat Object: 512 MPI processes type: mpiaij rows=4096, cols=4096 total: nonzeros=110592, allocated nonzeros=110592 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 2 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_1_) 512 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_1_) 512 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 512 MPI processes type: mpiaij rows=32768, cols=32768 total: nonzeros=884736, allocated nonzeros=884736 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_2_) 512 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_2_) 512 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 512 MPI processes type: mpiaij rows=262144, cols=262144 total: nonzeros=7077888, allocated nonzeros=7077888 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 512 MPI processes type: mpiaij rows=262144, cols=262144 total: nonzeros=7077888, allocated nonzeros=7077888 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines KSP Object: (mg_coarse_telescope_mg_coarse_redundant_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_coarse_redundant_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_coarse_telescope_mg_coarse_redundant_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_coarse_redundant_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=4096, cols=4096 package used to perform factorization: petsc total: nonzeros=110592, allocated nonzeros=110592 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=4096, cols=4096 total: nonzeros=110592, allocated nonzeros=110592 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=4096, cols=4096 total: nonzeros=110592, allocated nonzeros=110592 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 32768 MPI processes type: mpiaij rows=262144, cols=262144 total: nonzeros=7077888, allocated nonzeros=7077888 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 2 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 32768 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 32768 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 32768 MPI processes type: mpiaij rows=2097152, cols=2097152 total: nonzeros=56623104, allocated nonzeros=56623104 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 32768 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 32768 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 32768 MPI processes type: mpiaij rows=16777216, cols=16777216 total: nonzeros=452984832, allocated nonzeros=452984832 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 32768 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_3_) 32768 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 32768 MPI processes type: mpiaij rows=134217728, cols=134217728 total: nonzeros=3623878656, allocated nonzeros=3623878656 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (mg_levels_4_) 32768 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_4_) 32768 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 32768 MPI processes type: mpiaij rows=1073741824, cols=1073741824 total: nonzeros=7516192768, allocated nonzeros=7516192768 total number of mallocs used during MatSetValues calls =0 has attached null space Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 32768 MPI processes type: mpiaij rows=1073741824, cols=1073741824 total: nonzeros=7516192768, allocated nonzeros=7516192768 total number of mallocs used during MatSetValues calls =0 has attached null space -------------- next part -------------- 32768 processors, by hengjie Fri Sep 16 04:29:10 2016 Using Petsc Development GIT revision: v3.7.3-1056-geeb1ceb GIT Date: 2016-08-02 10:00:58 -0500 Max Max/Min Avg Total Time (sec): 3.595e+01 1.00092 3.595e+01 Objects: 4.240e+02 1.61217 2.655e+02 Flops: 7.348e+07 1.09866 6.699e+07 2.195e+12 Flops/sec: 2.044e+06 1.09875 1.863e+06 6.106e+10 Memory: 1.110e+09 1.00000 3.636e+13 MPI Messages: 5.004e+04 11.27696 4.668e+03 1.530e+08 MPI Message Lengths: 4.805e+06 1.27794 8.088e+02 1.237e+11 MPI Reductions: 2.296e+03 1.48994 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 3.5947e+01 100.0% 2.1951e+12 100.0% 1.530e+08 100.0% 8.088e+02 100.0% 1.551e+03 67.5% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecTDot 30 1.0 1.9905e-01 1.4 1.97e+06 1.0 0.0e+00 0.0e+00 6.0e+01 1 3 0 0 3 1 3 0 0 4 323650 VecNorm 16 1.0 3.9425e-01 3.5 1.05e+06 1.0 0.0e+00 0.0e+00 3.2e+01 1 2 0 0 1 1 2 0 0 2 87152 VecScale 75 1.7 2.3286e-02 2.0 4.52e+04 1.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 50363 VecCopy 17 1.0 3.8621e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 442 1.7 9.8095e-03 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 60 1.0 3.5868e-02 1.3 3.93e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 6 0 0 0 3592294 VecAYPX 119 1.3 1.7319e-02 1.3 1.98e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 3728684 VecAssemblyBegin 1 1.0 1.0757e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 1 1.0 2.7490e-04 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 471 1.5 5.8588e-02 3.4 0.00e+00 0.0 1.2e+08 8.1e+02 0.0e+00 0 0 81 81 0 0 0 81 81 0 0 VecScatterEnd 471 1.5 1.2934e+00 6.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 MatMult 135 1.3 2.8880e-01 1.4 2.33e+07 1.0 5.0e+07 1.7e+03 0.0e+00 1 34 32 66 0 1 34 32 66 0 2597254 MatMultAdd 90 1.5 1.1149e-01 2.9 3.85e+06 1.0 1.4e+07 3.2e+02 0.0e+00 0 6 9 4 0 0 6 9 4 0 1114404 MatMultTranspose 111 1.4 3.0435e-01 1.3 4.11e+06 1.0 1.7e+07 2.8e+02 8.0e+01 1 6 11 4 3 1 6 11 4 5 435479 MatSolve 15 0.0 2.0206e-02 0.0 3.26e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 82513 MatSOR 180 1.5 9.9816e-01 1.3 2.32e+07 1.0 3.9e+07 2.4e+02 1.2e+00 2 33 25 8 0 2 33 25 8 0 727846 MatLUFactorNum 1 0.0 2.4225e-02 0.0 1.60e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33762 MatILUFactorSym 1 0.0 2.5048e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 1 0.0 7.5793e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatResidual 90 1.5 3.7126e-01 1.2 1.11e+07 1.0 4.2e+07 8.0e+02 6.0e+01 1 16 27 27 3 1 16 27 27 4 942007 MatAssemblyBegin 33 1.4 7.2762e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.8e+01 2 0 0 0 2 2 0 0 0 3 0 MatAssemblyEnd 33 1.4 1.4643e+00 1.1 0.00e+00 0.0 1.1e+07 1.2e+02 2.5e+02 4 0 7 1 11 4 0 7 1 16 0 MatGetRowIJ 1 0.0 1.5974e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 2 2.0 3.4627e-01 3.7 0.00e+00 0.0 1.6e+05 5.4e+02 6.1e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 0.0 1.9929e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatView 13 2.2 1.0639e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 1 0 MatPtAP 7 1.4 5.2281e+00 1.0 5.25e+06 1.0 2.4e+07 8.8e+02 2.1e+02 14 8 15 17 9 14 8 15 17 14 31939 MatPtAPSymbolic 7 1.4 4.0818e+00 1.0 0.00e+00 0.0 1.4e+07 1.1e+03 7.5e+01 11 0 9 12 3 11 0 9 12 5 0 MatPtAPNumeric 7 1.4 1.1755e+00 1.0 5.25e+06 1.0 9.6e+06 5.7e+02 1.4e+02 3 8 6 4 6 3 8 6 4 9 142046 MatRedundantMat 1 0.0 1.3647e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 7.8e-02 0 0 0 0 0 0 0 0 0 0 0 MatMPIConcateSeq 1 0.0 2.7197e-01 0.0 0.00e+00 0.0 2.7e+04 4.0e+01 6.1e-01 0 0 0 0 0 0 0 0 0 0 0 MatGetLocalMat 7 1.4 1.3259e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 7 1.4 6.9566e-02 2.8 0.00e+00 0.0 1.1e+07 1.1e+03 0.0e+00 0 0 7 10 0 0 0 7 10 0 0 MatGetSymTrans 14 1.4 2.2139e-02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMCoarsen 6 1.5 3.3237e-01 1.1 0.00e+00 0.0 1.6e+06 1.7e+02 2.1e+02 1 0 1 0 9 1 0 1 0 13 0 DMCreateInterp 6 1.5 7.6958e-01 1.1 2.57e+05 1.0 2.8e+06 1.6e+02 2.0e+02 2 0 2 0 9 2 0 2 0 13 10763 KSPSetUp 12 2.0 1.1138e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.5e+01 0 0 0 0 2 0 0 0 0 2 0 KSPSolve 1 1.0 1.2628e+01 1.0 7.35e+07 1.1 1.5e+08 8.0e+02 1.4e+03 35100 99 99 59 35100 99 99 87 173826 PCSetUp 3 3.0 9.2140e+00 1.1 7.10e+06 1.3 2.9e+07 7.4e+02 7.9e+02 23 8 19 18 34 23 8 19 18 51 19110 PCSetUpOnBlocks 15 0.0 2.8822e-02 0.0 1.60e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 28377 PCApply 15 1.0 3.5384e+00 1.0 5.58e+07 1.1 1.2e+08 6.3e+02 3.7e+02 10 74 79 62 16 10 74 79 62 24 457052 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 197 197 4396000 0. Vector Scatter 27 27 333392 0. Matrix 66 66 14132608 0. Matrix Null Space 1 1 592 0. Distributed Mesh 8 8 40832 0. Star Forest Bipartite Graph 16 16 13568 0. Discrete System 8 8 7008 0. Index Set 60 60 341672 0. IS L to G Mapping 8 8 195776 0. Krylov Solver 12 12 14760 0. DMKSP interface 6 6 3888 0. Preconditioner 12 12 11928 0. Viewer 3 2 1664 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 0.000146198 Average time for zero size MPI_Send(): 3.66852e-06 From dmaldona at hawk.iit.edu Fri Sep 16 12:54:02 2016 From: dmaldona at hawk.iit.edu (Adrian Maldonado) Date: Fri, 16 Sep 2016 12:54:02 -0500 Subject: [petsc-users] Question about PETScSF usage in DMNetwork/DMPlex In-Reply-To: References: Message-ID: Just one addition about one thing I've noticed. The section: PetscSection Object: 2 MPI processes type not yet set Process 0: ( 0) dim 1 offset 0 ( 1) dim 1 offset 1 ( 2) dim 1 offset 2 ( 3) dim 1 offset 3 ( 4) dim -2 offset -8 ( 5) dim -2 offset -9 ( 6) dim -2 offset -10 Process 1: ( 0) dim 1 offset 4 ( 1) dim 1 offset 5 ( 2) dim 1 offset 6 ( 3) dim 1 offset 7 ( 4) dim 1 offset 8 ( 5) dim 1 offset 9 For the ghost values 4, 5, 6... is encoding the ghost values as rank = -(-2 + 1) and offset = -(-8 + 1) ? On Fri, Sep 16, 2016 at 11:36 AM, Adrian Maldonado wrote: > Hi, > > I am trying to understand some of the data structures DMPlex/DMNetwork > creates and the relationship among them. > > As an example, I have an small test circuit (/src/ksp/ksp/examples/ > tutorials/network/ex1.c). > > This is a graph that consists on 6 edges and 4 vertices, each one of those > having one degree of freedom. When ran with two processors, each rank will > own 3 edges. Rank 0 will own one vertex (3 ghost) and Rank 1 will own 3 > vertices. > > These are some data structures for this problem. I am getting these data > structures inside DMNetworkDistribute > > > DM Object: Parallel Mesh 2 MPI processes > type: plex > Parallel Mesh in 1 dimensions: > 0-cells: 4 3 > 1-cells: 3 3 > Labels: > depth: 2 strata of sizes (4, 3) > > This, as I understand, is printing a tree with all the vertices and edges > in each processor (owned and ghost). > > PetscSection Object: 2 MPI processes > type not yet set > Process 0: > ( 0) dim 1 offset 0 > ( 1) dim 1 offset 1 > ( 2) dim 1 offset 2 > ( 3) dim 1 offset 3 > ( 4) dim -2 offset -8 > ( 5) dim -2 offset -9 > ( 6) dim -2 offset -10 > Process 1: > ( 0) dim 1 offset 4 > ( 1) dim 1 offset 5 > ( 2) dim 1 offset 6 > ( 3) dim 1 offset 7 > ( 4) dim 1 offset 8 > ( 5) dim 1 offset 9 > > This is a global PETSc section that gives me the global numbering for the > owned points and (garbage?) negative values for ghost. > > Until here everything is good. But then I print the PetscSF that is > created by 'DMPlexDistribute'. This I do not understand: > > PetscSF Object: Migration SF 2 MPI processes > type: basic > sort=rank-order > [0] Number of roots=10, leaves=7, remote ranks=1 > [0] 0 <- (0,0) > [0] 1 <- (0,1) > [0] 2 <- (0,3) > [0] 3 <- (0,6) > [0] 4 <- (0,7) > [0] 5 <- (0,8) > [0] 6 <- (0,9) > [1] Number of roots=0, leaves=6, remote ranks=1 > [1] 0 <- (0,2) > [1] 1 <- (0,4) > [1] 2 <- (0,5) > [1] 3 <- (0,7) > [1] 4 <- (0,8) > [1] 5 <- (0,9) > [0] Roots referenced by my leaves, by rank > [0] 0: 7 edges > [0] 0 <- 0 > [0] 1 <- 1 > [0] 2 <- 3 > [0] 3 <- 6 > [0] 4 <- 7 > [0] 5 <- 8 > [0] 6 <- 9 > [1] Roots referenced by my leaves, by rank > [1] 0: 6 edges > [1] 0 <- 2 > [1] 1 <- 4 > [1] 2 <- 5 > [1] 3 <- 7 > [1] 4 <- 8 > [1] 5 <- 9 > > I understand that SF is a data structure that saves references to pieces > of data that are now owned by the process (https://arxiv.org/pdf/1506. > 06194v1.pdf, page 4). > > Since the only ghost nodes appear in rank 0 (three ghost vertices) I would > expect something like: > *rank 0:* > 4 - (1, 3) (to read: point 4 is owned by rank 1 and is rank's 1 point 3) > etc... > *rank 1:* > nothing > > Is my intuition correct? If so, what does the star forest that I get from > DMPlexDistribute mean? I am printing the wrong thing? > > Thank you > > -- > D. Adrian Maldonado, PhD Candidate > Electrical & Computer Engineering Dept. > Illinois Institute of Technology > 3301 S. Dearborn Street, Chicago, IL 60616 > -- D. Adrian Maldonado, PhD Candidate Electrical & Computer Engineering Dept. Illinois Institute of Technology 3301 S. Dearborn Street, Chicago, IL 60616 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 16 13:24:46 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 Sep 2016 13:24:46 -0500 Subject: [petsc-users] How to create a local to global mapping and construct matrix correctly In-Reply-To: References: Message-ID: <7B805207-FBC7-4286-8D0D-331BBE26C6D5@mcs.anl.gov> "Gives wrong answers" is not very informative. What answer do you expect and what answer do you get? Note that each process is looping over mSizes? for i in range(len(mSizes)): for j in range(len(mSizes)): Is this what you want? It doesn't seem likely that you want all processes to generate all information in the matrix. Each process should be doing a subset of the generation. Barry > On Sep 16, 2016, at 11:03 AM, Ji Zhang wrote: > > Dear all, > > I have a number of small 'mpidense' matrices mij, and I want to construct them to a big 'mpidense' matrix M like this: > [ m11 m12 m13 ] > M = | m21 m22 m23 | , > [ m31 m32 m33 ] > > And a short demo is below. I'm using python, but their grammar are similar. > import numpy as np > from petsc4py import PETSc > import sys, petsc4py > > > petsc4py.init(sys.argv) > mSizes = (2, 2) > mij = [] > > # create sub-matrices mij > for i in range(len(mSizes)): > for j in range(len(mSizes)): > temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) > temp_m.setType('mpidense') > temp_m.setFromOptions() > temp_m.setUp() > temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) > temp_m.assemble() > temp_m.view() > mij.append(temp_m) > > # Now we have four sub-matrices. I would like to construct them into a big matrix M. > M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) > M.setType('mpidense') > M.setFromOptions() > M.setUp() > mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, mSizes] > for i in range(len(mSizes)): > for j in range(len(mSizes)): > temp_m = mij[i*len(mSizes)+j].getDenseArray() > for k in range(temp_m.shape[0]): > M.setValues(mLocations[i]+k, np.arange(mLocations[j],mLocations[j+1],dtype='int32'), temp_m[k, :]) > M.assemble() > M.view() > The code works well in a single cup, but give wrong answer for 2 and more cores. > > Thanks. > 2016-09-17 > Best, > Regards, > Zhang Ji > Beijing Computational Science Research Center > E-mail: gotofd at gmail.com > > From bsmith at mcs.anl.gov Fri Sep 16 13:31:22 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 Sep 2016 13:31:22 -0500 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: Message-ID: <19EEF686-0334-46CD-A25D-4DFCA2B5D94B@mcs.anl.gov> Why is your C matrix an MPIAIJ matrix on one process? In general we recommend creating a SeqAIJ matrix for one process and MPIAIJ for multiple. You can use MatCreateAIJ() and it will always create the correct one. We could change the code as you suggest but I want to make sure that is the best solution in your case. Barry > On Sep 16, 2016, at 3:31 AM, Hoang Giang Bui wrote: > > Hi Matt > > I believed at line 523, src/ksp/ksp/utils/schurm.c > > ierr = MatMatMult(C, AinvB, MAT_INITIAL_MATRIX, fill, S);CHKERRQ(ierr); > > in my test case C is MPIAIJ and AinvB is SEQAIJ, hence it throws the error. > > In fact I guess there are two issues with it > line 521, ierr = MatConvert(AinvBd, MATAIJ, MAT_INITIAL_MATRIX, &AinvB);CHKERRQ(ierr); > shall we convert this to type of C matrix to ensure compatibility ? > > line 552, if(norm > PETSC_MACHINE_EPSILON) SETERRQ(PetscObjectComm((PetscObject) M), PETSC_ERR_SUP, "Not yet implemented for Schur complements with non-vanishing D"); > with this the Schur complement with A11!=0 will be aborted > > Giang > > On Thu, Sep 15, 2016 at 4:28 PM, Matthew Knepley wrote: > On Thu, Sep 15, 2016 at 9:07 AM, Hoang Giang Bui wrote: > Hi Matt > > Thanks for the comment. After looking carefully into the manual again, the key take away is that with selfp there is no option to compute the exact Schur, there are only two options to approximate the inv(A00) for selfp, which are lump and diag (diag by default). I misunderstood this previously. > > There is online manual entry mentioned about PC_FIELDSPLIT_SCHUR_PRE_FULL, which is not documented elsewhere in the offline manual. I tried to access that by setting > -pc_fieldsplit_schur_precondition full > > Yep, I wrote that specifically for testing, but its very slow so I did not document it to prevent people from complaining. > > but it gives the error > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Arguments are incompatible > [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with B, seqaij > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui Thu Sep 15 15:46:56 2016 > [0]PETSC ERROR: Configure options --with-shared-libraries --with-debugging=0 --with-pic --download-fblaslapack=yes --download-suitesparse --download-ptscotch=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes --download-mumps=yes --download-hypre=yes --download-ml=yes --download-pastix=yes --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 > [0]PETSC ERROR: #1 MatMatMult() line 9514 in /home/hbui/sw/petsc-3.7.3/src/mat/interface/matrix.c > [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line 526 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c > [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > Please excuse me to insist on forming the exact Schur complement, but as you said, I would like to track down what creates problem in my code by starting from a very exact but ineffective solution. > > Sure, I understand. I do not understand how A can be MPI and B can be Seq. Do you know how that happens? > > Thanks, > > Matt > > Giang > > On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley wrote: > On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui wrote: > Dear Barry > > Thanks for the clarification. I got exactly what you said if the code changed to > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > Residual norms for stokes_ solve. > 0 KSP Residual norm 1.327791371202e-02 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > 1 KSP Residual norm 3.997711925708e-17 > > but I guess we solve a different problem if B is used for the linear system. > > in addition, changed to > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > also works but inner iteration converged not in one iteration > > Residual norms for stokes_ solve. > 0 KSP Residual norm 1.327791371202e-02 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 > 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 > 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 > 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 > 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 > 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 > 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 > 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 > 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 > 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 > 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 > 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 > 1 KSP Residual norm 1.445282647127e-16 > > > Seem like zero pivot does not happen, but why the solver for Schur takes 13 steps if the preconditioner is direct solver? > > Look at the -ksp_view. I will bet that the default is to shift (add a multiple of the identity) the matrix instead of failing. This > gives an inexact PC, but as you see it can converge. > > Thanks, > > Matt > > > I also so tried another problem which I known does have a nonsingular Schur (at least A11 != 0) and it also have the same problem: 1 step outer convergence but multiple step inner convergence. > > Any ideas? > > Giang > > On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: > > Normally you'd be absolutely correct to expect convergence in one iteration. However in this example note the call > > ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); > > It is solving the linear system defined by A but building the preconditioner (i.e. the entire fieldsplit process) from a different matrix B. Since A is not B you should not expect convergence in one iteration. If you change the code to > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > you will see exactly what you expect, convergence in one iteration. > > Sorry about this, the example is lacking clarity and documentation its author obviously knew too well what he was doing that he didn't realize everyone else in the world would need more comments in the code. If you change the code to > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > it will stop without being able to build the preconditioner because LU factorization of the Sp matrix will result in a zero pivot. This is why this "auxiliary" matrix B is used to define the preconditioner instead of A. > > Barry > > > > > > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui wrote: > > > > Sorry I slept quite a while in this thread. Now I start to look at it again. In the last try, the previous setting doesn't work either (in fact diverge). So I would speculate if the Schur complement in my case is actually not invertible. It's also possible that the code is wrong somewhere. However, before looking at that, I want to understand thoroughly the settings for Schur complement > > > > I experimented ex42 with the settings: > > mpirun -np 1 ex42 \ > > -stokes_ksp_monitor \ > > -stokes_ksp_type fgmres \ > > -stokes_pc_type fieldsplit \ > > -stokes_pc_fieldsplit_type schur \ > > -stokes_pc_fieldsplit_schur_fact_type full \ > > -stokes_pc_fieldsplit_schur_precondition selfp \ > > -stokes_fieldsplit_u_ksp_type preonly \ > > -stokes_fieldsplit_u_pc_type lu \ > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > -stokes_fieldsplit_p_ksp_type gmres \ > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > -stokes_fieldsplit_p_pc_type lu \ > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > > > In my understanding, the solver should converge in 1 (outer) step. Execution gives: > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > 1 KSP Residual norm 7.656238881621e-04 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > > 2 KSP Residual norm 9.895645456398e-06 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > > 3 KSP Residual norm 1.608206702571e-06 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > > 4 KSP Residual norm 2.453618999882e-07 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > > 5 KSP Residual norm 4.230399264750e-08 > > > > Looks like the "selfp" does construct the Schur nicely. But does "full" really construct the full block preconditioner? > > > > Giang > > P/S: I'm also generating a smaller size of the previous problem for checking again. > > > > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley wrote: > > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui wrote: > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > believe that is a good preconditioner for your problem? > > > > This is the first approach to the problem, so I chose the most simple setting. Do you have any other recommendation? > > > > This is in no way the simplest PC. We need to make it simpler first. > > > > 1) Run on only 1 proc > > > > 2) Use -pc_fieldsplit_schur_fact_type full > > > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_true_residual > > > > This should converge in 1 outer iteration, but we will see how good your Schur complement preconditioner > > is for this problem. > > > > You need to start out from something you understand and then start making approximations. > > > > Matt > > > > For any solver question, please send us the output of > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > I sent here the full output (after changed to fgmres), again it takes long at the first iteration but after that, it does not converge > > > > -ksp_type fgmres > > -ksp_max_it 300 > > -ksp_gmres_restart 300 > > -ksp_gmres_modifiedgramschmidt > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type diag > > -pc_fieldsplit_schur_precondition selfp > > -pc_fieldsplit_detect_saddle_point > > -fieldsplit_u_ksp_type preonly > > -fieldsplit_u_pc_type lu > > -fieldsplit_u_pc_factor_mat_solver_package mumps > > -fieldsplit_lu_ksp_type preonly > > -fieldsplit_lu_pc_type lu > > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > > Linear solve did not converge due to DIVERGED_ITS iterations 300 > > KSP Object: 2 MPI processes > > type: fgmres > > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=300, initial guess is zero > > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > > right preconditioning > > using UNPRECONDITIONED norm type for convergence test > > PC Object: 2 MPI processes > > type: fieldsplit > > FieldSplit with Schur preconditioner, factorization DIAG > > Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse > > Split info: > > Split number 0 Defined by IS > > Split number 1 Defined by IS > > KSP solver for A00 block > > KSP Object: (fieldsplit_u_) 2 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_u_) 2 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326 > > package used to perform factorization: mumps > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > total number of mallocs used during MatSetValues calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): 1 > > ICNTL(13) (efficiency control): 0 > > ICNTL(14) (percentage of estimated workspace increase): 20 > > ICNTL(18) (input mat struct): 3 > > ICNTL(19) (Shur complement info): 0 > > ICNTL(20) (rhs sparse pattern): 0 > > ICNTL(21) (solution struct): 1 > > ICNTL(22) (in-core/out-of-core facility): 0 > > ICNTL(23) (max size of memory can be allocated locally):0 > > ICNTL(24) (detection of null pivot rows): 0 > > ICNTL(25) (computation of a null space basis): 0 > > ICNTL(26) (Schur options for rhs or solution): 0 > > ICNTL(27) (experimental parameter): -24 > > ICNTL(28) (use parallel or sequential ordering): 1 > > ICNTL(29) (parallel ordering): 0 > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve phase): 0 > > ICNTL(33) (compute determinant): 0 > > CNTL(1) (relative pivoting threshold): 0.01 > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the elimination after analysis): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > RINFO(2) (local estimated flops for the assembly after factorization): > > [0] 4.2839e+08 > > [1] 3.799e+08 > > RINFO(3) (local estimated flops for the elimination after factorization): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(23) (num of pivots eliminated on this processor after factorization): > > [0] 90423 > > [1] 93903 > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > INFOG(6) (number of nodes in the complete tree): 2812 > > INFOG(7) (ordering option effectively use after analysis): 5 > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after factorization): 0 > > INFOG(14) (number of memory compress after factorization): 0 > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > INFOG(20) (estimated number of entries in the factors): 403041366 > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > INFOG(32) (after analysis: type of analysis done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_u_) 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326, bs=3 > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > KSP solver for S = A11 - A10 inv(A00) A01 > > KSP Object: (fieldsplit_lu_) 2 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_lu_) 2 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=2583 > > package used to perform factorization: mumps > > total: nonzeros=2.17621e+06, allocated nonzeros=2.17621e+06 > > total number of mallocs used during MatSetValues calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): 1 > > ICNTL(13) (efficiency control): 0 > > ICNTL(14) (percentage of estimated workspace increase): 20 > > ICNTL(18) (input mat struct): 3 > > ICNTL(19) (Shur complement info): 0 > > ICNTL(20) (rhs sparse pattern): 0 > > ICNTL(21) (solution struct): 1 > > ICNTL(22) (in-core/out-of-core facility): 0 > > ICNTL(23) (max size of memory can be allocated locally):0 > > ICNTL(24) (detection of null pivot rows): 0 > > ICNTL(25) (computation of a null space basis): 0 > > ICNTL(26) (Schur options for rhs or solution): 0 > > ICNTL(27) (experimental parameter): -24 > > ICNTL(28) (use parallel or sequential ordering): 1 > > ICNTL(29) (parallel ordering): 0 > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve phase): 0 > > ICNTL(33) (compute determinant): 0 > > CNTL(1) (relative pivoting threshold): 0.01 > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the elimination after analysis): > > [0] 5.12794e+08 > > [1] 5.02142e+08 > > RINFO(2) (local estimated flops for the assembly after factorization): > > [0] 815031 > > [1] 745263 > > RINFO(3) (local estimated flops for the elimination after factorization): > > [0] 5.12794e+08 > > [1] 5.02142e+08 > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > [0] 34 > > [1] 34 > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > [0] 34 > > [1] 34 > > INFO(23) (num of pivots eliminated on this processor after factorization): > > [0] 1158 > > [1] 1425 > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.01494e+09 > > RINFOG(2) (global estimated flops for the assembly after factorization): 1.56029e+06 > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.01494e+09 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 2176209 > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 14427 > > INFOG(5) (estimated maximum front size in the complete tree): 699 > > INFOG(6) (number of nodes in the complete tree): 15 > > INFOG(7) (ordering option effectively use after analysis): 2 > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 2176209 > > INFOG(10) (total integer space store the matrix factors after factorization): 14427 > > INFOG(11) (order of largest frontal matrix after factorization): 699 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after factorization): 0 > > INFOG(14) (number of memory compress after factorization): 0 > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 34 > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 68 > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 34 > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 68 > > INFOG(20) (estimated number of entries in the factors): 2176209 > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 30 > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 59 > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 2176209 > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 16, 32 > > INFOG(32) (after analysis: type of analysis done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > linear system matrix followed by preconditioner matrix: > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > type: schurcomplement > > rows=2583, cols=2583 > > Schur complement A11 - A10 inv(A00) A01 > > A11 > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > type: mpiaij > > rows=2583, cols=2583, bs=3 > > total: nonzeros=117369, allocated nonzeros=117369 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > A10 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=184326, rbs=3, cbs = 1 > > total: nonzeros=292770, allocated nonzeros=292770 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > KSP of A00 > > KSP Object: (fieldsplit_u_) 2 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_u_) 2 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: natural > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326 > > package used to perform factorization: mumps > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > total number of mallocs used during MatSetValues calls =0 > > MUMPS run parameters: > > SYM (matrix type): 0 > > PAR (host participation): 1 > > ICNTL(1) (output for error): 6 > > ICNTL(2) (output of diagnostic msg): 0 > > ICNTL(3) (output for global info): 0 > > ICNTL(4) (level of printing): 0 > > ICNTL(5) (input mat struct): 0 > > ICNTL(6) (matrix prescaling): 7 > > ICNTL(7) (sequentia matrix ordering):7 > > ICNTL(8) (scalling strategy): 77 > > ICNTL(10) (max num of refinements): 0 > > ICNTL(11) (error analysis): 0 > > ICNTL(12) (efficiency control): 1 > > ICNTL(13) (efficiency control): 0 > > ICNTL(14) (percentage of estimated workspace increase): 20 > > ICNTL(18) (input mat struct): 3 > > ICNTL(19) (Shur complement info): 0 > > ICNTL(20) (rhs sparse pattern): 0 > > ICNTL(21) (solution struct): 1 > > ICNTL(22) (in-core/out-of-core facility): 0 > > ICNTL(23) (max size of memory can be allocated locally):0 > > ICNTL(24) (detection of null pivot rows): 0 > > ICNTL(25) (computation of a null space basis): 0 > > ICNTL(26) (Schur options for rhs or solution): 0 > > ICNTL(27) (experimental parameter): -24 > > ICNTL(28) (use parallel or sequential ordering): 1 > > ICNTL(29) (parallel ordering): 0 > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > ICNTL(31) (factors is discarded in the solve phase): 0 > > ICNTL(33) (compute determinant): 0 > > CNTL(1) (relative pivoting threshold): 0.01 > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > CNTL(3) (absolute pivoting threshold): 0 > > CNTL(4) (value of static pivoting): -1 > > CNTL(5) (fixation for null pivots): 0 > > RINFO(1) (local estimated flops for the elimination after analysis): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > RINFO(2) (local estimated flops for the assembly after factorization): > > [0] 4.2839e+08 > > [1] 3.799e+08 > > RINFO(3) (local estimated flops for the elimination after factorization): > > [0] 5.59214e+11 > > [1] 5.35237e+11 > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > [0] 2621 > > [1] 2649 > > INFO(23) (num of pivots eliminated on this processor after factorization): > > [0] 90423 > > [1] 93903 > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > INFOG(6) (number of nodes in the complete tree): 2812 > > INFOG(7) (ordering option effectively use after analysis): 5 > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > INFOG(12) (number of off-diagonal pivots): 0 > > INFOG(13) (number of delayed pivots after factorization): 0 > > INFOG(14) (number of memory compress after factorization): 0 > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > INFOG(20) (estimated number of entries in the factors): 403041366 > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > INFOG(32) (after analysis: type of analysis done): 1 > > INFOG(33) (value used for ICNTL(8)): 7 > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_u_) 2 MPI processes > > type: mpiaij > > rows=184326, cols=184326, bs=3 > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > A01 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=184326, cols=2583, rbs=3, cbs = 1 > > total: nonzeros=292770, allocated nonzeros=292770 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 16098 nodes, limit used is 5 > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=2583, cols=2583, rbs=3, cbs = 1 > > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node (on process 0) routines > > linear system matrix = precond matrix: > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=186909, cols=186909 > > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > KSPSolve completed > > > > > > Giang > > > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley wrote: > > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui wrote: > > Hello > > > > I'm solving an indefinite problem arising from mesh tying/contact using Lagrange multiplier, the matrix has the form > > > > K = [A P^T > > P 0] > > > > I used the FIELDSPLIT preconditioner with one field is the main variable (displacement) and the other field for dual variable (Lagrange multiplier). The block size for each field is 3. According to the manual, I first chose the preconditioner based on Schur complement to treat this problem. > > > > > > For any solver question, please send us the output of > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > However, I will comment below > > > > The parameters used for the solve is > > -ksp_type gmres > > > > You need 'fgmres' here with the options you have below. > > > > -ksp_max_it 300 > > -ksp_gmres_restart 300 > > -ksp_gmres_modifiedgramschmidt > > -pc_fieldsplit_type schur > > -pc_fieldsplit_schur_fact_type diag > > -pc_fieldsplit_schur_precondition selfp > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > believe that is a good preconditioner for your problem? > > > > > > -pc_fieldsplit_detect_saddle_point > > -fieldsplit_u_pc_type hypre > > > > I would just use MUMPS here to start, especially if it works on the whole problem. Same with the one below. > > > > Matt > > > > -fieldsplit_u_pc_hypre_type boomeramg > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > -fieldsplit_lu_pc_type hypre > > -fieldsplit_lu_pc_hypre_type boomeramg > > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > > > For the test case, a small problem is solved on 2 processes. Due to the decomposition, the contact only happens in 1 proc, so the size of Lagrange multiplier dofs on proc 0 is 0. > > > > 0: mIndexU.size(): 80490 > > 0: mIndexLU.size(): 0 > > 1: mIndexU.size(): 103836 > > 1: mIndexLU.size(): 2583 > > > > However, with this setup the solver takes very long at KSPSolve before going to iteration, and the first iteration seems forever so I have to stop the calculation. I guessed that the solver takes time to compute the Schur complement, but according to the manual only the diagonal of A is used to approximate the Schur complement, so it should not take long to compute this. > > > > Note that I ran the same problem with direct solver (MUMPS) and it's able to produce the valid results. The parameter for the solve is pretty standard > > -ksp_type preonly > > -pc_type lu > > -pc_factor_mat_solver_package mumps > > > > Hence the matrix/rhs must not have any problem here. Do you have any idea or suggestion for this case? > > > > > > Giang > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From jed at jedbrown.org Fri Sep 16 14:29:29 2016 From: jed at jedbrown.org (Jed Brown) Date: Fri, 16 Sep 2016 13:29:29 -0600 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <2D96AD50-C582-414C-963F-B231F4445BCD@mcs.anl.gov> References: <577C337B.60909@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> <5959F823-EDE5-4B34-84C2-271076977368@mcs.anl.gov> <0CFDEA05-2C49-4127-9F13-2B2DB71ADA77@mcs.anl.gov> <27f4756a-3c58-5c56-fd5b-000aac881a5b@uci.edu> <535EFF3A-8BF9-4A95-8FBA-5AC1BE798659@mcs.anl.gov> < CAJ98EDq 1VWkTD4N-m-z+yfqmzm0RNYQtg50wMyempwQfY_xkQw@mail.gmail.com> <2D96AD50-C582-414C-963F-B231F4445BCD@mcs.anl.gov> Message-ID: <87k2ebprs6.fsf@jedbrown.org> Barry Smith writes: >> On Sep 15, 2016, at 1:10 PM, Dave May wrote: >> >> >> >> On Thursday, 15 September 2016, Barry Smith wrote: >> >> Should we have some simple selection of default algorithms based on problem size/number of processes? For example if using more than 1000 processes then use scalable version etc? How would we decide on the parameter values? >> >> I don't like the idea of having "smart" selection by default as it's terribly annoying for the user when they try and understand the performance characteristics of a given method when they do a strong/weak scaling test. If such a smart selection strategy was adopted, the details of it should be made abundantly clear to the user. >> >> These algs are dependent on many some factors, thus making the smart selection for all use cases hard / impossible. >> >> I would be happy with unifying the three inplemtationa with three different options AND having these implantation options documented in the man page. Maybe even the man page should advise users which to use in particular circumstances (I think there is something similar on the VecScatter page). >> >> I have these as suggestions for unifying the options names using bools >> >> -matptap_explicit_transpose >> -matptap_symbolic_transpose_dense >> -matptap_symbolic_transpose >> >> Or maybe enums is more clear >> -matptap_impl {explicit_pt,symbolic_pt_dense,symbolic_pt} >> >> which are equivalent to these options >> 1) the current default >> 2) -matrap 0 >> 3) -matrap 0 -matptap_scalable >> >> Maybe there could be a fourth option >> -matptap_dynamic_selection >> which chooses the most appropriate alg given machine info, problem size, partition size,.... At least if the user explicitly chooses the dynamic_selection mode, they wouldn't be surprised if there were any bumps appearing in any scaling study they conducted. > > I like the idea of enum types with the final enum type being "dynamically select one for me". I also like enums and "-matptap_impl auto" (which could be the default). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From hgbk2008 at gmail.com Fri Sep 16 18:09:25 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Sat, 17 Sep 2016 01:09:25 +0200 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: <19EEF686-0334-46CD-A25D-4DFCA2B5D94B@mcs.anl.gov> References: <19EEF686-0334-46CD-A25D-4DFCA2B5D94B@mcs.anl.gov> Message-ID: Hi Barry You are right, using MatCreateAIJ() eliminates the first issue. Previously I ran the mpi code with one process so A,B,C,D is all MPIAIJ And how about the second issue, this error will always be thrown if A11 is nonzero, which is my case? Nevertheless, I would like to report my simple finding: I changed the part around line 552 to if (D) { ierr = MatAXPY(*S, -1.0, D, SUBSET_NONZERO_PATTERN);CHKERRQ(ierr); } I could get ex42 works with ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); parameters: mpirun -np 1 ex42 \ -stokes_ksp_monitor \ -stokes_ksp_type fgmres \ -stokes_pc_type fieldsplit \ -stokes_pc_fieldsplit_type schur \ -stokes_pc_fieldsplit_schur_fact_type full \ -stokes_pc_fieldsplit_schur_precondition full \ -stokes_fieldsplit_u_ksp_type preonly \ -stokes_fieldsplit_u_pc_type lu \ -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ -stokes_fieldsplit_p_ksp_type gmres \ -stokes_fieldsplit_p_ksp_monitor_true_residual \ -stokes_fieldsplit_p_ksp_max_it 300 \ -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ -stokes_fieldsplit_p_ksp_gmres_restart 300 \ -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ -stokes_fieldsplit_p_pc_type lu \ -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps \ Output: Residual norms for stokes_ solve. 0 KSP Residual norm 1.327791371202e-02 Residual norms for stokes_fieldsplit_p_ solve. 0 KSP preconditioned resid norm 1.651372938841e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.172753353368e+00 true resid norm 2.072348962892e-05 ||r(i)||/||b|| 3.588013522487e-04 2 KSP preconditioned resid norm 3.931379526610e-13 true resid norm 1.878299731917e-16 ||r(i)||/||b|| 3.252041503665e-15 1 KSP Residual norm 3.385960118582e-17 inner convergence is much better although 2 iterations (:-( ?? I also obtain the same convergence behavior for the problem with A11!=0 Please suggest if this makes sense, or I did something wrong. Giang On Fri, Sep 16, 2016 at 8:31 PM, Barry Smith wrote: > > Why is your C matrix an MPIAIJ matrix on one process? In general we > recommend creating a SeqAIJ matrix for one process and MPIAIJ for multiple. > You can use MatCreateAIJ() and it will always create the correct one. > > We could change the code as you suggest but I want to make sure that is > the best solution in your case. > > Barry > > > > > On Sep 16, 2016, at 3:31 AM, Hoang Giang Bui wrote: > > > > Hi Matt > > > > I believed at line 523, src/ksp/ksp/utils/schurm.c > > > > ierr = MatMatMult(C, AinvB, MAT_INITIAL_MATRIX, fill, S);CHKERRQ(ierr); > > > > in my test case C is MPIAIJ and AinvB is SEQAIJ, hence it throws the > error. > > > > In fact I guess there are two issues with it > > line 521, ierr = MatConvert(AinvBd, MATAIJ, MAT_INITIAL_MATRIX, > &AinvB);CHKERRQ(ierr); > > shall we convert this to type of C matrix to ensure compatibility ? > > > > line 552, if(norm > PETSC_MACHINE_EPSILON) SETERRQ(PetscObjectComm((PetscObject) > M), PETSC_ERR_SUP, "Not yet implemented for Schur complements with > non-vanishing D"); > > with this the Schur complement with A11!=0 will be aborted > > > > Giang > > > > On Thu, Sep 15, 2016 at 4:28 PM, Matthew Knepley > wrote: > > On Thu, Sep 15, 2016 at 9:07 AM, Hoang Giang Bui > wrote: > > Hi Matt > > > > Thanks for the comment. After looking carefully into the manual again, > the key take away is that with selfp there is no option to compute the > exact Schur, there are only two options to approximate the inv(A00) for > selfp, which are lump and diag (diag by default). I misunderstood this > previously. > > > > There is online manual entry mentioned about > PC_FIELDSPLIT_SCHUR_PRE_FULL, which is not documented elsewhere in the > offline manual. I tried to access that by setting > > -pc_fieldsplit_schur_precondition full > > > > Yep, I wrote that specifically for testing, but its very slow so I did > not document it to prevent people from complaining. > > > > but it gives the error > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Arguments are incompatible > > [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with B, > seqaij > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > > [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui Thu > Sep 15 15:46:56 2016 > > [0]PETSC ERROR: Configure options --with-shared-libraries > --with-debugging=0 --with-pic --download-fblaslapack=yes > --download-suitesparse --download-ptscotch=yes --download-metis=yes > --download-parmetis=yes --download-scalapack=yes --download-mumps=yes > --download-hypre=yes --download-ml=yes --download-pastix=yes > --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 > > [0]PETSC ERROR: #1 MatMatMult() line 9514 in > /home/hbui/sw/petsc-3.7.3/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line 526 > in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c > > [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in > /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ > ksp/pc/interface/precon.c > > [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/hbui/sw/petsc-3.7.3/src/ > ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/hbui/sw/petsc-3.7.3/src/ > ksp/ksp/interface/itfunc.c > > > > Please excuse me to insist on forming the exact Schur complement, but as > you said, I would like to track down what creates problem in my code by > starting from a very exact but ineffective solution. > > > > Sure, I understand. I do not understand how A can be MPI and B can be > Seq. Do you know how that happens? > > > > Thanks, > > > > Matt > > > > Giang > > > > On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley > wrote: > > On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui > wrote: > > Dear Barry > > > > Thanks for the clarification. I got exactly what you said if the code > changed to > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm > 0.000000000000e+00 ||r(i)||/||b|| -nan > > 1 KSP Residual norm 3.997711925708e-17 > > > > but I guess we solve a different problem if B is used for the linear > system. > > > > in addition, changed to > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > also works but inner iteration converged not in one iteration > > > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm > 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm > 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 > > 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm > 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 > > 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm > 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 > > 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm > 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 > > 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm > 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 > > 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm > 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 > > 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm > 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 > > 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm > 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 > > 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm > 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 > > 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm > 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 > > 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm > 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 > > 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm > 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 > > 1 KSP Residual norm 1.445282647127e-16 > > > > > > Seem like zero pivot does not happen, but why the solver for Schur takes > 13 steps if the preconditioner is direct solver? > > > > Look at the -ksp_view. I will bet that the default is to shift (add a > multiple of the identity) the matrix instead of failing. This > > gives an inexact PC, but as you see it can converge. > > > > Thanks, > > > > Matt > > > > > > I also so tried another problem which I known does have a nonsingular > Schur (at least A11 != 0) and it also have the same problem: 1 step outer > convergence but multiple step inner convergence. > > > > Any ideas? > > > > Giang > > > > On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: > > > > Normally you'd be absolutely correct to expect convergence in one > iteration. However in this example note the call > > > > ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); > > > > It is solving the linear system defined by A but building the > preconditioner (i.e. the entire fieldsplit process) from a different matrix > B. Since A is not B you should not expect convergence in one iteration. If > you change the code to > > > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > > > you will see exactly what you expect, convergence in one iteration. > > > > Sorry about this, the example is lacking clarity and documentation its > author obviously knew too well what he was doing that he didn't realize > everyone else in the world would need more comments in the code. If you > change the code to > > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > > it will stop without being able to build the preconditioner because LU > factorization of the Sp matrix will result in a zero pivot. This is why > this "auxiliary" matrix B is used to define the preconditioner instead of A. > > > > Barry > > > > > > > > > > > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui > wrote: > > > > > > Sorry I slept quite a while in this thread. Now I start to look at it > again. In the last try, the previous setting doesn't work either (in fact > diverge). So I would speculate if the Schur complement in my case is > actually not invertible. It's also possible that the code is wrong > somewhere. However, before looking at that, I want to understand thoroughly > the settings for Schur complement > > > > > > I experimented ex42 with the settings: > > > mpirun -np 1 ex42 \ > > > -stokes_ksp_monitor \ > > > -stokes_ksp_type fgmres \ > > > -stokes_pc_type fieldsplit \ > > > -stokes_pc_fieldsplit_type schur \ > > > -stokes_pc_fieldsplit_schur_fact_type full \ > > > -stokes_pc_fieldsplit_schur_precondition selfp \ > > > -stokes_fieldsplit_u_ksp_type preonly \ > > > -stokes_fieldsplit_u_pc_type lu \ > > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > > -stokes_fieldsplit_p_ksp_type gmres \ > > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > > -stokes_fieldsplit_p_pc_type lu \ > > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > > > > > In my understanding, the solver should converge in 1 (outer) step. > Execution gives: > > > Residual norms for stokes_ solve. > > > 0 KSP Residual norm 1.327791371202e-02 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm > 0.000000000000e+00 ||r(i)||/||b|| -nan > > > 1 KSP Residual norm 7.656238881621e-04 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm > 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > > > 2 KSP Residual norm 9.895645456398e-06 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm > 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > > > 3 KSP Residual norm 1.608206702571e-06 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm > 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > > > 4 KSP Residual norm 2.453618999882e-07 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm > 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > > > 5 KSP Residual norm 4.230399264750e-08 > > > > > > Looks like the "selfp" does construct the Schur nicely. But does > "full" really construct the full block preconditioner? > > > > > > Giang > > > P/S: I'm also generating a smaller size of the previous problem for > checking again. > > > > > > > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley > wrote: > > > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui > wrote: > > > > > > It could be taking time in the MatMatMult() here if that matrix is > dense. Is there any reason to > > > believe that is a good preconditioner for your problem? > > > > > > This is the first approach to the problem, so I chose the most simple > setting. Do you have any other recommendation? > > > > > > This is in no way the simplest PC. We need to make it simpler first. > > > > > > 1) Run on only 1 proc > > > > > > 2) Use -pc_fieldsplit_schur_fact_type full > > > > > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_ > true_residual > > > > > > This should converge in 1 outer iteration, but we will see how good > your Schur complement preconditioner > > > is for this problem. > > > > > > You need to start out from something you understand and then start > making approximations. > > > > > > Matt > > > > > > For any solver question, please send us the output of > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > I sent here the full output (after changed to fgmres), again it takes > long at the first iteration but after that, it does not converge > > > > > > -ksp_type fgmres > > > -ksp_max_it 300 > > > -ksp_gmres_restart 300 > > > -ksp_gmres_modifiedgramschmidt > > > -pc_fieldsplit_type schur > > > -pc_fieldsplit_schur_fact_type diag > > > -pc_fieldsplit_schur_precondition selfp > > > -pc_fieldsplit_detect_saddle_point > > > -fieldsplit_u_ksp_type preonly > > > -fieldsplit_u_pc_type lu > > > -fieldsplit_u_pc_factor_mat_solver_package mumps > > > -fieldsplit_lu_ksp_type preonly > > > -fieldsplit_lu_pc_type lu > > > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > > > > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm > 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm > 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > > > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm > 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > > > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm > 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > > > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm > 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > > > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm > 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > > > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm > 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > > > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm > 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > > > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm > 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > > > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm > 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > > > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm > 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > > > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm > 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > > > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm > 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > > > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm > 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > > > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm > 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > > > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm > 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > > > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm > 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > > > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm > 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > > > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm > 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > > > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm > 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > > > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm > 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > > > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm > 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > > > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm > 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > > > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm > 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > > > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm > 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > > > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm > 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > > > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm > 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > > > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm > 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > > > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm > 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > > > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm > 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > > > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm > 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > > > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm > 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > > > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm > 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > > > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm > 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > > > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm > 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > > > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm > 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > > > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm > 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > > > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm > 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > > > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm > 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > > > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm > 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > > > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm > 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > > > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm > 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > > > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm > 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > > > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm > 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > > > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm > 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > > > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm > 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > > > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm > 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > > > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm > 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > > > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm > 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > > > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm > 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > > > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm > 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > > > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm > 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > > > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm > 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > > > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm > 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > > > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm > 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > > > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm > 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > > > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm > 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > > > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm > 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > > > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm > 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > > > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm > 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > > > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm > 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > > > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm > 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > > > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm > 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > > > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm > 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > > > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm > 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > > > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm > 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > > > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm > 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > > > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm > 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > > > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm > 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > > > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm > 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > > > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm > 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > > > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm > 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > > > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm > 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > > > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm > 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > > > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm > 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > > > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm > 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > > > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm > 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > > > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm > 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > > > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm > 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > > > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm > 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > > > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm > 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > > > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm > 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > > > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm > 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > > > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm > 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > > > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm > 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > > > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm > 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > > > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm > 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > > > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm > 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > > > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm > 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > > > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm > 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > > > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm > 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > > > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm > 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > > > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm > 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > > > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm > 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > > > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm > 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > > > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm > 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > > > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm > 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > > > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm > 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > > > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm > 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > > > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm > 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > > > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm > 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > > > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm > 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > > > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm > 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > > > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm > 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > > > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm > 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > > > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm > 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > > > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm > 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > > > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm > 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > > > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm > 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > > > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm > 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > > > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm > 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > > > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm > 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > > > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm > 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > > > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm > 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > > > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm > 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > > > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm > 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > > > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm > 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > > > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm > 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > > > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm > 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > > > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm > 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > > > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm > 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > > > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm > 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > > > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm > 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > > > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm > 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > > > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm > 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > > > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm > 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > > > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm > 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > > > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm > 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > > > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm > 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > > > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm > 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > > > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm > 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > > > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm > 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > > > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm > 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > > > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm > 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > > > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm > 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > > > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm > 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > > > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm > 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > > > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm > 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > > > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm > 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > > > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm > 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > > > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm > 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > > > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm > 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > > > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm > 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > > > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm > 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > > > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm > 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > > > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm > 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > > > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm > 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > > > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm > 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > > > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm > 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > > > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm > 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > > > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm > 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > > > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm > 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > > > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm > 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > > > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm > 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > > > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm > 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > > > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm > 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > > > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm > 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > > > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm > 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > > > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm > 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > > > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm > 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > > > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm > 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > > > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm > 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > > > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm > 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > > > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm > 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > > > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm > 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > > > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm > 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > > > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm > 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > > > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm > 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > > > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm > 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > > > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm > 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > > > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm > 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > > > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm > 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > > > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm > 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > > > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm > 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > > > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm > 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > > > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm > 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > > > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm > 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > > > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm > 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > > > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm > 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > > > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm > 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > > > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm > 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > > > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm > 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > > > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm > 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > > > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm > 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > > > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm > 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > > > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm > 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > > > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm > 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > > > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm > 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > > > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm > 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > > > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm > 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > > > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm > 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > > > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm > 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > > > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm > 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > > > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm > 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > > > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm > 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > > > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm > 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > > > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm > 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > > > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm > 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > > > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm > 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > > > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm > 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > > > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm > 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > > > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm > 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > > > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm > 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > > > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm > 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > > > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm > 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > > > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm > 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > > > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm > 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > > > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm > 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > > > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm > 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > > > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm > 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > > > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm > 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > > > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm > 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > > > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm > 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > > > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm > 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > > > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm > 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > > > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm > 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > > > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm > 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > > > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm > 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > > > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm > 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > > > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm > 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > > > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm > 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > > > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm > 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > > > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm > 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > > > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm > 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > > > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm > 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > > > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm > 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > > > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm > 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > > > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm > 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > > > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm > 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > > > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm > 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > > > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm > 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > > > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm > 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > > > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm > 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > > > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm > 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > > > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm > 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > > > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm > 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > > > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm > 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > > > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm > 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > > > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm > 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > > > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm > 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > > > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm > 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > > > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm > 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > > > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm > 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > > > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm > 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > > > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm > 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > > > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm > 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > > > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm > 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > > > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm > 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > > > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm > 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > > > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm > 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > > > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm > 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > > > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm > 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > > > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm > 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > > > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm > 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > > > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm > 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > > > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm > 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > > > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm > 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > > > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm > 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > > > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm > 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > > > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm > 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > > > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm > 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > > > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm > 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > > > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm > 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > > > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm > 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > > > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm > 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > > > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm > 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > > > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm > 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > > > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm > 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > > > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm > 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > > > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm > 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > > > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm > 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > > > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm > 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > > > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm > 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > > > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm > 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > > > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm > 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > > > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm > 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > > > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm > 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > > > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm > 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > > > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm > 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > > > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm > 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > > > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm > 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > > > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm > 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > > > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm > 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > > > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm > 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > > > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm > 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > > > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm > 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > > > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm > 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > > > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm > 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > > > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm > 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > > > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm > 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > > > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm > 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > > > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm > 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > > > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm > 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > > > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm > 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > > > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm > 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > > > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm > 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > > > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm > 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > > > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm > 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > > > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm > 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > > > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm > 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > > > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm > 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > > > Linear solve did not converge due to DIVERGED_ITS iterations 300 > > > KSP Object: 2 MPI processes > > > type: fgmres > > > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > > > GMRES: happy breakdown tolerance 1e-30 > > > maximum iterations=300, initial guess is zero > > > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > > > right preconditioning > > > using UNPRECONDITIONED norm type for convergence test > > > PC Object: 2 MPI processes > > > type: fieldsplit > > > FieldSplit with Schur preconditioner, factorization DIAG > > > Preconditioner for the Schur complement formed from Sp, an > assembled approximation to S, which uses (lumped, if requested) A00's > diagonal's inverse > > > Split info: > > > Split number 0 Defined by IS > > > Split number 1 Defined by IS > > > KSP solver for A00 block > > > KSP Object: (fieldsplit_u_) 2 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_u_) 2 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326 > > > package used to perform factorization: mumps > > > total: nonzeros=4.03041e+08, allocated > nonzeros=4.03041e+08 > > > total number of mallocs used during MatSetValues calls > =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): > 1 > > > ICNTL(13) (efficiency control): > 0 > > > ICNTL(14) (percentage of estimated workspace > increase): 20 > > > ICNTL(18) (input mat struct): > 3 > > > ICNTL(19) (Shur complement info): > 0 > > > ICNTL(20) (rhs sparse pattern): > 0 > > > ICNTL(21) (solution struct): > 1 > > > ICNTL(22) (in-core/out-of-core facility): > 0 > > > ICNTL(23) (max size of memory can be allocated > locally):0 > > > ICNTL(24) (detection of null pivot rows): > 0 > > > ICNTL(25) (computation of a null space basis): > 0 > > > ICNTL(26) (Schur options for rhs or solution): > 0 > > > ICNTL(27) (experimental parameter): > -24 > > > ICNTL(28) (use parallel or sequential ordering): > 1 > > > ICNTL(29) (parallel ordering): > 0 > > > ICNTL(30) (user-specified set of entries in > inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve > phase): 0 > > > ICNTL(33) (compute determinant): > 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): > 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0 > > > CNTL(4) (value of static pivoting): -1 > > > CNTL(5) (fixation for null pivots): 0 > > > RINFO(1) (local estimated flops for the > elimination after analysis): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > RINFO(2) (local estimated flops for the assembly > after factorization): > > > [0] 4.2839e+08 > > > [1] 3.799e+08 > > > RINFO(3) (local estimated flops for the > elimination after factorization): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > INFO(15) (estimated size of (in MB) MUMPS internal > data for running numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(16) (size of (in MB) MUMPS internal data used > during numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(23) (num of pivots eliminated on this > processor after factorization): > > > [0] 90423 > > > [1] 93903 > > > RINFOG(1) (global estimated flops for the > elimination after analysis): 1.09445e+12 > > > RINFOG(2) (global estimated flops for the assembly > after factorization): 8.0829e+08 > > > RINFOG(3) (global estimated flops for the > elimination after factorization): 1.09445e+12 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for factors on > all processors after analysis): 403041366 > > > INFOG(4) (estimated integer workspace for factors > on all processors after analysis): 2265748 > > > INFOG(5) (estimated maximum front size in the > complete tree): 6663 > > > INFOG(6) (number of nodes in the complete tree): > 2812 > > > INFOG(7) (ordering option effectively use after > analysis): 5 > > > INFOG(8) (structural symmetry in percent of the > permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store > the matrix factors after factorization): 403041366 > > > INFOG(10) (total integer space store the matrix > factors after factorization): 2265766 > > > INFOG(11) (order of largest frontal matrix after > factorization): 6663 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after > factorization): 0 > > > INFOG(14) (number of memory compress after > factorization): 0 > > > INFOG(15) (number of steps of iterative refinement > after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS > internal data for factorization after analysis: value on the most memory > consuming processor): 2649 > > > INFOG(17) (estimated size of all MUMPS internal > data for factorization after analysis: sum over all processors): 5270 > > > INFOG(18) (size of all MUMPS internal data > allocated during factorization: value on the most memory consuming > processor): 2649 > > > INFOG(19) (size of all MUMPS internal data > allocated during factorization: sum over all processors): 5270 > > > INFOG(20) (estimated number of entries in the > factors): 403041366 > > > INFOG(21) (size in MB of memory effectively used > during factorization - value on the most memory consuming processor): 2121 > > > INFOG(22) (size in MB of memory effectively used > during factorization - sum over all processors): 4174 > > > INFOG(23) (after analysis: value of ICNTL(6) > effectively used): 0 > > > INFOG(24) (after analysis: value of ICNTL(12) > effectively used): 1 > > > INFOG(25) (after factorization: number of pivots > modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null > pivots encountered): 0 > > > INFOG(29) (after factorization: effective number > of entries in the factors (sum over all processors)): 403041366 > > > INFOG(30, 31) (after solution: size in Mbytes of > memory used during solution phase): 2467, 4922 > > > INFOG(32) (after analysis: type of analysis done): > 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if > determinant is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_u_) 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326, bs=3 > > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 26829 nodes, > limit used is 5 > > > KSP solver for S = A11 - A10 inv(A00) A01 > > > KSP Object: (fieldsplit_lu_) 2 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_lu_) 2 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583 > > > package used to perform factorization: mumps > > > total: nonzeros=2.17621e+06, allocated > nonzeros=2.17621e+06 > > > total number of mallocs used during MatSetValues calls > =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): > 1 > > > ICNTL(13) (efficiency control): > 0 > > > ICNTL(14) (percentage of estimated workspace > increase): 20 > > > ICNTL(18) (input mat struct): > 3 > > > ICNTL(19) (Shur complement info): > 0 > > > ICNTL(20) (rhs sparse pattern): > 0 > > > ICNTL(21) (solution struct): > 1 > > > ICNTL(22) (in-core/out-of-core facility): > 0 > > > ICNTL(23) (max size of memory can be allocated > locally):0 > > > ICNTL(24) (detection of null pivot rows): > 0 > > > ICNTL(25) (computation of a null space basis): > 0 > > > ICNTL(26) (Schur options for rhs or solution): > 0 > > > ICNTL(27) (experimental parameter): > -24 > > > ICNTL(28) (use parallel or sequential ordering): > 1 > > > ICNTL(29) (parallel ordering): > 0 > > > ICNTL(30) (user-specified set of entries in > inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve > phase): 0 > > > ICNTL(33) (compute determinant): > 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): > 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0 > > > CNTL(4) (value of static pivoting): -1 > > > CNTL(5) (fixation for null pivots): 0 > > > RINFO(1) (local estimated flops for the > elimination after analysis): > > > [0] 5.12794e+08 > > > [1] 5.02142e+08 > > > RINFO(2) (local estimated flops for the assembly > after factorization): > > > [0] 815031 > > > [1] 745263 > > > RINFO(3) (local estimated flops for the > elimination after factorization): > > > [0] 5.12794e+08 > > > [1] 5.02142e+08 > > > INFO(15) (estimated size of (in MB) MUMPS internal > data for running numerical factorization): > > > [0] 34 > > > [1] 34 > > > INFO(16) (size of (in MB) MUMPS internal data used > during numerical factorization): > > > [0] 34 > > > [1] 34 > > > INFO(23) (num of pivots eliminated on this > processor after factorization): > > > [0] 1158 > > > [1] 1425 > > > RINFOG(1) (global estimated flops for the > elimination after analysis): 1.01494e+09 > > > RINFOG(2) (global estimated flops for the assembly > after factorization): 1.56029e+06 > > > RINFOG(3) (global estimated flops for the > elimination after factorization): 1.01494e+09 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for factors on > all processors after analysis): 2176209 > > > INFOG(4) (estimated integer workspace for factors > on all processors after analysis): 14427 > > > INFOG(5) (estimated maximum front size in the > complete tree): 699 > > > INFOG(6) (number of nodes in the complete tree): 15 > > > INFOG(7) (ordering option effectively use after > analysis): 2 > > > INFOG(8) (structural symmetry in percent of the > permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store > the matrix factors after factorization): 2176209 > > > INFOG(10) (total integer space store the matrix > factors after factorization): 14427 > > > INFOG(11) (order of largest frontal matrix after > factorization): 699 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after > factorization): 0 > > > INFOG(14) (number of memory compress after > factorization): 0 > > > INFOG(15) (number of steps of iterative refinement > after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS > internal data for factorization after analysis: value on the most memory > consuming processor): 34 > > > INFOG(17) (estimated size of all MUMPS internal > data for factorization after analysis: sum over all processors): 68 > > > INFOG(18) (size of all MUMPS internal data > allocated during factorization: value on the most memory consuming > processor): 34 > > > INFOG(19) (size of all MUMPS internal data > allocated during factorization: sum over all processors): 68 > > > INFOG(20) (estimated number of entries in the > factors): 2176209 > > > INFOG(21) (size in MB of memory effectively used > during factorization - value on the most memory consuming processor): 30 > > > INFOG(22) (size in MB of memory effectively used > during factorization - sum over all processors): 59 > > > INFOG(23) (after analysis: value of ICNTL(6) > effectively used): 0 > > > INFOG(24) (after analysis: value of ICNTL(12) > effectively used): 1 > > > INFOG(25) (after factorization: number of pivots > modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null > pivots encountered): 0 > > > INFOG(29) (after factorization: effective number > of entries in the factors (sum over all processors)): 2176209 > > > INFOG(30, 31) (after solution: size in Mbytes of > memory used during solution phase): 16, 32 > > > INFOG(32) (after analysis: type of analysis done): > 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if > determinant is requested): 0 > > > linear system matrix followed by preconditioner matrix: > > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > > type: schurcomplement > > > rows=2583, cols=2583 > > > Schur complement A11 - A10 inv(A00) A01 > > > A11 > > > Mat Object: (fieldsplit_lu_) > 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583, bs=3 > > > total: nonzeros=117369, allocated nonzeros=117369 > > > total number of mallocs used during MatSetValues calls > =0 > > > not using I-node (on process 0) routines > > > A10 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=184326, rbs=3, cbs = 1 > > > total: nonzeros=292770, allocated nonzeros=292770 > > > total number of mallocs used during MatSetValues calls > =0 > > > not using I-node (on process 0) routines > > > KSP of A00 > > > KSP Object: (fieldsplit_u_) 2 > MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_u_) 2 > MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326 > > > package used to perform factorization: mumps > > > total: nonzeros=4.03041e+08, allocated > nonzeros=4.03041e+08 > > > total number of mallocs used during > MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): > 1 > > > ICNTL(13) (efficiency control): > 0 > > > ICNTL(14) (percentage of estimated > workspace increase): 20 > > > ICNTL(18) (input mat struct): > 3 > > > ICNTL(19) (Shur complement info): > 0 > > > ICNTL(20) (rhs sparse pattern): > 0 > > > ICNTL(21) (solution struct): > 1 > > > ICNTL(22) (in-core/out-of-core facility): > 0 > > > ICNTL(23) (max size of memory can be > allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): > 0 > > > ICNTL(25) (computation of a null space > basis): 0 > > > ICNTL(26) (Schur options for rhs or > solution): 0 > > > ICNTL(27) (experimental parameter): > -24 > > > ICNTL(28) (use parallel or sequential > ordering): 1 > > > ICNTL(29) (parallel ordering): > 0 > > > ICNTL(30) (user-specified set of entries > in inv(A)): 0 > > > ICNTL(31) (factors is discarded in the > solve phase): 0 > > > ICNTL(33) (compute determinant): > 0 > > > CNTL(1) (relative pivoting threshold): > 0.01 > > > CNTL(2) (stopping criterion of > refinement): 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): > 0 > > > CNTL(4) (value of static pivoting): > -1 > > > CNTL(5) (fixation for null pivots): > 0 > > > RINFO(1) (local estimated flops for the > elimination after analysis): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > RINFO(2) (local estimated flops for the > assembly after factorization): > > > [0] 4.2839e+08 > > > [1] 3.799e+08 > > > RINFO(3) (local estimated flops for the > elimination after factorization): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > INFO(15) (estimated size of (in MB) MUMPS > internal data for running numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(16) (size of (in MB) MUMPS internal > data used during numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(23) (num of pivots eliminated on this > processor after factorization): > > > [0] 90423 > > > [1] 93903 > > > RINFOG(1) (global estimated flops for the > elimination after analysis): 1.09445e+12 > > > RINFOG(2) (global estimated flops for the > assembly after factorization): 8.0829e+08 > > > RINFOG(3) (global estimated flops for the > elimination after factorization): 1.09445e+12 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) > (determinant): (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for > factors on all processors after analysis): 403041366 > > > INFOG(4) (estimated integer workspace for > factors on all processors after analysis): 2265748 > > > INFOG(5) (estimated maximum front size in > the complete tree): 6663 > > > INFOG(6) (number of nodes in the complete > tree): 2812 > > > INFOG(7) (ordering option effectively use > after analysis): 5 > > > INFOG(8) (structural symmetry in percent > of the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to > store the matrix factors after factorization): 403041366 > > > INFOG(10) (total integer space store the > matrix factors after factorization): 2265766 > > > INFOG(11) (order of largest frontal matrix > after factorization): 6663 > > > INFOG(12) (number of off-diagonal pivots): > 0 > > > INFOG(13) (number of delayed pivots after > factorization): 0 > > > INFOG(14) (number of memory compress after > factorization): 0 > > > INFOG(15) (number of steps of iterative > refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all > MUMPS internal data for factorization after analysis: value on the most > memory consuming processor): 2649 > > > INFOG(17) (estimated size of all MUMPS > internal data for factorization after analysis: sum over all processors): > 5270 > > > INFOG(18) (size of all MUMPS internal data > allocated during factorization: value on the most memory consuming > processor): 2649 > > > INFOG(19) (size of all MUMPS internal data > allocated during factorization: sum over all processors): 5270 > > > INFOG(20) (estimated number of entries in > the factors): 403041366 > > > INFOG(21) (size in MB of memory > effectively used during factorization - value on the most memory consuming > processor): 2121 > > > INFOG(22) (size in MB of memory > effectively used during factorization - sum over all processors): 4174 > > > INFOG(23) (after analysis: value of > ICNTL(6) effectively used): 0 > > > INFOG(24) (after analysis: value of > ICNTL(12) effectively used): 1 > > > INFOG(25) (after factorization: number of > pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of > null pivots encountered): 0 > > > INFOG(29) (after factorization: effective > number of entries in the factors (sum over all processors)): 403041366 > > > INFOG(30, 31) (after solution: size in > Mbytes of memory used during solution phase): 2467, 4922 > > > INFOG(32) (after analysis: type of > analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if > determinant is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_u_) > 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326, bs=3 > > > total: nonzeros=3.32649e+07, allocated > nonzeros=3.32649e+07 > > > total number of mallocs used during MatSetValues > calls =0 > > > using I-node (on process 0) routines: found 26829 > nodes, limit used is 5 > > > A01 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=2583, rbs=3, cbs = 1 > > > total: nonzeros=292770, allocated nonzeros=292770 > > > total number of mallocs used during MatSetValues calls > =0 > > > using I-node (on process 0) routines: found 16098 > nodes, limit used is 5 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583, rbs=3, cbs = 1 > > > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node (on process 0) routines > > > linear system matrix = precond matrix: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=186909, cols=186909 > > > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 26829 nodes, limit > used is 5 > > > KSPSolve completed > > > > > > > > > Giang > > > > > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley > wrote: > > > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui > wrote: > > > Hello > > > > > > I'm solving an indefinite problem arising from mesh tying/contact > using Lagrange multiplier, the matrix has the form > > > > > > K = [A P^T > > > P 0] > > > > > > I used the FIELDSPLIT preconditioner with one field is the main > variable (displacement) and the other field for dual variable (Lagrange > multiplier). The block size for each field is 3. According to the manual, I > first chose the preconditioner based on Schur complement to treat this > problem. > > > > > > > > > For any solver question, please send us the output of > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > However, I will comment below > > > > > > The parameters used for the solve is > > > -ksp_type gmres > > > > > > You need 'fgmres' here with the options you have below. > > > > > > -ksp_max_it 300 > > > -ksp_gmres_restart 300 > > > -ksp_gmres_modifiedgramschmidt > > > -pc_fieldsplit_type schur > > > -pc_fieldsplit_schur_fact_type diag > > > -pc_fieldsplit_schur_precondition selfp > > > > > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is > dense. Is there any reason to > > > believe that is a good preconditioner for your problem? > > > > > > > > > -pc_fieldsplit_detect_saddle_point > > > -fieldsplit_u_pc_type hypre > > > > > > I would just use MUMPS here to start, especially if it works on the > whole problem. Same with the one below. > > > > > > Matt > > > > > > -fieldsplit_u_pc_hypre_type boomeramg > > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > > -fieldsplit_lu_pc_type hypre > > > -fieldsplit_lu_pc_hypre_type boomeramg > > > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > > > > > For the test case, a small problem is solved on 2 processes. Due to > the decomposition, the contact only happens in 1 proc, so the size of > Lagrange multiplier dofs on proc 0 is 0. > > > > > > 0: mIndexU.size(): 80490 > > > 0: mIndexLU.size(): 0 > > > 1: mIndexU.size(): 103836 > > > 1: mIndexLU.size(): 2583 > > > > > > However, with this setup the solver takes very long at KSPSolve before > going to iteration, and the first iteration seems forever so I have to stop > the calculation. I guessed that the solver takes time to compute the Schur > complement, but according to the manual only the diagonal of A is used to > approximate the Schur complement, so it should not take long to compute > this. > > > > > > Note that I ran the same problem with direct solver (MUMPS) and it's > able to produce the valid results. The parameter for the solve is pretty > standard > > > -ksp_type preonly > > > -pc_type lu > > > -pc_factor_mat_solver_package mumps > > > > > > Hence the matrix/rhs must not have any problem here. Do you have any > idea or suggestion for this case? > > > > > > > > > Giang > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 16 18:44:21 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 Sep 2016 18:44:21 -0500 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: <19EEF686-0334-46CD-A25D-4DFCA2B5D94B@mcs.anl.gov> Message-ID: <76457EBE-4527-4A5A-B247-3D6AC8371F79@mcs.anl.gov> > On Sep 16, 2016, at 6:09 PM, Hoang Giang Bui wrote: > > Hi Barry > > You are right, using MatCreateAIJ() eliminates the first issue. Previously I ran the mpi code with one process so A,B,C,D is all MPIAIJ > > And how about the second issue, this error will always be thrown if A11 is nonzero, which is my case? > > Nevertheless, I would like to report my simple finding: I changed the part around line 552 to I'm sorry what file are you talking about? What version of PETSc? What other lines of code are around 552? I can't figure out where you are doing this. Barry > > if (D) { > ierr = MatAXPY(*S, -1.0, D, SUBSET_NONZERO_PATTERN);CHKERRQ(ierr); > } > > I could get ex42 works with > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > parameters: > mpirun -np 1 ex42 \ > -stokes_ksp_monitor \ > -stokes_ksp_type fgmres \ > -stokes_pc_type fieldsplit \ > -stokes_pc_fieldsplit_type schur \ > -stokes_pc_fieldsplit_schur_fact_type full \ > -stokes_pc_fieldsplit_schur_precondition full \ > -stokes_fieldsplit_u_ksp_type preonly \ > -stokes_fieldsplit_u_pc_type lu \ > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > -stokes_fieldsplit_p_ksp_type gmres \ > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > -stokes_fieldsplit_p_ksp_max_it 300 \ > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > -stokes_fieldsplit_p_pc_type lu \ > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps \ > > Output: > Residual norms for stokes_ solve. > 0 KSP Residual norm 1.327791371202e-02 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 1.651372938841e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.172753353368e+00 true resid norm 2.072348962892e-05 ||r(i)||/||b|| 3.588013522487e-04 > 2 KSP preconditioned resid norm 3.931379526610e-13 true resid norm 1.878299731917e-16 ||r(i)||/||b|| 3.252041503665e-15 > 1 KSP Residual norm 3.385960118582e-17 > > inner convergence is much better although 2 iterations (:-( ?? > > I also obtain the same convergence behavior for the problem with A11!=0 > > Please suggest if this makes sense, or I did something wrong. > > Giang > > On Fri, Sep 16, 2016 at 8:31 PM, Barry Smith wrote: > > Why is your C matrix an MPIAIJ matrix on one process? In general we recommend creating a SeqAIJ matrix for one process and MPIAIJ for multiple. You can use MatCreateAIJ() and it will always create the correct one. > > We could change the code as you suggest but I want to make sure that is the best solution in your case. > > Barry > > > > > On Sep 16, 2016, at 3:31 AM, Hoang Giang Bui wrote: > > > > Hi Matt > > > > I believed at line 523, src/ksp/ksp/utils/schurm.c > > > > ierr = MatMatMult(C, AinvB, MAT_INITIAL_MATRIX, fill, S);CHKERRQ(ierr); > > > > in my test case C is MPIAIJ and AinvB is SEQAIJ, hence it throws the error. > > > > In fact I guess there are two issues with it > > line 521, ierr = MatConvert(AinvBd, MATAIJ, MAT_INITIAL_MATRIX, &AinvB);CHKERRQ(ierr); > > shall we convert this to type of C matrix to ensure compatibility ? > > > > line 552, if(norm > PETSC_MACHINE_EPSILON) SETERRQ(PetscObjectComm((PetscObject) M), PETSC_ERR_SUP, "Not yet implemented for Schur complements with non-vanishing D"); > > with this the Schur complement with A11!=0 will be aborted > > > > Giang > > > > On Thu, Sep 15, 2016 at 4:28 PM, Matthew Knepley wrote: > > On Thu, Sep 15, 2016 at 9:07 AM, Hoang Giang Bui wrote: > > Hi Matt > > > > Thanks for the comment. After looking carefully into the manual again, the key take away is that with selfp there is no option to compute the exact Schur, there are only two options to approximate the inv(A00) for selfp, which are lump and diag (diag by default). I misunderstood this previously. > > > > There is online manual entry mentioned about PC_FIELDSPLIT_SCHUR_PRE_FULL, which is not documented elsewhere in the offline manual. I tried to access that by setting > > -pc_fieldsplit_schur_precondition full > > > > Yep, I wrote that specifically for testing, but its very slow so I did not document it to prevent people from complaining. > > > > but it gives the error > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Arguments are incompatible > > [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with B, seqaij > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > > [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui Thu Sep 15 15:46:56 2016 > > [0]PETSC ERROR: Configure options --with-shared-libraries --with-debugging=0 --with-pic --download-fblaslapack=yes --download-suitesparse --download-ptscotch=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes --download-mumps=yes --download-hypre=yes --download-ml=yes --download-pastix=yes --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 > > [0]PETSC ERROR: #1 MatMatMult() line 9514 in /home/hbui/sw/petsc-3.7.3/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line 526 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c > > [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > > > Please excuse me to insist on forming the exact Schur complement, but as you said, I would like to track down what creates problem in my code by starting from a very exact but ineffective solution. > > > > Sure, I understand. I do not understand how A can be MPI and B can be Seq. Do you know how that happens? > > > > Thanks, > > > > Matt > > > > Giang > > > > On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley wrote: > > On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui wrote: > > Dear Barry > > > > Thanks for the clarification. I got exactly what you said if the code changed to > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > 1 KSP Residual norm 3.997711925708e-17 > > > > but I guess we solve a different problem if B is used for the linear system. > > > > in addition, changed to > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > also works but inner iteration converged not in one iteration > > > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 > > 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 > > 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 > > 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 > > 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 > > 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 > > 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 > > 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 > > 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 > > 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 > > 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 > > 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 > > 1 KSP Residual norm 1.445282647127e-16 > > > > > > Seem like zero pivot does not happen, but why the solver for Schur takes 13 steps if the preconditioner is direct solver? > > > > Look at the -ksp_view. I will bet that the default is to shift (add a multiple of the identity) the matrix instead of failing. This > > gives an inexact PC, but as you see it can converge. > > > > Thanks, > > > > Matt > > > > > > I also so tried another problem which I known does have a nonsingular Schur (at least A11 != 0) and it also have the same problem: 1 step outer convergence but multiple step inner convergence. > > > > Any ideas? > > > > Giang > > > > On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: > > > > Normally you'd be absolutely correct to expect convergence in one iteration. However in this example note the call > > > > ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); > > > > It is solving the linear system defined by A but building the preconditioner (i.e. the entire fieldsplit process) from a different matrix B. Since A is not B you should not expect convergence in one iteration. If you change the code to > > > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > > > you will see exactly what you expect, convergence in one iteration. > > > > Sorry about this, the example is lacking clarity and documentation its author obviously knew too well what he was doing that he didn't realize everyone else in the world would need more comments in the code. If you change the code to > > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > > it will stop without being able to build the preconditioner because LU factorization of the Sp matrix will result in a zero pivot. This is why this "auxiliary" matrix B is used to define the preconditioner instead of A. > > > > Barry > > > > > > > > > > > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui wrote: > > > > > > Sorry I slept quite a while in this thread. Now I start to look at it again. In the last try, the previous setting doesn't work either (in fact diverge). So I would speculate if the Schur complement in my case is actually not invertible. It's also possible that the code is wrong somewhere. However, before looking at that, I want to understand thoroughly the settings for Schur complement > > > > > > I experimented ex42 with the settings: > > > mpirun -np 1 ex42 \ > > > -stokes_ksp_monitor \ > > > -stokes_ksp_type fgmres \ > > > -stokes_pc_type fieldsplit \ > > > -stokes_pc_fieldsplit_type schur \ > > > -stokes_pc_fieldsplit_schur_fact_type full \ > > > -stokes_pc_fieldsplit_schur_precondition selfp \ > > > -stokes_fieldsplit_u_ksp_type preonly \ > > > -stokes_fieldsplit_u_pc_type lu \ > > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > > -stokes_fieldsplit_p_ksp_type gmres \ > > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > > -stokes_fieldsplit_p_pc_type lu \ > > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > > > > > In my understanding, the solver should converge in 1 (outer) step. Execution gives: > > > Residual norms for stokes_ solve. > > > 0 KSP Residual norm 1.327791371202e-02 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > > 1 KSP Residual norm 7.656238881621e-04 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > > > 2 KSP Residual norm 9.895645456398e-06 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > > > 3 KSP Residual norm 1.608206702571e-06 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > > > 4 KSP Residual norm 2.453618999882e-07 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > > > 5 KSP Residual norm 4.230399264750e-08 > > > > > > Looks like the "selfp" does construct the Schur nicely. But does "full" really construct the full block preconditioner? > > > > > > Giang > > > P/S: I'm also generating a smaller size of the previous problem for checking again. > > > > > > > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley wrote: > > > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui wrote: > > > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > > believe that is a good preconditioner for your problem? > > > > > > This is the first approach to the problem, so I chose the most simple setting. Do you have any other recommendation? > > > > > > This is in no way the simplest PC. We need to make it simpler first. > > > > > > 1) Run on only 1 proc > > > > > > 2) Use -pc_fieldsplit_schur_fact_type full > > > > > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_true_residual > > > > > > This should converge in 1 outer iteration, but we will see how good your Schur complement preconditioner > > > is for this problem. > > > > > > You need to start out from something you understand and then start making approximations. > > > > > > Matt > > > > > > For any solver question, please send us the output of > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > I sent here the full output (after changed to fgmres), again it takes long at the first iteration but after that, it does not converge > > > > > > -ksp_type fgmres > > > -ksp_max_it 300 > > > -ksp_gmres_restart 300 > > > -ksp_gmres_modifiedgramschmidt > > > -pc_fieldsplit_type schur > > > -pc_fieldsplit_schur_fact_type diag > > > -pc_fieldsplit_schur_precondition selfp > > > -pc_fieldsplit_detect_saddle_point > > > -fieldsplit_u_ksp_type preonly > > > -fieldsplit_u_pc_type lu > > > -fieldsplit_u_pc_factor_mat_solver_package mumps > > > -fieldsplit_lu_ksp_type preonly > > > -fieldsplit_lu_pc_type lu > > > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > > > > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > > > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > > > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > > > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > > > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > > > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > > > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > > > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > > > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > > > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > > > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > > > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > > > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > > > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > > > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > > > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > > > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > > > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > > > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > > > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > > > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > > > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > > > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > > > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > > > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > > > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > > > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > > > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > > > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > > > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > > > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > > > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > > > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > > > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > > > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > > > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > > > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > > > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > > > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > > > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > > > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > > > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > > > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > > > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > > > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > > > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > > > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > > > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > > > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > > > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > > > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > > > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > > > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > > > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > > > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > > > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > > > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > > > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > > > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > > > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > > > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > > > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > > > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > > > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > > > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > > > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > > > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > > > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > > > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > > > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > > > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > > > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > > > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > > > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > > > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > > > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > > > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > > > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > > > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > > > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > > > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > > > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > > > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > > > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > > > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > > > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > > > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > > > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > > > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > > > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > > > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > > > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > > > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > > > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > > > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > > > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > > > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > > > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > > > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > > > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > > > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > > > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > > > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > > > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > > > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > > > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > > > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > > > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > > > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > > > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > > > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > > > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > > > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > > > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > > > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > > > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > > > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > > > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > > > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > > > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > > > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > > > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > > > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > > > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > > > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > > > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > > > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > > > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > > > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > > > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > > > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > > > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > > > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > > > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > > > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > > > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > > > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > > > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > > > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > > > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > > > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > > > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > > > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > > > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > > > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > > > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > > > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > > > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > > > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > > > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > > > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > > > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > > > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > > > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > > > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > > > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > > > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > > > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > > > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > > > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > > > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > > > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > > > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > > > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > > > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > > > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > > > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > > > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > > > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > > > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > > > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > > > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > > > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > > > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > > > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > > > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > > > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > > > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > > > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > > > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > > > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > > > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > > > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > > > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > > > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > > > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > > > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > > > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > > > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > > > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > > > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > > > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > > > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > > > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > > > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > > > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > > > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > > > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > > > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > > > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > > > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > > > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > > > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > > > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > > > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > > > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > > > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > > > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > > > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > > > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > > > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > > > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > > > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > > > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > > > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > > > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > > > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > > > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > > > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > > > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > > > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > > > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > > > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > > > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > > > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > > > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > > > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > > > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > > > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > > > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > > > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > > > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > > > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > > > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > > > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > > > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > > > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > > > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > > > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > > > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > > > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > > > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > > > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > > > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > > > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > > > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > > > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > > > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > > > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > > > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > > > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > > > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > > > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > > > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > > > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > > > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > > > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > > > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > > > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > > > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > > > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > > > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > > > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > > > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > > > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > > > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > > > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > > > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > > > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > > > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > > > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > > > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > > > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > > > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > > > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > > > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > > > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > > > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > > > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > > > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > > > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > > > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > > > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > > > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > > > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > > > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > > > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > > > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > > > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > > > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > > > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > > > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > > > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > > > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > > > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > > > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > > > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > > > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > > > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > > > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > > > Linear solve did not converge due to DIVERGED_ITS iterations 300 > > > KSP Object: 2 MPI processes > > > type: fgmres > > > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > > > GMRES: happy breakdown tolerance 1e-30 > > > maximum iterations=300, initial guess is zero > > > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > > > right preconditioning > > > using UNPRECONDITIONED norm type for convergence test > > > PC Object: 2 MPI processes > > > type: fieldsplit > > > FieldSplit with Schur preconditioner, factorization DIAG > > > Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse > > > Split info: > > > Split number 0 Defined by IS > > > Split number 1 Defined by IS > > > KSP solver for A00 block > > > KSP Object: (fieldsplit_u_) 2 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_u_) 2 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326 > > > package used to perform factorization: mumps > > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > > total number of mallocs used during MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): 1 > > > ICNTL(13) (efficiency control): 0 > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > ICNTL(18) (input mat struct): 3 > > > ICNTL(19) (Shur complement info): 0 > > > ICNTL(20) (rhs sparse pattern): 0 > > > ICNTL(21) (solution struct): 1 > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): 0 > > > ICNTL(25) (computation of a null space basis): 0 > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > ICNTL(27) (experimental parameter): -24 > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > ICNTL(29) (parallel ordering): 0 > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > ICNTL(33) (compute determinant): 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0 > > > CNTL(4) (value of static pivoting): -1 > > > CNTL(5) (fixation for null pivots): 0 > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > [0] 4.2839e+08 > > > [1] 3.799e+08 > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > [0] 90423 > > > [1] 93903 > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > > INFOG(6) (number of nodes in the complete tree): 2812 > > > INFOG(7) (ordering option effectively use after analysis): 5 > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > INFOG(14) (number of memory compress after factorization): 0 > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > > INFOG(20) (estimated number of entries in the factors): 403041366 > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > > INFOG(32) (after analysis: type of analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_u_) 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326, bs=3 > > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > KSP solver for S = A11 - A10 inv(A00) A01 > > > KSP Object: (fieldsplit_lu_) 2 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_lu_) 2 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583 > > > package used to perform factorization: mumps > > > total: nonzeros=2.17621e+06, allocated nonzeros=2.17621e+06 > > > total number of mallocs used during MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): 1 > > > ICNTL(13) (efficiency control): 0 > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > ICNTL(18) (input mat struct): 3 > > > ICNTL(19) (Shur complement info): 0 > > > ICNTL(20) (rhs sparse pattern): 0 > > > ICNTL(21) (solution struct): 1 > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): 0 > > > ICNTL(25) (computation of a null space basis): 0 > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > ICNTL(27) (experimental parameter): -24 > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > ICNTL(29) (parallel ordering): 0 > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > ICNTL(33) (compute determinant): 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0 > > > CNTL(4) (value of static pivoting): -1 > > > CNTL(5) (fixation for null pivots): 0 > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > [0] 5.12794e+08 > > > [1] 5.02142e+08 > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > [0] 815031 > > > [1] 745263 > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > [0] 5.12794e+08 > > > [1] 5.02142e+08 > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > [0] 34 > > > [1] 34 > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > [0] 34 > > > [1] 34 > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > [0] 1158 > > > [1] 1425 > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.01494e+09 > > > RINFOG(2) (global estimated flops for the assembly after factorization): 1.56029e+06 > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.01494e+09 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 2176209 > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 14427 > > > INFOG(5) (estimated maximum front size in the complete tree): 699 > > > INFOG(6) (number of nodes in the complete tree): 15 > > > INFOG(7) (ordering option effectively use after analysis): 2 > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 2176209 > > > INFOG(10) (total integer space store the matrix factors after factorization): 14427 > > > INFOG(11) (order of largest frontal matrix after factorization): 699 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > INFOG(14) (number of memory compress after factorization): 0 > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 34 > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 68 > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 34 > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 68 > > > INFOG(20) (estimated number of entries in the factors): 2176209 > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 30 > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 59 > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 2176209 > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 16, 32 > > > INFOG(32) (after analysis: type of analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > linear system matrix followed by preconditioner matrix: > > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > > type: schurcomplement > > > rows=2583, cols=2583 > > > Schur complement A11 - A10 inv(A00) A01 > > > A11 > > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583, bs=3 > > > total: nonzeros=117369, allocated nonzeros=117369 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node (on process 0) routines > > > A10 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=184326, rbs=3, cbs = 1 > > > total: nonzeros=292770, allocated nonzeros=292770 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node (on process 0) routines > > > KSP of A00 > > > KSP Object: (fieldsplit_u_) 2 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_u_) 2 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326 > > > package used to perform factorization: mumps > > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > > total number of mallocs used during MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): 1 > > > ICNTL(13) (efficiency control): 0 > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > ICNTL(18) (input mat struct): 3 > > > ICNTL(19) (Shur complement info): 0 > > > ICNTL(20) (rhs sparse pattern): 0 > > > ICNTL(21) (solution struct): 1 > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): 0 > > > ICNTL(25) (computation of a null space basis): 0 > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > ICNTL(27) (experimental parameter): -24 > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > ICNTL(29) (parallel ordering): 0 > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > ICNTL(33) (compute determinant): 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0 > > > CNTL(4) (value of static pivoting): -1 > > > CNTL(5) (fixation for null pivots): 0 > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > [0] 4.2839e+08 > > > [1] 3.799e+08 > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > [0] 90423 > > > [1] 93903 > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > > INFOG(6) (number of nodes in the complete tree): 2812 > > > INFOG(7) (ordering option effectively use after analysis): 5 > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > INFOG(14) (number of memory compress after factorization): 0 > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > > INFOG(20) (estimated number of entries in the factors): 403041366 > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > > INFOG(32) (after analysis: type of analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_u_) 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326, bs=3 > > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > A01 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=2583, rbs=3, cbs = 1 > > > total: nonzeros=292770, allocated nonzeros=292770 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 16098 nodes, limit used is 5 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583, rbs=3, cbs = 1 > > > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node (on process 0) routines > > > linear system matrix = precond matrix: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=186909, cols=186909 > > > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > KSPSolve completed > > > > > > > > > Giang > > > > > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley wrote: > > > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui wrote: > > > Hello > > > > > > I'm solving an indefinite problem arising from mesh tying/contact using Lagrange multiplier, the matrix has the form > > > > > > K = [A P^T > > > P 0] > > > > > > I used the FIELDSPLIT preconditioner with one field is the main variable (displacement) and the other field for dual variable (Lagrange multiplier). The block size for each field is 3. According to the manual, I first chose the preconditioner based on Schur complement to treat this problem. > > > > > > > > > For any solver question, please send us the output of > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > However, I will comment below > > > > > > The parameters used for the solve is > > > -ksp_type gmres > > > > > > You need 'fgmres' here with the options you have below. > > > > > > -ksp_max_it 300 > > > -ksp_gmres_restart 300 > > > -ksp_gmres_modifiedgramschmidt > > > -pc_fieldsplit_type schur > > > -pc_fieldsplit_schur_fact_type diag > > > -pc_fieldsplit_schur_precondition selfp > > > > > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > > believe that is a good preconditioner for your problem? > > > > > > > > > -pc_fieldsplit_detect_saddle_point > > > -fieldsplit_u_pc_type hypre > > > > > > I would just use MUMPS here to start, especially if it works on the whole problem. Same with the one below. > > > > > > Matt > > > > > > -fieldsplit_u_pc_hypre_type boomeramg > > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > > -fieldsplit_lu_pc_type hypre > > > -fieldsplit_lu_pc_hypre_type boomeramg > > > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > > > > > For the test case, a small problem is solved on 2 processes. Due to the decomposition, the contact only happens in 1 proc, so the size of Lagrange multiplier dofs on proc 0 is 0. > > > > > > 0: mIndexU.size(): 80490 > > > 0: mIndexLU.size(): 0 > > > 1: mIndexU.size(): 103836 > > > 1: mIndexLU.size(): 2583 > > > > > > However, with this setup the solver takes very long at KSPSolve before going to iteration, and the first iteration seems forever so I have to stop the calculation. I guessed that the solver takes time to compute the Schur complement, but according to the manual only the diagonal of A is used to approximate the Schur complement, so it should not take long to compute this. > > > > > > Note that I ran the same problem with direct solver (MUMPS) and it's able to produce the valid results. The parameter for the solve is pretty standard > > > -ksp_type preonly > > > -pc_type lu > > > -pc_factor_mat_solver_package mumps > > > > > > Hence the matrix/rhs must not have any problem here. Do you have any idea or suggestion for this case? > > > > > > > > > Giang > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > From gotofd at gmail.com Fri Sep 16 19:52:14 2016 From: gotofd at gmail.com (Ji Zhang) Date: Sat, 17 Sep 2016 08:52:14 +0800 Subject: [petsc-users] How to create a local to global mapping and construct matrix correctly In-Reply-To: <7B805207-FBC7-4286-8D0D-331BBE26C6D5@mcs.anl.gov> References: <7B805207-FBC7-4286-8D0D-331BBE26C6D5@mcs.anl.gov> Message-ID: Sorry. What I mean is that, for example, I have a matrix [a1, a2, a3] mij = [b1, b2, b3] , [c1, c2, c3] and using 3 cups. Thus, mij in cpu 2 is mij_2 = [b1, b2, b3] . The local index of element b1 is (1, 1) and it's global index is (2, 1). How can I get the global index from the local index, and local index from global index? Thanks. 2016-09-17 Best, Regards, Zhang Ji Beijing Computational Science Research Center E-mail: gotofd at gmail.com Wayne On Sat, Sep 17, 2016 at 2:24 AM, Barry Smith wrote: > > "Gives wrong answers" is not very informative. What answer do you > expect and what answer do you get? > > Note that each process is looping over mSizes? > > for i in range(len(mSizes)): > for j in range(len(mSizes)): > > Is this what you want? It doesn't seem likely that you want all > processes to generate all information in the matrix. Each process should be > doing a subset of the generation. > > Barry > > > On Sep 16, 2016, at 11:03 AM, Ji Zhang wrote: > > > > Dear all, > > > > I have a number of small 'mpidense' matrices mij, and I want to > construct them to a big 'mpidense' matrix M like this: > > [ m11 m12 m13 ] > > M = | m21 m22 m23 | , > > [ m31 m32 m33 ] > > > > And a short demo is below. I'm using python, but their grammar are > similar. > > import numpy as np > > from petsc4py import PETSc > > import sys, petsc4py > > > > > > petsc4py.init(sys.argv) > > mSizes = (2, 2) > > mij = [] > > > > # create sub-matrices mij > > for i in range(len(mSizes)): > > for j in range(len(mSizes)): > > temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) > > temp_m.setType('mpidense') > > temp_m.setFromOptions() > > temp_m.setUp() > > temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) > > temp_m.assemble() > > temp_m.view() > > mij.append(temp_m) > > > > # Now we have four sub-matrices. I would like to construct them into a > big matrix M. > > M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) > > M.setType('mpidense') > > M.setFromOptions() > > M.setUp() > > mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, > mSizes] > > for i in range(len(mSizes)): > > for j in range(len(mSizes)): > > temp_m = mij[i*len(mSizes)+j].getDenseArray() > > for k in range(temp_m.shape[0]): > > M.setValues(mLocations[i]+k, np.arange(mLocations[j], > mLocations[j+1],dtype='int32'), temp_m[k, :]) > > M.assemble() > > M.view() > > The code works well in a single cup, but give wrong answer for 2 and > more cores. > > > > Thanks. > > 2016-09-17 > > Best, > > Regards, > > Zhang Ji > > Beijing Computational Science Research Center > > E-mail: gotofd at gmail.com > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 16 20:00:42 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 Sep 2016 20:00:42 -0500 Subject: [petsc-users] How to create a local to global mapping and construct matrix correctly In-Reply-To: References: <7B805207-FBC7-4286-8D0D-331BBE26C6D5@mcs.anl.gov> Message-ID: <7E4B3E07-AD8A-407D-92F3-81D265833C81@mcs.anl.gov> > On Sep 16, 2016, at 7:52 PM, Ji Zhang wrote: > > Sorry. What I mean is that, for example, I have a matrix > [a1, a2, a3] > mij = [b1, b2, b3] , > [c1, c2, c3] > and using 3 cups. Thus, mij in cpu 2 is > mij_2 = [b1, b2, b3] . > > The local index of element b1 is (1, 1) and it's global index is (2, 1). How can I get the global index from the local index, and local index from global index? That is something your code needs to generate and deal with, it is not something PETSc can do for you directly. You are defining the little m's and the big M and deciding where to put the little m's into the big ends. PETSc/we have no idea what the little m's represent in terms of the big M and where they would belong, that is completely the business of your application. Barry > > Thanks. > 2016-09-17 > Best, > Regards, > Zhang Ji > Beijing Computational Science Research Center > E-mail: gotofd at gmail.com > > > > > Wayne > > On Sat, Sep 17, 2016 at 2:24 AM, Barry Smith wrote: > > "Gives wrong answers" is not very informative. What answer do you expect and what answer do you get? > > Note that each process is looping over mSizes? > > for i in range(len(mSizes)): > for j in range(len(mSizes)): > > Is this what you want? It doesn't seem likely that you want all processes to generate all information in the matrix. Each process should be doing a subset of the generation. > > Barry > > > On Sep 16, 2016, at 11:03 AM, Ji Zhang wrote: > > > > Dear all, > > > > I have a number of small 'mpidense' matrices mij, and I want to construct them to a big 'mpidense' matrix M like this: > > [ m11 m12 m13 ] > > M = | m21 m22 m23 | , > > [ m31 m32 m33 ] > > > > And a short demo is below. I'm using python, but their grammar are similar. > > import numpy as np > > from petsc4py import PETSc > > import sys, petsc4py > > > > > > petsc4py.init(sys.argv) > > mSizes = (2, 2) > > mij = [] > > > > # create sub-matrices mij > > for i in range(len(mSizes)): > > for j in range(len(mSizes)): > > temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) > > temp_m.setType('mpidense') > > temp_m.setFromOptions() > > temp_m.setUp() > > temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) > > temp_m.assemble() > > temp_m.view() > > mij.append(temp_m) > > > > # Now we have four sub-matrices. I would like to construct them into a big matrix M. > > M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) > > M.setType('mpidense') > > M.setFromOptions() > > M.setUp() > > mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, mSizes] > > for i in range(len(mSizes)): > > for j in range(len(mSizes)): > > temp_m = mij[i*len(mSizes)+j].getDenseArray() > > for k in range(temp_m.shape[0]): > > M.setValues(mLocations[i]+k, np.arange(mLocations[j],mLocations[j+1],dtype='int32'), temp_m[k, :]) > > M.assemble() > > M.view() > > The code works well in a single cup, but give wrong answer for 2 and more cores. > > > > Thanks. > > 2016-09-17 > > Best, > > Regards, > > Zhang Ji > > Beijing Computational Science Research Center > > E-mail: gotofd at gmail.com > > > > > > From gotofd at gmail.com Fri Sep 16 21:00:09 2016 From: gotofd at gmail.com (Ji Zhang) Date: Sat, 17 Sep 2016 10:00:09 +0800 Subject: [petsc-users] How to create a local to global mapping and construct matrix correctly In-Reply-To: <7E4B3E07-AD8A-407D-92F3-81D265833C81@mcs.anl.gov> References: <7B805207-FBC7-4286-8D0D-331BBE26C6D5@mcs.anl.gov> <7E4B3E07-AD8A-407D-92F3-81D265833C81@mcs.anl.gov> Message-ID: Thanks for your previous suggestion and the construction from little m to big M have accomplished. For a MPI program, a arbitrary matrix is been shorted in different cups (i.e. 3), and each cup only contain part of then. So I think the matrix have two kinds of indexes, a local one indicate the location of values at the corresponding cup, and the global one indicate the location at the whole matrix. I would like to know the relation between them and find the way to shift the index from one to another. I have one more question, the function VecGetArray() only return a pointer to the local data array. What should I do if I need a pointer to the whole data array? Wayne On Sat, Sep 17, 2016 at 9:00 AM, Barry Smith wrote: > > > On Sep 16, 2016, at 7:52 PM, Ji Zhang wrote: > > > > Sorry. What I mean is that, for example, I have a matrix > > [a1, a2, a3] > > mij = [b1, b2, b3] , > > [c1, c2, c3] > > and using 3 cups. Thus, mij in cpu 2 is > > mij_2 = [b1, b2, b3] . > > > > The local index of element b1 is (1, 1) and it's global index is (2, 1). > How can I get the global index from the local index, and local index from > global index? > > That is something your code needs to generate and deal with, it is not > something PETSc can do for you directly. You are defining the little m's > and the big M and deciding where to put the little m's into the big ends. > PETSc/we have no idea what the little m's represent in terms of the big M > and where they would belong, that is completely the business of your > application. > > Barry > > > > > > > Thanks. > > 2016-09-17 > > Best, > > Regards, > > Zhang Ji > > Beijing Computational Science Research Center > > E-mail: gotofd at gmail.com > > > > > > > > > > Wayne > > > > On Sat, Sep 17, 2016 at 2:24 AM, Barry Smith wrote: > > > > "Gives wrong answers" is not very informative. What answer do you > expect and what answer do you get? > > > > Note that each process is looping over mSizes? > > > > for i in range(len(mSizes)): > > for j in range(len(mSizes)): > > > > Is this what you want? It doesn't seem likely that you want all > processes to generate all information in the matrix. Each process should be > doing a subset of the generation. > > > > Barry > > > > > On Sep 16, 2016, at 11:03 AM, Ji Zhang wrote: > > > > > > Dear all, > > > > > > I have a number of small 'mpidense' matrices mij, and I want to > construct them to a big 'mpidense' matrix M like this: > > > [ m11 m12 m13 ] > > > M = | m21 m22 m23 | , > > > [ m31 m32 m33 ] > > > > > > And a short demo is below. I'm using python, but their grammar are > similar. > > > import numpy as np > > > from petsc4py import PETSc > > > import sys, petsc4py > > > > > > > > > petsc4py.init(sys.argv) > > > mSizes = (2, 2) > > > mij = [] > > > > > > # create sub-matrices mij > > > for i in range(len(mSizes)): > > > for j in range(len(mSizes)): > > > temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > > temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) > > > temp_m.setType('mpidense') > > > temp_m.setFromOptions() > > > temp_m.setUp() > > > temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) > > > temp_m.assemble() > > > temp_m.view() > > > mij.append(temp_m) > > > > > > # Now we have four sub-matrices. I would like to construct them into a > big matrix M. > > > M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > > M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) > > > M.setType('mpidense') > > > M.setFromOptions() > > > M.setUp() > > > mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, > mSizes] > > > for i in range(len(mSizes)): > > > for j in range(len(mSizes)): > > > temp_m = mij[i*len(mSizes)+j].getDenseArray() > > > for k in range(temp_m.shape[0]): > > > M.setValues(mLocations[i]+k, np.arange(mLocations[j], > mLocations[j+1],dtype='int32'), temp_m[k, :]) > > > M.assemble() > > > M.view() > > > The code works well in a single cup, but give wrong answer for 2 and > more cores. > > > > > > Thanks. > > > 2016-09-17 > > > Best, > > > Regards, > > > Zhang Ji > > > Beijing Computational Science Research Center > > > E-mail: gotofd at gmail.com > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 16 21:28:17 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 Sep 2016 21:28:17 -0500 Subject: [petsc-users] How to create a local to global mapping and construct matrix correctly In-Reply-To: References: <7B805207-FBC7-4286-8D0D-331BBE26C6D5@mcs.anl.gov> <7E4B3E07-AD8A-407D-92F3-81D265833C81@mcs.anl.gov> Message-ID: <3F3AD1DF-FE74-480A-AC90-DB15B8C2CC2E@mcs.anl.gov> > On Sep 16, 2016, at 9:00 PM, Ji Zhang wrote: > > Thanks for your previous suggestion and the construction from little m to big M have accomplished. > > For a MPI program, a arbitrary matrix is been shorted in different cups (i.e. 3), and each cup only contain part of then. So I think the matrix have two kinds of indexes, a local one indicate the location of values at the corresponding cup, and the global one indicate the location at the whole matrix. I would like to know the relation between them and find the way to shift the index from one to another. This depends on what the little matrices are that you are putting into the large matrix. For PDE type problems you can look at the PETSc KSP and SNES tutorial examples but for your problem I don't know. > > I have one more question, the function VecGetArray() only return a pointer to the local data array. What should I do if I need a pointer to the whole data array? VecScatterCreateToAll() but note this is not scalable for very large problems > > Wayne > > On Sat, Sep 17, 2016 at 9:00 AM, Barry Smith wrote: > > > On Sep 16, 2016, at 7:52 PM, Ji Zhang wrote: > > > > Sorry. What I mean is that, for example, I have a matrix > > [a1, a2, a3] > > mij = [b1, b2, b3] , > > [c1, c2, c3] > > and using 3 cups. Thus, mij in cpu 2 is > > mij_2 = [b1, b2, b3] . > > > > The local index of element b1 is (1, 1) and it's global index is (2, 1). How can I get the global index from the local index, and local index from global index? > > That is something your code needs to generate and deal with, it is not something PETSc can do for you directly. You are defining the little m's and the big M and deciding where to put the little m's into the big ends. PETSc/we have no idea what the little m's represent in terms of the big M and where they would belong, that is completely the business of your application. > > Barry > > > > > > > Thanks. > > 2016-09-17 > > Best, > > Regards, > > Zhang Ji > > Beijing Computational Science Research Center > > E-mail: gotofd at gmail.com > > > > > > > > > > Wayne > > > > On Sat, Sep 17, 2016 at 2:24 AM, Barry Smith wrote: > > > > "Gives wrong answers" is not very informative. What answer do you expect and what answer do you get? > > > > Note that each process is looping over mSizes? > > > > for i in range(len(mSizes)): > > for j in range(len(mSizes)): > > > > Is this what you want? It doesn't seem likely that you want all processes to generate all information in the matrix. Each process should be doing a subset of the generation. > > > > Barry > > > > > On Sep 16, 2016, at 11:03 AM, Ji Zhang wrote: > > > > > > Dear all, > > > > > > I have a number of small 'mpidense' matrices mij, and I want to construct them to a big 'mpidense' matrix M like this: > > > [ m11 m12 m13 ] > > > M = | m21 m22 m23 | , > > > [ m31 m32 m33 ] > > > > > > And a short demo is below. I'm using python, but their grammar are similar. > > > import numpy as np > > > from petsc4py import PETSc > > > import sys, petsc4py > > > > > > > > > petsc4py.init(sys.argv) > > > mSizes = (2, 2) > > > mij = [] > > > > > > # create sub-matrices mij > > > for i in range(len(mSizes)): > > > for j in range(len(mSizes)): > > > temp_m = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > > temp_m.setSizes(((None, mSizes[i]), (None, mSizes[j]))) > > > temp_m.setType('mpidense') > > > temp_m.setFromOptions() > > > temp_m.setUp() > > > temp_m[:, :] = np.random.random_sample((mSizes[i], mSizes[j])) > > > temp_m.assemble() > > > temp_m.view() > > > mij.append(temp_m) > > > > > > # Now we have four sub-matrices. I would like to construct them into a big matrix M. > > > M = PETSc.Mat().create(comm=PETSc.COMM_WORLD) > > > M.setSizes(((None, np.sum(mSizes)), (None, np.sum(mSizes)))) > > > M.setType('mpidense') > > > M.setFromOptions() > > > M.setUp() > > > mLocations = np.insert(np.cumsum(mSizes), 0, 0) # mLocations = [0, mSizes] > > > for i in range(len(mSizes)): > > > for j in range(len(mSizes)): > > > temp_m = mij[i*len(mSizes)+j].getDenseArray() > > > for k in range(temp_m.shape[0]): > > > M.setValues(mLocations[i]+k, np.arange(mLocations[j],mLocations[j+1],dtype='int32'), temp_m[k, :]) > > > M.assemble() > > > M.view() > > > The code works well in a single cup, but give wrong answer for 2 and more cores. > > > > > > Thanks. > > > 2016-09-17 > > > Best, > > > Regards, > > > Zhang Ji > > > Beijing Computational Science Research Center > > > E-mail: gotofd at gmail.com > > > > > > > > > > > > From hgbk2008 at gmail.com Sat Sep 17 01:49:57 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Sat, 17 Sep 2016 08:49:57 +0200 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: <76457EBE-4527-4A5A-B247-3D6AC8371F79@mcs.anl.gov> References: <19EEF686-0334-46CD-A25D-4DFCA2B5D94B@mcs.anl.gov> <76457EBE-4527-4A5A-B247-3D6AC8371F79@mcs.anl.gov> Message-ID: I'm specifically looking into src/ksp/ksp/utils/schurm.c, petsc 3.7.3 The link is: https://bitbucket.org/petsc/petsc/src/2077e624e7fbbda0ee00455afb91c6183e71919a/src/ksp/ksp/utils/schurm.c?at=v3.7.3&fileviewer=file-view-default#L548-557 Giang On Sat, Sep 17, 2016 at 1:44 AM, Barry Smith wrote: > > > On Sep 16, 2016, at 6:09 PM, Hoang Giang Bui wrote: > > > > Hi Barry > > > > You are right, using MatCreateAIJ() eliminates the first issue. > Previously I ran the mpi code with one process so A,B,C,D is all MPIAIJ > > > > And how about the second issue, this error will always be thrown if A11 > is nonzero, which is my case? > > > > Nevertheless, I would like to report my simple finding: I changed the > part around line 552 to > > I'm sorry what file are you talking about? What version of PETSc? What > other lines of code are around 552? I can't figure out where you are doing > this. > > Barry > > > > > if (D) { > > ierr = MatAXPY(*S, -1.0, D, SUBSET_NONZERO_PATTERN);CHKERRQ(ierr); > > } > > > > I could get ex42 works with > > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > > parameters: > > mpirun -np 1 ex42 \ > > -stokes_ksp_monitor \ > > -stokes_ksp_type fgmres \ > > -stokes_pc_type fieldsplit \ > > -stokes_pc_fieldsplit_type schur \ > > -stokes_pc_fieldsplit_schur_fact_type full \ > > -stokes_pc_fieldsplit_schur_precondition full \ > > -stokes_fieldsplit_u_ksp_type preonly \ > > -stokes_fieldsplit_u_pc_type lu \ > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > -stokes_fieldsplit_p_ksp_type gmres \ > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > -stokes_fieldsplit_p_pc_type lu \ > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps \ > > > > Output: > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 1.651372938841e+02 true resid norm > 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.172753353368e+00 true resid norm > 2.072348962892e-05 ||r(i)||/||b|| 3.588013522487e-04 > > 2 KSP preconditioned resid norm 3.931379526610e-13 true resid norm > 1.878299731917e-16 ||r(i)||/||b|| 3.252041503665e-15 > > 1 KSP Residual norm 3.385960118582e-17 > > > > inner convergence is much better although 2 iterations (:-( ?? > > > > I also obtain the same convergence behavior for the problem with A11!=0 > > > > Please suggest if this makes sense, or I did something wrong. > > > > Giang > > > > On Fri, Sep 16, 2016 at 8:31 PM, Barry Smith wrote: > > > > Why is your C matrix an MPIAIJ matrix on one process? In general we > recommend creating a SeqAIJ matrix for one process and MPIAIJ for multiple. > You can use MatCreateAIJ() and it will always create the correct one. > > > > We could change the code as you suggest but I want to make sure that > is the best solution in your case. > > > > Barry > > > > > > > > > On Sep 16, 2016, at 3:31 AM, Hoang Giang Bui > wrote: > > > > > > Hi Matt > > > > > > I believed at line 523, src/ksp/ksp/utils/schurm.c > > > > > > ierr = MatMatMult(C, AinvB, MAT_INITIAL_MATRIX, fill, S);CHKERRQ(ierr); > > > > > > in my test case C is MPIAIJ and AinvB is SEQAIJ, hence it throws the > error. > > > > > > In fact I guess there are two issues with it > > > line 521, ierr = MatConvert(AinvBd, MATAIJ, MAT_INITIAL_MATRIX, > &AinvB);CHKERRQ(ierr); > > > shall we convert this to type of C matrix to ensure compatibility ? > > > > > > line 552, if(norm > PETSC_MACHINE_EPSILON) SETERRQ(PetscObjectComm((PetscObject) > M), PETSC_ERR_SUP, "Not yet implemented for Schur complements with > non-vanishing D"); > > > with this the Schur complement with A11!=0 will be aborted > > > > > > Giang > > > > > > On Thu, Sep 15, 2016 at 4:28 PM, Matthew Knepley > wrote: > > > On Thu, Sep 15, 2016 at 9:07 AM, Hoang Giang Bui > wrote: > > > Hi Matt > > > > > > Thanks for the comment. After looking carefully into the manual again, > the key take away is that with selfp there is no option to compute the > exact Schur, there are only two options to approximate the inv(A00) for > selfp, which are lump and diag (diag by default). I misunderstood this > previously. > > > > > > There is online manual entry mentioned about > PC_FIELDSPLIT_SCHUR_PRE_FULL, which is not documented elsewhere in the > offline manual. I tried to access that by setting > > > -pc_fieldsplit_schur_precondition full > > > > > > Yep, I wrote that specifically for testing, but its very slow so I did > not document it to prevent people from complaining. > > > > > > but it gives the error > > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Arguments are incompatible > > > [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with > B, seqaij > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > > > [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui > Thu Sep 15 15:46:56 2016 > > > [0]PETSC ERROR: Configure options --with-shared-libraries > --with-debugging=0 --with-pic --download-fblaslapack=yes > --download-suitesparse --download-ptscotch=yes --download-metis=yes > --download-parmetis=yes --download-scalapack=yes --download-mumps=yes > --download-hypre=yes --download-ml=yes --download-pastix=yes > --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 > > > [0]PETSC ERROR: #1 MatMatMult() line 9514 in > /home/hbui/sw/petsc-3.7.3/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line > 526 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c > > > [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in > /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > > > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ > ksp/pc/interface/precon.c > > > [0]PETSC ERROR: #5 KSPSetUp() line 390 in > /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: #6 KSPSolve() line 599 in > /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > > > > > Please excuse me to insist on forming the exact Schur complement, but > as you said, I would like to track down what creates problem in my code by > starting from a very exact but ineffective solution. > > > > > > Sure, I understand. I do not understand how A can be MPI and B can be > Seq. Do you know how that happens? > > > > > > Thanks, > > > > > > Matt > > > > > > Giang > > > > > > On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley > wrote: > > > On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui > wrote: > > > Dear Barry > > > > > > Thanks for the clarification. I got exactly what you said if the code > changed to > > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > > Residual norms for stokes_ solve. > > > 0 KSP Residual norm 1.327791371202e-02 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm > 0.000000000000e+00 ||r(i)||/||b|| -nan > > > 1 KSP Residual norm 3.997711925708e-17 > > > > > > but I guess we solve a different problem if B is used for the linear > system. > > > > > > in addition, changed to > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > also works but inner iteration converged not in one iteration > > > > > > Residual norms for stokes_ solve. > > > 0 KSP Residual norm 1.327791371202e-02 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm > 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm > 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 > > > 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm > 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 > > > 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm > 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 > > > 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm > 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 > > > 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm > 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 > > > 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm > 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 > > > 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm > 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 > > > 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm > 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 > > > 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm > 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 > > > 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm > 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 > > > 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm > 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 > > > 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm > 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 > > > 1 KSP Residual norm 1.445282647127e-16 > > > > > > > > > Seem like zero pivot does not happen, but why the solver for Schur > takes 13 steps if the preconditioner is direct solver? > > > > > > Look at the -ksp_view. I will bet that the default is to shift (add a > multiple of the identity) the matrix instead of failing. This > > > gives an inexact PC, but as you see it can converge. > > > > > > Thanks, > > > > > > Matt > > > > > > > > > I also so tried another problem which I known does have a nonsingular > Schur (at least A11 != 0) and it also have the same problem: 1 step outer > convergence but multiple step inner convergence. > > > > > > Any ideas? > > > > > > Giang > > > > > > On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith > wrote: > > > > > > Normally you'd be absolutely correct to expect convergence in one > iteration. However in this example note the call > > > > > > ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); > > > > > > It is solving the linear system defined by A but building the > preconditioner (i.e. the entire fieldsplit process) from a different matrix > B. Since A is not B you should not expect convergence in one iteration. If > you change the code to > > > > > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > > > > > you will see exactly what you expect, convergence in one iteration. > > > > > > Sorry about this, the example is lacking clarity and documentation > its author obviously knew too well what he was doing that he didn't realize > everyone else in the world would need more comments in the code. If you > change the code to > > > > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > > > > it will stop without being able to build the preconditioner because LU > factorization of the Sp matrix will result in a zero pivot. This is why > this "auxiliary" matrix B is used to define the preconditioner instead of A. > > > > > > Barry > > > > > > > > > > > > > > > > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui > wrote: > > > > > > > > Sorry I slept quite a while in this thread. Now I start to look at > it again. In the last try, the previous setting doesn't work either (in > fact diverge). So I would speculate if the Schur complement in my case is > actually not invertible. It's also possible that the code is wrong > somewhere. However, before looking at that, I want to understand thoroughly > the settings for Schur complement > > > > > > > > I experimented ex42 with the settings: > > > > mpirun -np 1 ex42 \ > > > > -stokes_ksp_monitor \ > > > > -stokes_ksp_type fgmres \ > > > > -stokes_pc_type fieldsplit \ > > > > -stokes_pc_fieldsplit_type schur \ > > > > -stokes_pc_fieldsplit_schur_fact_type full \ > > > > -stokes_pc_fieldsplit_schur_precondition selfp \ > > > > -stokes_fieldsplit_u_ksp_type preonly \ > > > > -stokes_fieldsplit_u_pc_type lu \ > > > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > > > -stokes_fieldsplit_p_ksp_type gmres \ > > > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > > > -stokes_fieldsplit_p_pc_type lu \ > > > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > > > > > > > In my understanding, the solver should converge in 1 (outer) step. > Execution gives: > > > > Residual norms for stokes_ solve. > > > > 0 KSP Residual norm 1.327791371202e-02 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid > norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > > > 1 KSP Residual norm 7.656238881621e-04 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid > norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid > norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > > > > 2 KSP Residual norm 9.895645456398e-06 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid > norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid > norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > > > > 3 KSP Residual norm 1.608206702571e-06 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid > norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid > norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > > > > 4 KSP Residual norm 2.453618999882e-07 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid > norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid > norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > > > > 5 KSP Residual norm 4.230399264750e-08 > > > > > > > > Looks like the "selfp" does construct the Schur nicely. But does > "full" really construct the full block preconditioner? > > > > > > > > Giang > > > > P/S: I'm also generating a smaller size of the previous problem for > checking again. > > > > > > > > > > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley > wrote: > > > > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui > wrote: > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is > dense. Is there any reason to > > > > believe that is a good preconditioner for your problem? > > > > > > > > This is the first approach to the problem, so I chose the most > simple setting. Do you have any other recommendation? > > > > > > > > This is in no way the simplest PC. We need to make it simpler first. > > > > > > > > 1) Run on only 1 proc > > > > > > > > 2) Use -pc_fieldsplit_schur_fact_type full > > > > > > > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_ > true_residual > > > > > > > > This should converge in 1 outer iteration, but we will see how good > your Schur complement preconditioner > > > > is for this problem. > > > > > > > > You need to start out from something you understand and then start > making approximations. > > > > > > > > Matt > > > > > > > > For any solver question, please send us the output of > > > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > > > > I sent here the full output (after changed to fgmres), again it > takes long at the first iteration but after that, it does not converge > > > > > > > > -ksp_type fgmres > > > > -ksp_max_it 300 > > > > -ksp_gmres_restart 300 > > > > -ksp_gmres_modifiedgramschmidt > > > > -pc_fieldsplit_type schur > > > > -pc_fieldsplit_schur_fact_type diag > > > > -pc_fieldsplit_schur_precondition selfp > > > > -pc_fieldsplit_detect_saddle_point > > > > -fieldsplit_u_ksp_type preonly > > > > -fieldsplit_u_pc_type lu > > > > -fieldsplit_u_pc_factor_mat_solver_package mumps > > > > -fieldsplit_lu_ksp_type preonly > > > > -fieldsplit_lu_pc_type lu > > > > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > > > > > > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid > norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid > norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > > > > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid > norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > > > > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid > norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > > > > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid > norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > > > > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid > norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > > > > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid > norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > > > > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid > norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > > > > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid > norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > > > > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid > norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > > > > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid > norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > > > > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid > norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > > > > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid > norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > > > > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid > norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > > > > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid > norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > > > > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid > norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > > > > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid > norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > > > > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid > norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > > > > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid > norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > > > > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid > norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > > > > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid > norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > > > > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid > norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > > > > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid > norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > > > > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid > norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > > > > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid > norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > > > > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid > norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > > > > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid > norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > > > > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid > norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > > > > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid > norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > > > > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid > norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > > > > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid > norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > > > > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid > norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > > > > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid > norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > > > > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid > norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > > > > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid > norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > > > > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid > norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > > > > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid > norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > > > > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid > norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > > > > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid > norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > > > > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid > norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > > > > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid > norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > > > > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid > norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > > > > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid > norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > > > > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid > norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > > > > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid > norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > > > > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid > norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > > > > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid > norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > > > > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid > norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > > > > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid > norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > > > > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid > norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > > > > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid > norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > > > > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid > norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > > > > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid > norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > > > > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid > norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > > > > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid > norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > > > > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid > norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > > > > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid > norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > > > > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid > norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > > > > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid > norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > > > > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid > norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > > > > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid > norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > > > > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid > norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > > > > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid > norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > > > > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid > norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > > > > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid > norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > > > > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid > norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > > > > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid > norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > > > > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid > norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > > > > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid > norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > > > > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid > norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > > > > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid > norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > > > > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid > norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > > > > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid > norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > > > > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid > norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > > > > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid > norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > > > > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid > norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > > > > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid > norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > > > > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid > norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > > > > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid > norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > > > > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid > norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > > > > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid > norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > > > > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid > norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > > > > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid > norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > > > > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid > norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > > > > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid > norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > > > > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid > norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > > > > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid > norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > > > > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid > norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > > > > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid > norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > > > > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid > norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > > > > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid > norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > > > > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid > norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > > > > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid > norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > > > > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid > norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > > > > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid > norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > > > > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid > norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > > > > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid > norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > > > > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid > norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > > > > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid > norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > > > > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid > norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > > > > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid > norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > > > > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid > norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > > > > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid > norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > > > > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid > norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > > > > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid > norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > > > > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid > norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > > > > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid > norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > > > > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid > norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > > > > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid > norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > > > > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid > norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > > > > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid > norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > > > > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid > norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > > > > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid > norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > > > > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid > norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > > > > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid > norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > > > > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid > norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > > > > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid > norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > > > > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid > norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > > > > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid > norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > > > > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid > norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > > > > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid > norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > > > > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid > norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > > > > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid > norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > > > > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid > norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > > > > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid > norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > > > > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid > norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > > > > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid > norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > > > > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid > norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > > > > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid > norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > > > > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid > norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > > > > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid > norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > > > > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid > norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > > > > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid > norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > > > > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid > norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > > > > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid > norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > > > > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid > norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > > > > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid > norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > > > > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid > norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > > > > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid > norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > > > > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid > norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > > > > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid > norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > > > > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid > norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > > > > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid > norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > > > > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid > norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > > > > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid > norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > > > > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid > norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > > > > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid > norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > > > > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid > norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > > > > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid > norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > > > > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid > norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > > > > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid > norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > > > > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid > norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > > > > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid > norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > > > > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid > norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > > > > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid > norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > > > > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid > norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > > > > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid > norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > > > > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid > norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > > > > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid > norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > > > > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid > norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > > > > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid > norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > > > > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid > norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > > > > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid > norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > > > > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid > norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > > > > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid > norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > > > > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid > norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > > > > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid > norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > > > > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid > norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > > > > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid > norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > > > > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid > norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > > > > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid > norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > > > > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid > norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > > > > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid > norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > > > > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid > norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > > > > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid > norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > > > > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid > norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > > > > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid > norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > > > > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid > norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > > > > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid > norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > > > > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid > norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > > > > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid > norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > > > > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid > norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > > > > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid > norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > > > > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid > norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > > > > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid > norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > > > > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid > norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > > > > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid > norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > > > > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid > norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > > > > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid > norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > > > > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid > norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > > > > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid > norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > > > > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid > norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > > > > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid > norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > > > > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid > norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > > > > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid > norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > > > > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid > norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > > > > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid > norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > > > > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid > norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > > > > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid > norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > > > > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid > norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > > > > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid > norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > > > > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid > norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > > > > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid > norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > > > > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid > norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > > > > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid > norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > > > > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid > norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > > > > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid > norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > > > > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid > norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > > > > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid > norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > > > > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid > norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > > > > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid > norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > > > > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid > norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > > > > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid > norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > > > > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid > norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > > > > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid > norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > > > > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid > norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > > > > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid > norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > > > > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid > norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > > > > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid > norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > > > > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid > norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > > > > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid > norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > > > > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid > norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > > > > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid > norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > > > > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid > norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > > > > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid > norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > > > > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid > norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > > > > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid > norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > > > > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid > norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > > > > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid > norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > > > > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid > norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > > > > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid > norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > > > > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid > norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > > > > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid > norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > > > > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid > norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > > > > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid > norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > > > > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid > norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > > > > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid > norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > > > > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid > norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > > > > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid > norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > > > > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid > norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > > > > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid > norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > > > > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid > norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > > > > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid > norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > > > > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid > norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > > > > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid > norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > > > > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid > norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > > > > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid > norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > > > > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid > norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > > > > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid > norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > > > > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid > norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > > > > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid > norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > > > > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid > norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > > > > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid > norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > > > > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid > norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > > > > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid > norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > > > > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid > norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > > > > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid > norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > > > > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid > norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > > > > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid > norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > > > > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid > norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > > > > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid > norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > > > > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid > norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > > > > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid > norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > > > > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid > norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > > > > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid > norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > > > > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid > norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > > > > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid > norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > > > > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid > norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > > > > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid > norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > > > > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid > norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > > > > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid > norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > > > > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid > norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > > > > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid > norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > > > > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid > norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > > > > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid > norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > > > > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid > norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > > > > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid > norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > > > > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid > norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > > > > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid > norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > > > > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid > norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > > > > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid > norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > > > > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid > norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > > > > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid > norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > > > > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid > norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > > > > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid > norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > > > > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid > norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > > > > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid > norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > > > > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid > norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > > > > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid > norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > > > > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid > norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > > > > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid > norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > > > > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid > norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > > > > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid > norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > > > > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid > norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > > > > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid > norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > > > > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid > norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > > > > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid > norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > > > > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid > norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > > > > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid > norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > > > > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid > norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > > > > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid > norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > > > > Linear solve did not converge due to DIVERGED_ITS iterations 300 > > > > KSP Object: 2 MPI processes > > > > type: fgmres > > > > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > > > > GMRES: happy breakdown tolerance 1e-30 > > > > maximum iterations=300, initial guess is zero > > > > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > > > > right preconditioning > > > > using UNPRECONDITIONED norm type for convergence test > > > > PC Object: 2 MPI processes > > > > type: fieldsplit > > > > FieldSplit with Schur preconditioner, factorization DIAG > > > > Preconditioner for the Schur complement formed from Sp, an > assembled approximation to S, which uses (lumped, if requested) A00's > diagonal's inverse > > > > Split info: > > > > Split number 0 Defined by IS > > > > Split number 1 Defined by IS > > > > KSP solver for A00 block > > > > KSP Object: (fieldsplit_u_) 2 MPI processes > > > > type: preonly > > > > maximum iterations=10000, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (fieldsplit_u_) 2 MPI processes > > > > type: lu > > > > LU: out-of-place factorization > > > > tolerance for zero pivot 2.22045e-14 > > > > matrix ordering: natural > > > > factor fill ratio given 0, needed 0 > > > > Factored matrix follows: > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=184326 > > > > package used to perform factorization: mumps > > > > total: nonzeros=4.03041e+08, allocated > nonzeros=4.03041e+08 > > > > total number of mallocs used during MatSetValues > calls =0 > > > > MUMPS run parameters: > > > > SYM (matrix type): 0 > > > > PAR (host participation): 1 > > > > ICNTL(1) (output for error): 6 > > > > ICNTL(2) (output of diagnostic msg): 0 > > > > ICNTL(3) (output for global info): 0 > > > > ICNTL(4) (level of printing): 0 > > > > ICNTL(5) (input mat struct): 0 > > > > ICNTL(6) (matrix prescaling): 7 > > > > ICNTL(7) (sequentia matrix ordering):7 > > > > ICNTL(8) (scalling strategy): 77 > > > > ICNTL(10) (max num of refinements): 0 > > > > ICNTL(11) (error analysis): 0 > > > > ICNTL(12) (efficiency control): > 1 > > > > ICNTL(13) (efficiency control): > 0 > > > > ICNTL(14) (percentage of estimated workspace > increase): 20 > > > > ICNTL(18) (input mat struct): > 3 > > > > ICNTL(19) (Shur complement info): > 0 > > > > ICNTL(20) (rhs sparse pattern): > 0 > > > > ICNTL(21) (solution struct): > 1 > > > > ICNTL(22) (in-core/out-of-core facility): > 0 > > > > ICNTL(23) (max size of memory can be allocated > locally):0 > > > > ICNTL(24) (detection of null pivot rows): > 0 > > > > ICNTL(25) (computation of a null space basis): > 0 > > > > ICNTL(26) (Schur options for rhs or solution): > 0 > > > > ICNTL(27) (experimental parameter): > -24 > > > > ICNTL(28) (use parallel or sequential > ordering): 1 > > > > ICNTL(29) (parallel ordering): > 0 > > > > ICNTL(30) (user-specified set of entries in > inv(A)): 0 > > > > ICNTL(31) (factors is discarded in the solve > phase): 0 > > > > ICNTL(33) (compute determinant): > 0 > > > > CNTL(1) (relative pivoting threshold): 0.01 > > > > CNTL(2) (stopping criterion of refinement): > 1.49012e-08 > > > > CNTL(3) (absolute pivoting threshold): 0 > > > > CNTL(4) (value of static pivoting): -1 > > > > CNTL(5) (fixation for null pivots): 0 > > > > RINFO(1) (local estimated flops for the > elimination after analysis): > > > > [0] 5.59214e+11 > > > > [1] 5.35237e+11 > > > > RINFO(2) (local estimated flops for the assembly > after factorization): > > > > [0] 4.2839e+08 > > > > [1] 3.799e+08 > > > > RINFO(3) (local estimated flops for the > elimination after factorization): > > > > [0] 5.59214e+11 > > > > [1] 5.35237e+11 > > > > INFO(15) (estimated size of (in MB) MUMPS > internal data for running numerical factorization): > > > > [0] 2621 > > > > [1] 2649 > > > > INFO(16) (size of (in MB) MUMPS internal data > used during numerical factorization): > > > > [0] 2621 > > > > [1] 2649 > > > > INFO(23) (num of pivots eliminated on this > processor after factorization): > > > > [0] 90423 > > > > [1] 93903 > > > > RINFOG(1) (global estimated flops for the > elimination after analysis): 1.09445e+12 > > > > RINFOG(2) (global estimated flops for the > assembly after factorization): 8.0829e+08 > > > > RINFOG(3) (global estimated flops for the > elimination after factorization): 1.09445e+12 > > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) > (determinant): (0,0)*(2^0) > > > > INFOG(3) (estimated real workspace for factors > on all processors after analysis): 403041366 > > > > INFOG(4) (estimated integer workspace for > factors on all processors after analysis): 2265748 > > > > INFOG(5) (estimated maximum front size in the > complete tree): 6663 > > > > INFOG(6) (number of nodes in the complete tree): > 2812 > > > > INFOG(7) (ordering option effectively use after > analysis): 5 > > > > INFOG(8) (structural symmetry in percent of the > permuted matrix after analysis): 100 > > > > INFOG(9) (total real/complex workspace to store > the matrix factors after factorization): 403041366 > > > > INFOG(10) (total integer space store the matrix > factors after factorization): 2265766 > > > > INFOG(11) (order of largest frontal matrix after > factorization): 6663 > > > > INFOG(12) (number of off-diagonal pivots): 0 > > > > INFOG(13) (number of delayed pivots after > factorization): 0 > > > > INFOG(14) (number of memory compress after > factorization): 0 > > > > INFOG(15) (number of steps of iterative > refinement after solution): 0 > > > > INFOG(16) (estimated size (in MB) of all MUMPS > internal data for factorization after analysis: value on the most memory > consuming processor): 2649 > > > > INFOG(17) (estimated size of all MUMPS internal > data for factorization after analysis: sum over all processors): 5270 > > > > INFOG(18) (size of all MUMPS internal data > allocated during factorization: value on the most memory consuming > processor): 2649 > > > > INFOG(19) (size of all MUMPS internal data > allocated during factorization: sum over all processors): 5270 > > > > INFOG(20) (estimated number of entries in the > factors): 403041366 > > > > INFOG(21) (size in MB of memory effectively used > during factorization - value on the most memory consuming processor): 2121 > > > > INFOG(22) (size in MB of memory effectively used > during factorization - sum over all processors): 4174 > > > > INFOG(23) (after analysis: value of ICNTL(6) > effectively used): 0 > > > > INFOG(24) (after analysis: value of ICNTL(12) > effectively used): 1 > > > > INFOG(25) (after factorization: number of pivots > modified by static pivoting): 0 > > > > INFOG(28) (after factorization: number of null > pivots encountered): 0 > > > > INFOG(29) (after factorization: effective number > of entries in the factors (sum over all processors)): 403041366 > > > > INFOG(30, 31) (after solution: size in Mbytes of > memory used during solution phase): 2467, 4922 > > > > INFOG(32) (after analysis: type of analysis > done): 1 > > > > INFOG(33) (value used for ICNTL(8)): 7 > > > > INFOG(34) (exponent of the determinant if > determinant is requested): 0 > > > > linear system matrix = precond matrix: > > > > Mat Object: (fieldsplit_u_) 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=184326, bs=3 > > > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > > > total number of mallocs used during MatSetValues calls =0 > > > > using I-node (on process 0) routines: found 26829 nodes, > limit used is 5 > > > > KSP solver for S = A11 - A10 inv(A00) A01 > > > > KSP Object: (fieldsplit_lu_) 2 MPI processes > > > > type: preonly > > > > maximum iterations=10000, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (fieldsplit_lu_) 2 MPI processes > > > > type: lu > > > > LU: out-of-place factorization > > > > tolerance for zero pivot 2.22045e-14 > > > > matrix ordering: natural > > > > factor fill ratio given 0, needed 0 > > > > Factored matrix follows: > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=2583, cols=2583 > > > > package used to perform factorization: mumps > > > > total: nonzeros=2.17621e+06, allocated > nonzeros=2.17621e+06 > > > > total number of mallocs used during MatSetValues > calls =0 > > > > MUMPS run parameters: > > > > SYM (matrix type): 0 > > > > PAR (host participation): 1 > > > > ICNTL(1) (output for error): 6 > > > > ICNTL(2) (output of diagnostic msg): 0 > > > > ICNTL(3) (output for global info): 0 > > > > ICNTL(4) (level of printing): 0 > > > > ICNTL(5) (input mat struct): 0 > > > > ICNTL(6) (matrix prescaling): 7 > > > > ICNTL(7) (sequentia matrix ordering):7 > > > > ICNTL(8) (scalling strategy): 77 > > > > ICNTL(10) (max num of refinements): 0 > > > > ICNTL(11) (error analysis): 0 > > > > ICNTL(12) (efficiency control): > 1 > > > > ICNTL(13) (efficiency control): > 0 > > > > ICNTL(14) (percentage of estimated workspace > increase): 20 > > > > ICNTL(18) (input mat struct): > 3 > > > > ICNTL(19) (Shur complement info): > 0 > > > > ICNTL(20) (rhs sparse pattern): > 0 > > > > ICNTL(21) (solution struct): > 1 > > > > ICNTL(22) (in-core/out-of-core facility): > 0 > > > > ICNTL(23) (max size of memory can be allocated > locally):0 > > > > ICNTL(24) (detection of null pivot rows): > 0 > > > > ICNTL(25) (computation of a null space basis): > 0 > > > > ICNTL(26) (Schur options for rhs or solution): > 0 > > > > ICNTL(27) (experimental parameter): > -24 > > > > ICNTL(28) (use parallel or sequential > ordering): 1 > > > > ICNTL(29) (parallel ordering): > 0 > > > > ICNTL(30) (user-specified set of entries in > inv(A)): 0 > > > > ICNTL(31) (factors is discarded in the solve > phase): 0 > > > > ICNTL(33) (compute determinant): > 0 > > > > CNTL(1) (relative pivoting threshold): 0.01 > > > > CNTL(2) (stopping criterion of refinement): > 1.49012e-08 > > > > CNTL(3) (absolute pivoting threshold): 0 > > > > CNTL(4) (value of static pivoting): -1 > > > > CNTL(5) (fixation for null pivots): 0 > > > > RINFO(1) (local estimated flops for the > elimination after analysis): > > > > [0] 5.12794e+08 > > > > [1] 5.02142e+08 > > > > RINFO(2) (local estimated flops for the assembly > after factorization): > > > > [0] 815031 > > > > [1] 745263 > > > > RINFO(3) (local estimated flops for the > elimination after factorization): > > > > [0] 5.12794e+08 > > > > [1] 5.02142e+08 > > > > INFO(15) (estimated size of (in MB) MUMPS > internal data for running numerical factorization): > > > > [0] 34 > > > > [1] 34 > > > > INFO(16) (size of (in MB) MUMPS internal data > used during numerical factorization): > > > > [0] 34 > > > > [1] 34 > > > > INFO(23) (num of pivots eliminated on this > processor after factorization): > > > > [0] 1158 > > > > [1] 1425 > > > > RINFOG(1) (global estimated flops for the > elimination after analysis): 1.01494e+09 > > > > RINFOG(2) (global estimated flops for the > assembly after factorization): 1.56029e+06 > > > > RINFOG(3) (global estimated flops for the > elimination after factorization): 1.01494e+09 > > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) > (determinant): (0,0)*(2^0) > > > > INFOG(3) (estimated real workspace for factors > on all processors after analysis): 2176209 > > > > INFOG(4) (estimated integer workspace for > factors on all processors after analysis): 14427 > > > > INFOG(5) (estimated maximum front size in the > complete tree): 699 > > > > INFOG(6) (number of nodes in the complete tree): > 15 > > > > INFOG(7) (ordering option effectively use after > analysis): 2 > > > > INFOG(8) (structural symmetry in percent of the > permuted matrix after analysis): 100 > > > > INFOG(9) (total real/complex workspace to store > the matrix factors after factorization): 2176209 > > > > INFOG(10) (total integer space store the matrix > factors after factorization): 14427 > > > > INFOG(11) (order of largest frontal matrix after > factorization): 699 > > > > INFOG(12) (number of off-diagonal pivots): 0 > > > > INFOG(13) (number of delayed pivots after > factorization): 0 > > > > INFOG(14) (number of memory compress after > factorization): 0 > > > > INFOG(15) (number of steps of iterative > refinement after solution): 0 > > > > INFOG(16) (estimated size (in MB) of all MUMPS > internal data for factorization after analysis: value on the most memory > consuming processor): 34 > > > > INFOG(17) (estimated size of all MUMPS internal > data for factorization after analysis: sum over all processors): 68 > > > > INFOG(18) (size of all MUMPS internal data > allocated during factorization: value on the most memory consuming > processor): 34 > > > > INFOG(19) (size of all MUMPS internal data > allocated during factorization: sum over all processors): 68 > > > > INFOG(20) (estimated number of entries in the > factors): 2176209 > > > > INFOG(21) (size in MB of memory effectively used > during factorization - value on the most memory consuming processor): 30 > > > > INFOG(22) (size in MB of memory effectively used > during factorization - sum over all processors): 59 > > > > INFOG(23) (after analysis: value of ICNTL(6) > effectively used): 0 > > > > INFOG(24) (after analysis: value of ICNTL(12) > effectively used): 1 > > > > INFOG(25) (after factorization: number of pivots > modified by static pivoting): 0 > > > > INFOG(28) (after factorization: number of null > pivots encountered): 0 > > > > INFOG(29) (after factorization: effective number > of entries in the factors (sum over all processors)): 2176209 > > > > INFOG(30, 31) (after solution: size in Mbytes of > memory used during solution phase): 16, 32 > > > > INFOG(32) (after analysis: type of analysis > done): 1 > > > > INFOG(33) (value used for ICNTL(8)): 7 > > > > INFOG(34) (exponent of the determinant if > determinant is requested): 0 > > > > linear system matrix followed by preconditioner matrix: > > > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > > > type: schurcomplement > > > > rows=2583, cols=2583 > > > > Schur complement A11 - A10 inv(A00) A01 > > > > A11 > > > > Mat Object: (fieldsplit_lu_) > 2 MPI processes > > > > type: mpiaij > > > > rows=2583, cols=2583, bs=3 > > > > total: nonzeros=117369, allocated nonzeros=117369 > > > > total number of mallocs used during MatSetValues > calls =0 > > > > not using I-node (on process 0) routines > > > > A10 > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=2583, cols=184326, rbs=3, cbs = 1 > > > > total: nonzeros=292770, allocated nonzeros=292770 > > > > total number of mallocs used during MatSetValues > calls =0 > > > > not using I-node (on process 0) routines > > > > KSP of A00 > > > > KSP Object: (fieldsplit_u_) > 2 MPI processes > > > > type: preonly > > > > maximum iterations=10000, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (fieldsplit_u_) > 2 MPI processes > > > > type: lu > > > > LU: out-of-place factorization > > > > tolerance for zero pivot 2.22045e-14 > > > > matrix ordering: natural > > > > factor fill ratio given 0, needed 0 > > > > Factored matrix follows: > > > > Mat Object: 2 MPI > processes > > > > type: mpiaij > > > > rows=184326, cols=184326 > > > > package used to perform factorization: mumps > > > > total: nonzeros=4.03041e+08, allocated > nonzeros=4.03041e+08 > > > > total number of mallocs used during > MatSetValues calls =0 > > > > MUMPS run parameters: > > > > SYM (matrix type): 0 > > > > PAR (host participation): 1 > > > > ICNTL(1) (output for error): 6 > > > > ICNTL(2) (output of diagnostic msg): 0 > > > > ICNTL(3) (output for global info): 0 > > > > ICNTL(4) (level of printing): 0 > > > > ICNTL(5) (input mat struct): 0 > > > > ICNTL(6) (matrix prescaling): 7 > > > > ICNTL(7) (sequentia matrix ordering):7 > > > > ICNTL(8) (scalling strategy): 77 > > > > ICNTL(10) (max num of refinements): 0 > > > > ICNTL(11) (error analysis): 0 > > > > ICNTL(12) (efficiency control): > 1 > > > > ICNTL(13) (efficiency control): > 0 > > > > ICNTL(14) (percentage of estimated > workspace increase): 20 > > > > ICNTL(18) (input mat struct): > 3 > > > > ICNTL(19) (Shur complement info): > 0 > > > > ICNTL(20) (rhs sparse pattern): > 0 > > > > ICNTL(21) (solution struct): > 1 > > > > ICNTL(22) (in-core/out-of-core > facility): 0 > > > > ICNTL(23) (max size of memory can be > allocated locally):0 > > > > ICNTL(24) (detection of null pivot > rows): 0 > > > > ICNTL(25) (computation of a null space > basis): 0 > > > > ICNTL(26) (Schur options for rhs or > solution): 0 > > > > ICNTL(27) (experimental parameter): > -24 > > > > ICNTL(28) (use parallel or sequential > ordering): 1 > > > > ICNTL(29) (parallel ordering): > 0 > > > > ICNTL(30) (user-specified set of entries > in inv(A)): 0 > > > > ICNTL(31) (factors is discarded in the > solve phase): 0 > > > > ICNTL(33) (compute determinant): > 0 > > > > CNTL(1) (relative pivoting threshold): > 0.01 > > > > CNTL(2) (stopping criterion of > refinement): 1.49012e-08 > > > > CNTL(3) (absolute pivoting threshold): > 0 > > > > CNTL(4) (value of static pivoting): > -1 > > > > CNTL(5) (fixation for null pivots): > 0 > > > > RINFO(1) (local estimated flops for the > elimination after analysis): > > > > [0] 5.59214e+11 > > > > [1] 5.35237e+11 > > > > RINFO(2) (local estimated flops for the > assembly after factorization): > > > > [0] 4.2839e+08 > > > > [1] 3.799e+08 > > > > RINFO(3) (local estimated flops for the > elimination after factorization): > > > > [0] 5.59214e+11 > > > > [1] 5.35237e+11 > > > > INFO(15) (estimated size of (in MB) > MUMPS internal data for running numerical factorization): > > > > [0] 2621 > > > > [1] 2649 > > > > INFO(16) (size of (in MB) MUMPS internal > data used during numerical factorization): > > > > [0] 2621 > > > > [1] 2649 > > > > INFO(23) (num of pivots eliminated on > this processor after factorization): > > > > [0] 90423 > > > > [1] 93903 > > > > RINFOG(1) (global estimated flops for > the elimination after analysis): 1.09445e+12 > > > > RINFOG(2) (global estimated flops for > the assembly after factorization): 8.0829e+08 > > > > RINFOG(3) (global estimated flops for > the elimination after factorization): 1.09445e+12 > > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) > (determinant): (0,0)*(2^0) > > > > INFOG(3) (estimated real workspace for > factors on all processors after analysis): 403041366 > > > > INFOG(4) (estimated integer workspace > for factors on all processors after analysis): 2265748 > > > > INFOG(5) (estimated maximum front size > in the complete tree): 6663 > > > > INFOG(6) (number of nodes in the > complete tree): 2812 > > > > INFOG(7) (ordering option effectively > use after analysis): 5 > > > > INFOG(8) (structural symmetry in percent > of the permuted matrix after analysis): 100 > > > > INFOG(9) (total real/complex workspace > to store the matrix factors after factorization): 403041366 > > > > INFOG(10) (total integer space store the > matrix factors after factorization): 2265766 > > > > INFOG(11) (order of largest frontal > matrix after factorization): 6663 > > > > INFOG(12) (number of off-diagonal > pivots): 0 > > > > INFOG(13) (number of delayed pivots > after factorization): 0 > > > > INFOG(14) (number of memory compress > after factorization): 0 > > > > INFOG(15) (number of steps of iterative > refinement after solution): 0 > > > > INFOG(16) (estimated size (in MB) of all > MUMPS internal data for factorization after analysis: value on the most > memory consuming processor): 2649 > > > > INFOG(17) (estimated size of all MUMPS > internal data for factorization after analysis: sum over all processors): > 5270 > > > > INFOG(18) (size of all MUMPS internal > data allocated during factorization: value on the most memory consuming > processor): 2649 > > > > INFOG(19) (size of all MUMPS internal > data allocated during factorization: sum over all processors): 5270 > > > > INFOG(20) (estimated number of entries > in the factors): 403041366 > > > > INFOG(21) (size in MB of memory > effectively used during factorization - value on the most memory consuming > processor): 2121 > > > > INFOG(22) (size in MB of memory > effectively used during factorization - sum over all processors): 4174 > > > > INFOG(23) (after analysis: value of > ICNTL(6) effectively used): 0 > > > > INFOG(24) (after analysis: value of > ICNTL(12) effectively used): 1 > > > > INFOG(25) (after factorization: number > of pivots modified by static pivoting): 0 > > > > INFOG(28) (after factorization: number > of null pivots encountered): 0 > > > > INFOG(29) (after factorization: > effective number of entries in the factors (sum over all processors)): > 403041366 > > > > INFOG(30, 31) (after solution: size in > Mbytes of memory used during solution phase): 2467, 4922 > > > > INFOG(32) (after analysis: type of > analysis done): 1 > > > > INFOG(33) (value used for ICNTL(8)): 7 > > > > INFOG(34) (exponent of the determinant > if determinant is requested): 0 > > > > linear system matrix = precond matrix: > > > > Mat Object: (fieldsplit_u_) > 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=184326, bs=3 > > > > total: nonzeros=3.32649e+07, allocated > nonzeros=3.32649e+07 > > > > total number of mallocs used during MatSetValues > calls =0 > > > > using I-node (on process 0) routines: found > 26829 nodes, limit used is 5 > > > > A01 > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=2583, rbs=3, cbs = 1 > > > > total: nonzeros=292770, allocated nonzeros=292770 > > > > total number of mallocs used during MatSetValues > calls =0 > > > > using I-node (on process 0) routines: found 16098 > nodes, limit used is 5 > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=2583, cols=2583, rbs=3, cbs = 1 > > > > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node (on process 0) routines > > > > linear system matrix = precond matrix: > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=186909, cols=186909 > > > > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > > > > total number of mallocs used during MatSetValues calls =0 > > > > using I-node (on process 0) routines: found 26829 nodes, limit > used is 5 > > > > KSPSolve completed > > > > > > > > > > > > Giang > > > > > > > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley > wrote: > > > > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui > wrote: > > > > Hello > > > > > > > > I'm solving an indefinite problem arising from mesh tying/contact > using Lagrange multiplier, the matrix has the form > > > > > > > > K = [A P^T > > > > P 0] > > > > > > > > I used the FIELDSPLIT preconditioner with one field is the main > variable (displacement) and the other field for dual variable (Lagrange > multiplier). The block size for each field is 3. According to the manual, I > first chose the preconditioner based on Schur complement to treat this > problem. > > > > > > > > > > > > For any solver question, please send us the output of > > > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > > > > However, I will comment below > > > > > > > > The parameters used for the solve is > > > > -ksp_type gmres > > > > > > > > You need 'fgmres' here with the options you have below. > > > > > > > > -ksp_max_it 300 > > > > -ksp_gmres_restart 300 > > > > -ksp_gmres_modifiedgramschmidt > > > > -pc_fieldsplit_type schur > > > > -pc_fieldsplit_schur_fact_type diag > > > > -pc_fieldsplit_schur_precondition selfp > > > > > > > > > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is > dense. Is there any reason to > > > > believe that is a good preconditioner for your problem? > > > > > > > > > > > > -pc_fieldsplit_detect_saddle_point > > > > -fieldsplit_u_pc_type hypre > > > > > > > > I would just use MUMPS here to start, especially if it works on the > whole problem. Same with the one below. > > > > > > > > Matt > > > > > > > > -fieldsplit_u_pc_hypre_type boomeramg > > > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > > > -fieldsplit_lu_pc_type hypre > > > > -fieldsplit_lu_pc_hypre_type boomeramg > > > > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > > > > > > > For the test case, a small problem is solved on 2 processes. Due to > the decomposition, the contact only happens in 1 proc, so the size of > Lagrange multiplier dofs on proc 0 is 0. > > > > > > > > 0: mIndexU.size(): 80490 > > > > 0: mIndexLU.size(): 0 > > > > 1: mIndexU.size(): 103836 > > > > 1: mIndexLU.size(): 2583 > > > > > > > > However, with this setup the solver takes very long at KSPSolve > before going to iteration, and the first iteration seems forever so I have > to stop the calculation. I guessed that the solver takes time to compute > the Schur complement, but according to the manual only the diagonal of A is > used to approximate the Schur complement, so it should not take long to > compute this. > > > > > > > > Note that I ran the same problem with direct solver (MUMPS) and it's > able to produce the valid results. The parameter for the solve is pretty > standard > > > > -ksp_type preonly > > > > -pc_type lu > > > > -pc_factor_mat_solver_package mumps > > > > > > > > Hence the matrix/rhs must not have any problem here. Do you have any > idea or suggestion for this case? > > > > > > > > > > > > Giang > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aurelien.ponte at ifremer.fr Sat Sep 17 02:58:47 2016 From: aurelien.ponte at ifremer.fr (Aurelien Ponte) Date: Sat, 17 Sep 2016 09:58:47 +0200 Subject: [petsc-users] 2D vector in 3D dmda In-Reply-To: <48F7A9AE-1659-4C4C-AF5B-00D532C26F5F@mcs.anl.gov> References: <57DC01DB.6080603@ifremer.fr> <48F7A9AE-1659-4C4C-AF5B-00D532C26F5F@mcs.anl.gov> Message-ID: <69f68ff8-6a54-a377-8aed-65f4044b1ed0@ifremer.fr> Thanks Barry for your answer ! I guess my concern is of the second type: By grid metric terms I meant essentially grid spacings which look like: dx(i,j), dy(i,j) and dz(k) where (i,j,k) are indices running along the 3 dimensions of the grid. Storing dx(i,j) into a 3D array seemed like a bit waste of memory to me but I must be wrong. The elliptic problem I am solving for is close to a poisson equation btw. I guess I can at least store dx and dy into a single dxy 3D array. Thanks again, Aurelien Le 16/09/16 ? 19:43, Barry Smith a ?crit : >> On Sep 16, 2016, at 9:29 AM, Aurelien PONTE wrote: >> >> Hi, >> >> I've started using petsc4py in order to solve a 3D problem (inversion of elliptic operator). >> I would like to store 2D metric terms describing the grid > > What do you mean by 2D metric terms describing the grid? > > Do you want to store something like a little dense 2d array for each grid point? If so create another 3D DA with a dof = the product of the dimensions of the little dense 2d array and then store the little dense 2d arrays at in a global vector obtained from that DA. > > Or is the grid uniform in one dimension and not uniform in the other two and hence you want to store the information about the non-uniformness in only a 2d array so as to not "waste" the redundant information in the third direction? Then I recommend just "waste" the redundant information in the third dimension; it is trivial compared to all the data you need to solve the problem. > > Or do you mean something else? > > Barry > >> I am working on but don't know >> how to do that given my domain is tiled in 3D directions: >> >> self.da = PETSc.DMDA().create([self.grid.Nx, self.grid.Ny, self.grid.Nz], >> stencil_width=2) >> >> I create my 3D vectors with, for example: >> >> self.Q = self.da.createGlobalVec() >> >> What am I supposed to do for a 2D vector? >> Is it a bad idea? >> >> thanks >> >> aurelien >> -- Aur?lien Ponte Tel: (+33) 2 98 22 40 73 Fax: (+33) 2 98 22 44 96 UMR 6523, IFREMER ZI de la Pointe du Diable CS 10070 29280 Plouzan? From bsmith at mcs.anl.gov Sat Sep 17 13:24:42 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 17 Sep 2016 13:24:42 -0500 Subject: [petsc-users] 2D vector in 3D dmda In-Reply-To: <69f68ff8-6a54-a377-8aed-65f4044b1ed0@ifremer.fr> References: <57DC01DB.6080603@ifremer.fr> <48F7A9AE-1659-4C4C-AF5B-00D532C26F5F@mcs.anl.gov> <69f68ff8-6a54-a377-8aed-65f4044b1ed0@ifremer.fr> Message-ID: <70CD8A67-892B-4332-90DD-919C4BC221DB@mcs.anl.gov> Understood. We don't have any direct way with DA for storing this information directly associated with you 3D DA. You need to figure out how to store it so that each process has access to the parts of the data that it needs; which may not be completely trivial. You could possibly use some 2d DMDA on sub communicators with suitable layouts to be accessible, you need to figure out the details., the dx[k] probably can be just stored on every process since it is 1d. Barry > On Sep 17, 2016, at 2:58 AM, Aurelien Ponte wrote: > > Thanks Barry for your answer ! > > I guess my concern is of the second type: > By grid metric terms I meant essentially grid spacings which look like: > dx(i,j), dy(i,j) and dz(k) where (i,j,k) are indices running along the 3 dimensions of the grid. > Storing dx(i,j) into a 3D array seemed like a bit waste of memory to me but I must > be wrong. The elliptic problem I am solving for is close to a poisson equation btw. > I guess I can at least store dx and dy into a single dxy 3D array. > > Thanks again, > > Aurelien > > > > Le 16/09/16 ? 19:43, Barry Smith a ?crit : >>> On Sep 16, 2016, at 9:29 AM, Aurelien PONTE wrote: >>> >>> Hi, >>> >>> I've started using petsc4py in order to solve a 3D problem (inversion of elliptic operator). >>> I would like to store 2D metric terms describing the grid >> >> What do you mean by 2D metric terms describing the grid? >> >> Do you want to store something like a little dense 2d array for each grid point? If so create another 3D DA with a dof = the product of the dimensions of the little dense 2d array and then store the little dense 2d arrays at in a global vector obtained from that DA. >> >> Or is the grid uniform in one dimension and not uniform in the other two and hence you want to store the information about the non-uniformness in only a 2d array so as to not "waste" the redundant information in the third direction? Then I recommend just "waste" the redundant information in the third dimension; it is trivial compared to all the data you need to solve the problem. >> >> Or do you mean something else? >> >> Barry >> >>> I am working on but don't know >>> how to do that given my domain is tiled in 3D directions: >>> >>> self.da = PETSc.DMDA().create([self.grid.Nx, self.grid.Ny, self.grid.Nz], >>> stencil_width=2) >>> >>> I create my 3D vectors with, for example: >>> >>> self.Q = self.da.createGlobalVec() >>> >>> What am I supposed to do for a 2D vector? >>> Is it a bad idea? >>> >>> thanks >>> >>> aurelien >>> > > > -- > Aur?lien Ponte > Tel: (+33) 2 98 22 40 73 > Fax: (+33) 2 98 22 44 96 > UMR 6523, IFREMER > ZI de la Pointe du Diable > CS 10070 > 29280 Plouzan? > From bsmith at mcs.anl.gov Sat Sep 17 13:28:49 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 17 Sep 2016 13:28:49 -0500 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: <19EEF686-0334-46CD-A25D-4DFCA2B5D94B@mcs.anl.gov> <76457EBE-4527-4A5A-B247-3D6AC8371F79@mcs.anl.gov> Message-ID: Found it. What you did seems reasonable to me. I will update the code in master. If it is suppose to explicitly return the Schur complement then it should explicitly return it. Note that explicitly computing the Schur complement for small problems is reasonable for testing things but likely never or almost never makes sense for large problems; it is just too expensive to compute and is dense so requires a lot of memory if it is large. Barry > On Sep 17, 2016, at 1:49 AM, Hoang Giang Bui wrote: > > I'm specifically looking into src/ksp/ksp/utils/schurm.c, petsc 3.7.3 > > The link is: > https://bitbucket.org/petsc/petsc/src/2077e624e7fbbda0ee00455afb91c6183e71919a/src/ksp/ksp/utils/schurm.c?at=v3.7.3&fileviewer=file-view-default#L548-557 > > Giang > > On Sat, Sep 17, 2016 at 1:44 AM, Barry Smith wrote: > > > On Sep 16, 2016, at 6:09 PM, Hoang Giang Bui wrote: > > > > Hi Barry > > > > You are right, using MatCreateAIJ() eliminates the first issue. Previously I ran the mpi code with one process so A,B,C,D is all MPIAIJ > > > > And how about the second issue, this error will always be thrown if A11 is nonzero, which is my case? > > > > Nevertheless, I would like to report my simple finding: I changed the part around line 552 to > > I'm sorry what file are you talking about? What version of PETSc? What other lines of code are around 552? I can't figure out where you are doing this. > > Barry > > > > > if (D) { > > ierr = MatAXPY(*S, -1.0, D, SUBSET_NONZERO_PATTERN);CHKERRQ(ierr); > > } > > > > I could get ex42 works with > > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > > parameters: > > mpirun -np 1 ex42 \ > > -stokes_ksp_monitor \ > > -stokes_ksp_type fgmres \ > > -stokes_pc_type fieldsplit \ > > -stokes_pc_fieldsplit_type schur \ > > -stokes_pc_fieldsplit_schur_fact_type full \ > > -stokes_pc_fieldsplit_schur_precondition full \ > > -stokes_fieldsplit_u_ksp_type preonly \ > > -stokes_fieldsplit_u_pc_type lu \ > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > -stokes_fieldsplit_p_ksp_type gmres \ > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > -stokes_fieldsplit_p_pc_type lu \ > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps \ > > > > Output: > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 1.651372938841e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.172753353368e+00 true resid norm 2.072348962892e-05 ||r(i)||/||b|| 3.588013522487e-04 > > 2 KSP preconditioned resid norm 3.931379526610e-13 true resid norm 1.878299731917e-16 ||r(i)||/||b|| 3.252041503665e-15 > > 1 KSP Residual norm 3.385960118582e-17 > > > > inner convergence is much better although 2 iterations (:-( ?? > > > > I also obtain the same convergence behavior for the problem with A11!=0 > > > > Please suggest if this makes sense, or I did something wrong. > > > > Giang > > > > On Fri, Sep 16, 2016 at 8:31 PM, Barry Smith wrote: > > > > Why is your C matrix an MPIAIJ matrix on one process? In general we recommend creating a SeqAIJ matrix for one process and MPIAIJ for multiple. You can use MatCreateAIJ() and it will always create the correct one. > > > > We could change the code as you suggest but I want to make sure that is the best solution in your case. > > > > Barry > > > > > > > > > On Sep 16, 2016, at 3:31 AM, Hoang Giang Bui wrote: > > > > > > Hi Matt > > > > > > I believed at line 523, src/ksp/ksp/utils/schurm.c > > > > > > ierr = MatMatMult(C, AinvB, MAT_INITIAL_MATRIX, fill, S);CHKERRQ(ierr); > > > > > > in my test case C is MPIAIJ and AinvB is SEQAIJ, hence it throws the error. > > > > > > In fact I guess there are two issues with it > > > line 521, ierr = MatConvert(AinvBd, MATAIJ, MAT_INITIAL_MATRIX, &AinvB);CHKERRQ(ierr); > > > shall we convert this to type of C matrix to ensure compatibility ? > > > > > > line 552, if(norm > PETSC_MACHINE_EPSILON) SETERRQ(PetscObjectComm((PetscObject) M), PETSC_ERR_SUP, "Not yet implemented for Schur complements with non-vanishing D"); > > > with this the Schur complement with A11!=0 will be aborted > > > > > > Giang > > > > > > On Thu, Sep 15, 2016 at 4:28 PM, Matthew Knepley wrote: > > > On Thu, Sep 15, 2016 at 9:07 AM, Hoang Giang Bui wrote: > > > Hi Matt > > > > > > Thanks for the comment. After looking carefully into the manual again, the key take away is that with selfp there is no option to compute the exact Schur, there are only two options to approximate the inv(A00) for selfp, which are lump and diag (diag by default). I misunderstood this previously. > > > > > > There is online manual entry mentioned about PC_FIELDSPLIT_SCHUR_PRE_FULL, which is not documented elsewhere in the offline manual. I tried to access that by setting > > > -pc_fieldsplit_schur_precondition full > > > > > > Yep, I wrote that specifically for testing, but its very slow so I did not document it to prevent people from complaining. > > > > > > but it gives the error > > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Arguments are incompatible > > > [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with B, seqaij > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > > > [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui Thu Sep 15 15:46:56 2016 > > > [0]PETSC ERROR: Configure options --with-shared-libraries --with-debugging=0 --with-pic --download-fblaslapack=yes --download-suitesparse --download-ptscotch=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes --download-mumps=yes --download-hypre=yes --download-ml=yes --download-pastix=yes --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 > > > [0]PETSC ERROR: #1 MatMatMult() line 9514 in /home/hbui/sw/petsc-3.7.3/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line 526 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c > > > [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > > > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > > [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > > > > > Please excuse me to insist on forming the exact Schur complement, but as you said, I would like to track down what creates problem in my code by starting from a very exact but ineffective solution. > > > > > > Sure, I understand. I do not understand how A can be MPI and B can be Seq. Do you know how that happens? > > > > > > Thanks, > > > > > > Matt > > > > > > Giang > > > > > > On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley wrote: > > > On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui wrote: > > > Dear Barry > > > > > > Thanks for the clarification. I got exactly what you said if the code changed to > > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > > Residual norms for stokes_ solve. > > > 0 KSP Residual norm 1.327791371202e-02 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > > 1 KSP Residual norm 3.997711925708e-17 > > > > > > but I guess we solve a different problem if B is used for the linear system. > > > > > > in addition, changed to > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > also works but inner iteration converged not in one iteration > > > > > > Residual norms for stokes_ solve. > > > 0 KSP Residual norm 1.327791371202e-02 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 > > > 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 > > > 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 > > > 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 > > > 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 > > > 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 > > > 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 > > > 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 > > > 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 > > > 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 > > > 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 > > > 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 > > > 1 KSP Residual norm 1.445282647127e-16 > > > > > > > > > Seem like zero pivot does not happen, but why the solver for Schur takes 13 steps if the preconditioner is direct solver? > > > > > > Look at the -ksp_view. I will bet that the default is to shift (add a multiple of the identity) the matrix instead of failing. This > > > gives an inexact PC, but as you see it can converge. > > > > > > Thanks, > > > > > > Matt > > > > > > > > > I also so tried another problem which I known does have a nonsingular Schur (at least A11 != 0) and it also have the same problem: 1 step outer convergence but multiple step inner convergence. > > > > > > Any ideas? > > > > > > Giang > > > > > > On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: > > > > > > Normally you'd be absolutely correct to expect convergence in one iteration. However in this example note the call > > > > > > ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); > > > > > > It is solving the linear system defined by A but building the preconditioner (i.e. the entire fieldsplit process) from a different matrix B. Since A is not B you should not expect convergence in one iteration. If you change the code to > > > > > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > > > > > you will see exactly what you expect, convergence in one iteration. > > > > > > Sorry about this, the example is lacking clarity and documentation its author obviously knew too well what he was doing that he didn't realize everyone else in the world would need more comments in the code. If you change the code to > > > > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > > > > it will stop without being able to build the preconditioner because LU factorization of the Sp matrix will result in a zero pivot. This is why this "auxiliary" matrix B is used to define the preconditioner instead of A. > > > > > > Barry > > > > > > > > > > > > > > > > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui wrote: > > > > > > > > Sorry I slept quite a while in this thread. Now I start to look at it again. In the last try, the previous setting doesn't work either (in fact diverge). So I would speculate if the Schur complement in my case is actually not invertible. It's also possible that the code is wrong somewhere. However, before looking at that, I want to understand thoroughly the settings for Schur complement > > > > > > > > I experimented ex42 with the settings: > > > > mpirun -np 1 ex42 \ > > > > -stokes_ksp_monitor \ > > > > -stokes_ksp_type fgmres \ > > > > -stokes_pc_type fieldsplit \ > > > > -stokes_pc_fieldsplit_type schur \ > > > > -stokes_pc_fieldsplit_schur_fact_type full \ > > > > -stokes_pc_fieldsplit_schur_precondition selfp \ > > > > -stokes_fieldsplit_u_ksp_type preonly \ > > > > -stokes_fieldsplit_u_pc_type lu \ > > > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > > > -stokes_fieldsplit_p_ksp_type gmres \ > > > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > > > -stokes_fieldsplit_p_pc_type lu \ > > > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > > > > > > > In my understanding, the solver should converge in 1 (outer) step. Execution gives: > > > > Residual norms for stokes_ solve. > > > > 0 KSP Residual norm 1.327791371202e-02 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > > > 1 KSP Residual norm 7.656238881621e-04 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > > > > 2 KSP Residual norm 9.895645456398e-06 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > > > > 3 KSP Residual norm 1.608206702571e-06 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > > > > 4 KSP Residual norm 2.453618999882e-07 > > > > Residual norms for stokes_fieldsplit_p_ solve. > > > > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > > > > 5 KSP Residual norm 4.230399264750e-08 > > > > > > > > Looks like the "selfp" does construct the Schur nicely. But does "full" really construct the full block preconditioner? > > > > > > > > Giang > > > > P/S: I'm also generating a smaller size of the previous problem for checking again. > > > > > > > > > > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley wrote: > > > > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui wrote: > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > > > believe that is a good preconditioner for your problem? > > > > > > > > This is the first approach to the problem, so I chose the most simple setting. Do you have any other recommendation? > > > > > > > > This is in no way the simplest PC. We need to make it simpler first. > > > > > > > > 1) Run on only 1 proc > > > > > > > > 2) Use -pc_fieldsplit_schur_fact_type full > > > > > > > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_true_residual > > > > > > > > This should converge in 1 outer iteration, but we will see how good your Schur complement preconditioner > > > > is for this problem. > > > > > > > > You need to start out from something you understand and then start making approximations. > > > > > > > > Matt > > > > > > > > For any solver question, please send us the output of > > > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > > > > I sent here the full output (after changed to fgmres), again it takes long at the first iteration but after that, it does not converge > > > > > > > > -ksp_type fgmres > > > > -ksp_max_it 300 > > > > -ksp_gmres_restart 300 > > > > -ksp_gmres_modifiedgramschmidt > > > > -pc_fieldsplit_type schur > > > > -pc_fieldsplit_schur_fact_type diag > > > > -pc_fieldsplit_schur_precondition selfp > > > > -pc_fieldsplit_detect_saddle_point > > > > -fieldsplit_u_ksp_type preonly > > > > -fieldsplit_u_pc_type lu > > > > -fieldsplit_u_pc_factor_mat_solver_package mumps > > > > -fieldsplit_lu_ksp_type preonly > > > > -fieldsplit_lu_pc_type lu > > > > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > > > > > > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > > > > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > > > > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > > > > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > > > > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > > > > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > > > > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > > > > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > > > > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > > > > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > > > > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > > > > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > > > > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > > > > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > > > > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > > > > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > > > > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > > > > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > > > > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > > > > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > > > > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > > > > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > > > > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > > > > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > > > > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > > > > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > > > > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > > > > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > > > > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > > > > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > > > > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > > > > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > > > > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > > > > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > > > > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > > > > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > > > > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > > > > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > > > > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > > > > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > > > > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > > > > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > > > > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > > > > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > > > > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > > > > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > > > > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > > > > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > > > > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > > > > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > > > > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > > > > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > > > > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > > > > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > > > > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > > > > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > > > > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > > > > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > > > > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > > > > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > > > > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > > > > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > > > > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > > > > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > > > > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > > > > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > > > > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > > > > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > > > > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > > > > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > > > > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > > > > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > > > > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > > > > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > > > > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > > > > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > > > > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > > > > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > > > > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > > > > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > > > > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > > > > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > > > > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > > > > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > > > > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > > > > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > > > > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > > > > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > > > > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > > > > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > > > > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > > > > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > > > > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > > > > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > > > > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > > > > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > > > > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > > > > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > > > > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > > > > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > > > > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > > > > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > > > > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > > > > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > > > > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > > > > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > > > > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > > > > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > > > > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > > > > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > > > > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > > > > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > > > > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > > > > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > > > > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > > > > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > > > > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > > > > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > > > > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > > > > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > > > > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > > > > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > > > > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > > > > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > > > > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > > > > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > > > > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > > > > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > > > > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > > > > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > > > > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > > > > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > > > > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > > > > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > > > > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > > > > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > > > > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > > > > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > > > > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > > > > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > > > > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > > > > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > > > > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > > > > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > > > > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > > > > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > > > > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > > > > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > > > > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > > > > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > > > > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > > > > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > > > > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > > > > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > > > > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > > > > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > > > > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > > > > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > > > > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > > > > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > > > > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > > > > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > > > > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > > > > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > > > > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > > > > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > > > > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > > > > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > > > > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > > > > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > > > > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > > > > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > > > > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > > > > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > > > > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > > > > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > > > > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > > > > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > > > > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > > > > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > > > > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > > > > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > > > > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > > > > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > > > > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > > > > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > > > > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > > > > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > > > > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > > > > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > > > > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > > > > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > > > > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > > > > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > > > > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > > > > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > > > > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > > > > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > > > > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > > > > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > > > > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > > > > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > > > > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > > > > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > > > > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > > > > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > > > > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > > > > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > > > > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > > > > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > > > > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > > > > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > > > > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > > > > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > > > > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > > > > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > > > > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > > > > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > > > > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > > > > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > > > > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > > > > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > > > > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > > > > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > > > > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > > > > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > > > > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > > > > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > > > > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > > > > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > > > > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > > > > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > > > > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > > > > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > > > > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > > > > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > > > > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > > > > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > > > > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > > > > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > > > > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > > > > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > > > > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > > > > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > > > > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > > > > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > > > > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > > > > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > > > > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > > > > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > > > > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > > > > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > > > > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > > > > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > > > > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > > > > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > > > > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > > > > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > > > > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > > > > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > > > > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > > > > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > > > > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > > > > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > > > > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > > > > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > > > > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > > > > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > > > > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > > > > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > > > > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > > > > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > > > > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > > > > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > > > > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > > > > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > > > > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > > > > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > > > > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > > > > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > > > > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > > > > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > > > > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > > > > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > > > > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > > > > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > > > > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > > > > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > > > > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > > > > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > > > > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > > > > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > > > > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > > > > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > > > > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > > > > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > > > > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > > > > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > > > > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > > > > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > > > > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > > > > Linear solve did not converge due to DIVERGED_ITS iterations 300 > > > > KSP Object: 2 MPI processes > > > > type: fgmres > > > > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > > > > GMRES: happy breakdown tolerance 1e-30 > > > > maximum iterations=300, initial guess is zero > > > > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > > > > right preconditioning > > > > using UNPRECONDITIONED norm type for convergence test > > > > PC Object: 2 MPI processes > > > > type: fieldsplit > > > > FieldSplit with Schur preconditioner, factorization DIAG > > > > Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse > > > > Split info: > > > > Split number 0 Defined by IS > > > > Split number 1 Defined by IS > > > > KSP solver for A00 block > > > > KSP Object: (fieldsplit_u_) 2 MPI processes > > > > type: preonly > > > > maximum iterations=10000, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (fieldsplit_u_) 2 MPI processes > > > > type: lu > > > > LU: out-of-place factorization > > > > tolerance for zero pivot 2.22045e-14 > > > > matrix ordering: natural > > > > factor fill ratio given 0, needed 0 > > > > Factored matrix follows: > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=184326 > > > > package used to perform factorization: mumps > > > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > > > total number of mallocs used during MatSetValues calls =0 > > > > MUMPS run parameters: > > > > SYM (matrix type): 0 > > > > PAR (host participation): 1 > > > > ICNTL(1) (output for error): 6 > > > > ICNTL(2) (output of diagnostic msg): 0 > > > > ICNTL(3) (output for global info): 0 > > > > ICNTL(4) (level of printing): 0 > > > > ICNTL(5) (input mat struct): 0 > > > > ICNTL(6) (matrix prescaling): 7 > > > > ICNTL(7) (sequentia matrix ordering):7 > > > > ICNTL(8) (scalling strategy): 77 > > > > ICNTL(10) (max num of refinements): 0 > > > > ICNTL(11) (error analysis): 0 > > > > ICNTL(12) (efficiency control): 1 > > > > ICNTL(13) (efficiency control): 0 > > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > > ICNTL(18) (input mat struct): 3 > > > > ICNTL(19) (Shur complement info): 0 > > > > ICNTL(20) (rhs sparse pattern): 0 > > > > ICNTL(21) (solution struct): 1 > > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > > ICNTL(24) (detection of null pivot rows): 0 > > > > ICNTL(25) (computation of a null space basis): 0 > > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > > ICNTL(27) (experimental parameter): -24 > > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > > ICNTL(29) (parallel ordering): 0 > > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > > ICNTL(33) (compute determinant): 0 > > > > CNTL(1) (relative pivoting threshold): 0.01 > > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > > CNTL(3) (absolute pivoting threshold): 0 > > > > CNTL(4) (value of static pivoting): -1 > > > > CNTL(5) (fixation for null pivots): 0 > > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > > [0] 5.59214e+11 > > > > [1] 5.35237e+11 > > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > > [0] 4.2839e+08 > > > > [1] 3.799e+08 > > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > > [0] 5.59214e+11 > > > > [1] 5.35237e+11 > > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > > [0] 2621 > > > > [1] 2649 > > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > > [0] 2621 > > > > [1] 2649 > > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > > [0] 90423 > > > > [1] 93903 > > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > > > INFOG(6) (number of nodes in the complete tree): 2812 > > > > INFOG(7) (ordering option effectively use after analysis): 5 > > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > > > INFOG(12) (number of off-diagonal pivots): 0 > > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > > INFOG(14) (number of memory compress after factorization): 0 > > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > > > INFOG(20) (estimated number of entries in the factors): 403041366 > > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > > > INFOG(32) (after analysis: type of analysis done): 1 > > > > INFOG(33) (value used for ICNTL(8)): 7 > > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > > linear system matrix = precond matrix: > > > > Mat Object: (fieldsplit_u_) 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=184326, bs=3 > > > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > > > total number of mallocs used during MatSetValues calls =0 > > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > > KSP solver for S = A11 - A10 inv(A00) A01 > > > > KSP Object: (fieldsplit_lu_) 2 MPI processes > > > > type: preonly > > > > maximum iterations=10000, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (fieldsplit_lu_) 2 MPI processes > > > > type: lu > > > > LU: out-of-place factorization > > > > tolerance for zero pivot 2.22045e-14 > > > > matrix ordering: natural > > > > factor fill ratio given 0, needed 0 > > > > Factored matrix follows: > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=2583, cols=2583 > > > > package used to perform factorization: mumps > > > > total: nonzeros=2.17621e+06, allocated nonzeros=2.17621e+06 > > > > total number of mallocs used during MatSetValues calls =0 > > > > MUMPS run parameters: > > > > SYM (matrix type): 0 > > > > PAR (host participation): 1 > > > > ICNTL(1) (output for error): 6 > > > > ICNTL(2) (output of diagnostic msg): 0 > > > > ICNTL(3) (output for global info): 0 > > > > ICNTL(4) (level of printing): 0 > > > > ICNTL(5) (input mat struct): 0 > > > > ICNTL(6) (matrix prescaling): 7 > > > > ICNTL(7) (sequentia matrix ordering):7 > > > > ICNTL(8) (scalling strategy): 77 > > > > ICNTL(10) (max num of refinements): 0 > > > > ICNTL(11) (error analysis): 0 > > > > ICNTL(12) (efficiency control): 1 > > > > ICNTL(13) (efficiency control): 0 > > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > > ICNTL(18) (input mat struct): 3 > > > > ICNTL(19) (Shur complement info): 0 > > > > ICNTL(20) (rhs sparse pattern): 0 > > > > ICNTL(21) (solution struct): 1 > > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > > ICNTL(24) (detection of null pivot rows): 0 > > > > ICNTL(25) (computation of a null space basis): 0 > > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > > ICNTL(27) (experimental parameter): -24 > > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > > ICNTL(29) (parallel ordering): 0 > > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > > ICNTL(33) (compute determinant): 0 > > > > CNTL(1) (relative pivoting threshold): 0.01 > > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > > CNTL(3) (absolute pivoting threshold): 0 > > > > CNTL(4) (value of static pivoting): -1 > > > > CNTL(5) (fixation for null pivots): 0 > > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > > [0] 5.12794e+08 > > > > [1] 5.02142e+08 > > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > > [0] 815031 > > > > [1] 745263 > > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > > [0] 5.12794e+08 > > > > [1] 5.02142e+08 > > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > > [0] 34 > > > > [1] 34 > > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > > [0] 34 > > > > [1] 34 > > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > > [0] 1158 > > > > [1] 1425 > > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.01494e+09 > > > > RINFOG(2) (global estimated flops for the assembly after factorization): 1.56029e+06 > > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.01494e+09 > > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 2176209 > > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 14427 > > > > INFOG(5) (estimated maximum front size in the complete tree): 699 > > > > INFOG(6) (number of nodes in the complete tree): 15 > > > > INFOG(7) (ordering option effectively use after analysis): 2 > > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 2176209 > > > > INFOG(10) (total integer space store the matrix factors after factorization): 14427 > > > > INFOG(11) (order of largest frontal matrix after factorization): 699 > > > > INFOG(12) (number of off-diagonal pivots): 0 > > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > > INFOG(14) (number of memory compress after factorization): 0 > > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 34 > > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 68 > > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 34 > > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 68 > > > > INFOG(20) (estimated number of entries in the factors): 2176209 > > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 30 > > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 59 > > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 2176209 > > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 16, 32 > > > > INFOG(32) (after analysis: type of analysis done): 1 > > > > INFOG(33) (value used for ICNTL(8)): 7 > > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > > linear system matrix followed by preconditioner matrix: > > > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > > > type: schurcomplement > > > > rows=2583, cols=2583 > > > > Schur complement A11 - A10 inv(A00) A01 > > > > A11 > > > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > > > type: mpiaij > > > > rows=2583, cols=2583, bs=3 > > > > total: nonzeros=117369, allocated nonzeros=117369 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node (on process 0) routines > > > > A10 > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=2583, cols=184326, rbs=3, cbs = 1 > > > > total: nonzeros=292770, allocated nonzeros=292770 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node (on process 0) routines > > > > KSP of A00 > > > > KSP Object: (fieldsplit_u_) 2 MPI processes > > > > type: preonly > > > > maximum iterations=10000, initial guess is zero > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > > left preconditioning > > > > using NONE norm type for convergence test > > > > PC Object: (fieldsplit_u_) 2 MPI processes > > > > type: lu > > > > LU: out-of-place factorization > > > > tolerance for zero pivot 2.22045e-14 > > > > matrix ordering: natural > > > > factor fill ratio given 0, needed 0 > > > > Factored matrix follows: > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=184326 > > > > package used to perform factorization: mumps > > > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > > > total number of mallocs used during MatSetValues calls =0 > > > > MUMPS run parameters: > > > > SYM (matrix type): 0 > > > > PAR (host participation): 1 > > > > ICNTL(1) (output for error): 6 > > > > ICNTL(2) (output of diagnostic msg): 0 > > > > ICNTL(3) (output for global info): 0 > > > > ICNTL(4) (level of printing): 0 > > > > ICNTL(5) (input mat struct): 0 > > > > ICNTL(6) (matrix prescaling): 7 > > > > ICNTL(7) (sequentia matrix ordering):7 > > > > ICNTL(8) (scalling strategy): 77 > > > > ICNTL(10) (max num of refinements): 0 > > > > ICNTL(11) (error analysis): 0 > > > > ICNTL(12) (efficiency control): 1 > > > > ICNTL(13) (efficiency control): 0 > > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > > ICNTL(18) (input mat struct): 3 > > > > ICNTL(19) (Shur complement info): 0 > > > > ICNTL(20) (rhs sparse pattern): 0 > > > > ICNTL(21) (solution struct): 1 > > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > > ICNTL(24) (detection of null pivot rows): 0 > > > > ICNTL(25) (computation of a null space basis): 0 > > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > > ICNTL(27) (experimental parameter): -24 > > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > > ICNTL(29) (parallel ordering): 0 > > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > > ICNTL(33) (compute determinant): 0 > > > > CNTL(1) (relative pivoting threshold): 0.01 > > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > > CNTL(3) (absolute pivoting threshold): 0 > > > > CNTL(4) (value of static pivoting): -1 > > > > CNTL(5) (fixation for null pivots): 0 > > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > > [0] 5.59214e+11 > > > > [1] 5.35237e+11 > > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > > [0] 4.2839e+08 > > > > [1] 3.799e+08 > > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > > [0] 5.59214e+11 > > > > [1] 5.35237e+11 > > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > > [0] 2621 > > > > [1] 2649 > > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > > [0] 2621 > > > > [1] 2649 > > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > > [0] 90423 > > > > [1] 93903 > > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > > > INFOG(6) (number of nodes in the complete tree): 2812 > > > > INFOG(7) (ordering option effectively use after analysis): 5 > > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > > > INFOG(12) (number of off-diagonal pivots): 0 > > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > > INFOG(14) (number of memory compress after factorization): 0 > > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > > > INFOG(20) (estimated number of entries in the factors): 403041366 > > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > > > INFOG(32) (after analysis: type of analysis done): 1 > > > > INFOG(33) (value used for ICNTL(8)): 7 > > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > > linear system matrix = precond matrix: > > > > Mat Object: (fieldsplit_u_) 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=184326, bs=3 > > > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > > > total number of mallocs used during MatSetValues calls =0 > > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > > A01 > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=184326, cols=2583, rbs=3, cbs = 1 > > > > total: nonzeros=292770, allocated nonzeros=292770 > > > > total number of mallocs used during MatSetValues calls =0 > > > > using I-node (on process 0) routines: found 16098 nodes, limit used is 5 > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=2583, cols=2583, rbs=3, cbs = 1 > > > > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > > > > total number of mallocs used during MatSetValues calls =0 > > > > not using I-node (on process 0) routines > > > > linear system matrix = precond matrix: > > > > Mat Object: 2 MPI processes > > > > type: mpiaij > > > > rows=186909, cols=186909 > > > > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > > > > total number of mallocs used during MatSetValues calls =0 > > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > > KSPSolve completed > > > > > > > > > > > > Giang > > > > > > > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley wrote: > > > > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui wrote: > > > > Hello > > > > > > > > I'm solving an indefinite problem arising from mesh tying/contact using Lagrange multiplier, the matrix has the form > > > > > > > > K = [A P^T > > > > P 0] > > > > > > > > I used the FIELDSPLIT preconditioner with one field is the main variable (displacement) and the other field for dual variable (Lagrange multiplier). The block size for each field is 3. According to the manual, I first chose the preconditioner based on Schur complement to treat this problem. > > > > > > > > > > > > For any solver question, please send us the output of > > > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > > > > However, I will comment below > > > > > > > > The parameters used for the solve is > > > > -ksp_type gmres > > > > > > > > You need 'fgmres' here with the options you have below. > > > > > > > > -ksp_max_it 300 > > > > -ksp_gmres_restart 300 > > > > -ksp_gmres_modifiedgramschmidt > > > > -pc_fieldsplit_type schur > > > > -pc_fieldsplit_schur_fact_type diag > > > > -pc_fieldsplit_schur_precondition selfp > > > > > > > > > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > > > believe that is a good preconditioner for your problem? > > > > > > > > > > > > -pc_fieldsplit_detect_saddle_point > > > > -fieldsplit_u_pc_type hypre > > > > > > > > I would just use MUMPS here to start, especially if it works on the whole problem. Same with the one below. > > > > > > > > Matt > > > > > > > > -fieldsplit_u_pc_hypre_type boomeramg > > > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > > > -fieldsplit_lu_pc_type hypre > > > > -fieldsplit_lu_pc_hypre_type boomeramg > > > > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > > > > > > > For the test case, a small problem is solved on 2 processes. Due to the decomposition, the contact only happens in 1 proc, so the size of Lagrange multiplier dofs on proc 0 is 0. > > > > > > > > 0: mIndexU.size(): 80490 > > > > 0: mIndexLU.size(): 0 > > > > 1: mIndexU.size(): 103836 > > > > 1: mIndexLU.size(): 2583 > > > > > > > > However, with this setup the solver takes very long at KSPSolve before going to iteration, and the first iteration seems forever so I have to stop the calculation. I guessed that the solver takes time to compute the Schur complement, but according to the manual only the diagonal of A is used to approximate the Schur complement, so it should not take long to compute this. > > > > > > > > Note that I ran the same problem with direct solver (MUMPS) and it's able to produce the valid results. The parameter for the solve is pretty standard > > > > -ksp_type preonly > > > > -pc_type lu > > > > -pc_factor_mat_solver_package mumps > > > > > > > > Hence the matrix/rhs must not have any problem here. Do you have any idea or suggestion for this case? > > > > > > > > > > > > Giang > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > From aurelien.ponte at ifremer.fr Sun Sep 18 13:36:59 2016 From: aurelien.ponte at ifremer.fr (Aurelien Ponte) Date: Sun, 18 Sep 2016 20:36:59 +0200 Subject: [petsc-users] 2D vector in 3D dmda In-Reply-To: <70CD8A67-892B-4332-90DD-919C4BC221DB@mcs.anl.gov> References: <57DC01DB.6080603@ifremer.fr> <48F7A9AE-1659-4C4C-AF5B-00D532C26F5F@mcs.anl.gov> <69f68ff8-6a54-a377-8aed-65f4044b1ed0@ifremer.fr> <70CD8A67-892B-4332-90DD-919C4BC221DB@mcs.anl.gov> Message-ID: Ok, thanks again, I am a very young petsc user and even though I am familiar with MPI I'll risk myself doing something like that. I'll stick with the easy-less-memory-efficient method for now. cheers aurelien Le 17/09/16 ? 20:24, Barry Smith a ?crit : > Understood. We don't have any direct way with DA for storing this information directly associated with you 3D DA. > > You need to figure out how to store it so that each process has access to the parts of the data that it needs; which may not be completely trivial. You could possibly use some 2d DMDA on sub communicators with suitable layouts to be accessible, you need to figure out the details., the dx[k] probably can be just stored on every process since it is 1d. > > Barry > >> On Sep 17, 2016, at 2:58 AM, Aurelien Ponte wrote: >> >> Thanks Barry for your answer ! >> >> I guess my concern is of the second type: >> By grid metric terms I meant essentially grid spacings which look like: >> dx(i,j), dy(i,j) and dz(k) where (i,j,k) are indices running along the 3 dimensions of the grid. >> Storing dx(i,j) into a 3D array seemed like a bit waste of memory to me but I must >> be wrong. The elliptic problem I am solving for is close to a poisson equation btw. >> I guess I can at least store dx and dy into a single dxy 3D array. >> >> Thanks again, >> >> Aurelien >> >> >> >> Le 16/09/16 ? 19:43, Barry Smith a ?crit : >>>> On Sep 16, 2016, at 9:29 AM, Aurelien PONTE wrote: >>>> >>>> Hi, >>>> >>>> I've started using petsc4py in order to solve a 3D problem (inversion of elliptic operator). >>>> I would like to store 2D metric terms describing the grid >>> What do you mean by 2D metric terms describing the grid? >>> >>> Do you want to store something like a little dense 2d array for each grid point? If so create another 3D DA with a dof = the product of the dimensions of the little dense 2d array and then store the little dense 2d arrays at in a global vector obtained from that DA. >>> >>> Or is the grid uniform in one dimension and not uniform in the other two and hence you want to store the information about the non-uniformness in only a 2d array so as to not "waste" the redundant information in the third direction? Then I recommend just "waste" the redundant information in the third dimension; it is trivial compared to all the data you need to solve the problem. >>> >>> Or do you mean something else? >>> >>> Barry >>> >>>> I am working on but don't know >>>> how to do that given my domain is tiled in 3D directions: >>>> >>>> self.da = PETSc.DMDA().create([self.grid.Nx, self.grid.Ny, self.grid.Nz], >>>> stencil_width=2) >>>> >>>> I create my 3D vectors with, for example: >>>> >>>> self.Q = self.da.createGlobalVec() >>>> >>>> What am I supposed to do for a 2D vector? >>>> Is it a bad idea? >>>> >>>> thanks >>>> >>>> aurelien >>>> >> >> -- >> Aur?lien Ponte >> Tel: (+33) 2 98 22 40 73 >> Fax: (+33) 2 98 22 44 96 >> UMR 6523, IFREMER >> ZI de la Pointe du Diable >> CS 10070 >> 29280 Plouzan? >> -- Aur?lien Ponte Tel: (+33) 2 98 22 40 73 Fax: (+33) 2 98 22 44 96 UMR 6523, IFREMER ZI de la Pointe du Diable CS 10070 29280 Plouzan? From bsmith at mcs.anl.gov Sun Sep 18 18:24:50 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 18 Sep 2016 18:24:50 -0500 Subject: [petsc-users] fieldsplit preconditioner for indefinite matrix In-Reply-To: References: <19EEF686-0334-46CD-A25D-4DFCA2B5D94B@mcs.anl.gov> Message-ID: > On Sep 16, 2016, at 6:09 PM, Hoang Giang Bui wrote: > > Hi Barry > > You are right, using MatCreateAIJ() eliminates the first issue. Previously I ran the mpi code with one process so A,B,C,D is all MPIAIJ > > And how about the second issue, this error will always be thrown if A11 is nonzero, which is my case? > > Nevertheless, I would like to report my simple finding: I changed the part around line 552 to > > if (D) { > ierr = MatAXPY(*S, -1.0, D, SUBSET_NONZERO_PATTERN);CHKERRQ(ierr); > } > > I could get ex42 works with > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > parameters: > mpirun -np 1 ex42 \ > -stokes_ksp_monitor \ > -stokes_ksp_type fgmres \ > -stokes_pc_type fieldsplit \ > -stokes_pc_fieldsplit_type schur \ > -stokes_pc_fieldsplit_schur_fact_type full \ > -stokes_pc_fieldsplit_schur_precondition full \ > -stokes_fieldsplit_u_ksp_type preonly \ > -stokes_fieldsplit_u_pc_type lu \ > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > -stokes_fieldsplit_p_ksp_type gmres \ > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > -stokes_fieldsplit_p_ksp_max_it 300 \ > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > -stokes_fieldsplit_p_pc_type lu \ > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps \ > > Output: > Residual norms for stokes_ solve. > 0 KSP Residual norm 1.327791371202e-02 > Residual norms for stokes_fieldsplit_p_ solve. > 0 KSP preconditioned resid norm 1.651372938841e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.172753353368e+00 true resid norm 2.072348962892e-05 ||r(i)||/||b|| 3.588013522487e-04 > 2 KSP preconditioned resid norm 3.931379526610e-13 true resid norm 1.878299731917e-16 ||r(i)||/||b|| 3.252041503665e-15 > 1 KSP Residual norm 3.385960118582e-17 > > inner convergence is much better although 2 iterations (:-( ?? If you run with -stokes_fieldsplit_p_pc_type svd it will take only a single inner iteration. It is because the factorization of the Schur complement is producing almost a zero pivot. Anyways, I have added your fix to computing the explicit Schur complement to master. Note that full is only for testing and understanding, it is totally inefficient for solving problems. Thanks Barry > > I also obtain the same convergence behavior for the problem with A11!=0 > > Please suggest if this makes sense, or I did something wrong. > > Giang > > On Fri, Sep 16, 2016 at 8:31 PM, Barry Smith wrote: > > Why is your C matrix an MPIAIJ matrix on one process? In general we recommend creating a SeqAIJ matrix for one process and MPIAIJ for multiple. You can use MatCreateAIJ() and it will always create the correct one. > > We could change the code as you suggest but I want to make sure that is the best solution in your case. > > Barry > > > > > On Sep 16, 2016, at 3:31 AM, Hoang Giang Bui wrote: > > > > Hi Matt > > > > I believed at line 523, src/ksp/ksp/utils/schurm.c > > > > ierr = MatMatMult(C, AinvB, MAT_INITIAL_MATRIX, fill, S);CHKERRQ(ierr); > > > > in my test case C is MPIAIJ and AinvB is SEQAIJ, hence it throws the error. > > > > In fact I guess there are two issues with it > > line 521, ierr = MatConvert(AinvBd, MATAIJ, MAT_INITIAL_MATRIX, &AinvB);CHKERRQ(ierr); > > shall we convert this to type of C matrix to ensure compatibility ? > > > > line 552, if(norm > PETSC_MACHINE_EPSILON) SETERRQ(PetscObjectComm((PetscObject) M), PETSC_ERR_SUP, "Not yet implemented for Schur complements with non-vanishing D"); > > with this the Schur complement with A11!=0 will be aborted > > > > Giang > > > > On Thu, Sep 15, 2016 at 4:28 PM, Matthew Knepley wrote: > > On Thu, Sep 15, 2016 at 9:07 AM, Hoang Giang Bui wrote: > > Hi Matt > > > > Thanks for the comment. After looking carefully into the manual again, the key take away is that with selfp there is no option to compute the exact Schur, there are only two options to approximate the inv(A00) for selfp, which are lump and diag (diag by default). I misunderstood this previously. > > > > There is online manual entry mentioned about PC_FIELDSPLIT_SCHUR_PRE_FULL, which is not documented elsewhere in the offline manual. I tried to access that by setting > > -pc_fieldsplit_schur_precondition full > > > > Yep, I wrote that specifically for testing, but its very slow so I did not document it to prevent people from complaining. > > > > but it gives the error > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Arguments are incompatible > > [0]PETSC ERROR: MatMatMult requires A, mpiaij, to be compatible with B, seqaij > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > > [0]PETSC ERROR: python on a arch-linux2-c-opt named bermuda by hbui Thu Sep 15 15:46:56 2016 > > [0]PETSC ERROR: Configure options --with-shared-libraries --with-debugging=0 --with-pic --download-fblaslapack=yes --download-suitesparse --download-ptscotch=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes --download-mumps=yes --download-hypre=yes --download-ml=yes --download-pastix=yes --with-mpi-dir=/opt/openmpi-1.10.1 --prefix=/home/hbui/opt/petsc-3.7.3 > > [0]PETSC ERROR: #1 MatMatMult() line 9514 in /home/hbui/sw/petsc-3.7.3/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 MatSchurComplementComputeExplicitOperator() line 526 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/utils/schurm.c > > [0]PETSC ERROR: #3 PCSetUp_FieldSplit() line 792 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/hbui/sw/petsc-3.7.3/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/hbui/sw/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > > > > Please excuse me to insist on forming the exact Schur complement, but as you said, I would like to track down what creates problem in my code by starting from a very exact but ineffective solution. > > > > Sure, I understand. I do not understand how A can be MPI and B can be Seq. Do you know how that happens? > > > > Thanks, > > > > Matt > > > > Giang > > > > On Thu, Sep 15, 2016 at 2:56 PM, Matthew Knepley wrote: > > On Thu, Sep 15, 2016 at 4:11 AM, Hoang Giang Bui wrote: > > Dear Barry > > > > Thanks for the clarification. I got exactly what you said if the code changed to > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > 1 KSP Residual norm 3.997711925708e-17 > > > > but I guess we solve a different problem if B is used for the linear system. > > > > in addition, changed to > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > also works but inner iteration converged not in one iteration > > > > Residual norms for stokes_ solve. > > 0 KSP Residual norm 1.327791371202e-02 > > Residual norms for stokes_fieldsplit_p_ solve. > > 0 KSP preconditioned resid norm 5.308049264070e+02 true resid norm 5.775755720828e-02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.853645192358e+02 true resid norm 1.537879609454e-02 ||r(i)||/||b|| 2.662646558801e-01 > > 2 KSP preconditioned resid norm 2.282724981527e+01 true resid norm 4.440700864158e-03 ||r(i)||/||b|| 7.688519180519e-02 > > 3 KSP preconditioned resid norm 3.114190504933e+00 true resid norm 8.474158485027e-04 ||r(i)||/||b|| 1.467194752449e-02 > > 4 KSP preconditioned resid norm 4.273258497986e-01 true resid norm 1.249911370496e-04 ||r(i)||/||b|| 2.164065502267e-03 > > 5 KSP preconditioned resid norm 2.548558490130e-02 true resid norm 8.428488734654e-06 ||r(i)||/||b|| 1.459287605301e-04 > > 6 KSP preconditioned resid norm 1.556370641259e-03 true resid norm 2.866605637380e-07 ||r(i)||/||b|| 4.963169801386e-06 > > 7 KSP preconditioned resid norm 2.324584224817e-05 true resid norm 6.975804113442e-09 ||r(i)||/||b|| 1.207773398083e-07 > > 8 KSP preconditioned resid norm 8.893330367907e-06 true resid norm 1.082096232921e-09 ||r(i)||/||b|| 1.873514541169e-08 > > 9 KSP preconditioned resid norm 6.563740470820e-07 true resid norm 2.212185528660e-10 ||r(i)||/||b|| 3.830123079274e-09 > > 10 KSP preconditioned resid norm 1.460372091709e-08 true resid norm 3.859545051902e-12 ||r(i)||/||b|| 6.682320441607e-11 > > 11 KSP preconditioned resid norm 1.041947844812e-08 true resid norm 2.364389912927e-12 ||r(i)||/||b|| 4.093645969827e-11 > > 12 KSP preconditioned resid norm 1.614713897816e-10 true resid norm 1.057061924974e-14 ||r(i)||/||b|| 1.830170762178e-13 > > 1 KSP Residual norm 1.445282647127e-16 > > > > > > Seem like zero pivot does not happen, but why the solver for Schur takes 13 steps if the preconditioner is direct solver? > > > > Look at the -ksp_view. I will bet that the default is to shift (add a multiple of the identity) the matrix instead of failing. This > > gives an inexact PC, but as you see it can converge. > > > > Thanks, > > > > Matt > > > > > > I also so tried another problem which I known does have a nonsingular Schur (at least A11 != 0) and it also have the same problem: 1 step outer convergence but multiple step inner convergence. > > > > Any ideas? > > > > Giang > > > > On Fri, Sep 9, 2016 at 1:04 AM, Barry Smith wrote: > > > > Normally you'd be absolutely correct to expect convergence in one iteration. However in this example note the call > > > > ierr = KSPSetOperators(ksp_S,A,B);CHKERRQ(ierr); > > > > It is solving the linear system defined by A but building the preconditioner (i.e. the entire fieldsplit process) from a different matrix B. Since A is not B you should not expect convergence in one iteration. If you change the code to > > > > ierr = KSPSetOperators(ksp_S,B,B);CHKERRQ(ierr); > > > > you will see exactly what you expect, convergence in one iteration. > > > > Sorry about this, the example is lacking clarity and documentation its author obviously knew too well what he was doing that he didn't realize everyone else in the world would need more comments in the code. If you change the code to > > > > ierr = KSPSetOperators(ksp_S,A,A);CHKERRQ(ierr); > > > > it will stop without being able to build the preconditioner because LU factorization of the Sp matrix will result in a zero pivot. This is why this "auxiliary" matrix B is used to define the preconditioner instead of A. > > > > Barry > > > > > > > > > > > On Sep 8, 2016, at 5:30 PM, Hoang Giang Bui wrote: > > > > > > Sorry I slept quite a while in this thread. Now I start to look at it again. In the last try, the previous setting doesn't work either (in fact diverge). So I would speculate if the Schur complement in my case is actually not invertible. It's also possible that the code is wrong somewhere. However, before looking at that, I want to understand thoroughly the settings for Schur complement > > > > > > I experimented ex42 with the settings: > > > mpirun -np 1 ex42 \ > > > -stokes_ksp_monitor \ > > > -stokes_ksp_type fgmres \ > > > -stokes_pc_type fieldsplit \ > > > -stokes_pc_fieldsplit_type schur \ > > > -stokes_pc_fieldsplit_schur_fact_type full \ > > > -stokes_pc_fieldsplit_schur_precondition selfp \ > > > -stokes_fieldsplit_u_ksp_type preonly \ > > > -stokes_fieldsplit_u_pc_type lu \ > > > -stokes_fieldsplit_u_pc_factor_mat_solver_package mumps \ > > > -stokes_fieldsplit_p_ksp_type gmres \ > > > -stokes_fieldsplit_p_ksp_monitor_true_residual \ > > > -stokes_fieldsplit_p_ksp_max_it 300 \ > > > -stokes_fieldsplit_p_ksp_rtol 1.0e-12 \ > > > -stokes_fieldsplit_p_ksp_gmres_restart 300 \ > > > -stokes_fieldsplit_p_ksp_gmres_modifiedgramschmidt \ > > > -stokes_fieldsplit_p_pc_type lu \ > > > -stokes_fieldsplit_p_pc_factor_mat_solver_package mumps > > > > > > In my understanding, the solver should converge in 1 (outer) step. Execution gives: > > > Residual norms for stokes_ solve. > > > 0 KSP Residual norm 1.327791371202e-02 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 0.000000000000e+00 true resid norm 0.000000000000e+00 ||r(i)||/||b|| -nan > > > 1 KSP Residual norm 7.656238881621e-04 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 1.512059266251e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 1.861905708091e-12 true resid norm 2.934589919911e-16 ||r(i)||/||b|| 2.934589919911e-16 > > > 2 KSP Residual norm 9.895645456398e-06 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.002531529083e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 6.388584944363e-12 true resid norm 1.961047000344e-15 ||r(i)||/||b|| 1.961047000344e-15 > > > 3 KSP Residual norm 1.608206702571e-06 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.004810086026e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 3.081350863773e-12 true resid norm 7.721720636293e-16 ||r(i)||/||b|| 7.721720636293e-16 > > > 4 KSP Residual norm 2.453618999882e-07 > > > Residual norms for stokes_fieldsplit_p_ solve. > > > 0 KSP preconditioned resid norm 3.000681887478e+03 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP preconditioned resid norm 3.909717465288e-12 true resid norm 1.156131245879e-15 ||r(i)||/||b|| 1.156131245879e-15 > > > 5 KSP Residual norm 4.230399264750e-08 > > > > > > Looks like the "selfp" does construct the Schur nicely. But does "full" really construct the full block preconditioner? > > > > > > Giang > > > P/S: I'm also generating a smaller size of the previous problem for checking again. > > > > > > > > > On Sun, Apr 17, 2016 at 3:16 PM, Matthew Knepley wrote: > > > On Sun, Apr 17, 2016 at 4:25 AM, Hoang Giang Bui wrote: > > > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > > believe that is a good preconditioner for your problem? > > > > > > This is the first approach to the problem, so I chose the most simple setting. Do you have any other recommendation? > > > > > > This is in no way the simplest PC. We need to make it simpler first. > > > > > > 1) Run on only 1 proc > > > > > > 2) Use -pc_fieldsplit_schur_fact_type full > > > > > > 3) Use -fieldsplit_lu_ksp_type gmres -fieldsplit_lu_ksp_monitor_true_residual > > > > > > This should converge in 1 outer iteration, but we will see how good your Schur complement preconditioner > > > is for this problem. > > > > > > You need to start out from something you understand and then start making approximations. > > > > > > Matt > > > > > > For any solver question, please send us the output of > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > I sent here the full output (after changed to fgmres), again it takes long at the first iteration but after that, it does not converge > > > > > > -ksp_type fgmres > > > -ksp_max_it 300 > > > -ksp_gmres_restart 300 > > > -ksp_gmres_modifiedgramschmidt > > > -pc_fieldsplit_type schur > > > -pc_fieldsplit_schur_fact_type diag > > > -pc_fieldsplit_schur_precondition selfp > > > -pc_fieldsplit_detect_saddle_point > > > -fieldsplit_u_ksp_type preonly > > > -fieldsplit_u_pc_type lu > > > -fieldsplit_u_pc_factor_mat_solver_package mumps > > > -fieldsplit_lu_ksp_type preonly > > > -fieldsplit_lu_pc_type lu > > > -fieldsplit_lu_pc_factor_mat_solver_package mumps > > > > > > 0 KSP unpreconditioned resid norm 3.037772453815e+06 true resid norm 3.037772453815e+06 ||r(i)||/||b|| 1.000000000000e+00 > > > 1 KSP unpreconditioned resid norm 3.024368791893e+06 true resid norm 3.024368791296e+06 ||r(i)||/||b|| 9.955876673705e-01 > > > 2 KSP unpreconditioned resid norm 3.008534454663e+06 true resid norm 3.008534454904e+06 ||r(i)||/||b|| 9.903751846607e-01 > > > 3 KSP unpreconditioned resid norm 4.633282412600e+02 true resid norm 4.607539866185e+02 ||r(i)||/||b|| 1.516749505184e-04 > > > 4 KSP unpreconditioned resid norm 4.630592911836e+02 true resid norm 4.605625897903e+02 ||r(i)||/||b|| 1.516119448683e-04 > > > 5 KSP unpreconditioned resid norm 2.145735509629e+02 true resid norm 2.111697416683e+02 ||r(i)||/||b|| 6.951466736857e-05 > > > 6 KSP unpreconditioned resid norm 2.145734219762e+02 true resid norm 2.112001242378e+02 ||r(i)||/||b|| 6.952466896346e-05 > > > 7 KSP unpreconditioned resid norm 1.892914067411e+02 true resid norm 1.831020928502e+02 ||r(i)||/||b|| 6.027511791420e-05 > > > 8 KSP unpreconditioned resid norm 1.892906351597e+02 true resid norm 1.831422357767e+02 ||r(i)||/||b|| 6.028833250718e-05 > > > 9 KSP unpreconditioned resid norm 1.891426729822e+02 true resid norm 1.835600473014e+02 ||r(i)||/||b|| 6.042587128964e-05 > > > 10 KSP unpreconditioned resid norm 1.891425181679e+02 true resid norm 1.855772578041e+02 ||r(i)||/||b|| 6.108991395027e-05 > > > 11 KSP unpreconditioned resid norm 1.891417382057e+02 true resid norm 1.833302669042e+02 ||r(i)||/||b|| 6.035023020699e-05 > > > 12 KSP unpreconditioned resid norm 1.891414749001e+02 true resid norm 1.827923591605e+02 ||r(i)||/||b|| 6.017315712076e-05 > > > 13 KSP unpreconditioned resid norm 1.891414702834e+02 true resid norm 1.849895606391e+02 ||r(i)||/||b|| 6.089645075515e-05 > > > 14 KSP unpreconditioned resid norm 1.891414687385e+02 true resid norm 1.852700958573e+02 ||r(i)||/||b|| 6.098879974523e-05 > > > 15 KSP unpreconditioned resid norm 1.891399614701e+02 true resid norm 1.817034334576e+02 ||r(i)||/||b|| 5.981469521503e-05 > > > 16 KSP unpreconditioned resid norm 1.891393964580e+02 true resid norm 1.823173574739e+02 ||r(i)||/||b|| 6.001679199012e-05 > > > 17 KSP unpreconditioned resid norm 1.890868604964e+02 true resid norm 1.834754811775e+02 ||r(i)||/||b|| 6.039803308740e-05 > > > 18 KSP unpreconditioned resid norm 1.888442703508e+02 true resid norm 1.852079421560e+02 ||r(i)||/||b|| 6.096833945658e-05 > > > 19 KSP unpreconditioned resid norm 1.888131521870e+02 true resid norm 1.810111295757e+02 ||r(i)||/||b|| 5.958679668335e-05 > > > 20 KSP unpreconditioned resid norm 1.888038471618e+02 true resid norm 1.814080717355e+02 ||r(i)||/||b|| 5.971746550920e-05 > > > 21 KSP unpreconditioned resid norm 1.885794485272e+02 true resid norm 1.843223565278e+02 ||r(i)||/||b|| 6.067681478129e-05 > > > 22 KSP unpreconditioned resid norm 1.884898771362e+02 true resid norm 1.842766260526e+02 ||r(i)||/||b|| 6.066176083110e-05 > > > 23 KSP unpreconditioned resid norm 1.884840498049e+02 true resid norm 1.813011285152e+02 ||r(i)||/||b|| 5.968226102238e-05 > > > 24 KSP unpreconditioned resid norm 1.884105698955e+02 true resid norm 1.811513025118e+02 ||r(i)||/||b|| 5.963294001309e-05 > > > 25 KSP unpreconditioned resid norm 1.881392557375e+02 true resid norm 1.835706567649e+02 ||r(i)||/||b|| 6.042936380386e-05 > > > 26 KSP unpreconditioned resid norm 1.881234481250e+02 true resid norm 1.843633799886e+02 ||r(i)||/||b|| 6.069031923609e-05 > > > 27 KSP unpreconditioned resid norm 1.852572648925e+02 true resid norm 1.791532195358e+02 ||r(i)||/||b|| 5.897519391579e-05 > > > 28 KSP unpreconditioned resid norm 1.852177694782e+02 true resid norm 1.800935543889e+02 ||r(i)||/||b|| 5.928474141066e-05 > > > 29 KSP unpreconditioned resid norm 1.844720976468e+02 true resid norm 1.806835899755e+02 ||r(i)||/||b|| 5.947897438749e-05 > > > 30 KSP unpreconditioned resid norm 1.843525447108e+02 true resid norm 1.811351238391e+02 ||r(i)||/||b|| 5.962761417881e-05 > > > 31 KSP unpreconditioned resid norm 1.834262885149e+02 true resid norm 1.778584233423e+02 ||r(i)||/||b|| 5.854896179565e-05 > > > 32 KSP unpreconditioned resid norm 1.833523213017e+02 true resid norm 1.773290649733e+02 ||r(i)||/||b|| 5.837470306591e-05 > > > 33 KSP unpreconditioned resid norm 1.821645929344e+02 true resid norm 1.781151248933e+02 ||r(i)||/||b|| 5.863346501467e-05 > > > 34 KSP unpreconditioned resid norm 1.820831279534e+02 true resid norm 1.789778939067e+02 ||r(i)||/||b|| 5.891747872094e-05 > > > 35 KSP unpreconditioned resid norm 1.814860919375e+02 true resid norm 1.757339506869e+02 ||r(i)||/||b|| 5.784960965928e-05 > > > 36 KSP unpreconditioned resid norm 1.812512010159e+02 true resid norm 1.764086437459e+02 ||r(i)||/||b|| 5.807171090922e-05 > > > 37 KSP unpreconditioned resid norm 1.804298150360e+02 true resid norm 1.780147196442e+02 ||r(i)||/||b|| 5.860041275333e-05 > > > 38 KSP unpreconditioned resid norm 1.799675012847e+02 true resid norm 1.780554543786e+02 ||r(i)||/||b|| 5.861382216269e-05 > > > 39 KSP unpreconditioned resid norm 1.793156052097e+02 true resid norm 1.747985717965e+02 ||r(i)||/||b|| 5.754169361071e-05 > > > 40 KSP unpreconditioned resid norm 1.789109248325e+02 true resid norm 1.734086984879e+02 ||r(i)||/||b|| 5.708416319009e-05 > > > 41 KSP unpreconditioned resid norm 1.788931581371e+02 true resid norm 1.766103879126e+02 ||r(i)||/||b|| 5.813812278494e-05 > > > 42 KSP unpreconditioned resid norm 1.785522436483e+02 true resid norm 1.762597032909e+02 ||r(i)||/||b|| 5.802268141233e-05 > > > 43 KSP unpreconditioned resid norm 1.783317950582e+02 true resid norm 1.752774080448e+02 ||r(i)||/||b|| 5.769932103530e-05 > > > 44 KSP unpreconditioned resid norm 1.782832982797e+02 true resid norm 1.741667594885e+02 ||r(i)||/||b|| 5.733370821430e-05 > > > 45 KSP unpreconditioned resid norm 1.781302427969e+02 true resid norm 1.760315735899e+02 ||r(i)||/||b|| 5.794758372005e-05 > > > 46 KSP unpreconditioned resid norm 1.780557458973e+02 true resid norm 1.757279911034e+02 ||r(i)||/||b|| 5.784764783244e-05 > > > 47 KSP unpreconditioned resid norm 1.774691940686e+02 true resid norm 1.729436852773e+02 ||r(i)||/||b|| 5.693108615167e-05 > > > 48 KSP unpreconditioned resid norm 1.771436357084e+02 true resid norm 1.734001323688e+02 ||r(i)||/||b|| 5.708134332148e-05 > > > 49 KSP unpreconditioned resid norm 1.756105727417e+02 true resid norm 1.740222172981e+02 ||r(i)||/||b|| 5.728612657594e-05 > > > 50 KSP unpreconditioned resid norm 1.756011794480e+02 true resid norm 1.736979026533e+02 ||r(i)||/||b|| 5.717936589858e-05 > > > 51 KSP unpreconditioned resid norm 1.751096154950e+02 true resid norm 1.713154407940e+02 ||r(i)||/||b|| 5.639508666256e-05 > > > 52 KSP unpreconditioned resid norm 1.712639990486e+02 true resid norm 1.684444278579e+02 ||r(i)||/||b|| 5.544998199137e-05 > > > 53 KSP unpreconditioned resid norm 1.710183053728e+02 true resid norm 1.692712952670e+02 ||r(i)||/||b|| 5.572217729951e-05 > > > 54 KSP unpreconditioned resid norm 1.655470115849e+02 true resid norm 1.631767858448e+02 ||r(i)||/||b|| 5.371593439788e-05 > > > 55 KSP unpreconditioned resid norm 1.648313805392e+02 true resid norm 1.617509396670e+02 ||r(i)||/||b|| 5.324656211951e-05 > > > 56 KSP unpreconditioned resid norm 1.643417766012e+02 true resid norm 1.614766932468e+02 ||r(i)||/||b|| 5.315628332992e-05 > > > 57 KSP unpreconditioned resid norm 1.643165564782e+02 true resid norm 1.611660297521e+02 ||r(i)||/||b|| 5.305401645527e-05 > > > 58 KSP unpreconditioned resid norm 1.639561245303e+02 true resid norm 1.616105878219e+02 ||r(i)||/||b|| 5.320035989496e-05 > > > 59 KSP unpreconditioned resid norm 1.636859175366e+02 true resid norm 1.601704798933e+02 ||r(i)||/||b|| 5.272629281109e-05 > > > 60 KSP unpreconditioned resid norm 1.633269681891e+02 true resid norm 1.603249334191e+02 ||r(i)||/||b|| 5.277713714789e-05 > > > 61 KSP unpreconditioned resid norm 1.633257086864e+02 true resid norm 1.602922744638e+02 ||r(i)||/||b|| 5.276638619280e-05 > > > 62 KSP unpreconditioned resid norm 1.629449737049e+02 true resid norm 1.605812790996e+02 ||r(i)||/||b|| 5.286152321842e-05 > > > 63 KSP unpreconditioned resid norm 1.629422151091e+02 true resid norm 1.589656479615e+02 ||r(i)||/||b|| 5.232967589850e-05 > > > 64 KSP unpreconditioned resid norm 1.624767340901e+02 true resid norm 1.601925152173e+02 ||r(i)||/||b|| 5.273354658809e-05 > > > 65 KSP unpreconditioned resid norm 1.614000473427e+02 true resid norm 1.600055285874e+02 ||r(i)||/||b|| 5.267199272497e-05 > > > 66 KSP unpreconditioned resid norm 1.599192711038e+02 true resid norm 1.602225820054e+02 ||r(i)||/||b|| 5.274344423136e-05 > > > 67 KSP unpreconditioned resid norm 1.562002802473e+02 true resid norm 1.582069452329e+02 ||r(i)||/||b|| 5.207991962471e-05 > > > 68 KSP unpreconditioned resid norm 1.552436010567e+02 true resid norm 1.584249134588e+02 ||r(i)||/||b|| 5.215167227548e-05 > > > 69 KSP unpreconditioned resid norm 1.507627069906e+02 true resid norm 1.530713322210e+02 ||r(i)||/||b|| 5.038933447066e-05 > > > 70 KSP unpreconditioned resid norm 1.503802419288e+02 true resid norm 1.526772130725e+02 ||r(i)||/||b|| 5.025959494786e-05 > > > 71 KSP unpreconditioned resid norm 1.483645684459e+02 true resid norm 1.509599328686e+02 ||r(i)||/||b|| 4.969428591633e-05 > > > 72 KSP unpreconditioned resid norm 1.481979533059e+02 true resid norm 1.535340885300e+02 ||r(i)||/||b|| 5.054166856281e-05 > > > 73 KSP unpreconditioned resid norm 1.481400704979e+02 true resid norm 1.509082933863e+02 ||r(i)||/||b|| 4.967728678847e-05 > > > 74 KSP unpreconditioned resid norm 1.481132272449e+02 true resid norm 1.513298398754e+02 ||r(i)||/||b|| 4.981605507858e-05 > > > 75 KSP unpreconditioned resid norm 1.481101708026e+02 true resid norm 1.502466334943e+02 ||r(i)||/||b|| 4.945947590828e-05 > > > 76 KSP unpreconditioned resid norm 1.481010335860e+02 true resid norm 1.533384206564e+02 ||r(i)||/||b|| 5.047725693339e-05 > > > 77 KSP unpreconditioned resid norm 1.480865328511e+02 true resid norm 1.508354096349e+02 ||r(i)||/||b|| 4.965329428986e-05 > > > 78 KSP unpreconditioned resid norm 1.480582653674e+02 true resid norm 1.493335938981e+02 ||r(i)||/||b|| 4.915891370027e-05 > > > 79 KSP unpreconditioned resid norm 1.480031554288e+02 true resid norm 1.505131104808e+02 ||r(i)||/||b|| 4.954719708903e-05 > > > 80 KSP unpreconditioned resid norm 1.479574822714e+02 true resid norm 1.540226621640e+02 ||r(i)||/||b|| 5.070250142355e-05 > > > 81 KSP unpreconditioned resid norm 1.479574535946e+02 true resid norm 1.498368142318e+02 ||r(i)||/||b|| 4.932456808727e-05 > > > 82 KSP unpreconditioned resid norm 1.479436001532e+02 true resid norm 1.512355315895e+02 ||r(i)||/||b|| 4.978500986785e-05 > > > 83 KSP unpreconditioned resid norm 1.479410419985e+02 true resid norm 1.513924042216e+02 ||r(i)||/||b|| 4.983665054686e-05 > > > 84 KSP unpreconditioned resid norm 1.477087197314e+02 true resid norm 1.519847216835e+02 ||r(i)||/||b|| 5.003163469095e-05 > > > 85 KSP unpreconditioned resid norm 1.477081559094e+02 true resid norm 1.507153721984e+02 ||r(i)||/||b|| 4.961377933660e-05 > > > 86 KSP unpreconditioned resid norm 1.476420890986e+02 true resid norm 1.512147907360e+02 ||r(i)||/||b|| 4.977818221576e-05 > > > 87 KSP unpreconditioned resid norm 1.476086929880e+02 true resid norm 1.508513380647e+02 ||r(i)||/||b|| 4.965853774704e-05 > > > 88 KSP unpreconditioned resid norm 1.475729830724e+02 true resid norm 1.521640656963e+02 ||r(i)||/||b|| 5.009067269183e-05 > > > 89 KSP unpreconditioned resid norm 1.472338605465e+02 true resid norm 1.506094588356e+02 ||r(i)||/||b|| 4.957891386713e-05 > > > 90 KSP unpreconditioned resid norm 1.472079944867e+02 true resid norm 1.504582871439e+02 ||r(i)||/||b|| 4.952914987262e-05 > > > 91 KSP unpreconditioned resid norm 1.469363056078e+02 true resid norm 1.506425446156e+02 ||r(i)||/||b|| 4.958980532804e-05 > > > 92 KSP unpreconditioned resid norm 1.469110799022e+02 true resid norm 1.509842019134e+02 ||r(i)||/||b|| 4.970227500870e-05 > > > 93 KSP unpreconditioned resid norm 1.468779696240e+02 true resid norm 1.501105195969e+02 ||r(i)||/||b|| 4.941466876770e-05 > > > 94 KSP unpreconditioned resid norm 1.468777757710e+02 true resid norm 1.491460779150e+02 ||r(i)||/||b|| 4.909718558007e-05 > > > 95 KSP unpreconditioned resid norm 1.468774588833e+02 true resid norm 1.519041612996e+02 ||r(i)||/||b|| 5.000511513258e-05 > > > 96 KSP unpreconditioned resid norm 1.468771672305e+02 true resid norm 1.508986277767e+02 ||r(i)||/||b|| 4.967410498018e-05 > > > 97 KSP unpreconditioned resid norm 1.468771086724e+02 true resid norm 1.500987040931e+02 ||r(i)||/||b|| 4.941077923878e-05 > > > 98 KSP unpreconditioned resid norm 1.468769529855e+02 true resid norm 1.509749203169e+02 ||r(i)||/||b|| 4.969921961314e-05 > > > 99 KSP unpreconditioned resid norm 1.468539019917e+02 true resid norm 1.505087391266e+02 ||r(i)||/||b|| 4.954575808916e-05 > > > 100 KSP unpreconditioned resid norm 1.468527260351e+02 true resid norm 1.519470484364e+02 ||r(i)||/||b|| 5.001923308823e-05 > > > 101 KSP unpreconditioned resid norm 1.468342327062e+02 true resid norm 1.489814197970e+02 ||r(i)||/||b|| 4.904298200804e-05 > > > 102 KSP unpreconditioned resid norm 1.468333201903e+02 true resid norm 1.491479405434e+02 ||r(i)||/||b|| 4.909779873608e-05 > > > 103 KSP unpreconditioned resid norm 1.468287736823e+02 true resid norm 1.496401088908e+02 ||r(i)||/||b|| 4.925981493540e-05 > > > 104 KSP unpreconditioned resid norm 1.468269778777e+02 true resid norm 1.509676608058e+02 ||r(i)||/||b|| 4.969682986500e-05 > > > 105 KSP unpreconditioned resid norm 1.468214752527e+02 true resid norm 1.500441644659e+02 ||r(i)||/||b|| 4.939282541636e-05 > > > 106 KSP unpreconditioned resid norm 1.468208033546e+02 true resid norm 1.510964155942e+02 ||r(i)||/||b|| 4.973921447094e-05 > > > 107 KSP unpreconditioned resid norm 1.467590018852e+02 true resid norm 1.512302088409e+02 ||r(i)||/||b|| 4.978325767980e-05 > > > 108 KSP unpreconditioned resid norm 1.467588908565e+02 true resid norm 1.501053278370e+02 ||r(i)||/||b|| 4.941295969963e-05 > > > 109 KSP unpreconditioned resid norm 1.467570731153e+02 true resid norm 1.485494378220e+02 ||r(i)||/||b|| 4.890077847519e-05 > > > 110 KSP unpreconditioned resid norm 1.467399860352e+02 true resid norm 1.504418099302e+02 ||r(i)||/||b|| 4.952372576205e-05 > > > 111 KSP unpreconditioned resid norm 1.467095654863e+02 true resid norm 1.507288583410e+02 ||r(i)||/||b|| 4.961821882075e-05 > > > 112 KSP unpreconditioned resid norm 1.467065865602e+02 true resid norm 1.517786399520e+02 ||r(i)||/||b|| 4.996379493842e-05 > > > 113 KSP unpreconditioned resid norm 1.466898232510e+02 true resid norm 1.491434236258e+02 ||r(i)||/||b|| 4.909631181838e-05 > > > 114 KSP unpreconditioned resid norm 1.466897921426e+02 true resid norm 1.505605420512e+02 ||r(i)||/||b|| 4.956281102033e-05 > > > 115 KSP unpreconditioned resid norm 1.466593121787e+02 true resid norm 1.500608650677e+02 ||r(i)||/||b|| 4.939832306376e-05 > > > 116 KSP unpreconditioned resid norm 1.466590894710e+02 true resid norm 1.503102560128e+02 ||r(i)||/||b|| 4.948041971478e-05 > > > 117 KSP unpreconditioned resid norm 1.465338856917e+02 true resid norm 1.501331730933e+02 ||r(i)||/||b|| 4.942212604002e-05 > > > 118 KSP unpreconditioned resid norm 1.464192893188e+02 true resid norm 1.505131429801e+02 ||r(i)||/||b|| 4.954720778744e-05 > > > 119 KSP unpreconditioned resid norm 1.463859793112e+02 true resid norm 1.504355712014e+02 ||r(i)||/||b|| 4.952167204377e-05 > > > 120 KSP unpreconditioned resid norm 1.459254939182e+02 true resid norm 1.526513923221e+02 ||r(i)||/||b|| 5.025109505170e-05 > > > 121 KSP unpreconditioned resid norm 1.456973020864e+02 true resid norm 1.496897691500e+02 ||r(i)||/||b|| 4.927616252562e-05 > > > 122 KSP unpreconditioned resid norm 1.456904663212e+02 true resid norm 1.488752755634e+02 ||r(i)||/||b|| 4.900804053853e-05 > > > 123 KSP unpreconditioned resid norm 1.449254956591e+02 true resid norm 1.494048196254e+02 ||r(i)||/||b|| 4.918236039628e-05 > > > 124 KSP unpreconditioned resid norm 1.448408616171e+02 true resid norm 1.507801939332e+02 ||r(i)||/||b|| 4.963511791142e-05 > > > 125 KSP unpreconditioned resid norm 1.447662934870e+02 true resid norm 1.495157701445e+02 ||r(i)||/||b|| 4.921888404010e-05 > > > 126 KSP unpreconditioned resid norm 1.446934748257e+02 true resid norm 1.511098625097e+02 ||r(i)||/||b|| 4.974364104196e-05 > > > 127 KSP unpreconditioned resid norm 1.446892504333e+02 true resid norm 1.493367018275e+02 ||r(i)||/||b|| 4.915993679512e-05 > > > 128 KSP unpreconditioned resid norm 1.446838883996e+02 true resid norm 1.510097796622e+02 ||r(i)||/||b|| 4.971069491153e-05 > > > 129 KSP unpreconditioned resid norm 1.446696373784e+02 true resid norm 1.463776964101e+02 ||r(i)||/||b|| 4.818586600396e-05 > > > 130 KSP unpreconditioned resid norm 1.446690766798e+02 true resid norm 1.495018999638e+02 ||r(i)||/||b|| 4.921431813499e-05 > > > 131 KSP unpreconditioned resid norm 1.446480744133e+02 true resid norm 1.499605592408e+02 ||r(i)||/||b|| 4.936530353102e-05 > > > 132 KSP unpreconditioned resid norm 1.446220543422e+02 true resid norm 1.498225445439e+02 ||r(i)||/||b|| 4.931987066895e-05 > > > 133 KSP unpreconditioned resid norm 1.446156526760e+02 true resid norm 1.481441673781e+02 ||r(i)||/||b|| 4.876736807329e-05 > > > 134 KSP unpreconditioned resid norm 1.446152477418e+02 true resid norm 1.501616466283e+02 ||r(i)||/||b|| 4.943149920257e-05 > > > 135 KSP unpreconditioned resid norm 1.445744489044e+02 true resid norm 1.505958339620e+02 ||r(i)||/||b|| 4.957442871432e-05 > > > 136 KSP unpreconditioned resid norm 1.445307936181e+02 true resid norm 1.502091787932e+02 ||r(i)||/||b|| 4.944714624841e-05 > > > 137 KSP unpreconditioned resid norm 1.444543817248e+02 true resid norm 1.491871661616e+02 ||r(i)||/||b|| 4.911071136162e-05 > > > 138 KSP unpreconditioned resid norm 1.444176915911e+02 true resid norm 1.478091693367e+02 ||r(i)||/||b|| 4.865709054379e-05 > > > 139 KSP unpreconditioned resid norm 1.444173719058e+02 true resid norm 1.495962731374e+02 ||r(i)||/||b|| 4.924538470600e-05 > > > 140 KSP unpreconditioned resid norm 1.444075340820e+02 true resid norm 1.515103203654e+02 ||r(i)||/||b|| 4.987546719477e-05 > > > 141 KSP unpreconditioned resid norm 1.444050342939e+02 true resid norm 1.498145746307e+02 ||r(i)||/||b|| 4.931724706454e-05 > > > 142 KSP unpreconditioned resid norm 1.443757787691e+02 true resid norm 1.492291154146e+02 ||r(i)||/||b|| 4.912452057664e-05 > > > 143 KSP unpreconditioned resid norm 1.440588930707e+02 true resid norm 1.485032724987e+02 ||r(i)||/||b|| 4.888558137795e-05 > > > 144 KSP unpreconditioned resid norm 1.438299468441e+02 true resid norm 1.506129385276e+02 ||r(i)||/||b|| 4.958005934200e-05 > > > 145 KSP unpreconditioned resid norm 1.434543079403e+02 true resid norm 1.471733741230e+02 ||r(i)||/||b|| 4.844779402032e-05 > > > 146 KSP unpreconditioned resid norm 1.433157223870e+02 true resid norm 1.481025707968e+02 ||r(i)||/||b|| 4.875367495378e-05 > > > 147 KSP unpreconditioned resid norm 1.430111913458e+02 true resid norm 1.485000481919e+02 ||r(i)||/||b|| 4.888451997299e-05 > > > 148 KSP unpreconditioned resid norm 1.430056153071e+02 true resid norm 1.496425172884e+02 ||r(i)||/||b|| 4.926060775239e-05 > > > 149 KSP unpreconditioned resid norm 1.429327762233e+02 true resid norm 1.467613264791e+02 ||r(i)||/||b|| 4.831215264157e-05 > > > 150 KSP unpreconditioned resid norm 1.424230217603e+02 true resid norm 1.460277537447e+02 ||r(i)||/||b|| 4.807066887493e-05 > > > 151 KSP unpreconditioned resid norm 1.421912821676e+02 true resid norm 1.470486188164e+02 ||r(i)||/||b|| 4.840672599809e-05 > > > 152 KSP unpreconditioned resid norm 1.420344275315e+02 true resid norm 1.481536901943e+02 ||r(i)||/||b|| 4.877050287565e-05 > > > 153 KSP unpreconditioned resid norm 1.420071178597e+02 true resid norm 1.450813684108e+02 ||r(i)||/||b|| 4.775912963085e-05 > > > 154 KSP unpreconditioned resid norm 1.419367456470e+02 true resid norm 1.472052819440e+02 ||r(i)||/||b|| 4.845829771059e-05 > > > 155 KSP unpreconditioned resid norm 1.419032748919e+02 true resid norm 1.479193155584e+02 ||r(i)||/||b|| 4.869334942209e-05 > > > 156 KSP unpreconditioned resid norm 1.418899781440e+02 true resid norm 1.478677351572e+02 ||r(i)||/||b|| 4.867636974307e-05 > > > 157 KSP unpreconditioned resid norm 1.418895621075e+02 true resid norm 1.455168237674e+02 ||r(i)||/||b|| 4.790247656128e-05 > > > 158 KSP unpreconditioned resid norm 1.418061469023e+02 true resid norm 1.467147028974e+02 ||r(i)||/||b|| 4.829680469093e-05 > > > 159 KSP unpreconditioned resid norm 1.417948698213e+02 true resid norm 1.478376854834e+02 ||r(i)||/||b|| 4.866647773362e-05 > > > 160 KSP unpreconditioned resid norm 1.415166832324e+02 true resid norm 1.475436433192e+02 ||r(i)||/||b|| 4.856968241116e-05 > > > 161 KSP unpreconditioned resid norm 1.414939087573e+02 true resid norm 1.468361945080e+02 ||r(i)||/||b|| 4.833679834170e-05 > > > 162 KSP unpreconditioned resid norm 1.414544622036e+02 true resid norm 1.475730757600e+02 ||r(i)||/||b|| 4.857937123456e-05 > > > 163 KSP unpreconditioned resid norm 1.413780373982e+02 true resid norm 1.463891808066e+02 ||r(i)||/||b|| 4.818964653614e-05 > > > 164 KSP unpreconditioned resid norm 1.413741853943e+02 true resid norm 1.481999741168e+02 ||r(i)||/||b|| 4.878573901436e-05 > > > 165 KSP unpreconditioned resid norm 1.413725682642e+02 true resid norm 1.458413423932e+02 ||r(i)||/||b|| 4.800930438685e-05 > > > 166 KSP unpreconditioned resid norm 1.412970845566e+02 true resid norm 1.481492296610e+02 ||r(i)||/||b|| 4.876903451901e-05 > > > 167 KSP unpreconditioned resid norm 1.410100899597e+02 true resid norm 1.468338434340e+02 ||r(i)||/||b|| 4.833602439497e-05 > > > 168 KSP unpreconditioned resid norm 1.409983320599e+02 true resid norm 1.485378957202e+02 ||r(i)||/||b|| 4.889697894709e-05 > > > 169 KSP unpreconditioned resid norm 1.407688141293e+02 true resid norm 1.461003623074e+02 ||r(i)||/||b|| 4.809457078458e-05 > > > 170 KSP unpreconditioned resid norm 1.407072771004e+02 true resid norm 1.463217409181e+02 ||r(i)||/||b|| 4.816744609502e-05 > > > 171 KSP unpreconditioned resid norm 1.407069670790e+02 true resid norm 1.464695099700e+02 ||r(i)||/||b|| 4.821608997937e-05 > > > 172 KSP unpreconditioned resid norm 1.402361094414e+02 true resid norm 1.493786053835e+02 ||r(i)||/||b|| 4.917373096721e-05 > > > 173 KSP unpreconditioned resid norm 1.400618325859e+02 true resid norm 1.465475533254e+02 ||r(i)||/||b|| 4.824178096070e-05 > > > 174 KSP unpreconditioned resid norm 1.400573078320e+02 true resid norm 1.471993735980e+02 ||r(i)||/||b|| 4.845635275056e-05 > > > 175 KSP unpreconditioned resid norm 1.400258865388e+02 true resid norm 1.479779387468e+02 ||r(i)||/||b|| 4.871264750624e-05 > > > 176 KSP unpreconditioned resid norm 1.396589283831e+02 true resid norm 1.476626943974e+02 ||r(i)||/||b|| 4.860887266654e-05 > > > 177 KSP unpreconditioned resid norm 1.395796112440e+02 true resid norm 1.443093901655e+02 ||r(i)||/||b|| 4.750500320860e-05 > > > 178 KSP unpreconditioned resid norm 1.394749154493e+02 true resid norm 1.447914005206e+02 ||r(i)||/||b|| 4.766367551289e-05 > > > 179 KSP unpreconditioned resid norm 1.394476969416e+02 true resid norm 1.455635964329e+02 ||r(i)||/||b|| 4.791787358864e-05 > > > 180 KSP unpreconditioned resid norm 1.391990722790e+02 true resid norm 1.457511594620e+02 ||r(i)||/||b|| 4.797961719582e-05 > > > 181 KSP unpreconditioned resid norm 1.391686315799e+02 true resid norm 1.460567495143e+02 ||r(i)||/||b|| 4.808021395114e-05 > > > 182 KSP unpreconditioned resid norm 1.387654475794e+02 true resid norm 1.468215388414e+02 ||r(i)||/||b|| 4.833197386362e-05 > > > 183 KSP unpreconditioned resid norm 1.384925240232e+02 true resid norm 1.456091052791e+02 ||r(i)||/||b|| 4.793285458106e-05 > > > 184 KSP unpreconditioned resid norm 1.378003249970e+02 true resid norm 1.453421051371e+02 ||r(i)||/||b|| 4.784496118351e-05 > > > 185 KSP unpreconditioned resid norm 1.377904214978e+02 true resid norm 1.441752187090e+02 ||r(i)||/||b|| 4.746083549740e-05 > > > 186 KSP unpreconditioned resid norm 1.376670282479e+02 true resid norm 1.441674745344e+02 ||r(i)||/||b|| 4.745828620353e-05 > > > 187 KSP unpreconditioned resid norm 1.376636051755e+02 true resid norm 1.463118783906e+02 ||r(i)||/||b|| 4.816419946362e-05 > > > 188 KSP unpreconditioned resid norm 1.363148994276e+02 true resid norm 1.432997756128e+02 ||r(i)||/||b|| 4.717264962781e-05 > > > 189 KSP unpreconditioned resid norm 1.363051099558e+02 true resid norm 1.451009062639e+02 ||r(i)||/||b|| 4.776556126897e-05 > > > 190 KSP unpreconditioned resid norm 1.362538398564e+02 true resid norm 1.438957985476e+02 ||r(i)||/||b|| 4.736885357127e-05 > > > 191 KSP unpreconditioned resid norm 1.358335705250e+02 true resid norm 1.436616069458e+02 ||r(i)||/||b|| 4.729176037047e-05 > > > 192 KSP unpreconditioned resid norm 1.337424103882e+02 true resid norm 1.432816138672e+02 ||r(i)||/||b|| 4.716667098856e-05 > > > 193 KSP unpreconditioned resid norm 1.337419543121e+02 true resid norm 1.405274691954e+02 ||r(i)||/||b|| 4.626003801533e-05 > > > 194 KSP unpreconditioned resid norm 1.322568117657e+02 true resid norm 1.417123189671e+02 ||r(i)||/||b|| 4.665007702902e-05 > > > 195 KSP unpreconditioned resid norm 1.320880115122e+02 true resid norm 1.413658215058e+02 ||r(i)||/||b|| 4.653601402181e-05 > > > 196 KSP unpreconditioned resid norm 1.312526182172e+02 true resid norm 1.420574070412e+02 ||r(i)||/||b|| 4.676367608204e-05 > > > 197 KSP unpreconditioned resid norm 1.311651332692e+02 true resid norm 1.398984125128e+02 ||r(i)||/||b|| 4.605295973934e-05 > > > 198 KSP unpreconditioned resid norm 1.294482397720e+02 true resid norm 1.380390703259e+02 ||r(i)||/||b|| 4.544088552537e-05 > > > 199 KSP unpreconditioned resid norm 1.293598434732e+02 true resid norm 1.373830689903e+02 ||r(i)||/||b|| 4.522493737731e-05 > > > 200 KSP unpreconditioned resid norm 1.265165992897e+02 true resid norm 1.375015523244e+02 ||r(i)||/||b|| 4.526394073779e-05 > > > 201 KSP unpreconditioned resid norm 1.263813235463e+02 true resid norm 1.356820166419e+02 ||r(i)||/||b|| 4.466497037047e-05 > > > 202 KSP unpreconditioned resid norm 1.243190164198e+02 true resid norm 1.366420975402e+02 ||r(i)||/||b|| 4.498101803792e-05 > > > 203 KSP unpreconditioned resid norm 1.230747513665e+02 true resid norm 1.348856851681e+02 ||r(i)||/||b|| 4.440282714351e-05 > > > 204 KSP unpreconditioned resid norm 1.198014010398e+02 true resid norm 1.325188356617e+02 ||r(i)||/||b|| 4.362368731578e-05 > > > 205 KSP unpreconditioned resid norm 1.195977240348e+02 true resid norm 1.299721846860e+02 ||r(i)||/||b|| 4.278535889769e-05 > > > 206 KSP unpreconditioned resid norm 1.130620928393e+02 true resid norm 1.266961052950e+02 ||r(i)||/||b|| 4.170691097546e-05 > > > 207 KSP unpreconditioned resid norm 1.123992882530e+02 true resid norm 1.270907813369e+02 ||r(i)||/||b|| 4.183683382120e-05 > > > 208 KSP unpreconditioned resid norm 1.063236317163e+02 true resid norm 1.182163029843e+02 ||r(i)||/||b|| 3.891545689533e-05 > > > 209 KSP unpreconditioned resid norm 1.059802897214e+02 true resid norm 1.187516613498e+02 ||r(i)||/||b|| 3.909169075539e-05 > > > 210 KSP unpreconditioned resid norm 9.878733567790e+01 true resid norm 1.124812677115e+02 ||r(i)||/||b|| 3.702754877846e-05 > > > 211 KSP unpreconditioned resid norm 9.861048081032e+01 true resid norm 1.117192174341e+02 ||r(i)||/||b|| 3.677669052986e-05 > > > 212 KSP unpreconditioned resid norm 9.169383217455e+01 true resid norm 1.102172324977e+02 ||r(i)||/||b|| 3.628225424167e-05 > > > 213 KSP unpreconditioned resid norm 9.146164223196e+01 true resid norm 1.121134424773e+02 ||r(i)||/||b|| 3.690646491198e-05 > > > 214 KSP unpreconditioned resid norm 8.692213412954e+01 true resid norm 1.056264039532e+02 ||r(i)||/||b|| 3.477100591276e-05 > > > 215 KSP unpreconditioned resid norm 8.685846611574e+01 true resid norm 1.029018845366e+02 ||r(i)||/||b|| 3.387412523521e-05 > > > 216 KSP unpreconditioned resid norm 7.808516472373e+01 true resid norm 9.749023000535e+01 ||r(i)||/||b|| 3.209267036539e-05 > > > 217 KSP unpreconditioned resid norm 7.786400257086e+01 true resid norm 1.004515546585e+02 ||r(i)||/||b|| 3.306750462244e-05 > > > 218 KSP unpreconditioned resid norm 6.646475864029e+01 true resid norm 9.429020541969e+01 ||r(i)||/||b|| 3.103925881653e-05 > > > 219 KSP unpreconditioned resid norm 6.643821996375e+01 true resid norm 8.864525788550e+01 ||r(i)||/||b|| 2.918100655438e-05 > > > 220 KSP unpreconditioned resid norm 5.625046780791e+01 true resid norm 8.410041684883e+01 ||r(i)||/||b|| 2.768489678784e-05 > > > 221 KSP unpreconditioned resid norm 5.623343238032e+01 true resid norm 8.815552919640e+01 ||r(i)||/||b|| 2.901979346270e-05 > > > 222 KSP unpreconditioned resid norm 4.491016868776e+01 true resid norm 8.557052117768e+01 ||r(i)||/||b|| 2.816883834410e-05 > > > 223 KSP unpreconditioned resid norm 4.461976108543e+01 true resid norm 7.867894425332e+01 ||r(i)||/||b|| 2.590020992340e-05 > > > 224 KSP unpreconditioned resid norm 3.535718264955e+01 true resid norm 7.609346753983e+01 ||r(i)||/||b|| 2.504910051583e-05 > > > 225 KSP unpreconditioned resid norm 3.525592897743e+01 true resid norm 7.926812413349e+01 ||r(i)||/||b|| 2.609416121143e-05 > > > 226 KSP unpreconditioned resid norm 2.633469451114e+01 true resid norm 7.883483297310e+01 ||r(i)||/||b|| 2.595152670968e-05 > > > 227 KSP unpreconditioned resid norm 2.614440577316e+01 true resid norm 7.398963634249e+01 ||r(i)||/||b|| 2.435654331172e-05 > > > 228 KSP unpreconditioned resid norm 1.988460252721e+01 true resid norm 7.147825835126e+01 ||r(i)||/||b|| 2.352982635730e-05 > > > 229 KSP unpreconditioned resid norm 1.975927240058e+01 true resid norm 7.488507147714e+01 ||r(i)||/||b|| 2.465131033205e-05 > > > 230 KSP unpreconditioned resid norm 1.505732242656e+01 true resid norm 7.888901529160e+01 ||r(i)||/||b|| 2.596936291016e-05 > > > 231 KSP unpreconditioned resid norm 1.504120870628e+01 true resid norm 7.126366562975e+01 ||r(i)||/||b|| 2.345918488406e-05 > > > 232 KSP unpreconditioned resid norm 1.163470506257e+01 true resid norm 7.142763663542e+01 ||r(i)||/||b|| 2.351316226655e-05 > > > 233 KSP unpreconditioned resid norm 1.157114340949e+01 true resid norm 7.464790352976e+01 ||r(i)||/||b|| 2.457323735226e-05 > > > 234 KSP unpreconditioned resid norm 8.702850618357e+00 true resid norm 7.798031063059e+01 ||r(i)||/||b|| 2.567022771329e-05 > > > 235 KSP unpreconditioned resid norm 8.702017371082e+00 true resid norm 7.032943782131e+01 ||r(i)||/||b|| 2.315164775854e-05 > > > 236 KSP unpreconditioned resid norm 6.422855779486e+00 true resid norm 6.800345168870e+01 ||r(i)||/||b|| 2.238595968678e-05 > > > 237 KSP unpreconditioned resid norm 6.413921210094e+00 true resid norm 7.408432731879e+01 ||r(i)||/||b|| 2.438771449973e-05 > > > 238 KSP unpreconditioned resid norm 4.949111361190e+00 true resid norm 7.744087979524e+01 ||r(i)||/||b|| 2.549265324267e-05 > > > 239 KSP unpreconditioned resid norm 4.947369357666e+00 true resid norm 7.104259266677e+01 ||r(i)||/||b|| 2.338641018933e-05 > > > 240 KSP unpreconditioned resid norm 3.873645232239e+00 true resid norm 6.908028336929e+01 ||r(i)||/||b|| 2.274044037845e-05 > > > 241 KSP unpreconditioned resid norm 3.841473653930e+00 true resid norm 7.431718972562e+01 ||r(i)||/||b|| 2.446437014474e-05 > > > 242 KSP unpreconditioned resid norm 3.057267436362e+00 true resid norm 7.685939322732e+01 ||r(i)||/||b|| 2.530123450517e-05 > > > 243 KSP unpreconditioned resid norm 2.980906717815e+00 true resid norm 6.975661521135e+01 ||r(i)||/||b|| 2.296308109705e-05 > > > 244 KSP unpreconditioned resid norm 2.415633545154e+00 true resid norm 6.989644258184e+01 ||r(i)||/||b|| 2.300911067057e-05 > > > 245 KSP unpreconditioned resid norm 2.363923146996e+00 true resid norm 7.486631867276e+01 ||r(i)||/||b|| 2.464513712301e-05 > > > 246 KSP unpreconditioned resid norm 1.947823635306e+00 true resid norm 7.671103669547e+01 ||r(i)||/||b|| 2.525239722914e-05 > > > 247 KSP unpreconditioned resid norm 1.942156637334e+00 true resid norm 6.835715877902e+01 ||r(i)||/||b|| 2.250239602152e-05 > > > 248 KSP unpreconditioned resid norm 1.675749569790e+00 true resid norm 7.111781390782e+01 ||r(i)||/||b|| 2.341117216285e-05 > > > 249 KSP unpreconditioned resid norm 1.673819729570e+00 true resid norm 7.552508026111e+01 ||r(i)||/||b|| 2.486199391474e-05 > > > 250 KSP unpreconditioned resid norm 1.453311843294e+00 true resid norm 7.639099426865e+01 ||r(i)||/||b|| 2.514704291716e-05 > > > 251 KSP unpreconditioned resid norm 1.452846325098e+00 true resid norm 6.951401359923e+01 ||r(i)||/||b|| 2.288321941689e-05 > > > 252 KSP unpreconditioned resid norm 1.335008887441e+00 true resid norm 6.912230871414e+01 ||r(i)||/||b|| 2.275427464204e-05 > > > 253 KSP unpreconditioned resid norm 1.334477013356e+00 true resid norm 7.412281497148e+01 ||r(i)||/||b|| 2.440038419546e-05 > > > 254 KSP unpreconditioned resid norm 1.248507835050e+00 true resid norm 7.801932499175e+01 ||r(i)||/||b|| 2.568307079543e-05 > > > 255 KSP unpreconditioned resid norm 1.248246596771e+00 true resid norm 7.094899926215e+01 ||r(i)||/||b|| 2.335560030938e-05 > > > 256 KSP unpreconditioned resid norm 1.208952722414e+00 true resid norm 7.101235824005e+01 ||r(i)||/||b|| 2.337645736134e-05 > > > 257 KSP unpreconditioned resid norm 1.208780664971e+00 true resid norm 7.562936418444e+01 ||r(i)||/||b|| 2.489632299136e-05 > > > 258 KSP unpreconditioned resid norm 1.179956701653e+00 true resid norm 7.812300941072e+01 ||r(i)||/||b|| 2.571720252207e-05 > > > 259 KSP unpreconditioned resid norm 1.179219541297e+00 true resid norm 7.131201918549e+01 ||r(i)||/||b|| 2.347510232240e-05 > > > 260 KSP unpreconditioned resid norm 1.160215487467e+00 true resid norm 7.222079766175e+01 ||r(i)||/||b|| 2.377426181841e-05 > > > 261 KSP unpreconditioned resid norm 1.159115040554e+00 true resid norm 7.481372509179e+01 ||r(i)||/||b|| 2.462782391678e-05 > > > 262 KSP unpreconditioned resid norm 1.151973184765e+00 true resid norm 7.709040836137e+01 ||r(i)||/||b|| 2.537728204907e-05 > > > 263 KSP unpreconditioned resid norm 1.150882463576e+00 true resid norm 7.032588895526e+01 ||r(i)||/||b|| 2.315047951236e-05 > > > 264 KSP unpreconditioned resid norm 1.137617003277e+00 true resid norm 7.004055871264e+01 ||r(i)||/||b|| 2.305655205500e-05 > > > 265 KSP unpreconditioned resid norm 1.137134003401e+00 true resid norm 7.610459827221e+01 ||r(i)||/||b|| 2.505276462582e-05 > > > 266 KSP unpreconditioned resid norm 1.131425778253e+00 true resid norm 7.852741072990e+01 ||r(i)||/||b|| 2.585032681802e-05 > > > 267 KSP unpreconditioned resid norm 1.131176695314e+00 true resid norm 7.064571495865e+01 ||r(i)||/||b|| 2.325576258022e-05 > > > 268 KSP unpreconditioned resid norm 1.125420065063e+00 true resid norm 7.138837220124e+01 ||r(i)||/||b|| 2.350023686323e-05 > > > 269 KSP unpreconditioned resid norm 1.124779989266e+00 true resid norm 7.585594020759e+01 ||r(i)||/||b|| 2.497090923065e-05 > > > 270 KSP unpreconditioned resid norm 1.119805446125e+00 true resid norm 7.703631305135e+01 ||r(i)||/||b|| 2.535947449079e-05 > > > 271 KSP unpreconditioned resid norm 1.119024433863e+00 true resid norm 7.081439585094e+01 ||r(i)||/||b|| 2.331129040360e-05 > > > 272 KSP unpreconditioned resid norm 1.115694452861e+00 true resid norm 7.134872343512e+01 ||r(i)||/||b|| 2.348718494222e-05 > > > 273 KSP unpreconditioned resid norm 1.113572716158e+00 true resid norm 7.600475566242e+01 ||r(i)||/||b|| 2.501989757889e-05 > > > 274 KSP unpreconditioned resid norm 1.108711406381e+00 true resid norm 7.738835220359e+01 ||r(i)||/||b|| 2.547536175937e-05 > > > 275 KSP unpreconditioned resid norm 1.107890435549e+00 true resid norm 7.093429729336e+01 ||r(i)||/||b|| 2.335076058915e-05 > > > 276 KSP unpreconditioned resid norm 1.103340227961e+00 true resid norm 7.145267197866e+01 ||r(i)||/||b|| 2.352140361564e-05 > > > 277 KSP unpreconditioned resid norm 1.102897652964e+00 true resid norm 7.448617654625e+01 ||r(i)||/||b|| 2.451999867624e-05 > > > 278 KSP unpreconditioned resid norm 1.102576754158e+00 true resid norm 7.707165090465e+01 ||r(i)||/||b|| 2.537110730854e-05 > > > 279 KSP unpreconditioned resid norm 1.102564028537e+00 true resid norm 7.009637628868e+01 ||r(i)||/||b|| 2.307492656359e-05 > > > 280 KSP unpreconditioned resid norm 1.100828424712e+00 true resid norm 7.059832880916e+01 ||r(i)||/||b|| 2.324016360096e-05 > > > 281 KSP unpreconditioned resid norm 1.100686341559e+00 true resid norm 7.460867988528e+01 ||r(i)||/||b|| 2.456032537644e-05 > > > 282 KSP unpreconditioned resid norm 1.099417185996e+00 true resid norm 7.763784632467e+01 ||r(i)||/||b|| 2.555749237477e-05 > > > 283 KSP unpreconditioned resid norm 1.099379061087e+00 true resid norm 7.017139420999e+01 ||r(i)||/||b|| 2.309962160657e-05 > > > 284 KSP unpreconditioned resid norm 1.097928047676e+00 true resid norm 6.983706716123e+01 ||r(i)||/||b|| 2.298956496018e-05 > > > 285 KSP unpreconditioned resid norm 1.096490152934e+00 true resid norm 7.414445779601e+01 ||r(i)||/||b|| 2.440750876614e-05 > > > 286 KSP unpreconditioned resid norm 1.094691490227e+00 true resid norm 7.634526287231e+01 ||r(i)||/||b|| 2.513198866374e-05 > > > 287 KSP unpreconditioned resid norm 1.093560358328e+00 true resid norm 7.003716824146e+01 ||r(i)||/||b|| 2.305543595061e-05 > > > 288 KSP unpreconditioned resid norm 1.093357856424e+00 true resid norm 6.964715939684e+01 ||r(i)||/||b|| 2.292704949292e-05 > > > 289 KSP unpreconditioned resid norm 1.091881434739e+00 true resid norm 7.429955169250e+01 ||r(i)||/||b|| 2.445856390566e-05 > > > 290 KSP unpreconditioned resid norm 1.091817808496e+00 true resid norm 7.607892786798e+01 ||r(i)||/||b|| 2.504431422190e-05 > > > 291 KSP unpreconditioned resid norm 1.090295101202e+00 true resid norm 6.942248339413e+01 ||r(i)||/||b|| 2.285308871866e-05 > > > 292 KSP unpreconditioned resid norm 1.089995012773e+00 true resid norm 6.995557798353e+01 ||r(i)||/||b|| 2.302857736947e-05 > > > 293 KSP unpreconditioned resid norm 1.089975910578e+00 true resid norm 7.453210925277e+01 ||r(i)||/||b|| 2.453511919866e-05 > > > 294 KSP unpreconditioned resid norm 1.085570944646e+00 true resid norm 7.629598425927e+01 ||r(i)||/||b|| 2.511576670710e-05 > > > 295 KSP unpreconditioned resid norm 1.085363565621e+00 true resid norm 7.025539955712e+01 ||r(i)||/||b|| 2.312727520749e-05 > > > 296 KSP unpreconditioned resid norm 1.083348574106e+00 true resid norm 7.003219621882e+01 ||r(i)||/||b|| 2.305379921754e-05 > > > 297 KSP unpreconditioned resid norm 1.082180374430e+00 true resid norm 7.473048827106e+01 ||r(i)||/||b|| 2.460042330597e-05 > > > 298 KSP unpreconditioned resid norm 1.081326671068e+00 true resid norm 7.660142838935e+01 ||r(i)||/||b|| 2.521631542651e-05 > > > 299 KSP unpreconditioned resid norm 1.078679751898e+00 true resid norm 7.077868424247e+01 ||r(i)||/||b|| 2.329953454992e-05 > > > 300 KSP unpreconditioned resid norm 1.078656949888e+00 true resid norm 7.074960394994e+01 ||r(i)||/||b|| 2.328996164972e-05 > > > Linear solve did not converge due to DIVERGED_ITS iterations 300 > > > KSP Object: 2 MPI processes > > > type: fgmres > > > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > > > GMRES: happy breakdown tolerance 1e-30 > > > maximum iterations=300, initial guess is zero > > > tolerances: relative=1e-09, absolute=1e-20, divergence=10000 > > > right preconditioning > > > using UNPRECONDITIONED norm type for convergence test > > > PC Object: 2 MPI processes > > > type: fieldsplit > > > FieldSplit with Schur preconditioner, factorization DIAG > > > Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse > > > Split info: > > > Split number 0 Defined by IS > > > Split number 1 Defined by IS > > > KSP solver for A00 block > > > KSP Object: (fieldsplit_u_) 2 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_u_) 2 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326 > > > package used to perform factorization: mumps > > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > > total number of mallocs used during MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): 1 > > > ICNTL(13) (efficiency control): 0 > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > ICNTL(18) (input mat struct): 3 > > > ICNTL(19) (Shur complement info): 0 > > > ICNTL(20) (rhs sparse pattern): 0 > > > ICNTL(21) (solution struct): 1 > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): 0 > > > ICNTL(25) (computation of a null space basis): 0 > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > ICNTL(27) (experimental parameter): -24 > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > ICNTL(29) (parallel ordering): 0 > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > ICNTL(33) (compute determinant): 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0 > > > CNTL(4) (value of static pivoting): -1 > > > CNTL(5) (fixation for null pivots): 0 > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > [0] 4.2839e+08 > > > [1] 3.799e+08 > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > [0] 90423 > > > [1] 93903 > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > > INFOG(6) (number of nodes in the complete tree): 2812 > > > INFOG(7) (ordering option effectively use after analysis): 5 > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > INFOG(14) (number of memory compress after factorization): 0 > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > > INFOG(20) (estimated number of entries in the factors): 403041366 > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > > INFOG(32) (after analysis: type of analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_u_) 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326, bs=3 > > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > KSP solver for S = A11 - A10 inv(A00) A01 > > > KSP Object: (fieldsplit_lu_) 2 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_lu_) 2 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583 > > > package used to perform factorization: mumps > > > total: nonzeros=2.17621e+06, allocated nonzeros=2.17621e+06 > > > total number of mallocs used during MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): 1 > > > ICNTL(13) (efficiency control): 0 > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > ICNTL(18) (input mat struct): 3 > > > ICNTL(19) (Shur complement info): 0 > > > ICNTL(20) (rhs sparse pattern): 0 > > > ICNTL(21) (solution struct): 1 > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): 0 > > > ICNTL(25) (computation of a null space basis): 0 > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > ICNTL(27) (experimental parameter): -24 > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > ICNTL(29) (parallel ordering): 0 > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > ICNTL(33) (compute determinant): 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0 > > > CNTL(4) (value of static pivoting): -1 > > > CNTL(5) (fixation for null pivots): 0 > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > [0] 5.12794e+08 > > > [1] 5.02142e+08 > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > [0] 815031 > > > [1] 745263 > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > [0] 5.12794e+08 > > > [1] 5.02142e+08 > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > [0] 34 > > > [1] 34 > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > [0] 34 > > > [1] 34 > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > [0] 1158 > > > [1] 1425 > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.01494e+09 > > > RINFOG(2) (global estimated flops for the assembly after factorization): 1.56029e+06 > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.01494e+09 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 2176209 > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 14427 > > > INFOG(5) (estimated maximum front size in the complete tree): 699 > > > INFOG(6) (number of nodes in the complete tree): 15 > > > INFOG(7) (ordering option effectively use after analysis): 2 > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 2176209 > > > INFOG(10) (total integer space store the matrix factors after factorization): 14427 > > > INFOG(11) (order of largest frontal matrix after factorization): 699 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > INFOG(14) (number of memory compress after factorization): 0 > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 34 > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 68 > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 34 > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 68 > > > INFOG(20) (estimated number of entries in the factors): 2176209 > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 30 > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 59 > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 2176209 > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 16, 32 > > > INFOG(32) (after analysis: type of analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > linear system matrix followed by preconditioner matrix: > > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > > type: schurcomplement > > > rows=2583, cols=2583 > > > Schur complement A11 - A10 inv(A00) A01 > > > A11 > > > Mat Object: (fieldsplit_lu_) 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583, bs=3 > > > total: nonzeros=117369, allocated nonzeros=117369 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node (on process 0) routines > > > A10 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=184326, rbs=3, cbs = 1 > > > total: nonzeros=292770, allocated nonzeros=292770 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node (on process 0) routines > > > KSP of A00 > > > KSP Object: (fieldsplit_u_) 2 MPI processes > > > type: preonly > > > maximum iterations=10000, initial guess is zero > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > > left preconditioning > > > using NONE norm type for convergence test > > > PC Object: (fieldsplit_u_) 2 MPI processes > > > type: lu > > > LU: out-of-place factorization > > > tolerance for zero pivot 2.22045e-14 > > > matrix ordering: natural > > > factor fill ratio given 0, needed 0 > > > Factored matrix follows: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326 > > > package used to perform factorization: mumps > > > total: nonzeros=4.03041e+08, allocated nonzeros=4.03041e+08 > > > total number of mallocs used during MatSetValues calls =0 > > > MUMPS run parameters: > > > SYM (matrix type): 0 > > > PAR (host participation): 1 > > > ICNTL(1) (output for error): 6 > > > ICNTL(2) (output of diagnostic msg): 0 > > > ICNTL(3) (output for global info): 0 > > > ICNTL(4) (level of printing): 0 > > > ICNTL(5) (input mat struct): 0 > > > ICNTL(6) (matrix prescaling): 7 > > > ICNTL(7) (sequentia matrix ordering):7 > > > ICNTL(8) (scalling strategy): 77 > > > ICNTL(10) (max num of refinements): 0 > > > ICNTL(11) (error analysis): 0 > > > ICNTL(12) (efficiency control): 1 > > > ICNTL(13) (efficiency control): 0 > > > ICNTL(14) (percentage of estimated workspace increase): 20 > > > ICNTL(18) (input mat struct): 3 > > > ICNTL(19) (Shur complement info): 0 > > > ICNTL(20) (rhs sparse pattern): 0 > > > ICNTL(21) (solution struct): 1 > > > ICNTL(22) (in-core/out-of-core facility): 0 > > > ICNTL(23) (max size of memory can be allocated locally):0 > > > ICNTL(24) (detection of null pivot rows): 0 > > > ICNTL(25) (computation of a null space basis): 0 > > > ICNTL(26) (Schur options for rhs or solution): 0 > > > ICNTL(27) (experimental parameter): -24 > > > ICNTL(28) (use parallel or sequential ordering): 1 > > > ICNTL(29) (parallel ordering): 0 > > > ICNTL(30) (user-specified set of entries in inv(A)): 0 > > > ICNTL(31) (factors is discarded in the solve phase): 0 > > > ICNTL(33) (compute determinant): 0 > > > CNTL(1) (relative pivoting threshold): 0.01 > > > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > > > CNTL(3) (absolute pivoting threshold): 0 > > > CNTL(4) (value of static pivoting): -1 > > > CNTL(5) (fixation for null pivots): 0 > > > RINFO(1) (local estimated flops for the elimination after analysis): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > RINFO(2) (local estimated flops for the assembly after factorization): > > > [0] 4.2839e+08 > > > [1] 3.799e+08 > > > RINFO(3) (local estimated flops for the elimination after factorization): > > > [0] 5.59214e+11 > > > [1] 5.35237e+11 > > > INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): > > > [0] 2621 > > > [1] 2649 > > > INFO(23) (num of pivots eliminated on this processor after factorization): > > > [0] 90423 > > > [1] 93903 > > > RINFOG(1) (global estimated flops for the elimination after analysis): 1.09445e+12 > > > RINFOG(2) (global estimated flops for the assembly after factorization): 8.0829e+08 > > > RINFOG(3) (global estimated flops for the elimination after factorization): 1.09445e+12 > > > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > > > INFOG(3) (estimated real workspace for factors on all processors after analysis): 403041366 > > > INFOG(4) (estimated integer workspace for factors on all processors after analysis): 2265748 > > > INFOG(5) (estimated maximum front size in the complete tree): 6663 > > > INFOG(6) (number of nodes in the complete tree): 2812 > > > INFOG(7) (ordering option effectively use after analysis): 5 > > > INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 > > > INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 403041366 > > > INFOG(10) (total integer space store the matrix factors after factorization): 2265766 > > > INFOG(11) (order of largest frontal matrix after factorization): 6663 > > > INFOG(12) (number of off-diagonal pivots): 0 > > > INFOG(13) (number of delayed pivots after factorization): 0 > > > INFOG(14) (number of memory compress after factorization): 0 > > > INFOG(15) (number of steps of iterative refinement after solution): 0 > > > INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 2649 > > > INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 5270 > > > INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 2649 > > > INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 5270 > > > INFOG(20) (estimated number of entries in the factors): 403041366 > > > INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 2121 > > > INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 4174 > > > INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 > > > INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 > > > INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 > > > INFOG(28) (after factorization: number of null pivots encountered): 0 > > > INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 403041366 > > > INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 2467, 4922 > > > INFOG(32) (after analysis: type of analysis done): 1 > > > INFOG(33) (value used for ICNTL(8)): 7 > > > INFOG(34) (exponent of the determinant if determinant is requested): 0 > > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_u_) 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=184326, bs=3 > > > total: nonzeros=3.32649e+07, allocated nonzeros=3.32649e+07 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > A01 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=184326, cols=2583, rbs=3, cbs = 1 > > > total: nonzeros=292770, allocated nonzeros=292770 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 16098 nodes, limit used is 5 > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=2583, cols=2583, rbs=3, cbs = 1 > > > total: nonzeros=1.25158e+06, allocated nonzeros=1.25158e+06 > > > total number of mallocs used during MatSetValues calls =0 > > > not using I-node (on process 0) routines > > > linear system matrix = precond matrix: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > rows=186909, cols=186909 > > > total: nonzeros=3.39678e+07, allocated nonzeros=3.39678e+07 > > > total number of mallocs used during MatSetValues calls =0 > > > using I-node (on process 0) routines: found 26829 nodes, limit used is 5 > > > KSPSolve completed > > > > > > > > > Giang > > > > > > On Sun, Apr 17, 2016 at 1:15 AM, Matthew Knepley wrote: > > > On Sat, Apr 16, 2016 at 6:54 PM, Hoang Giang Bui wrote: > > > Hello > > > > > > I'm solving an indefinite problem arising from mesh tying/contact using Lagrange multiplier, the matrix has the form > > > > > > K = [A P^T > > > P 0] > > > > > > I used the FIELDSPLIT preconditioner with one field is the main variable (displacement) and the other field for dual variable (Lagrange multiplier). The block size for each field is 3. According to the manual, I first chose the preconditioner based on Schur complement to treat this problem. > > > > > > > > > For any solver question, please send us the output of > > > > > > -ksp_view -ksp_monitor_true_residual -ksp_converged_reason > > > > > > > > > However, I will comment below > > > > > > The parameters used for the solve is > > > -ksp_type gmres > > > > > > You need 'fgmres' here with the options you have below. > > > > > > -ksp_max_it 300 > > > -ksp_gmres_restart 300 > > > -ksp_gmres_modifiedgramschmidt > > > -pc_fieldsplit_type schur > > > -pc_fieldsplit_schur_fact_type diag > > > -pc_fieldsplit_schur_precondition selfp > > > > > > > > > > > > It could be taking time in the MatMatMult() here if that matrix is dense. Is there any reason to > > > believe that is a good preconditioner for your problem? > > > > > > > > > -pc_fieldsplit_detect_saddle_point > > > -fieldsplit_u_pc_type hypre > > > > > > I would just use MUMPS here to start, especially if it works on the whole problem. Same with the one below. > > > > > > Matt > > > > > > -fieldsplit_u_pc_hypre_type boomeramg > > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > > -fieldsplit_lu_pc_type hypre > > > -fieldsplit_lu_pc_hypre_type boomeramg > > > -fieldsplit_lu_pc_hypre_boomeramg_coarsen_type PMIS > > > > > > For the test case, a small problem is solved on 2 processes. Due to the decomposition, the contact only happens in 1 proc, so the size of Lagrange multiplier dofs on proc 0 is 0. > > > > > > 0: mIndexU.size(): 80490 > > > 0: mIndexLU.size(): 0 > > > 1: mIndexU.size(): 103836 > > > 1: mIndexLU.size(): 2583 > > > > > > However, with this setup the solver takes very long at KSPSolve before going to iteration, and the first iteration seems forever so I have to stop the calculation. I guessed that the solver takes time to compute the Schur complement, but according to the manual only the diagonal of A is used to approximate the Schur complement, so it should not take long to compute this. > > > > > > Note that I ran the same problem with direct solver (MUMPS) and it's able to produce the valid results. The parameter for the solve is pretty standard > > > -ksp_type preonly > > > -pc_type lu > > > -pc_factor_mat_solver_package mumps > > > > > > Hence the matrix/rhs must not have any problem here. Do you have any idea or suggestion for this case? > > > > > > > > > Giang > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > From cyrill.von.planta at usi.ch Mon Sep 19 04:55:02 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Mon, 19 Sep 2016 09:55:02 +0000 Subject: [petsc-users] Example for MatInvertBlockDiagonal Message-ID: Dear PETSc-Users, I would like to use the inverted block diagonals of a a matrix. I have seen the function MatInvertBlockDiagonal() but I don?t know how to create a matrix out of them or an array of block matrizes. Does anyone have an example on how to use **values to create a PETSc matrix? Thanks Cyrill From knepley at gmail.com Mon Sep 19 06:21:37 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 Sep 2016 06:21:37 -0500 Subject: [petsc-users] Question about PETScSF usage in DMNetwork/DMPlex In-Reply-To: References: Message-ID: On Fri, Sep 16, 2016 at 12:54 PM, Adrian Maldonado wrote: > Just one addition about one thing I've noticed. > > The section: > > PetscSection Object: 2 MPI processes > type not yet set > Process 0: > ( 0) dim 1 offset 0 > ( 1) dim 1 offset 1 > ( 2) dim 1 offset 2 > ( 3) dim 1 offset 3 > ( 4) dim -2 offset -8 > ( 5) dim -2 offset -9 > ( 6) dim -2 offset -10 > Process 1: > ( 0) dim 1 offset 4 > ( 1) dim 1 offset 5 > ( 2) dim 1 offset 6 > ( 3) dim 1 offset 7 > ( 4) dim 1 offset 8 > ( 5) dim 1 offset 9 > > For the ghost values 4, 5, 6... is encoding the ghost values as rank = > -(-2 + 1) and offset = -(-8 + 1) ? > Yes, the encoding is -(val + 1) Notice that this is reversible. Say my value is v, then my ghost value is -(v + 1) and I can get my original value using -(-(v+1) + 1) = -(-v - 1 + 1) = v Matt > On Fri, Sep 16, 2016 at 11:36 AM, Adrian Maldonado > wrote: > >> Hi, >> >> I am trying to understand some of the data structures DMPlex/DMNetwork >> creates and the relationship among them. >> >> As an example, I have an small test circuit (/src/ksp/ksp/examples/tutoria >> ls/network/ex1.c). >> >> This is a graph that consists on 6 edges and 4 vertices, each one of >> those having one degree of freedom. When ran with two processors, each >> rank will own 3 edges. Rank 0 will own one vertex (3 ghost) and Rank 1 will >> own 3 vertices. >> >> These are some data structures for this problem. I am getting these data >> structures inside DMNetworkDistribute >> >> >> DM Object: Parallel Mesh 2 MPI processes >> type: plex >> Parallel Mesh in 1 dimensions: >> 0-cells: 4 3 >> 1-cells: 3 3 >> Labels: >> depth: 2 strata of sizes (4, 3) >> >> This, as I understand, is printing a tree with all the vertices and >> edges in each processor (owned and ghost). >> >> PetscSection Object: 2 MPI processes >> type not yet set >> Process 0: >> ( 0) dim 1 offset 0 >> ( 1) dim 1 offset 1 >> ( 2) dim 1 offset 2 >> ( 3) dim 1 offset 3 >> ( 4) dim -2 offset -8 >> ( 5) dim -2 offset -9 >> ( 6) dim -2 offset -10 >> Process 1: >> ( 0) dim 1 offset 4 >> ( 1) dim 1 offset 5 >> ( 2) dim 1 offset 6 >> ( 3) dim 1 offset 7 >> ( 4) dim 1 offset 8 >> ( 5) dim 1 offset 9 >> >> This is a global PETSc section that gives me the global numbering for the >> owned points and (garbage?) negative values for ghost. >> >> Until here everything is good. But then I print the PetscSF that is >> created by 'DMPlexDistribute'. This I do not understand: >> >> PetscSF Object: Migration SF 2 MPI processes >> type: basic >> sort=rank-order >> [0] Number of roots=10, leaves=7, remote ranks=1 >> [0] 0 <- (0,0) >> [0] 1 <- (0,1) >> [0] 2 <- (0,3) >> [0] 3 <- (0,6) >> [0] 4 <- (0,7) >> [0] 5 <- (0,8) >> [0] 6 <- (0,9) >> [1] Number of roots=0, leaves=6, remote ranks=1 >> [1] 0 <- (0,2) >> [1] 1 <- (0,4) >> [1] 2 <- (0,5) >> [1] 3 <- (0,7) >> [1] 4 <- (0,8) >> [1] 5 <- (0,9) >> [0] Roots referenced by my leaves, by rank >> [0] 0: 7 edges >> [0] 0 <- 0 >> [0] 1 <- 1 >> [0] 2 <- 3 >> [0] 3 <- 6 >> [0] 4 <- 7 >> [0] 5 <- 8 >> [0] 6 <- 9 >> [1] Roots referenced by my leaves, by rank >> [1] 0: 6 edges >> [1] 0 <- 2 >> [1] 1 <- 4 >> [1] 2 <- 5 >> [1] 3 <- 7 >> [1] 4 <- 8 >> [1] 5 <- 9 >> >> I understand that SF is a data structure that saves references to pieces >> of data that are now owned by the process (https://arxiv.org/pdf/1506.06 >> 194v1.pdf, page 4). >> >> Since the only ghost nodes appear in rank 0 (three ghost vertices) I >> would expect something like: >> *rank 0:* >> 4 - (1, 3) (to read: point 4 is owned by rank 1 and is rank's 1 point >> 3) >> etc... >> *rank 1:* >> nothing >> >> Is my intuition correct? If so, what does the star forest that I get from >> DMPlexDistribute mean? I am printing the wrong thing? >> >> Thank you >> >> -- >> D. Adrian Maldonado, PhD Candidate >> Electrical & Computer Engineering Dept. >> Illinois Institute of Technology >> 3301 S. Dearborn Street, Chicago, IL 60616 >> > > > > -- > D. Adrian Maldonado, PhD Candidate > Electrical & Computer Engineering Dept. > Illinois Institute of Technology > 3301 S. Dearborn Street, Chicago, IL 60616 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 19 06:25:32 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 Sep 2016 06:25:32 -0500 Subject: [petsc-users] Question about PETScSF usage in DMNetwork/DMPlex In-Reply-To: References: Message-ID: On Fri, Sep 16, 2016 at 11:36 AM, Adrian Maldonado wrote: > Hi, > > I am trying to understand some of the data structures DMPlex/DMNetwork > creates and the relationship among them. > > As an example, I have an small test circuit (/src/ksp/ksp/examples/ > tutorials/network/ex1.c). > > This is a graph that consists on 6 edges and 4 vertices, each one of those > having one degree of freedom. When ran with two processors, each rank will > own 3 edges. Rank 0 will own one vertex (3 ghost) and Rank 1 will own 3 > vertices. > > These are some data structures for this problem. I am getting these data > structures inside DMNetworkDistribute > > > DM Object: Parallel Mesh 2 MPI processes > type: plex > Parallel Mesh in 1 dimensions: > 0-cells: 4 3 > 1-cells: 3 3 > Labels: > depth: 2 strata of sizes (4, 3) > > This, as I understand, is printing a tree with all the vertices and edges > in each processor (owned and ghost). > > PetscSection Object: 2 MPI processes > type not yet set > Process 0: > ( 0) dim 1 offset 0 > ( 1) dim 1 offset 1 > ( 2) dim 1 offset 2 > ( 3) dim 1 offset 3 > ( 4) dim -2 offset -8 > ( 5) dim -2 offset -9 > ( 6) dim -2 offset -10 > Process 1: > ( 0) dim 1 offset 4 > ( 1) dim 1 offset 5 > ( 2) dim 1 offset 6 > ( 3) dim 1 offset 7 > ( 4) dim 1 offset 8 > ( 5) dim 1 offset 9 > > This is a global PETSc section that gives me the global numbering for the > owned points and (garbage?) negative values for ghost. > > Until here everything is good. But then I print the PetscSF that is > created by 'DMPlexDistribute'. This I do not understand: > 1) You are looking at the MigrationSF, not the eventual PointSF or OffsetSF from the DM. You need DMGetDefaultSF or DMGetPointSF for those. 2) Notice that edges 0,1,3 are sent to proc 0, and edges 2,4,5 are sent to proc 1. Matt > PetscSF Object: Migration SF 2 MPI processes > type: basic > sort=rank-order > [0] Number of roots=10, leaves=7, remote ranks=1 > [0] 0 <- (0,0) > [0] 1 <- (0,1) > [0] 2 <- (0,3) > [0] 3 <- (0,6) > [0] 4 <- (0,7) > [0] 5 <- (0,8) > [0] 6 <- (0,9) > [1] Number of roots=0, leaves=6, remote ranks=1 > [1] 0 <- (0,2) > [1] 1 <- (0,4) > [1] 2 <- (0,5) > [1] 3 <- (0,7) > [1] 4 <- (0,8) > [1] 5 <- (0,9) > [0] Roots referenced by my leaves, by rank > [0] 0: 7 edges > [0] 0 <- 0 > [0] 1 <- 1 > [0] 2 <- 3 > [0] 3 <- 6 > [0] 4 <- 7 > [0] 5 <- 8 > [0] 6 <- 9 > [1] Roots referenced by my leaves, by rank > [1] 0: 6 edges > [1] 0 <- 2 > [1] 1 <- 4 > [1] 2 <- 5 > [1] 3 <- 7 > [1] 4 <- 8 > [1] 5 <- 9 > > I understand that SF is a data structure that saves references to pieces > of data that are now owned by the process (https://arxiv.org/pdf/1506. > 06194v1.pdf, page 4). > > Since the only ghost nodes appear in rank 0 (three ghost vertices) I would > expect something like: > *rank 0:* > 4 - (1, 3) (to read: point 4 is owned by rank 1 and is rank's 1 point 3) > etc... > *rank 1:* > nothing > > Is my intuition correct? If so, what does the star forest that I get from > DMPlexDistribute mean? I am printing the wrong thing? > > Thank you > > -- > D. Adrian Maldonado, PhD Candidate > Electrical & Computer Engineering Dept. > Illinois Institute of Technology > 3301 S. Dearborn Street, Chicago, IL 60616 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ztdepyahoo at 163.com Mon Sep 19 07:15:33 2016 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Mon, 19 Sep 2016 20:15:33 +0800 (CST) Subject: [petsc-users] Does Petsc support matrix diagonalization Message-ID: <469fe956.fd38.157425f7017.Coremail.ztdepyahoo@163.com> Dear friends: I want to diagonalize matrix D: D=PAP^(-1). where A is the diagonal matrix , P is the transformation matrix. Does Petsc has this routine to perform this task. Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 19 07:50:57 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 Sep 2016 07:50:57 -0500 Subject: [petsc-users] Does Petsc support matrix diagonalization In-Reply-To: <469fe956.fd38.157425f7017.Coremail.ztdepyahoo@163.com> References: <469fe956.fd38.157425f7017.Coremail.ztdepyahoo@163.com> Message-ID: On Mon, Sep 19, 2016 at 7:15 AM, ??? wrote: > Dear friends: > I want to diagonalize matrix D: > D=PAP^(-1). > where A is the diagonal matrix , P is the transformation matrix. > Does Petsc has this routine to perform this task. > No, you should check out http://slepc.upv.es/ Thanks, Matt > Regards > > > > > > > > > > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Sep 19 09:52:14 2016 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 19 Sep 2016 09:52:14 -0500 Subject: [petsc-users] Does Petsc support matrix diagonalization In-Reply-To: <469fe956.fd38.157425f7017.Coremail.ztdepyahoo@163.com> References: <469fe956.fd38.157425f7017.Coremail.ztdepyahoo@163.com> Message-ID: ???: > Dear friends: > I want to diagonalize matrix D: > D=PAP^(-1). > where A is the diagonal matrix , P is the transformation matrix. > Does Petsc has this routine to perform this task. > This is an eigenvalu/singular value decomposition of D. For dense D, you can use Elemental, for sparse D, use Slepc. Hong > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 19 10:18:04 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 19 Sep 2016 10:18:04 -0500 Subject: [petsc-users] Example for MatInvertBlockDiagonal In-Reply-To: References: Message-ID: <136AA62C-B799-46D7-8DA2-DE5542114A67@mcs.anl.gov> Cyrill, This is very specialized for implementing point block Jacobi; I don't think it is something you would want to use directly. If you do want to use it, it simply returns the inverses of the block diagonals in column major form. You can then call MatSetValues() with with each of those blocks into another PETSc matrix. values[i*bs*bs] is the starting point of each block in the array. Barry > On Sep 19, 2016, at 4:55 AM, Cyrill Vonplanta wrote: > > Dear PETSc-Users, > > I would like to use the inverted block diagonals of a a matrix. I have seen the function MatInvertBlockDiagonal() but I don?t know how to create a matrix out of them or an array of block matrizes. > > Does anyone have an example on how to use **values to create a PETSc matrix? > > Thanks > Cyrill From cyrill.von.planta at usi.ch Mon Sep 19 10:43:50 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Mon, 19 Sep 2016 15:43:50 +0000 Subject: [petsc-users] Example for MatInvertBlockDiagonal In-Reply-To: <136AA62C-B799-46D7-8DA2-DE5542114A67@mcs.anl.gov> References: <136AA62C-B799-46D7-8DA2-DE5542114A67@mcs.anl.gov> Message-ID: Barry, Thanks a lot. I?d like to use this for a nonlinear variant of a block-gauss-seidel smoother. I would like to use MatInvertBlockDiagonal for speeding up my variant. I think I can work with this, however I also have the problem to turn my initial matrix into one with a blocksize of 3.When I call: MatConvert(A, MATBAIJ, MAT_INITIAL_MATRIX, Dinverse); Then the matrix Dinverse has blocksize 1 which comes from A. I checked the blocksize before the conversion and it was 3, so it seems to get lost. What is the correct (and elegant) way to turn a matrix into a block matrix? Best Cyrill On 19/09/16 17:18, "Barry Smith" wrote: > > Cyrill, > > This is very specialized for implementing point block Jacobi; I don't think it is something you would want to use directly. > > If you do want to use it, it simply returns the inverses of the block diagonals in column major form. You can then call MatSetValues() with with each of those blocks into another PETSc matrix. values[i*bs*bs] is the starting point of each block in the array. > > Barry > >> On Sep 19, 2016, at 4:55 AM, Cyrill Vonplanta wrote: >> >> Dear PETSc-Users, >> >> I would like to use the inverted block diagonals of a a matrix. I have seen the function MatInvertBlockDiagonal() but I don?t know how to create a matrix out of them or an array of block matrizes. >> >> Does anyone have an example on how to use **values to create a PETSc matrix? >> >> Thanks >> Cyrill > From bsmith at mcs.anl.gov Mon Sep 19 12:40:11 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 19 Sep 2016 12:40:11 -0500 Subject: [petsc-users] Example for MatInvertBlockDiagonal In-Reply-To: References: <136AA62C-B799-46D7-8DA2-DE5542114A67@mcs.anl.gov> Message-ID: <2D8FC5B4-6117-479D-836F-280945B76210@mcs.anl.gov> > On Sep 19, 2016, at 10:43 AM, Cyrill Vonplanta wrote: > > Barry, > > Thanks a lot. I?d like to use this for a nonlinear variant of a block-gauss-seidel smoother. I would like to use MatInvertBlockDiagonal for speeding up my variant. > > I think I can work with this, however I also have the problem to turn my initial matrix into one with a blocksize of 3.When I call: > > MatConvert(A, MATBAIJ, MAT_INITIAL_MATRIX, Dinverse); MatConvert() creates a new matrix so you should not create the Dinverse beforehand; anything you put in Dinverse before is lost..m block size > 1 really only makes sense if the block size is really greater than one. So if A has blocks of size 3 you should create A as BAIJ and thus never need to call the convert routine. You can also set the block size for AIJ matrix to 3 and use MatInvertBlockDiagonal() on that matrix and not use the BAIJ matrix. Finally note that MatInvertBlockDiagonal() ends up calling (for block size 3) PetscKernel_A_gets_inverse_A_3() for each block. It sounds to me like that is what you would want for nonlinear variant of a block-gauss-seidel smoother.. Barry > > > Then the matrix Dinverse has blocksize 1 which comes from A. I checked the blocksize before the conversion and it was 3, so it seems to get lost. > > What is the correct (and elegant) way to turn a matrix into a block matrix? > > > Best > Cyrill > > > > On 19/09/16 17:18, "Barry Smith" wrote: > >> >> Cyrill, >> >> This is very specialized for implementing point block Jacobi; I don't think it is something you would want to use directly. >> >> If you do want to use it, it simply returns the inverses of the block diagonals in column major form. You can then call MatSetValues() with with each of those blocks into another PETSc matrix. values[i*bs*bs] is the starting point of each block in the array. >> >> Barry >> >>> On Sep 19, 2016, at 4:55 AM, Cyrill Vonplanta wrote: >>> >>> Dear PETSc-Users, >>> >>> I would like to use the inverted block diagonals of a a matrix. I have seen the function MatInvertBlockDiagonal() but I don?t know how to create a matrix out of them or an array of block matrizes. >>> >>> Does anyone have an example on how to use **values to create a PETSc matrix? >>> >>> Thanks >>> Cyrill >> From david.knezevic at akselos.com Mon Sep 19 14:05:13 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Mon, 19 Sep 2016 15:05:13 -0400 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve Message-ID: When I use MUMPS via PETSc, one issue is that it can sometimes fail with MUMPS error -9, which means that MUMPS didn't allocate a big enough workspace. This can typically be fixed by increasing MUMPS icntl 14, e.g. via the command line option -mat_mumps_icntl_14. However, instead of having to run several times with different command line options, I'd like to be able to automatically increment icntl 14 value in a loop until the solve succeeds. I have a saved matrix which fails when I use it for a solve with MUMPS with 4 MPI processes and the default ictnl values, so I'm using this to check that I can achieve the automatic icntl 14 update, as described above. (The matrix is 14MB so I haven't attached it here, but I'd be happy to send it to anyone else who wants to try this test case out.) I've pasted some test code below which provides a simple test of this idea using two solves. The first solve uses the default value of icntl 14, which fails, and then we update icntl 14 to 30 and solve again. The second solve should succeed since icntl 14 of 30 is sufficient for MUMPS to succeed in this case, but for some reason the second solve still fails. Below I've also pasted the output from -ksp_view, and you can see that ictnl 14 is being updated correctly (see the ICNTL(14) lines in the output), so it's not clear to me why the second solve fails. It seems like MUMPS is ignoring the update to the ictnl value? Thanks, David ------------------------------------------------------------ ----------------------------------------- Test code: Mat A; MatCreate(PETSC_COMM_WORLD,&A); MatSetType(A,MATMPIAIJ); PetscViewer petsc_viewer; PetscViewerBinaryOpen( PETSC_COMM_WORLD, "matrix.dat", FILE_MODE_READ, &petsc_viewer); MatLoad(A, petsc_viewer); PetscViewerDestroy(&petsc_viewer); PetscInt m, n; MatGetSize(A, &m, &n); Vec x; VecCreate(PETSC_COMM_WORLD,&x); VecSetSizes(x,PETSC_DECIDE,m); VecSetFromOptions(x); VecSet(x,1.0); Vec b; VecDuplicate(x,&b); KSP ksp; PC pc; KSPCreate(PETSC_COMM_WORLD,&ksp); KSPSetOperators(ksp,A,A); KSPSetType(ksp,KSPPREONLY); KSPGetPC(ksp,&pc); PCSetType(pc,PCCHOLESKY); PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); PCFactorSetUpMatSolverPackage(pc); KSPSetFromOptions(ksp); KSPSetUp(ksp); KSPSolve(ksp,b,x); { KSPConvergedReason reason; KSPGetConvergedReason(ksp, &reason); std::cout << "converged reason: " << reason << std::endl; } Mat F; PCFactorGetMatrix(pc,&F); MatMumpsSetIcntl(F,14,30); KSPSolve(ksp,b,x); { KSPConvergedReason reason; KSPGetConvergedReason(ksp, &reason); std::cout << "converged reason: " << reason << std::endl; } ------------------------------------------------------------ ----------------------------------------- -ksp_view output (ICNTL(14) changes from 20 to 30, but we get "converged reason: -11" for both solves) KSP Object: 4 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: 4 MPI processes type: cholesky Cholesky: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 4 MPI processes type: mpiaij rows=22878, cols=22878 package used to perform factorization: mumps total: nonzeros=3361617, allocated nonzeros=3361617 total number of mallocs used during MatSetValues calls =0 MUMPS run parameters: SYM (matrix type): 2 PAR (host participation): 1 ICNTL(1) (output for error): 6 ICNTL(2) (output of diagnostic msg): 0 ICNTL(3) (output for global info): 0 ICNTL(4) (level of printing): 0 ICNTL(5) (input mat struct): 0 ICNTL(6) (matrix prescaling): 7 ICNTL(7) (sequentia matrix ordering):7 ICNTL(8) (scalling strategy): 77 ICNTL(10) (max num of refinements): 0 ICNTL(11) (error analysis): 0 ICNTL(12) (efficiency control): 0 ICNTL(13) (efficiency control): 0 ICNTL(14) (percentage of estimated workspace increase): 20 ICNTL(18) (input mat struct): 3 ICNTL(19) (Shur complement info): 0 ICNTL(20) (rhs sparse pattern): 0 ICNTL(21) (solution struct): 1 ICNTL(22) (in-core/out-of-core facility): 0 ICNTL(23) (max size of memory can be allocated locally):0 ICNTL(24) (detection of null pivot rows): 0 ICNTL(25) (computation of a null space basis): 0 ICNTL(26) (Schur options for rhs or solution): 0 ICNTL(27) (experimental parameter): -24 ICNTL(28) (use parallel or sequential ordering): 1 ICNTL(29) (parallel ordering): 0 ICNTL(30) (user-specified set of entries in inv(A)): 0 ICNTL(31) (factors is discarded in the solve phase): 0 ICNTL(33) (compute determinant): 0 CNTL(1) (relative pivoting threshold): 0.01 CNTL(2) (stopping criterion of refinement): 1.49012e-08 CNTL(3) (absolute pivoting threshold): 0. CNTL(4) (value of static pivoting): -1. CNTL(5) (fixation for null pivots): 0. RINFO(1) (local estimated flops for the elimination after analysis): [0] 1.84947e+08 [1] 2.42065e+08 [2] 2.53044e+08 [3] 2.18441e+08 RINFO(2) (local estimated flops for the assembly after factorization): [0] 945938. [1] 906795. [2] 897815. [3] 998840. RINFO(3) (local estimated flops for the elimination after factorization): [0] 1.59835e+08 [1] 1.50867e+08 [2] 2.27932e+08 [3] 1.52037e+08 INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): [0] 36 [1] 37 [2] 38 [3] 39 INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): [0] 36 [1] 37 [2] 38 [3] 39 INFO(23) (num of pivots eliminated on this processor after factorization): [0] 6450 [1] 5442 [2] 4386 [3] 5526 RINFOG(1) (global estimated flops for the elimination after analysis): 8.98497e+08 RINFOG(2) (global estimated flops for the assembly after factorization): 3.74939e+06 RINFOG(3) (global estimated flops for the elimination after factorization): 6.9067e+08 (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0) INFOG(3) (estimated real workspace for factors on all processors after analysis): 4082184 INFOG(4) (estimated integer workspace for factors on all processors after analysis): 231846 INFOG(5) (estimated maximum front size in the complete tree): 678 INFOG(6) (number of nodes in the complete tree): 1380 INFOG(7) (ordering option effectively use after analysis): 5 INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 3521904 INFOG(10) (total integer space store the matrix factors after factorization): 229416 INFOG(11) (order of largest frontal matrix after factorization): 678 INFOG(12) (number of off-diagonal pivots): 0 INFOG(13) (number of delayed pivots after factorization): 0 INFOG(14) (number of memory compress after factorization): 0 INFOG(15) (number of steps of iterative refinement after solution): 0 INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 39 INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 150 INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 39 INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 150 INFOG(20) (estimated number of entries in the factors): 3361617 INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 35 INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 136 INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 INFOG(28) (after factorization: number of null pivots encountered): 0 INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 2931438 INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 0, 0 INFOG(32) (after analysis: type of analysis done): 1 INFOG(33) (value used for ICNTL(8)): 7 INFOG(34) (exponent of the determinant if determinant is requested): 0 linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=22878, cols=22878 total: nonzeros=1219140, allocated nonzeros=1219140 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 1889 nodes, limit used is 5 converged reason: -11 KSP Object: 4 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: 4 MPI processes type: cholesky Cholesky: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 4 MPI processes type: mpiaij rows=22878, cols=22878 package used to perform factorization: mumps total: nonzeros=3361617, allocated nonzeros=3361617 total number of mallocs used during MatSetValues calls =0 MUMPS run parameters: SYM (matrix type): 2 PAR (host participation): 1 ICNTL(1) (output for error): 6 ICNTL(2) (output of diagnostic msg): 0 ICNTL(3) (output for global info): 0 ICNTL(4) (level of printing): 0 ICNTL(5) (input mat struct): 0 ICNTL(6) (matrix prescaling): 7 ICNTL(7) (sequentia matrix ordering):7 ICNTL(8) (scalling strategy): 77 ICNTL(10) (max num of refinements): 0 ICNTL(11) (error analysis): 0 ICNTL(12) (efficiency control): 0 ICNTL(13) (efficiency control): 0 ICNTL(14) (percentage of estimated workspace increase): 30 ICNTL(18) (input mat struct): 3 ICNTL(19) (Shur complement info): 0 ICNTL(20) (rhs sparse pattern): 0 ICNTL(21) (solution struct): 1 ICNTL(22) (in-core/out-of-core facility): 0 ICNTL(23) (max size of memory can be allocated locally):0 ICNTL(24) (detection of null pivot rows): 0 ICNTL(25) (computation of a null space basis): 0 ICNTL(26) (Schur options for rhs or solution): 0 ICNTL(27) (experimental parameter): -24 ICNTL(28) (use parallel or sequential ordering): 1 ICNTL(29) (parallel ordering): 0 ICNTL(30) (user-specified set of entries in inv(A)): 0 ICNTL(31) (factors is discarded in the solve phase): 0 ICNTL(33) (compute determinant): 0 CNTL(1) (relative pivoting threshold): 0.01 CNTL(2) (stopping criterion of refinement): 1.49012e-08 CNTL(3) (absolute pivoting threshold): 0. CNTL(4) (value of static pivoting): -1. CNTL(5) (fixation for null pivots): 0. RINFO(1) (local estimated flops for the elimination after analysis): [0] 1.84947e+08 [1] 2.42065e+08 [2] 2.53044e+08 [3] 2.18441e+08 RINFO(2) (local estimated flops for the assembly after factorization): [0] 945938. [1] 906795. [2] 897815. [3] 998840. RINFO(3) (local estimated flops for the elimination after factorization): [0] 1.59835e+08 [1] 1.50867e+08 [2] 2.27932e+08 [3] 1.52037e+08 INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization): [0] 36 [1] 37 [2] 38 [3] 39 INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization): [0] 36 [1] 37 [2] 38 [3] 39 INFO(23) (num of pivots eliminated on this processor after factorization): [0] 6450 [1] 5442 [2] 4386 [3] 5526 RINFOG(1) (global estimated flops for the elimination after analysis): 8.98497e+08 RINFOG(2) (global estimated flops for the assembly after factorization): 3.74939e+06 RINFOG(3) (global estimated flops for the elimination after factorization): 6.9067e+08 (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0) INFOG(3) (estimated real workspace for factors on all processors after analysis): 4082184 INFOG(4) (estimated integer workspace for factors on all processors after analysis): 231846 INFOG(5) (estimated maximum front size in the complete tree): 678 INFOG(6) (number of nodes in the complete tree): 1380 INFOG(7) (ordering option effectively use after analysis): 5 INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 3521904 INFOG(10) (total integer space store the matrix factors after factorization): 229416 INFOG(11) (order of largest frontal matrix after factorization): 678 INFOG(12) (number of off-diagonal pivots): 0 INFOG(13) (number of delayed pivots after factorization): 0 INFOG(14) (number of memory compress after factorization): 0 INFOG(15) (number of steps of iterative refinement after solution): 0 INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 39 INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 150 INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 39 INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 150 INFOG(20) (estimated number of entries in the factors): 3361617 INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 35 INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 136 INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 INFOG(28) (after factorization: number of null pivots encountered): 0 INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 2931438 INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 0, 0 INFOG(32) (after analysis: type of analysis done): 1 INFOG(33) (value used for ICNTL(8)): 7 INFOG(34) (exponent of the determinant if determinant is requested): 0 linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=22878, cols=22878 total: nonzeros=1219140, allocated nonzeros=1219140 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 1889 nodes, limit used is 5 converged reason: -11 ------------------------------------------------------------ ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmaldona at hawk.iit.edu Mon Sep 19 14:21:12 2016 From: dmaldona at hawk.iit.edu (Adrian Maldonado) Date: Mon, 19 Sep 2016 14:21:12 -0500 Subject: [petsc-users] Question about PETScSF usage in DMNetwork/DMPlex In-Reply-To: References: Message-ID: Ok, got it! Thanks On Mon, Sep 19, 2016 at 6:25 AM, Matthew Knepley wrote: > On Fri, Sep 16, 2016 at 11:36 AM, Adrian Maldonado > wrote: > >> Hi, >> >> I am trying to understand some of the data structures DMPlex/DMNetwork >> creates and the relationship among them. >> >> As an example, I have an small test circuit (/src/ksp/ksp/examples/tutoria >> ls/network/ex1.c). >> >> This is a graph that consists on 6 edges and 4 vertices, each one of >> those having one degree of freedom. When ran with two processors, each >> rank will own 3 edges. Rank 0 will own one vertex (3 ghost) and Rank 1 will >> own 3 vertices. >> >> These are some data structures for this problem. I am getting these data >> structures inside DMNetworkDistribute >> >> >> DM Object: Parallel Mesh 2 MPI processes >> type: plex >> Parallel Mesh in 1 dimensions: >> 0-cells: 4 3 >> 1-cells: 3 3 >> Labels: >> depth: 2 strata of sizes (4, 3) >> >> This, as I understand, is printing a tree with all the vertices and >> edges in each processor (owned and ghost). >> >> PetscSection Object: 2 MPI processes >> type not yet set >> Process 0: >> ( 0) dim 1 offset 0 >> ( 1) dim 1 offset 1 >> ( 2) dim 1 offset 2 >> ( 3) dim 1 offset 3 >> ( 4) dim -2 offset -8 >> ( 5) dim -2 offset -9 >> ( 6) dim -2 offset -10 >> Process 1: >> ( 0) dim 1 offset 4 >> ( 1) dim 1 offset 5 >> ( 2) dim 1 offset 6 >> ( 3) dim 1 offset 7 >> ( 4) dim 1 offset 8 >> ( 5) dim 1 offset 9 >> >> This is a global PETSc section that gives me the global numbering for the >> owned points and (garbage?) negative values for ghost. >> >> Until here everything is good. But then I print the PetscSF that is >> created by 'DMPlexDistribute'. This I do not understand: >> > > 1) You are looking at the MigrationSF, not the eventual PointSF or > OffsetSF from the DM. You need > > DMGetDefaultSF or DMGetPointSF > > for those. > > 2) Notice that edges 0,1,3 are sent to proc 0, and edges 2,4,5 are sent to > proc 1. > > Matt > > >> PetscSF Object: Migration SF 2 MPI processes >> type: basic >> sort=rank-order >> [0] Number of roots=10, leaves=7, remote ranks=1 >> [0] 0 <- (0,0) >> [0] 1 <- (0,1) >> [0] 2 <- (0,3) >> [0] 3 <- (0,6) >> [0] 4 <- (0,7) >> [0] 5 <- (0,8) >> [0] 6 <- (0,9) >> [1] Number of roots=0, leaves=6, remote ranks=1 >> [1] 0 <- (0,2) >> [1] 1 <- (0,4) >> [1] 2 <- (0,5) >> [1] 3 <- (0,7) >> [1] 4 <- (0,8) >> [1] 5 <- (0,9) >> [0] Roots referenced by my leaves, by rank >> [0] 0: 7 edges >> [0] 0 <- 0 >> [0] 1 <- 1 >> [0] 2 <- 3 >> [0] 3 <- 6 >> [0] 4 <- 7 >> [0] 5 <- 8 >> [0] 6 <- 9 >> [1] Roots referenced by my leaves, by rank >> [1] 0: 6 edges >> [1] 0 <- 2 >> [1] 1 <- 4 >> [1] 2 <- 5 >> [1] 3 <- 7 >> [1] 4 <- 8 >> [1] 5 <- 9 >> >> I understand that SF is a data structure that saves references to pieces >> of data that are now owned by the process (https://arxiv.org/pdf/1506.06 >> 194v1.pdf, page 4). >> >> Since the only ghost nodes appear in rank 0 (three ghost vertices) I >> would expect something like: >> *rank 0:* >> 4 - (1, 3) (to read: point 4 is owned by rank 1 and is rank's 1 point >> 3) >> etc... >> *rank 1:* >> nothing >> >> Is my intuition correct? If so, what does the star forest that I get from >> DMPlexDistribute mean? I am printing the wrong thing? >> >> Thank you >> >> -- >> D. Adrian Maldonado, PhD Candidate >> Electrical & Computer Engineering Dept. >> Illinois Institute of Technology >> 3301 S. Dearborn Street, Chicago, IL 60616 >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- D. Adrian Maldonado, PhD Candidate Electrical & Computer Engineering Dept. Illinois Institute of Technology 3301 S. Dearborn Street, Chicago, IL 60616 -------------- next part -------------- An HTML attachment was scrubbed... URL: From cyrill.von.planta at usi.ch Mon Sep 19 14:21:16 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Mon, 19 Sep 2016 19:21:16 +0000 Subject: [petsc-users] Example for MatInvertBlockDiagonal In-Reply-To: <2D8FC5B4-6117-479D-836F-280945B76210@mcs.anl.gov> References: <136AA62C-B799-46D7-8DA2-DE5542114A67@mcs.anl.gov> <2D8FC5B4-6117-479D-836F-280945B76210@mcs.anl.gov> Message-ID: <905ECA30-E8BC-4083-AE15-F51284E0E814@usi.ch> > block size > 1 really only makes sense if the block size is really greater than one. So if A has blocks of size 3 you should create A as BAIJ and thus never need to call the convert routine. Unfortunately A is not created by my part of the program and comes with blocksize 1. > > You can also set the block size for AIJ matrix to 3 and use MatInvertBlockDiagonal() on that matrix and not use the BAIJ matrix. If I run: ierr = MatSetBlockSize(A, 3); CHKERRQ(ierr); It doesn?t work for me. I get: [1;31m[0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0;39m[0;49m[0]PETSC ERROR: Arguments are incompatible [0]PETSC ERROR: Cannot change block size 1 to 3 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 . . What are the constraints for blocksize? From bsmith at mcs.anl.gov Mon Sep 19 14:38:45 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 19 Sep 2016 14:38:45 -0500 Subject: [petsc-users] Example for MatInvertBlockDiagonal In-Reply-To: <905ECA30-E8BC-4083-AE15-F51284E0E814@usi.ch> References: <136AA62C-B799-46D7-8DA2-DE5542114A67@mcs.anl.gov> <2D8FC5B4-6117-479D-836F-280945B76210@mcs.anl.gov> <905ECA30-E8BC-4083-AE15-F51284E0E814@usi.ch> Message-ID: > On Sep 19, 2016, at 2:21 PM, Cyrill Vonplanta wrote: > > >> block size > 1 really only makes sense if the block size is really greater than one. So if A has blocks of size 3 you should create A as BAIJ and thus never need to call the convert routine. > > Unfortunately A is not created by my part of the program and comes with blocksize 1. Ok, copy the code for MatInvertBlockDiagonal_SeqAIJ() into your source code with a different name and modify it to serve your purpose and call it directly instead of calling MatInvertBlockDiagonal > >> >> You can also set the block size for AIJ matrix to 3 and use MatInvertBlockDiagonal() on that matrix and not use the BAIJ matrix. > > If I run: > ierr = MatSetBlockSize(A, 3); CHKERRQ(ierr); > > It doesn?t work for me. I get: > > [1;31m[0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0;39m[0;49m[0]PETSC ERROR: Arguments are incompatible > [0]PETSC ERROR: Cannot change block size 1 to 3 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 > . > . > > What are the constraints for block size? You need to set it early in the life of the matrix. Barry > > > > From hzhang at mcs.anl.gov Mon Sep 19 16:47:31 2016 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 19 Sep 2016 16:47:31 -0500 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: David : I'll check it ... Hong When I use MUMPS via PETSc, one issue is that it can sometimes fail with > MUMPS error -9, which means that MUMPS didn't allocate a big enough > workspace. This can typically be fixed by increasing MUMPS icntl 14, e.g. > via the command line option -mat_mumps_icntl_14. > > However, instead of having to run several times with different command > line options, I'd like to be able to automatically increment icntl 14 value > in a loop until the solve succeeds. > > I have a saved matrix which fails when I use it for a solve with MUMPS > with 4 MPI processes and the default ictnl values, so I'm using this to > check that I can achieve the automatic icntl 14 update, as described above. > (The matrix is 14MB so I haven't attached it here, but I'd be happy to send > it to anyone else who wants to try this test case out.) > > I've pasted some test code below which provides a simple test of this idea > using two solves. The first solve uses the default value of icntl 14, which > fails, and then we update icntl 14 to 30 and solve again. The second solve > should succeed since icntl 14 of 30 is sufficient for MUMPS to succeed in > this case, but for some reason the second solve still fails. > > Below I've also pasted the output from -ksp_view, and you can see that > ictnl 14 is being updated correctly (see the ICNTL(14) lines in the > output), so it's not clear to me why the second solve fails. It seems like > MUMPS is ignoring the update to the ictnl value? > > Thanks, > David > > ------------------------------------------------------------ > ----------------------------------------- > Test code: > > Mat A; > MatCreate(PETSC_COMM_WORLD,&A); > MatSetType(A,MATMPIAIJ); > > PetscViewer petsc_viewer; > PetscViewerBinaryOpen( PETSC_COMM_WORLD, > "matrix.dat", > FILE_MODE_READ, > &petsc_viewer); > MatLoad(A, petsc_viewer); > PetscViewerDestroy(&petsc_viewer); > > PetscInt m, n; > MatGetSize(A, &m, &n); > > Vec x; > VecCreate(PETSC_COMM_WORLD,&x); > VecSetSizes(x,PETSC_DECIDE,m); > VecSetFromOptions(x); > VecSet(x,1.0); > > Vec b; > VecDuplicate(x,&b); > > KSP ksp; > PC pc; > > KSPCreate(PETSC_COMM_WORLD,&ksp); > KSPSetOperators(ksp,A,A); > > KSPSetType(ksp,KSPPREONLY); > KSPGetPC(ksp,&pc); > > PCSetType(pc,PCCHOLESKY); > > PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverPackage(pc); > > KSPSetFromOptions(ksp); > KSPSetUp(ksp); > > KSPSolve(ksp,b,x); > > { > KSPConvergedReason reason; > KSPGetConvergedReason(ksp, &reason); > std::cout << "converged reason: " << reason << std::endl; > } > > Mat F; > PCFactorGetMatrix(pc,&F); > MatMumpsSetIcntl(F,14,30); > > KSPSolve(ksp,b,x); > > { > KSPConvergedReason reason; > KSPGetConvergedReason(ksp, &reason); > std::cout << "converged reason: " << reason << std::endl; > } > > ------------------------------------------------------------ > ----------------------------------------- > -ksp_view output (ICNTL(14) changes from 20 to 30, but we get "converged > reason: -11" for both solves) > > KSP Object: 4 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: 4 MPI processes > type: cholesky > Cholesky: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: 4 MPI processes > type: mpiaij > rows=22878, cols=22878 > package used to perform factorization: mumps > total: nonzeros=3361617, allocated nonzeros=3361617 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 2 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequentia matrix ordering):7 > ICNTL(8) (scalling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 0 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 20 > ICNTL(18) (input mat struct): 3 > ICNTL(19) (Shur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (solution struct): 1 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): -24 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absolute pivoting threshold): 0. > CNTL(4) (value of static pivoting): -1. > CNTL(5) (fixation for null pivots): 0. > RINFO(1) (local estimated flops for the elimination after > analysis): > [0] 1.84947e+08 > [1] 2.42065e+08 > [2] 2.53044e+08 > [3] 2.18441e+08 > RINFO(2) (local estimated flops for the assembly after > factorization): > [0] 945938. > [1] 906795. > [2] 897815. > [3] 998840. > RINFO(3) (local estimated flops for the elimination after > factorization): > [0] 1.59835e+08 > [1] 1.50867e+08 > [2] 2.27932e+08 > [3] 1.52037e+08 > INFO(15) (estimated size of (in MB) MUMPS internal data for > running numerical factorization): > [0] 36 > [1] 37 > [2] 38 > [3] 39 > INFO(16) (size of (in MB) MUMPS internal data used during > numerical factorization): > [0] 36 > [1] 37 > [2] 38 > [3] 39 > INFO(23) (num of pivots eliminated on this processor after > factorization): > [0] 6450 > [1] 5442 > [2] 4386 > [3] 5526 > RINFOG(1) (global estimated flops for the elimination after > analysis): 8.98497e+08 > RINFOG(2) (global estimated flops for the assembly after > factorization): 3.74939e+06 > RINFOG(3) (global estimated flops for the elimination after > factorization): 6.9067e+08 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0.,0.)*(2^0) > INFOG(3) (estimated real workspace for factors on all > processors after analysis): 4082184 > INFOG(4) (estimated integer workspace for factors on all > processors after analysis): 231846 > INFOG(5) (estimated maximum front size in the complete > tree): 678 > INFOG(6) (number of nodes in the complete tree): 1380 > INFOG(7) (ordering option effectively use after analysis): 5 > INFOG(8) (structural symmetry in percent of the permuted > matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix > factors after factorization): 3521904 > INFOG(10) (total integer space store the matrix factors > after factorization): 229416 > INFOG(11) (order of largest frontal matrix after > factorization): 678 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after > solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data > for factorization after analysis: value on the most memory consuming > processor): 39 > INFOG(17) (estimated size of all MUMPS internal data for > factorization after analysis: sum over all processors): 150 > INFOG(18) (size of all MUMPS internal data allocated during > factorization: value on the most memory consuming processor): 39 > INFOG(19) (size of all MUMPS internal data allocated during > factorization: sum over all processors): 150 > INFOG(20) (estimated number of entries in the factors): > 3361617 > INFOG(21) (size in MB of memory effectively used during > factorization - value on the most memory consuming processor): 35 > INFOG(22) (size in MB of memory effectively used during > factorization - sum over all processors): 136 > INFOG(23) (after analysis: value of ICNTL(6) effectively > used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively > used): 1 > INFOG(25) (after factorization: number of pivots modified by > static pivoting): 0 > INFOG(28) (after factorization: number of null pivots > encountered): 0 > INFOG(29) (after factorization: effective number of entries > in the factors (sum over all processors)): 2931438 > INFOG(30, 31) (after solution: size in Mbytes of memory used > during solution phase): 0, 0 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is > requested): 0 > linear system matrix = precond matrix: > Mat Object: 4 MPI processes > type: mpiaij > rows=22878, cols=22878 > total: nonzeros=1219140, allocated nonzeros=1219140 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 1889 nodes, limit used > is 5 > converged reason: -11 > KSP Object: 4 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: 4 MPI processes > type: cholesky > Cholesky: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: 4 MPI processes > type: mpiaij > rows=22878, cols=22878 > package used to perform factorization: mumps > total: nonzeros=3361617, allocated nonzeros=3361617 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 2 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequentia matrix ordering):7 > ICNTL(8) (scalling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 0 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 30 > ICNTL(18) (input mat struct): 3 > ICNTL(19) (Shur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (solution struct): 1 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): -24 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absolute pivoting threshold): 0. > CNTL(4) (value of static pivoting): -1. > CNTL(5) (fixation for null pivots): 0. > RINFO(1) (local estimated flops for the elimination after > analysis): > [0] 1.84947e+08 > [1] 2.42065e+08 > [2] 2.53044e+08 > [3] 2.18441e+08 > RINFO(2) (local estimated flops for the assembly after > factorization): > [0] 945938. > [1] 906795. > [2] 897815. > [3] 998840. > RINFO(3) (local estimated flops for the elimination after > factorization): > [0] 1.59835e+08 > [1] 1.50867e+08 > [2] 2.27932e+08 > [3] 1.52037e+08 > INFO(15) (estimated size of (in MB) MUMPS internal data for > running numerical factorization): > [0] 36 > [1] 37 > [2] 38 > [3] 39 > INFO(16) (size of (in MB) MUMPS internal data used during > numerical factorization): > [0] 36 > [1] 37 > [2] 38 > [3] 39 > INFO(23) (num of pivots eliminated on this processor after > factorization): > [0] 6450 > [1] 5442 > [2] 4386 > [3] 5526 > RINFOG(1) (global estimated flops for the elimination after > analysis): 8.98497e+08 > RINFOG(2) (global estimated flops for the assembly after > factorization): 3.74939e+06 > RINFOG(3) (global estimated flops for the elimination after > factorization): 6.9067e+08 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0.,0.)*(2^0) > INFOG(3) (estimated real workspace for factors on all > processors after analysis): 4082184 > INFOG(4) (estimated integer workspace for factors on all > processors after analysis): 231846 > INFOG(5) (estimated maximum front size in the complete > tree): 678 > INFOG(6) (number of nodes in the complete tree): 1380 > INFOG(7) (ordering option effectively use after analysis): 5 > INFOG(8) (structural symmetry in percent of the permuted > matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix > factors after factorization): 3521904 > INFOG(10) (total integer space store the matrix factors > after factorization): 229416 > INFOG(11) (order of largest frontal matrix after > factorization): 678 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after > solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data > for factorization after analysis: value on the most memory consuming > processor): 39 > INFOG(17) (estimated size of all MUMPS internal data for > factorization after analysis: sum over all processors): 150 > INFOG(18) (size of all MUMPS internal data allocated during > factorization: value on the most memory consuming processor): 39 > INFOG(19) (size of all MUMPS internal data allocated during > factorization: sum over all processors): 150 > INFOG(20) (estimated number of entries in the factors): > 3361617 > INFOG(21) (size in MB of memory effectively used during > factorization - value on the most memory consuming processor): 35 > INFOG(22) (size in MB of memory effectively used during > factorization - sum over all processors): 136 > INFOG(23) (after analysis: value of ICNTL(6) effectively > used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively > used): 1 > INFOG(25) (after factorization: number of pivots modified by > static pivoting): 0 > INFOG(28) (after factorization: number of null pivots > encountered): 0 > INFOG(29) (after factorization: effective number of entries > in the factors (sum over all processors)): 2931438 > INFOG(30, 31) (after solution: size in Mbytes of memory used > during solution phase): 0, 0 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is > requested): 0 > linear system matrix = precond matrix: > Mat Object: 4 MPI processes > type: mpiaij > rows=22878, cols=22878 > total: nonzeros=1219140, allocated nonzeros=1219140 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 1889 nodes, limit used > is 5 > converged reason: -11 > > ------------------------------------------------------------ > ----------------------------------------- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Mon Sep 19 18:26:21 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 20 Sep 2016 01:26:21 +0200 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: On 19 September 2016 at 21:05, David Knezevic wrote: > When I use MUMPS via PETSc, one issue is that it can sometimes fail with > MUMPS error -9, which means that MUMPS didn't allocate a big enough > workspace. This can typically be fixed by increasing MUMPS icntl 14, e.g. > via the command line option -mat_mumps_icntl_14. > > However, instead of having to run several times with different command > line options, I'd like to be able to automatically increment icntl 14 value > in a loop until the solve succeeds. > > I have a saved matrix which fails when I use it for a solve with MUMPS > with 4 MPI processes and the default ictnl values, so I'm using this to > check that I can achieve the automatic icntl 14 update, as described above. > (The matrix is 14MB so I haven't attached it here, but I'd be happy to send > it to anyone else who wants to try this test case out.) > > I've pasted some test code below which provides a simple test of this idea > using two solves. The first solve uses the default value of icntl 14, which > fails, and then we update icntl 14 to 30 and solve again. The second solve > should succeed since icntl 14 of 30 is sufficient for MUMPS to succeed in > this case, but for some reason the second solve still fails. > > Below I've also pasted the output from -ksp_view, and you can see that > ictnl 14 is being updated correctly (see the ICNTL(14) lines in the > output), so it's not clear to me why the second solve fails. It seems like > MUMPS is ignoring the update to the ictnl value? > I believe this parameter is utilized during the numerical factorization phase. In your code, the operator hasn't changed, however you haven't signalled to the KSP that you want to re-perform the numerical factorization. You can do this by calling KSPSetOperators() before your second solve. I think if you do this (please try it), the factorization will be performed again and the new value of icntl will have an effect. Note this is a wild stab in the dark - I haven't dug through the petsc-mumps code in detail... Thanks, Dave > > > Thanks, > David > > ------------------------------------------------------------ > ----------------------------------------- > Test code: > > Mat A; > MatCreate(PETSC_COMM_WORLD,&A); > MatSetType(A,MATMPIAIJ); > > PetscViewer petsc_viewer; > PetscViewerBinaryOpen( PETSC_COMM_WORLD, > "matrix.dat", > FILE_MODE_READ, > &petsc_viewer); > MatLoad(A, petsc_viewer); > PetscViewerDestroy(&petsc_viewer); > > PetscInt m, n; > MatGetSize(A, &m, &n); > > Vec x; > VecCreate(PETSC_COMM_WORLD,&x); > VecSetSizes(x,PETSC_DECIDE,m); > VecSetFromOptions(x); > VecSet(x,1.0); > > Vec b; > VecDuplicate(x,&b); > > KSP ksp; > PC pc; > > KSPCreate(PETSC_COMM_WORLD,&ksp); > KSPSetOperators(ksp,A,A); > > KSPSetType(ksp,KSPPREONLY); > KSPGetPC(ksp,&pc); > > PCSetType(pc,PCCHOLESKY); > > PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverPackage(pc); > > KSPSetFromOptions(ksp); > KSPSetUp(ksp); > > KSPSolve(ksp,b,x); > > { > KSPConvergedReason reason; > KSPGetConvergedReason(ksp, &reason); > std::cout << "converged reason: " << reason << std::endl; > } > > Mat F; > PCFactorGetMatrix(pc,&F); > MatMumpsSetIcntl(F,14,30); > > KSPSolve(ksp,b,x); > > { > KSPConvergedReason reason; > KSPGetConvergedReason(ksp, &reason); > std::cout << "converged reason: " << reason << std::endl; > } > > ------------------------------------------------------------ > ----------------------------------------- > -ksp_view output (ICNTL(14) changes from 20 to 30, but we get "converged > reason: -11" for both solves) > > KSP Object: 4 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: 4 MPI processes > type: cholesky > Cholesky: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: 4 MPI processes > type: mpiaij > rows=22878, cols=22878 > package used to perform factorization: mumps > total: nonzeros=3361617, allocated nonzeros=3361617 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 2 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequentia matrix ordering):7 > ICNTL(8) (scalling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 0 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 20 > ICNTL(18) (input mat struct): 3 > ICNTL(19) (Shur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (solution struct): 1 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): -24 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absolute pivoting threshold): 0. > CNTL(4) (value of static pivoting): -1. > CNTL(5) (fixation for null pivots): 0. > RINFO(1) (local estimated flops for the elimination after > analysis): > [0] 1.84947e+08 > [1] 2.42065e+08 > [2] 2.53044e+08 > [3] 2.18441e+08 > RINFO(2) (local estimated flops for the assembly after > factorization): > [0] 945938. > [1] 906795. > [2] 897815. > [3] 998840. > RINFO(3) (local estimated flops for the elimination after > factorization): > [0] 1.59835e+08 > [1] 1.50867e+08 > [2] 2.27932e+08 > [3] 1.52037e+08 > INFO(15) (estimated size of (in MB) MUMPS internal data for > running numerical factorization): > [0] 36 > [1] 37 > [2] 38 > [3] 39 > INFO(16) (size of (in MB) MUMPS internal data used during > numerical factorization): > [0] 36 > [1] 37 > [2] 38 > [3] 39 > INFO(23) (num of pivots eliminated on this processor after > factorization): > [0] 6450 > [1] 5442 > [2] 4386 > [3] 5526 > RINFOG(1) (global estimated flops for the elimination after > analysis): 8.98497e+08 > RINFOG(2) (global estimated flops for the assembly after > factorization): 3.74939e+06 > RINFOG(3) (global estimated flops for the elimination after > factorization): 6.9067e+08 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0.,0.)*(2^0) > INFOG(3) (estimated real workspace for factors on all > processors after analysis): 4082184 > INFOG(4) (estimated integer workspace for factors on all > processors after analysis): 231846 > INFOG(5) (estimated maximum front size in the complete > tree): 678 > INFOG(6) (number of nodes in the complete tree): 1380 > INFOG(7) (ordering option effectively use after analysis): 5 > INFOG(8) (structural symmetry in percent of the permuted > matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix > factors after factorization): 3521904 > INFOG(10) (total integer space store the matrix factors > after factorization): 229416 > INFOG(11) (order of largest frontal matrix after > factorization): 678 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after > solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data > for factorization after analysis: value on the most memory consuming > processor): 39 > INFOG(17) (estimated size of all MUMPS internal data for > factorization after analysis: sum over all processors): 150 > INFOG(18) (size of all MUMPS internal data allocated during > factorization: value on the most memory consuming processor): 39 > INFOG(19) (size of all MUMPS internal data allocated during > factorization: sum over all processors): 150 > INFOG(20) (estimated number of entries in the factors): > 3361617 > INFOG(21) (size in MB of memory effectively used during > factorization - value on the most memory consuming processor): 35 > INFOG(22) (size in MB of memory effectively used during > factorization - sum over all processors): 136 > INFOG(23) (after analysis: value of ICNTL(6) effectively > used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively > used): 1 > INFOG(25) (after factorization: number of pivots modified by > static pivoting): 0 > INFOG(28) (after factorization: number of null pivots > encountered): 0 > INFOG(29) (after factorization: effective number of entries > in the factors (sum over all processors)): 2931438 > INFOG(30, 31) (after solution: size in Mbytes of memory used > during solution phase): 0, 0 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is > requested): 0 > linear system matrix = precond matrix: > Mat Object: 4 MPI processes > type: mpiaij > rows=22878, cols=22878 > total: nonzeros=1219140, allocated nonzeros=1219140 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 1889 nodes, limit used > is 5 > converged reason: -11 > KSP Object: 4 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: 4 MPI processes > type: cholesky > Cholesky: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: 4 MPI processes > type: mpiaij > rows=22878, cols=22878 > package used to perform factorization: mumps > total: nonzeros=3361617, allocated nonzeros=3361617 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 2 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequentia matrix ordering):7 > ICNTL(8) (scalling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 0 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 30 > ICNTL(18) (input mat struct): 3 > ICNTL(19) (Shur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (solution struct): 1 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): -24 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absolute pivoting threshold): 0. > CNTL(4) (value of static pivoting): -1. > CNTL(5) (fixation for null pivots): 0. > RINFO(1) (local estimated flops for the elimination after > analysis): > [0] 1.84947e+08 > [1] 2.42065e+08 > [2] 2.53044e+08 > [3] 2.18441e+08 > RINFO(2) (local estimated flops for the assembly after > factorization): > [0] 945938. > [1] 906795. > [2] 897815. > [3] 998840. > RINFO(3) (local estimated flops for the elimination after > factorization): > [0] 1.59835e+08 > [1] 1.50867e+08 > [2] 2.27932e+08 > [3] 1.52037e+08 > INFO(15) (estimated size of (in MB) MUMPS internal data for > running numerical factorization): > [0] 36 > [1] 37 > [2] 38 > [3] 39 > INFO(16) (size of (in MB) MUMPS internal data used during > numerical factorization): > [0] 36 > [1] 37 > [2] 38 > [3] 39 > INFO(23) (num of pivots eliminated on this processor after > factorization): > [0] 6450 > [1] 5442 > [2] 4386 > [3] 5526 > RINFOG(1) (global estimated flops for the elimination after > analysis): 8.98497e+08 > RINFOG(2) (global estimated flops for the assembly after > factorization): 3.74939e+06 > RINFOG(3) (global estimated flops for the elimination after > factorization): 6.9067e+08 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0.,0.)*(2^0) > INFOG(3) (estimated real workspace for factors on all > processors after analysis): 4082184 > INFOG(4) (estimated integer workspace for factors on all > processors after analysis): 231846 > INFOG(5) (estimated maximum front size in the complete > tree): 678 > INFOG(6) (number of nodes in the complete tree): 1380 > INFOG(7) (ordering option effectively use after analysis): 5 > INFOG(8) (structural symmetry in percent of the permuted > matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix > factors after factorization): 3521904 > INFOG(10) (total integer space store the matrix factors > after factorization): 229416 > INFOG(11) (order of largest frontal matrix after > factorization): 678 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after > solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data > for factorization after analysis: value on the most memory consuming > processor): 39 > INFOG(17) (estimated size of all MUMPS internal data for > factorization after analysis: sum over all processors): 150 > INFOG(18) (size of all MUMPS internal data allocated during > factorization: value on the most memory consuming processor): 39 > INFOG(19) (size of all MUMPS internal data allocated during > factorization: sum over all processors): 150 > INFOG(20) (estimated number of entries in the factors): > 3361617 > INFOG(21) (size in MB of memory effectively used during > factorization - value on the most memory consuming processor): 35 > INFOG(22) (size in MB of memory effectively used during > factorization - sum over all processors): 136 > INFOG(23) (after analysis: value of ICNTL(6) effectively > used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively > used): 1 > INFOG(25) (after factorization: number of pivots modified by > static pivoting): 0 > INFOG(28) (after factorization: number of null pivots > encountered): 0 > INFOG(29) (after factorization: effective number of entries > in the factors (sum over all processors)): 2931438 > INFOG(30, 31) (after solution: size in Mbytes of memory used > during solution phase): 0, 0 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is > requested): 0 > linear system matrix = precond matrix: > Mat Object: 4 MPI processes > type: mpiaij > rows=22878, cols=22878 > total: nonzeros=1219140, allocated nonzeros=1219140 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 1889 nodes, limit used > is 5 > converged reason: -11 > > ------------------------------------------------------------ > ----------------------------------------- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Mon Sep 19 20:33:53 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Mon, 19 Sep 2016 21:33:53 -0400 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: On Mon, Sep 19, 2016 at 7:26 PM, Dave May wrote: > > > On 19 September 2016 at 21:05, David Knezevic > wrote: > >> When I use MUMPS via PETSc, one issue is that it can sometimes fail with >> MUMPS error -9, which means that MUMPS didn't allocate a big enough >> workspace. This can typically be fixed by increasing MUMPS icntl 14, e.g. >> via the command line option -mat_mumps_icntl_14. >> >> However, instead of having to run several times with different command >> line options, I'd like to be able to automatically increment icntl 14 value >> in a loop until the solve succeeds. >> >> I have a saved matrix which fails when I use it for a solve with MUMPS >> with 4 MPI processes and the default ictnl values, so I'm using this to >> check that I can achieve the automatic icntl 14 update, as described above. >> (The matrix is 14MB so I haven't attached it here, but I'd be happy to send >> it to anyone else who wants to try this test case out.) >> >> I've pasted some test code below which provides a simple test of this >> idea using two solves. The first solve uses the default value of icntl 14, >> which fails, and then we update icntl 14 to 30 and solve again. The second >> solve should succeed since icntl 14 of 30 is sufficient for MUMPS to >> succeed in this case, but for some reason the second solve still fails. >> >> Below I've also pasted the output from -ksp_view, and you can see that >> ictnl 14 is being updated correctly (see the ICNTL(14) lines in the >> output), so it's not clear to me why the second solve fails. It seems like >> MUMPS is ignoring the update to the ictnl value? >> > > I believe this parameter is utilized during the numerical factorization > phase. > In your code, the operator hasn't changed, however you haven't signalled > to the KSP that you want to re-perform the numerical factorization. > You can do this by calling KSPSetOperators() before your second solve. > I think if you do this (please try it), the factorization will be > performed again and the new value of icntl will have an effect. > > Note this is a wild stab in the dark - I haven't dug through the > petsc-mumps code in detail... > That sounds like a plausible guess to me, but unfortunately it didn't work. I added KSPSetOperators(ksp,A,A); before the second solve and I got the same behavior as before. Thanks, David > ------------------------------------------------------------ >> ----------------------------------------- >> Test code: >> >> Mat A; >> MatCreate(PETSC_COMM_WORLD,&A); >> MatSetType(A,MATMPIAIJ); >> >> PetscViewer petsc_viewer; >> PetscViewerBinaryOpen( PETSC_COMM_WORLD, >> "matrix.dat", >> FILE_MODE_READ, >> &petsc_viewer); >> MatLoad(A, petsc_viewer); >> PetscViewerDestroy(&petsc_viewer); >> >> PetscInt m, n; >> MatGetSize(A, &m, &n); >> >> Vec x; >> VecCreate(PETSC_COMM_WORLD,&x); >> VecSetSizes(x,PETSC_DECIDE,m); >> VecSetFromOptions(x); >> VecSet(x,1.0); >> >> Vec b; >> VecDuplicate(x,&b); >> >> KSP ksp; >> PC pc; >> >> KSPCreate(PETSC_COMM_WORLD,&ksp); >> KSPSetOperators(ksp,A,A); >> >> KSPSetType(ksp,KSPPREONLY); >> KSPGetPC(ksp,&pc); >> >> PCSetType(pc,PCCHOLESKY); >> >> PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); >> PCFactorSetUpMatSolverPackage(pc); >> >> KSPSetFromOptions(ksp); >> KSPSetUp(ksp); >> >> KSPSolve(ksp,b,x); >> >> { >> KSPConvergedReason reason; >> KSPGetConvergedReason(ksp, &reason); >> std::cout << "converged reason: " << reason << std::endl; >> } >> >> Mat F; >> PCFactorGetMatrix(pc,&F); >> MatMumpsSetIcntl(F,14,30); >> >> KSPSolve(ksp,b,x); >> >> { >> KSPConvergedReason reason; >> KSPGetConvergedReason(ksp, &reason); >> std::cout << "converged reason: " << reason << std::endl; >> } >> >> ------------------------------------------------------------ >> ----------------------------------------- >> -ksp_view output (ICNTL(14) changes from 20 to 30, but we get "converged >> reason: -11" for both solves) >> >> KSP Object: 4 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: 4 MPI processes >> type: cholesky >> Cholesky: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 0., needed 0. >> Factored matrix follows: >> Mat Object: 4 MPI processes >> type: mpiaij >> rows=22878, cols=22878 >> package used to perform factorization: mumps >> total: nonzeros=3361617, allocated nonzeros=3361617 >> total number of mallocs used during MatSetValues calls =0 >> MUMPS run parameters: >> SYM (matrix type): 2 >> PAR (host participation): 1 >> ICNTL(1) (output for error): 6 >> ICNTL(2) (output of diagnostic msg): 0 >> ICNTL(3) (output for global info): 0 >> ICNTL(4) (level of printing): 0 >> ICNTL(5) (input mat struct): 0 >> ICNTL(6) (matrix prescaling): 7 >> ICNTL(7) (sequentia matrix ordering):7 >> ICNTL(8) (scalling strategy): 77 >> ICNTL(10) (max num of refinements): 0 >> ICNTL(11) (error analysis): 0 >> ICNTL(12) (efficiency control): 0 >> ICNTL(13) (efficiency control): 0 >> ICNTL(14) (percentage of estimated workspace increase): 20 >> ICNTL(18) (input mat struct): 3 >> ICNTL(19) (Shur complement info): 0 >> ICNTL(20) (rhs sparse pattern): 0 >> ICNTL(21) (solution struct): 1 >> ICNTL(22) (in-core/out-of-core facility): 0 >> ICNTL(23) (max size of memory can be allocated locally):0 >> ICNTL(24) (detection of null pivot rows): 0 >> ICNTL(25) (computation of a null space basis): 0 >> ICNTL(26) (Schur options for rhs or solution): 0 >> ICNTL(27) (experimental parameter): -24 >> ICNTL(28) (use parallel or sequential ordering): 1 >> ICNTL(29) (parallel ordering): 0 >> ICNTL(30) (user-specified set of entries in inv(A)): 0 >> ICNTL(31) (factors is discarded in the solve phase): 0 >> ICNTL(33) (compute determinant): 0 >> CNTL(1) (relative pivoting threshold): 0.01 >> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >> CNTL(3) (absolute pivoting threshold): 0. >> CNTL(4) (value of static pivoting): -1. >> CNTL(5) (fixation for null pivots): 0. >> RINFO(1) (local estimated flops for the elimination after >> analysis): >> [0] 1.84947e+08 >> [1] 2.42065e+08 >> [2] 2.53044e+08 >> [3] 2.18441e+08 >> RINFO(2) (local estimated flops for the assembly after >> factorization): >> [0] 945938. >> [1] 906795. >> [2] 897815. >> [3] 998840. >> RINFO(3) (local estimated flops for the elimination after >> factorization): >> [0] 1.59835e+08 >> [1] 1.50867e+08 >> [2] 2.27932e+08 >> [3] 1.52037e+08 >> INFO(15) (estimated size of (in MB) MUMPS internal data for >> running numerical factorization): >> [0] 36 >> [1] 37 >> [2] 38 >> [3] 39 >> INFO(16) (size of (in MB) MUMPS internal data used during >> numerical factorization): >> [0] 36 >> [1] 37 >> [2] 38 >> [3] 39 >> INFO(23) (num of pivots eliminated on this processor after >> factorization): >> [0] 6450 >> [1] 5442 >> [2] 4386 >> [3] 5526 >> RINFOG(1) (global estimated flops for the elimination after >> analysis): 8.98497e+08 >> RINFOG(2) (global estimated flops for the assembly after >> factorization): 3.74939e+06 >> RINFOG(3) (global estimated flops for the elimination after >> factorization): 6.9067e+08 >> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >> (0.,0.)*(2^0) >> INFOG(3) (estimated real workspace for factors on all >> processors after analysis): 4082184 >> INFOG(4) (estimated integer workspace for factors on all >> processors after analysis): 231846 >> INFOG(5) (estimated maximum front size in the complete >> tree): 678 >> INFOG(6) (number of nodes in the complete tree): 1380 >> INFOG(7) (ordering option effectively use after analysis): >> 5 >> INFOG(8) (structural symmetry in percent of the permuted >> matrix after analysis): 100 >> INFOG(9) (total real/complex workspace to store the matrix >> factors after factorization): 3521904 >> INFOG(10) (total integer space store the matrix factors >> after factorization): 229416 >> INFOG(11) (order of largest frontal matrix after >> factorization): 678 >> INFOG(12) (number of off-diagonal pivots): 0 >> INFOG(13) (number of delayed pivots after factorization): 0 >> INFOG(14) (number of memory compress after factorization): >> 0 >> INFOG(15) (number of steps of iterative refinement after >> solution): 0 >> INFOG(16) (estimated size (in MB) of all MUMPS internal >> data for factorization after analysis: value on the most memory consuming >> processor): 39 >> INFOG(17) (estimated size of all MUMPS internal data for >> factorization after analysis: sum over all processors): 150 >> INFOG(18) (size of all MUMPS internal data allocated during >> factorization: value on the most memory consuming processor): 39 >> INFOG(19) (size of all MUMPS internal data allocated during >> factorization: sum over all processors): 150 >> INFOG(20) (estimated number of entries in the factors): >> 3361617 >> INFOG(21) (size in MB of memory effectively used during >> factorization - value on the most memory consuming processor): 35 >> INFOG(22) (size in MB of memory effectively used during >> factorization - sum over all processors): 136 >> INFOG(23) (after analysis: value of ICNTL(6) effectively >> used): 0 >> INFOG(24) (after analysis: value of ICNTL(12) effectively >> used): 1 >> INFOG(25) (after factorization: number of pivots modified >> by static pivoting): 0 >> INFOG(28) (after factorization: number of null pivots >> encountered): 0 >> INFOG(29) (after factorization: effective number of entries >> in the factors (sum over all processors)): 2931438 >> INFOG(30, 31) (after solution: size in Mbytes of memory >> used during solution phase): 0, 0 >> INFOG(32) (after analysis: type of analysis done): 1 >> INFOG(33) (value used for ICNTL(8)): 7 >> INFOG(34) (exponent of the determinant if determinant is >> requested): 0 >> linear system matrix = precond matrix: >> Mat Object: 4 MPI processes >> type: mpiaij >> rows=22878, cols=22878 >> total: nonzeros=1219140, allocated nonzeros=1219140 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 1889 nodes, limit used >> is 5 >> converged reason: -11 >> KSP Object: 4 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: 4 MPI processes >> type: cholesky >> Cholesky: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: natural >> factor fill ratio given 0., needed 0. >> Factored matrix follows: >> Mat Object: 4 MPI processes >> type: mpiaij >> rows=22878, cols=22878 >> package used to perform factorization: mumps >> total: nonzeros=3361617, allocated nonzeros=3361617 >> total number of mallocs used during MatSetValues calls =0 >> MUMPS run parameters: >> SYM (matrix type): 2 >> PAR (host participation): 1 >> ICNTL(1) (output for error): 6 >> ICNTL(2) (output of diagnostic msg): 0 >> ICNTL(3) (output for global info): 0 >> ICNTL(4) (level of printing): 0 >> ICNTL(5) (input mat struct): 0 >> ICNTL(6) (matrix prescaling): 7 >> ICNTL(7) (sequentia matrix ordering):7 >> ICNTL(8) (scalling strategy): 77 >> ICNTL(10) (max num of refinements): 0 >> ICNTL(11) (error analysis): 0 >> ICNTL(12) (efficiency control): 0 >> ICNTL(13) (efficiency control): 0 >> ICNTL(14) (percentage of estimated workspace increase): 30 >> ICNTL(18) (input mat struct): 3 >> ICNTL(19) (Shur complement info): 0 >> ICNTL(20) (rhs sparse pattern): 0 >> ICNTL(21) (solution struct): 1 >> ICNTL(22) (in-core/out-of-core facility): 0 >> ICNTL(23) (max size of memory can be allocated locally):0 >> ICNTL(24) (detection of null pivot rows): 0 >> ICNTL(25) (computation of a null space basis): 0 >> ICNTL(26) (Schur options for rhs or solution): 0 >> ICNTL(27) (experimental parameter): -24 >> ICNTL(28) (use parallel or sequential ordering): 1 >> ICNTL(29) (parallel ordering): 0 >> ICNTL(30) (user-specified set of entries in inv(A)): 0 >> ICNTL(31) (factors is discarded in the solve phase): 0 >> ICNTL(33) (compute determinant): 0 >> CNTL(1) (relative pivoting threshold): 0.01 >> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >> CNTL(3) (absolute pivoting threshold): 0. >> CNTL(4) (value of static pivoting): -1. >> CNTL(5) (fixation for null pivots): 0. >> RINFO(1) (local estimated flops for the elimination after >> analysis): >> [0] 1.84947e+08 >> [1] 2.42065e+08 >> [2] 2.53044e+08 >> [3] 2.18441e+08 >> RINFO(2) (local estimated flops for the assembly after >> factorization): >> [0] 945938. >> [1] 906795. >> [2] 897815. >> [3] 998840. >> RINFO(3) (local estimated flops for the elimination after >> factorization): >> [0] 1.59835e+08 >> [1] 1.50867e+08 >> [2] 2.27932e+08 >> [3] 1.52037e+08 >> INFO(15) (estimated size of (in MB) MUMPS internal data for >> running numerical factorization): >> [0] 36 >> [1] 37 >> [2] 38 >> [3] 39 >> INFO(16) (size of (in MB) MUMPS internal data used during >> numerical factorization): >> [0] 36 >> [1] 37 >> [2] 38 >> [3] 39 >> INFO(23) (num of pivots eliminated on this processor after >> factorization): >> [0] 6450 >> [1] 5442 >> [2] 4386 >> [3] 5526 >> RINFOG(1) (global estimated flops for the elimination after >> analysis): 8.98497e+08 >> RINFOG(2) (global estimated flops for the assembly after >> factorization): 3.74939e+06 >> RINFOG(3) (global estimated flops for the elimination after >> factorization): 6.9067e+08 >> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >> (0.,0.)*(2^0) >> INFOG(3) (estimated real workspace for factors on all >> processors after analysis): 4082184 >> INFOG(4) (estimated integer workspace for factors on all >> processors after analysis): 231846 >> INFOG(5) (estimated maximum front size in the complete >> tree): 678 >> INFOG(6) (number of nodes in the complete tree): 1380 >> INFOG(7) (ordering option effectively use after analysis): >> 5 >> INFOG(8) (structural symmetry in percent of the permuted >> matrix after analysis): 100 >> INFOG(9) (total real/complex workspace to store the matrix >> factors after factorization): 3521904 >> INFOG(10) (total integer space store the matrix factors >> after factorization): 229416 >> INFOG(11) (order of largest frontal matrix after >> factorization): 678 >> INFOG(12) (number of off-diagonal pivots): 0 >> INFOG(13) (number of delayed pivots after factorization): 0 >> INFOG(14) (number of memory compress after factorization): >> 0 >> INFOG(15) (number of steps of iterative refinement after >> solution): 0 >> INFOG(16) (estimated size (in MB) of all MUMPS internal >> data for factorization after analysis: value on the most memory consuming >> processor): 39 >> INFOG(17) (estimated size of all MUMPS internal data for >> factorization after analysis: sum over all processors): 150 >> INFOG(18) (size of all MUMPS internal data allocated during >> factorization: value on the most memory consuming processor): 39 >> INFOG(19) (size of all MUMPS internal data allocated during >> factorization: sum over all processors): 150 >> INFOG(20) (estimated number of entries in the factors): >> 3361617 >> INFOG(21) (size in MB of memory effectively used during >> factorization - value on the most memory consuming processor): 35 >> INFOG(22) (size in MB of memory effectively used during >> factorization - sum over all processors): 136 >> INFOG(23) (after analysis: value of ICNTL(6) effectively >> used): 0 >> INFOG(24) (after analysis: value of ICNTL(12) effectively >> used): 1 >> INFOG(25) (after factorization: number of pivots modified >> by static pivoting): 0 >> INFOG(28) (after factorization: number of null pivots >> encountered): 0 >> INFOG(29) (after factorization: effective number of entries >> in the factors (sum over all processors)): 2931438 >> INFOG(30, 31) (after solution: size in Mbytes of memory >> used during solution phase): 0, 0 >> INFOG(32) (after analysis: type of analysis done): 1 >> INFOG(33) (value used for ICNTL(8)): 7 >> INFOG(34) (exponent of the determinant if determinant is >> requested): 0 >> linear system matrix = precond matrix: >> Mat Object: 4 MPI processes >> type: mpiaij >> rows=22878, cols=22878 >> total: nonzeros=1219140, allocated nonzeros=1219140 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 1889 nodes, limit used >> is 5 >> converged reason: -11 >> >> ------------------------------------------------------------ >> ----------------------------------------- >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From keceli at gmail.com Mon Sep 19 20:38:30 2016 From: keceli at gmail.com (=?UTF-8?Q?murat_ke=C3=A7eli?=) Date: Mon, 19 Sep 2016 20:38:30 -0500 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: Another guess: maybe you also need KSPSetUp(ksp); before the second KSPSolve(ksp,b,x);. Murat Keceli ? On Mon, Sep 19, 2016 at 8:33 PM, David Knezevic wrote: > On Mon, Sep 19, 2016 at 7:26 PM, Dave May wrote: > >> >> >> On 19 September 2016 at 21:05, David Knezevic > > wrote: >> >>> When I use MUMPS via PETSc, one issue is that it can sometimes fail with >>> MUMPS error -9, which means that MUMPS didn't allocate a big enough >>> workspace. This can typically be fixed by increasing MUMPS icntl 14, e.g. >>> via the command line option -mat_mumps_icntl_14. >>> >>> However, instead of having to run several times with different command >>> line options, I'd like to be able to automatically increment icntl 14 value >>> in a loop until the solve succeeds. >>> >>> I have a saved matrix which fails when I use it for a solve with MUMPS >>> with 4 MPI processes and the default ictnl values, so I'm using this to >>> check that I can achieve the automatic icntl 14 update, as described above. >>> (The matrix is 14MB so I haven't attached it here, but I'd be happy to send >>> it to anyone else who wants to try this test case out.) >>> >>> I've pasted some test code below which provides a simple test of this >>> idea using two solves. The first solve uses the default value of icntl 14, >>> which fails, and then we update icntl 14 to 30 and solve again. The second >>> solve should succeed since icntl 14 of 30 is sufficient for MUMPS to >>> succeed in this case, but for some reason the second solve still fails. >>> >>> Below I've also pasted the output from -ksp_view, and you can see that >>> ictnl 14 is being updated correctly (see the ICNTL(14) lines in the >>> output), so it's not clear to me why the second solve fails. It seems like >>> MUMPS is ignoring the update to the ictnl value? >>> >> >> I believe this parameter is utilized during the numerical factorization >> phase. >> In your code, the operator hasn't changed, however you haven't signalled >> to the KSP that you want to re-perform the numerical factorization. >> You can do this by calling KSPSetOperators() before your second solve. >> I think if you do this (please try it), the factorization will be >> performed again and the new value of icntl will have an effect. >> >> Note this is a wild stab in the dark - I haven't dug through the >> petsc-mumps code in detail... >> > > That sounds like a plausible guess to me, but unfortunately it didn't > work. I added KSPSetOperators(ksp,A,A); before the second solve and I got > the same behavior as before. > > Thanks, > David > > > > > >> ------------------------------------------------------------ >>> ----------------------------------------- >>> Test code: >>> >>> Mat A; >>> MatCreate(PETSC_COMM_WORLD,&A); >>> MatSetType(A,MATMPIAIJ); >>> >>> PetscViewer petsc_viewer; >>> PetscViewerBinaryOpen( PETSC_COMM_WORLD, >>> "matrix.dat", >>> FILE_MODE_READ, >>> &petsc_viewer); >>> MatLoad(A, petsc_viewer); >>> PetscViewerDestroy(&petsc_viewer); >>> >>> PetscInt m, n; >>> MatGetSize(A, &m, &n); >>> >>> Vec x; >>> VecCreate(PETSC_COMM_WORLD,&x); >>> VecSetSizes(x,PETSC_DECIDE,m); >>> VecSetFromOptions(x); >>> VecSet(x,1.0); >>> >>> Vec b; >>> VecDuplicate(x,&b); >>> >>> KSP ksp; >>> PC pc; >>> >>> KSPCreate(PETSC_COMM_WORLD,&ksp); >>> KSPSetOperators(ksp,A,A); >>> >>> KSPSetType(ksp,KSPPREONLY); >>> KSPGetPC(ksp,&pc); >>> >>> PCSetType(pc,PCCHOLESKY); >>> >>> PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); >>> PCFactorSetUpMatSolverPackage(pc); >>> >>> KSPSetFromOptions(ksp); >>> KSPSetUp(ksp); >>> >>> KSPSolve(ksp,b,x); >>> >>> { >>> KSPConvergedReason reason; >>> KSPGetConvergedReason(ksp, &reason); >>> std::cout << "converged reason: " << reason << std::endl; >>> } >>> >>> Mat F; >>> PCFactorGetMatrix(pc,&F); >>> MatMumpsSetIcntl(F,14,30); >>> >>> KSPSolve(ksp,b,x); >>> >>> { >>> KSPConvergedReason reason; >>> KSPGetConvergedReason(ksp, &reason); >>> std::cout << "converged reason: " << reason << std::endl; >>> } >>> >>> ------------------------------------------------------------ >>> ----------------------------------------- >>> -ksp_view output (ICNTL(14) changes from 20 to 30, but we get "converged >>> reason: -11" for both solves) >>> >>> KSP Object: 4 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: 4 MPI processes >>> type: cholesky >>> Cholesky: out-of-place factorization >>> tolerance for zero pivot 2.22045e-14 >>> matrix ordering: natural >>> factor fill ratio given 0., needed 0. >>> Factored matrix follows: >>> Mat Object: 4 MPI processes >>> type: mpiaij >>> rows=22878, cols=22878 >>> package used to perform factorization: mumps >>> total: nonzeros=3361617, allocated nonzeros=3361617 >>> total number of mallocs used during MatSetValues calls =0 >>> MUMPS run parameters: >>> SYM (matrix type): 2 >>> PAR (host participation): 1 >>> ICNTL(1) (output for error): 6 >>> ICNTL(2) (output of diagnostic msg): 0 >>> ICNTL(3) (output for global info): 0 >>> ICNTL(4) (level of printing): 0 >>> ICNTL(5) (input mat struct): 0 >>> ICNTL(6) (matrix prescaling): 7 >>> ICNTL(7) (sequentia matrix ordering):7 >>> ICNTL(8) (scalling strategy): 77 >>> ICNTL(10) (max num of refinements): 0 >>> ICNTL(11) (error analysis): 0 >>> ICNTL(12) (efficiency control): 0 >>> ICNTL(13) (efficiency control): 0 >>> ICNTL(14) (percentage of estimated workspace increase): 20 >>> ICNTL(18) (input mat struct): 3 >>> ICNTL(19) (Shur complement info): 0 >>> ICNTL(20) (rhs sparse pattern): 0 >>> ICNTL(21) (solution struct): 1 >>> ICNTL(22) (in-core/out-of-core facility): 0 >>> ICNTL(23) (max size of memory can be allocated locally):0 >>> ICNTL(24) (detection of null pivot rows): 0 >>> ICNTL(25) (computation of a null space basis): 0 >>> ICNTL(26) (Schur options for rhs or solution): 0 >>> ICNTL(27) (experimental parameter): >>> -24 >>> ICNTL(28) (use parallel or sequential ordering): 1 >>> ICNTL(29) (parallel ordering): 0 >>> ICNTL(30) (user-specified set of entries in inv(A)): 0 >>> ICNTL(31) (factors is discarded in the solve phase): 0 >>> ICNTL(33) (compute determinant): 0 >>> CNTL(1) (relative pivoting threshold): 0.01 >>> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >>> CNTL(3) (absolute pivoting threshold): 0. >>> CNTL(4) (value of static pivoting): -1. >>> CNTL(5) (fixation for null pivots): 0. >>> RINFO(1) (local estimated flops for the elimination after >>> analysis): >>> [0] 1.84947e+08 >>> [1] 2.42065e+08 >>> [2] 2.53044e+08 >>> [3] 2.18441e+08 >>> RINFO(2) (local estimated flops for the assembly after >>> factorization): >>> [0] 945938. >>> [1] 906795. >>> [2] 897815. >>> [3] 998840. >>> RINFO(3) (local estimated flops for the elimination after >>> factorization): >>> [0] 1.59835e+08 >>> [1] 1.50867e+08 >>> [2] 2.27932e+08 >>> [3] 1.52037e+08 >>> INFO(15) (estimated size of (in MB) MUMPS internal data >>> for running numerical factorization): >>> [0] 36 >>> [1] 37 >>> [2] 38 >>> [3] 39 >>> INFO(16) (size of (in MB) MUMPS internal data used during >>> numerical factorization): >>> [0] 36 >>> [1] 37 >>> [2] 38 >>> [3] 39 >>> INFO(23) (num of pivots eliminated on this processor after >>> factorization): >>> [0] 6450 >>> [1] 5442 >>> [2] 4386 >>> [3] 5526 >>> RINFOG(1) (global estimated flops for the elimination >>> after analysis): 8.98497e+08 >>> RINFOG(2) (global estimated flops for the assembly after >>> factorization): 3.74939e+06 >>> RINFOG(3) (global estimated flops for the elimination >>> after factorization): 6.9067e+08 >>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>> (0.,0.)*(2^0) >>> INFOG(3) (estimated real workspace for factors on all >>> processors after analysis): 4082184 >>> INFOG(4) (estimated integer workspace for factors on all >>> processors after analysis): 231846 >>> INFOG(5) (estimated maximum front size in the complete >>> tree): 678 >>> INFOG(6) (number of nodes in the complete tree): 1380 >>> INFOG(7) (ordering option effectively use after analysis): >>> 5 >>> INFOG(8) (structural symmetry in percent of the permuted >>> matrix after analysis): 100 >>> INFOG(9) (total real/complex workspace to store the matrix >>> factors after factorization): 3521904 >>> INFOG(10) (total integer space store the matrix factors >>> after factorization): 229416 >>> INFOG(11) (order of largest frontal matrix after >>> factorization): 678 >>> INFOG(12) (number of off-diagonal pivots): 0 >>> INFOG(13) (number of delayed pivots after factorization): >>> 0 >>> INFOG(14) (number of memory compress after factorization): >>> 0 >>> INFOG(15) (number of steps of iterative refinement after >>> solution): 0 >>> INFOG(16) (estimated size (in MB) of all MUMPS internal >>> data for factorization after analysis: value on the most memory consuming >>> processor): 39 >>> INFOG(17) (estimated size of all MUMPS internal data for >>> factorization after analysis: sum over all processors): 150 >>> INFOG(18) (size of all MUMPS internal data allocated >>> during factorization: value on the most memory consuming processor): 39 >>> INFOG(19) (size of all MUMPS internal data allocated >>> during factorization: sum over all processors): 150 >>> INFOG(20) (estimated number of entries in the factors): >>> 3361617 >>> INFOG(21) (size in MB of memory effectively used during >>> factorization - value on the most memory consuming processor): 35 >>> INFOG(22) (size in MB of memory effectively used during >>> factorization - sum over all processors): 136 >>> INFOG(23) (after analysis: value of ICNTL(6) effectively >>> used): 0 >>> INFOG(24) (after analysis: value of ICNTL(12) effectively >>> used): 1 >>> INFOG(25) (after factorization: number of pivots modified >>> by static pivoting): 0 >>> INFOG(28) (after factorization: number of null pivots >>> encountered): 0 >>> INFOG(29) (after factorization: effective number of >>> entries in the factors (sum over all processors)): 2931438 >>> INFOG(30, 31) (after solution: size in Mbytes of memory >>> used during solution phase): 0, 0 >>> INFOG(32) (after analysis: type of analysis done): 1 >>> INFOG(33) (value used for ICNTL(8)): 7 >>> INFOG(34) (exponent of the determinant if determinant is >>> requested): 0 >>> linear system matrix = precond matrix: >>> Mat Object: 4 MPI processes >>> type: mpiaij >>> rows=22878, cols=22878 >>> total: nonzeros=1219140, allocated nonzeros=1219140 >>> total number of mallocs used during MatSetValues calls =0 >>> using I-node (on process 0) routines: found 1889 nodes, limit used >>> is 5 >>> converged reason: -11 >>> KSP Object: 4 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: 4 MPI processes >>> type: cholesky >>> Cholesky: out-of-place factorization >>> tolerance for zero pivot 2.22045e-14 >>> matrix ordering: natural >>> factor fill ratio given 0., needed 0. >>> Factored matrix follows: >>> Mat Object: 4 MPI processes >>> type: mpiaij >>> rows=22878, cols=22878 >>> package used to perform factorization: mumps >>> total: nonzeros=3361617, allocated nonzeros=3361617 >>> total number of mallocs used during MatSetValues calls =0 >>> MUMPS run parameters: >>> SYM (matrix type): 2 >>> PAR (host participation): 1 >>> ICNTL(1) (output for error): 6 >>> ICNTL(2) (output of diagnostic msg): 0 >>> ICNTL(3) (output for global info): 0 >>> ICNTL(4) (level of printing): 0 >>> ICNTL(5) (input mat struct): 0 >>> ICNTL(6) (matrix prescaling): 7 >>> ICNTL(7) (sequentia matrix ordering):7 >>> ICNTL(8) (scalling strategy): 77 >>> ICNTL(10) (max num of refinements): 0 >>> ICNTL(11) (error analysis): 0 >>> ICNTL(12) (efficiency control): 0 >>> ICNTL(13) (efficiency control): 0 >>> ICNTL(14) (percentage of estimated workspace increase): 30 >>> ICNTL(18) (input mat struct): 3 >>> ICNTL(19) (Shur complement info): 0 >>> ICNTL(20) (rhs sparse pattern): 0 >>> ICNTL(21) (solution struct): 1 >>> ICNTL(22) (in-core/out-of-core facility): 0 >>> ICNTL(23) (max size of memory can be allocated locally):0 >>> ICNTL(24) (detection of null pivot rows): 0 >>> ICNTL(25) (computation of a null space basis): 0 >>> ICNTL(26) (Schur options for rhs or solution): 0 >>> ICNTL(27) (experimental parameter): >>> -24 >>> ICNTL(28) (use parallel or sequential ordering): 1 >>> ICNTL(29) (parallel ordering): 0 >>> ICNTL(30) (user-specified set of entries in inv(A)): 0 >>> ICNTL(31) (factors is discarded in the solve phase): 0 >>> ICNTL(33) (compute determinant): 0 >>> CNTL(1) (relative pivoting threshold): 0.01 >>> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >>> CNTL(3) (absolute pivoting threshold): 0. >>> CNTL(4) (value of static pivoting): -1. >>> CNTL(5) (fixation for null pivots): 0. >>> RINFO(1) (local estimated flops for the elimination after >>> analysis): >>> [0] 1.84947e+08 >>> [1] 2.42065e+08 >>> [2] 2.53044e+08 >>> [3] 2.18441e+08 >>> RINFO(2) (local estimated flops for the assembly after >>> factorization): >>> [0] 945938. >>> [1] 906795. >>> [2] 897815. >>> [3] 998840. >>> RINFO(3) (local estimated flops for the elimination after >>> factorization): >>> [0] 1.59835e+08 >>> [1] 1.50867e+08 >>> [2] 2.27932e+08 >>> [3] 1.52037e+08 >>> INFO(15) (estimated size of (in MB) MUMPS internal data >>> for running numerical factorization): >>> [0] 36 >>> [1] 37 >>> [2] 38 >>> [3] 39 >>> INFO(16) (size of (in MB) MUMPS internal data used during >>> numerical factorization): >>> [0] 36 >>> [1] 37 >>> [2] 38 >>> [3] 39 >>> INFO(23) (num of pivots eliminated on this processor after >>> factorization): >>> [0] 6450 >>> [1] 5442 >>> [2] 4386 >>> [3] 5526 >>> RINFOG(1) (global estimated flops for the elimination >>> after analysis): 8.98497e+08 >>> RINFOG(2) (global estimated flops for the assembly after >>> factorization): 3.74939e+06 >>> RINFOG(3) (global estimated flops for the elimination >>> after factorization): 6.9067e+08 >>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>> (0.,0.)*(2^0) >>> INFOG(3) (estimated real workspace for factors on all >>> processors after analysis): 4082184 >>> INFOG(4) (estimated integer workspace for factors on all >>> processors after analysis): 231846 >>> INFOG(5) (estimated maximum front size in the complete >>> tree): 678 >>> INFOG(6) (number of nodes in the complete tree): 1380 >>> INFOG(7) (ordering option effectively use after analysis): >>> 5 >>> INFOG(8) (structural symmetry in percent of the permuted >>> matrix after analysis): 100 >>> INFOG(9) (total real/complex workspace to store the matrix >>> factors after factorization): 3521904 >>> INFOG(10) (total integer space store the matrix factors >>> after factorization): 229416 >>> INFOG(11) (order of largest frontal matrix after >>> factorization): 678 >>> INFOG(12) (number of off-diagonal pivots): 0 >>> INFOG(13) (number of delayed pivots after factorization): >>> 0 >>> INFOG(14) (number of memory compress after factorization): >>> 0 >>> INFOG(15) (number of steps of iterative refinement after >>> solution): 0 >>> INFOG(16) (estimated size (in MB) of all MUMPS internal >>> data for factorization after analysis: value on the most memory consuming >>> processor): 39 >>> INFOG(17) (estimated size of all MUMPS internal data for >>> factorization after analysis: sum over all processors): 150 >>> INFOG(18) (size of all MUMPS internal data allocated >>> during factorization: value on the most memory consuming processor): 39 >>> INFOG(19) (size of all MUMPS internal data allocated >>> during factorization: sum over all processors): 150 >>> INFOG(20) (estimated number of entries in the factors): >>> 3361617 >>> INFOG(21) (size in MB of memory effectively used during >>> factorization - value on the most memory consuming processor): 35 >>> INFOG(22) (size in MB of memory effectively used during >>> factorization - sum over all processors): 136 >>> INFOG(23) (after analysis: value of ICNTL(6) effectively >>> used): 0 >>> INFOG(24) (after analysis: value of ICNTL(12) effectively >>> used): 1 >>> INFOG(25) (after factorization: number of pivots modified >>> by static pivoting): 0 >>> INFOG(28) (after factorization: number of null pivots >>> encountered): 0 >>> INFOG(29) (after factorization: effective number of >>> entries in the factors (sum over all processors)): 2931438 >>> INFOG(30, 31) (after solution: size in Mbytes of memory >>> used during solution phase): 0, 0 >>> INFOG(32) (after analysis: type of analysis done): 1 >>> INFOG(33) (value used for ICNTL(8)): 7 >>> INFOG(34) (exponent of the determinant if determinant is >>> requested): 0 >>> linear system matrix = precond matrix: >>> Mat Object: 4 MPI processes >>> type: mpiaij >>> rows=22878, cols=22878 >>> total: nonzeros=1219140, allocated nonzeros=1219140 >>> total number of mallocs used during MatSetValues calls =0 >>> using I-node (on process 0) routines: found 1889 nodes, limit used >>> is 5 >>> converged reason: -11 >>> >>> ------------------------------------------------------------ >>> ----------------------------------------- >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Mon Sep 19 20:45:38 2016 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 19 Sep 2016 19:45:38 -0600 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: Placing PCReset(PC pc) before the second kspsolve might works. Fande Kong, On Mon, Sep 19, 2016 at 7:38 PM, murat ke?eli wrote: > Another guess: maybe you also need KSPSetUp(ksp); before the second > KSPSolve(ksp,b,x);. > > Murat Keceli > ? > > On Mon, Sep 19, 2016 at 8:33 PM, David Knezevic < > david.knezevic at akselos.com> wrote: > >> On Mon, Sep 19, 2016 at 7:26 PM, Dave May >> wrote: >> >>> >>> >>> On 19 September 2016 at 21:05, David Knezevic < >>> david.knezevic at akselos.com> wrote: >>> >>>> When I use MUMPS via PETSc, one issue is that it can sometimes fail >>>> with MUMPS error -9, which means that MUMPS didn't allocate a big enough >>>> workspace. This can typically be fixed by increasing MUMPS icntl 14, e.g. >>>> via the command line option -mat_mumps_icntl_14. >>>> >>>> However, instead of having to run several times with different command >>>> line options, I'd like to be able to automatically increment icntl 14 value >>>> in a loop until the solve succeeds. >>>> >>>> I have a saved matrix which fails when I use it for a solve with MUMPS >>>> with 4 MPI processes and the default ictnl values, so I'm using this to >>>> check that I can achieve the automatic icntl 14 update, as described above. >>>> (The matrix is 14MB so I haven't attached it here, but I'd be happy to send >>>> it to anyone else who wants to try this test case out.) >>>> >>>> I've pasted some test code below which provides a simple test of this >>>> idea using two solves. The first solve uses the default value of icntl 14, >>>> which fails, and then we update icntl 14 to 30 and solve again. The second >>>> solve should succeed since icntl 14 of 30 is sufficient for MUMPS to >>>> succeed in this case, but for some reason the second solve still fails. >>>> >>>> Below I've also pasted the output from -ksp_view, and you can see that >>>> ictnl 14 is being updated correctly (see the ICNTL(14) lines in the >>>> output), so it's not clear to me why the second solve fails. It seems like >>>> MUMPS is ignoring the update to the ictnl value? >>>> >>> >>> I believe this parameter is utilized during the numerical factorization >>> phase. >>> In your code, the operator hasn't changed, however you haven't signalled >>> to the KSP that you want to re-perform the numerical factorization. >>> You can do this by calling KSPSetOperators() before your second solve. >>> I think if you do this (please try it), the factorization will be >>> performed again and the new value of icntl will have an effect. >>> >>> Note this is a wild stab in the dark - I haven't dug through the >>> petsc-mumps code in detail... >>> >> >> That sounds like a plausible guess to me, but unfortunately it didn't >> work. I added KSPSetOperators(ksp,A,A); before the second solve and I >> got the same behavior as before. >> >> Thanks, >> David >> >> >> >> >> >>> ------------------------------------------------------------ >>>> ----------------------------------------- >>>> Test code: >>>> >>>> Mat A; >>>> MatCreate(PETSC_COMM_WORLD,&A); >>>> MatSetType(A,MATMPIAIJ); >>>> >>>> PetscViewer petsc_viewer; >>>> PetscViewerBinaryOpen( PETSC_COMM_WORLD, >>>> "matrix.dat", >>>> FILE_MODE_READ, >>>> &petsc_viewer); >>>> MatLoad(A, petsc_viewer); >>>> PetscViewerDestroy(&petsc_viewer); >>>> >>>> PetscInt m, n; >>>> MatGetSize(A, &m, &n); >>>> >>>> Vec x; >>>> VecCreate(PETSC_COMM_WORLD,&x); >>>> VecSetSizes(x,PETSC_DECIDE,m); >>>> VecSetFromOptions(x); >>>> VecSet(x,1.0); >>>> >>>> Vec b; >>>> VecDuplicate(x,&b); >>>> >>>> KSP ksp; >>>> PC pc; >>>> >>>> KSPCreate(PETSC_COMM_WORLD,&ksp); >>>> KSPSetOperators(ksp,A,A); >>>> >>>> KSPSetType(ksp,KSPPREONLY); >>>> KSPGetPC(ksp,&pc); >>>> >>>> PCSetType(pc,PCCHOLESKY); >>>> >>>> PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); >>>> PCFactorSetUpMatSolverPackage(pc); >>>> >>>> KSPSetFromOptions(ksp); >>>> KSPSetUp(ksp); >>>> >>>> KSPSolve(ksp,b,x); >>>> >>>> { >>>> KSPConvergedReason reason; >>>> KSPGetConvergedReason(ksp, &reason); >>>> std::cout << "converged reason: " << reason << std::endl; >>>> } >>>> >>>> Mat F; >>>> PCFactorGetMatrix(pc,&F); >>>> MatMumpsSetIcntl(F,14,30); >>>> >>>> KSPSolve(ksp,b,x); >>>> >>>> { >>>> KSPConvergedReason reason; >>>> KSPGetConvergedReason(ksp, &reason); >>>> std::cout << "converged reason: " << reason << std::endl; >>>> } >>>> >>>> ------------------------------------------------------------ >>>> ----------------------------------------- >>>> -ksp_view output (ICNTL(14) changes from 20 to 30, but we get >>>> "converged reason: -11" for both solves) >>>> >>>> KSP Object: 4 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: 4 MPI processes >>>> type: cholesky >>>> Cholesky: out-of-place factorization >>>> tolerance for zero pivot 2.22045e-14 >>>> matrix ordering: natural >>>> factor fill ratio given 0., needed 0. >>>> Factored matrix follows: >>>> Mat Object: 4 MPI processes >>>> type: mpiaij >>>> rows=22878, cols=22878 >>>> package used to perform factorization: mumps >>>> total: nonzeros=3361617, allocated nonzeros=3361617 >>>> total number of mallocs used during MatSetValues calls =0 >>>> MUMPS run parameters: >>>> SYM (matrix type): 2 >>>> PAR (host participation): 1 >>>> ICNTL(1) (output for error): 6 >>>> ICNTL(2) (output of diagnostic msg): 0 >>>> ICNTL(3) (output for global info): 0 >>>> ICNTL(4) (level of printing): 0 >>>> ICNTL(5) (input mat struct): 0 >>>> ICNTL(6) (matrix prescaling): 7 >>>> ICNTL(7) (sequentia matrix ordering):7 >>>> ICNTL(8) (scalling strategy): 77 >>>> ICNTL(10) (max num of refinements): 0 >>>> ICNTL(11) (error analysis): 0 >>>> ICNTL(12) (efficiency control): 0 >>>> ICNTL(13) (efficiency control): 0 >>>> ICNTL(14) (percentage of estimated workspace increase): >>>> 20 >>>> ICNTL(18) (input mat struct): 3 >>>> ICNTL(19) (Shur complement info): 0 >>>> ICNTL(20) (rhs sparse pattern): 0 >>>> ICNTL(21) (solution struct): 1 >>>> ICNTL(22) (in-core/out-of-core facility): 0 >>>> ICNTL(23) (max size of memory can be allocated locally):0 >>>> ICNTL(24) (detection of null pivot rows): 0 >>>> ICNTL(25) (computation of a null space basis): 0 >>>> ICNTL(26) (Schur options for rhs or solution): 0 >>>> ICNTL(27) (experimental parameter): >>>> -24 >>>> ICNTL(28) (use parallel or sequential ordering): 1 >>>> ICNTL(29) (parallel ordering): 0 >>>> ICNTL(30) (user-specified set of entries in inv(A)): 0 >>>> ICNTL(31) (factors is discarded in the solve phase): 0 >>>> ICNTL(33) (compute determinant): 0 >>>> CNTL(1) (relative pivoting threshold): 0.01 >>>> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >>>> CNTL(3) (absolute pivoting threshold): 0. >>>> CNTL(4) (value of static pivoting): -1. >>>> CNTL(5) (fixation for null pivots): 0. >>>> RINFO(1) (local estimated flops for the elimination after >>>> analysis): >>>> [0] 1.84947e+08 >>>> [1] 2.42065e+08 >>>> [2] 2.53044e+08 >>>> [3] 2.18441e+08 >>>> RINFO(2) (local estimated flops for the assembly after >>>> factorization): >>>> [0] 945938. >>>> [1] 906795. >>>> [2] 897815. >>>> [3] 998840. >>>> RINFO(3) (local estimated flops for the elimination after >>>> factorization): >>>> [0] 1.59835e+08 >>>> [1] 1.50867e+08 >>>> [2] 2.27932e+08 >>>> [3] 1.52037e+08 >>>> INFO(15) (estimated size of (in MB) MUMPS internal data >>>> for running numerical factorization): >>>> [0] 36 >>>> [1] 37 >>>> [2] 38 >>>> [3] 39 >>>> INFO(16) (size of (in MB) MUMPS internal data used during >>>> numerical factorization): >>>> [0] 36 >>>> [1] 37 >>>> [2] 38 >>>> [3] 39 >>>> INFO(23) (num of pivots eliminated on this processor >>>> after factorization): >>>> [0] 6450 >>>> [1] 5442 >>>> [2] 4386 >>>> [3] 5526 >>>> RINFOG(1) (global estimated flops for the elimination >>>> after analysis): 8.98497e+08 >>>> RINFOG(2) (global estimated flops for the assembly after >>>> factorization): 3.74939e+06 >>>> RINFOG(3) (global estimated flops for the elimination >>>> after factorization): 6.9067e+08 >>>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>>> (0.,0.)*(2^0) >>>> INFOG(3) (estimated real workspace for factors on all >>>> processors after analysis): 4082184 >>>> INFOG(4) (estimated integer workspace for factors on all >>>> processors after analysis): 231846 >>>> INFOG(5) (estimated maximum front size in the complete >>>> tree): 678 >>>> INFOG(6) (number of nodes in the complete tree): 1380 >>>> INFOG(7) (ordering option effectively use after >>>> analysis): 5 >>>> INFOG(8) (structural symmetry in percent of the permuted >>>> matrix after analysis): 100 >>>> INFOG(9) (total real/complex workspace to store the >>>> matrix factors after factorization): 3521904 >>>> INFOG(10) (total integer space store the matrix factors >>>> after factorization): 229416 >>>> INFOG(11) (order of largest frontal matrix after >>>> factorization): 678 >>>> INFOG(12) (number of off-diagonal pivots): 0 >>>> INFOG(13) (number of delayed pivots after factorization): >>>> 0 >>>> INFOG(14) (number of memory compress after >>>> factorization): 0 >>>> INFOG(15) (number of steps of iterative refinement after >>>> solution): 0 >>>> INFOG(16) (estimated size (in MB) of all MUMPS internal >>>> data for factorization after analysis: value on the most memory consuming >>>> processor): 39 >>>> INFOG(17) (estimated size of all MUMPS internal data for >>>> factorization after analysis: sum over all processors): 150 >>>> INFOG(18) (size of all MUMPS internal data allocated >>>> during factorization: value on the most memory consuming processor): 39 >>>> INFOG(19) (size of all MUMPS internal data allocated >>>> during factorization: sum over all processors): 150 >>>> INFOG(20) (estimated number of entries in the factors): >>>> 3361617 >>>> INFOG(21) (size in MB of memory effectively used during >>>> factorization - value on the most memory consuming processor): 35 >>>> INFOG(22) (size in MB of memory effectively used during >>>> factorization - sum over all processors): 136 >>>> INFOG(23) (after analysis: value of ICNTL(6) effectively >>>> used): 0 >>>> INFOG(24) (after analysis: value of ICNTL(12) effectively >>>> used): 1 >>>> INFOG(25) (after factorization: number of pivots modified >>>> by static pivoting): 0 >>>> INFOG(28) (after factorization: number of null pivots >>>> encountered): 0 >>>> INFOG(29) (after factorization: effective number of >>>> entries in the factors (sum over all processors)): 2931438 >>>> INFOG(30, 31) (after solution: size in Mbytes of memory >>>> used during solution phase): 0, 0 >>>> INFOG(32) (after analysis: type of analysis done): 1 >>>> INFOG(33) (value used for ICNTL(8)): 7 >>>> INFOG(34) (exponent of the determinant if determinant is >>>> requested): 0 >>>> linear system matrix = precond matrix: >>>> Mat Object: 4 MPI processes >>>> type: mpiaij >>>> rows=22878, cols=22878 >>>> total: nonzeros=1219140, allocated nonzeros=1219140 >>>> total number of mallocs used during MatSetValues calls =0 >>>> using I-node (on process 0) routines: found 1889 nodes, limit >>>> used is 5 >>>> converged reason: -11 >>>> KSP Object: 4 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: 4 MPI processes >>>> type: cholesky >>>> Cholesky: out-of-place factorization >>>> tolerance for zero pivot 2.22045e-14 >>>> matrix ordering: natural >>>> factor fill ratio given 0., needed 0. >>>> Factored matrix follows: >>>> Mat Object: 4 MPI processes >>>> type: mpiaij >>>> rows=22878, cols=22878 >>>> package used to perform factorization: mumps >>>> total: nonzeros=3361617, allocated nonzeros=3361617 >>>> total number of mallocs used during MatSetValues calls =0 >>>> MUMPS run parameters: >>>> SYM (matrix type): 2 >>>> PAR (host participation): 1 >>>> ICNTL(1) (output for error): 6 >>>> ICNTL(2) (output of diagnostic msg): 0 >>>> ICNTL(3) (output for global info): 0 >>>> ICNTL(4) (level of printing): 0 >>>> ICNTL(5) (input mat struct): 0 >>>> ICNTL(6) (matrix prescaling): 7 >>>> ICNTL(7) (sequentia matrix ordering):7 >>>> ICNTL(8) (scalling strategy): 77 >>>> ICNTL(10) (max num of refinements): 0 >>>> ICNTL(11) (error analysis): 0 >>>> ICNTL(12) (efficiency control): 0 >>>> ICNTL(13) (efficiency control): 0 >>>> ICNTL(14) (percentage of estimated workspace increase): >>>> 30 >>>> ICNTL(18) (input mat struct): 3 >>>> ICNTL(19) (Shur complement info): 0 >>>> ICNTL(20) (rhs sparse pattern): 0 >>>> ICNTL(21) (solution struct): 1 >>>> ICNTL(22) (in-core/out-of-core facility): 0 >>>> ICNTL(23) (max size of memory can be allocated locally):0 >>>> ICNTL(24) (detection of null pivot rows): 0 >>>> ICNTL(25) (computation of a null space basis): 0 >>>> ICNTL(26) (Schur options for rhs or solution): 0 >>>> ICNTL(27) (experimental parameter): >>>> -24 >>>> ICNTL(28) (use parallel or sequential ordering): 1 >>>> ICNTL(29) (parallel ordering): 0 >>>> ICNTL(30) (user-specified set of entries in inv(A)): 0 >>>> ICNTL(31) (factors is discarded in the solve phase): 0 >>>> ICNTL(33) (compute determinant): 0 >>>> CNTL(1) (relative pivoting threshold): 0.01 >>>> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >>>> CNTL(3) (absolute pivoting threshold): 0. >>>> CNTL(4) (value of static pivoting): -1. >>>> CNTL(5) (fixation for null pivots): 0. >>>> RINFO(1) (local estimated flops for the elimination after >>>> analysis): >>>> [0] 1.84947e+08 >>>> [1] 2.42065e+08 >>>> [2] 2.53044e+08 >>>> [3] 2.18441e+08 >>>> RINFO(2) (local estimated flops for the assembly after >>>> factorization): >>>> [0] 945938. >>>> [1] 906795. >>>> [2] 897815. >>>> [3] 998840. >>>> RINFO(3) (local estimated flops for the elimination after >>>> factorization): >>>> [0] 1.59835e+08 >>>> [1] 1.50867e+08 >>>> [2] 2.27932e+08 >>>> [3] 1.52037e+08 >>>> INFO(15) (estimated size of (in MB) MUMPS internal data >>>> for running numerical factorization): >>>> [0] 36 >>>> [1] 37 >>>> [2] 38 >>>> [3] 39 >>>> INFO(16) (size of (in MB) MUMPS internal data used during >>>> numerical factorization): >>>> [0] 36 >>>> [1] 37 >>>> [2] 38 >>>> [3] 39 >>>> INFO(23) (num of pivots eliminated on this processor >>>> after factorization): >>>> [0] 6450 >>>> [1] 5442 >>>> [2] 4386 >>>> [3] 5526 >>>> RINFOG(1) (global estimated flops for the elimination >>>> after analysis): 8.98497e+08 >>>> RINFOG(2) (global estimated flops for the assembly after >>>> factorization): 3.74939e+06 >>>> RINFOG(3) (global estimated flops for the elimination >>>> after factorization): 6.9067e+08 >>>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>>> (0.,0.)*(2^0) >>>> INFOG(3) (estimated real workspace for factors on all >>>> processors after analysis): 4082184 >>>> INFOG(4) (estimated integer workspace for factors on all >>>> processors after analysis): 231846 >>>> INFOG(5) (estimated maximum front size in the complete >>>> tree): 678 >>>> INFOG(6) (number of nodes in the complete tree): 1380 >>>> INFOG(7) (ordering option effectively use after >>>> analysis): 5 >>>> INFOG(8) (structural symmetry in percent of the permuted >>>> matrix after analysis): 100 >>>> INFOG(9) (total real/complex workspace to store the >>>> matrix factors after factorization): 3521904 >>>> INFOG(10) (total integer space store the matrix factors >>>> after factorization): 229416 >>>> INFOG(11) (order of largest frontal matrix after >>>> factorization): 678 >>>> INFOG(12) (number of off-diagonal pivots): 0 >>>> INFOG(13) (number of delayed pivots after factorization): >>>> 0 >>>> INFOG(14) (number of memory compress after >>>> factorization): 0 >>>> INFOG(15) (number of steps of iterative refinement after >>>> solution): 0 >>>> INFOG(16) (estimated size (in MB) of all MUMPS internal >>>> data for factorization after analysis: value on the most memory consuming >>>> processor): 39 >>>> INFOG(17) (estimated size of all MUMPS internal data for >>>> factorization after analysis: sum over all processors): 150 >>>> INFOG(18) (size of all MUMPS internal data allocated >>>> during factorization: value on the most memory consuming processor): 39 >>>> INFOG(19) (size of all MUMPS internal data allocated >>>> during factorization: sum over all processors): 150 >>>> INFOG(20) (estimated number of entries in the factors): >>>> 3361617 >>>> INFOG(21) (size in MB of memory effectively used during >>>> factorization - value on the most memory consuming processor): 35 >>>> INFOG(22) (size in MB of memory effectively used during >>>> factorization - sum over all processors): 136 >>>> INFOG(23) (after analysis: value of ICNTL(6) effectively >>>> used): 0 >>>> INFOG(24) (after analysis: value of ICNTL(12) effectively >>>> used): 1 >>>> INFOG(25) (after factorization: number of pivots modified >>>> by static pivoting): 0 >>>> INFOG(28) (after factorization: number of null pivots >>>> encountered): 0 >>>> INFOG(29) (after factorization: effective number of >>>> entries in the factors (sum over all processors)): 2931438 >>>> INFOG(30, 31) (after solution: size in Mbytes of memory >>>> used during solution phase): 0, 0 >>>> INFOG(32) (after analysis: type of analysis done): 1 >>>> INFOG(33) (value used for ICNTL(8)): 7 >>>> INFOG(34) (exponent of the determinant if determinant is >>>> requested): 0 >>>> linear system matrix = precond matrix: >>>> Mat Object: 4 MPI processes >>>> type: mpiaij >>>> rows=22878, cols=22878 >>>> total: nonzeros=1219140, allocated nonzeros=1219140 >>>> total number of mallocs used during MatSetValues calls =0 >>>> using I-node (on process 0) routines: found 1889 nodes, limit >>>> used is 5 >>>> converged reason: -11 >>>> >>>> ------------------------------------------------------------ >>>> ----------------------------------------- >>>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Mon Sep 19 20:52:59 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Mon, 19 Sep 2016 21:52:59 -0400 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: On Mon, Sep 19, 2016 at 9:45 PM, Fande Kong wrote: > Placing PCReset(PC pc) before the second kspsolve might works. > > Fande Kong, > > On Mon, Sep 19, 2016 at 7:38 PM, murat ke?eli wrote: > >> Another guess: maybe you also need KSPSetUp(ksp); before the second >> KSPSolve(ksp,b,x);. >> >> Murat Keceli >> > Thanks for the suggestions. I just tried these, and they didn't work either unfortunately. David > ? >> >> On Mon, Sep 19, 2016 at 8:33 PM, David Knezevic < >> david.knezevic at akselos.com> wrote: >> >>> On Mon, Sep 19, 2016 at 7:26 PM, Dave May >>> wrote: >>> >>>> >>>> >>>> On 19 September 2016 at 21:05, David Knezevic < >>>> david.knezevic at akselos.com> wrote: >>>> >>>>> When I use MUMPS via PETSc, one issue is that it can sometimes fail >>>>> with MUMPS error -9, which means that MUMPS didn't allocate a big enough >>>>> workspace. This can typically be fixed by increasing MUMPS icntl 14, e.g. >>>>> via the command line option -mat_mumps_icntl_14. >>>>> >>>>> However, instead of having to run several times with different command >>>>> line options, I'd like to be able to automatically increment icntl 14 value >>>>> in a loop until the solve succeeds. >>>>> >>>>> I have a saved matrix which fails when I use it for a solve with MUMPS >>>>> with 4 MPI processes and the default ictnl values, so I'm using this to >>>>> check that I can achieve the automatic icntl 14 update, as described above. >>>>> (The matrix is 14MB so I haven't attached it here, but I'd be happy to send >>>>> it to anyone else who wants to try this test case out.) >>>>> >>>>> I've pasted some test code below which provides a simple test of this >>>>> idea using two solves. The first solve uses the default value of icntl 14, >>>>> which fails, and then we update icntl 14 to 30 and solve again. The second >>>>> solve should succeed since icntl 14 of 30 is sufficient for MUMPS to >>>>> succeed in this case, but for some reason the second solve still fails. >>>>> >>>>> Below I've also pasted the output from -ksp_view, and you can see that >>>>> ictnl 14 is being updated correctly (see the ICNTL(14) lines in the >>>>> output), so it's not clear to me why the second solve fails. It seems like >>>>> MUMPS is ignoring the update to the ictnl value? >>>>> >>>> >>>> I believe this parameter is utilized during the numerical factorization >>>> phase. >>>> In your code, the operator hasn't changed, however you haven't >>>> signalled to the KSP that you want to re-perform the numerical >>>> factorization. >>>> You can do this by calling KSPSetOperators() before your second solve. >>>> I think if you do this (please try it), the factorization will be >>>> performed again and the new value of icntl will have an effect. >>>> >>>> Note this is a wild stab in the dark - I haven't dug through the >>>> petsc-mumps code in detail... >>>> >>> >>> That sounds like a plausible guess to me, but unfortunately it didn't >>> work. I added KSPSetOperators(ksp,A,A); before the second solve and I >>> got the same behavior as before. >>> >>> Thanks, >>> David >>> >>> >>> >>> >>> >>>> ------------------------------------------------------------ >>>>> ----------------------------------------- >>>>> Test code: >>>>> >>>>> Mat A; >>>>> MatCreate(PETSC_COMM_WORLD,&A); >>>>> MatSetType(A,MATMPIAIJ); >>>>> >>>>> PetscViewer petsc_viewer; >>>>> PetscViewerBinaryOpen( PETSC_COMM_WORLD, >>>>> "matrix.dat", >>>>> FILE_MODE_READ, >>>>> &petsc_viewer); >>>>> MatLoad(A, petsc_viewer); >>>>> PetscViewerDestroy(&petsc_viewer); >>>>> >>>>> PetscInt m, n; >>>>> MatGetSize(A, &m, &n); >>>>> >>>>> Vec x; >>>>> VecCreate(PETSC_COMM_WORLD,&x); >>>>> VecSetSizes(x,PETSC_DECIDE,m); >>>>> VecSetFromOptions(x); >>>>> VecSet(x,1.0); >>>>> >>>>> Vec b; >>>>> VecDuplicate(x,&b); >>>>> >>>>> KSP ksp; >>>>> PC pc; >>>>> >>>>> KSPCreate(PETSC_COMM_WORLD,&ksp); >>>>> KSPSetOperators(ksp,A,A); >>>>> >>>>> KSPSetType(ksp,KSPPREONLY); >>>>> KSPGetPC(ksp,&pc); >>>>> >>>>> PCSetType(pc,PCCHOLESKY); >>>>> >>>>> PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); >>>>> PCFactorSetUpMatSolverPackage(pc); >>>>> >>>>> KSPSetFromOptions(ksp); >>>>> KSPSetUp(ksp); >>>>> >>>>> KSPSolve(ksp,b,x); >>>>> >>>>> { >>>>> KSPConvergedReason reason; >>>>> KSPGetConvergedReason(ksp, &reason); >>>>> std::cout << "converged reason: " << reason << std::endl; >>>>> } >>>>> >>>>> Mat F; >>>>> PCFactorGetMatrix(pc,&F); >>>>> MatMumpsSetIcntl(F,14,30); >>>>> >>>>> KSPSolve(ksp,b,x); >>>>> >>>>> { >>>>> KSPConvergedReason reason; >>>>> KSPGetConvergedReason(ksp, &reason); >>>>> std::cout << "converged reason: " << reason << std::endl; >>>>> } >>>>> >>>>> ------------------------------------------------------------ >>>>> ----------------------------------------- >>>>> -ksp_view output (ICNTL(14) changes from 20 to 30, but we get >>>>> "converged reason: -11" for both solves) >>>>> >>>>> KSP Object: 4 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: 4 MPI processes >>>>> type: cholesky >>>>> Cholesky: out-of-place factorization >>>>> tolerance for zero pivot 2.22045e-14 >>>>> matrix ordering: natural >>>>> factor fill ratio given 0., needed 0. >>>>> Factored matrix follows: >>>>> Mat Object: 4 MPI processes >>>>> type: mpiaij >>>>> rows=22878, cols=22878 >>>>> package used to perform factorization: mumps >>>>> total: nonzeros=3361617, allocated nonzeros=3361617 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> MUMPS run parameters: >>>>> SYM (matrix type): 2 >>>>> PAR (host participation): 1 >>>>> ICNTL(1) (output for error): 6 >>>>> ICNTL(2) (output of diagnostic msg): 0 >>>>> ICNTL(3) (output for global info): 0 >>>>> ICNTL(4) (level of printing): 0 >>>>> ICNTL(5) (input mat struct): 0 >>>>> ICNTL(6) (matrix prescaling): 7 >>>>> ICNTL(7) (sequentia matrix ordering):7 >>>>> ICNTL(8) (scalling strategy): 77 >>>>> ICNTL(10) (max num of refinements): 0 >>>>> ICNTL(11) (error analysis): 0 >>>>> ICNTL(12) (efficiency control): >>>>> 0 >>>>> ICNTL(13) (efficiency control): >>>>> 0 >>>>> ICNTL(14) (percentage of estimated workspace increase): >>>>> 20 >>>>> ICNTL(18) (input mat struct): >>>>> 3 >>>>> ICNTL(19) (Shur complement info): >>>>> 0 >>>>> ICNTL(20) (rhs sparse pattern): >>>>> 0 >>>>> ICNTL(21) (solution struct): >>>>> 1 >>>>> ICNTL(22) (in-core/out-of-core facility): >>>>> 0 >>>>> ICNTL(23) (max size of memory can be allocated >>>>> locally):0 >>>>> ICNTL(24) (detection of null pivot rows): >>>>> 0 >>>>> ICNTL(25) (computation of a null space basis): >>>>> 0 >>>>> ICNTL(26) (Schur options for rhs or solution): >>>>> 0 >>>>> ICNTL(27) (experimental parameter): >>>>> -24 >>>>> ICNTL(28) (use parallel or sequential ordering): >>>>> 1 >>>>> ICNTL(29) (parallel ordering): >>>>> 0 >>>>> ICNTL(30) (user-specified set of entries in inv(A)): >>>>> 0 >>>>> ICNTL(31) (factors is discarded in the solve phase): >>>>> 0 >>>>> ICNTL(33) (compute determinant): >>>>> 0 >>>>> CNTL(1) (relative pivoting threshold): 0.01 >>>>> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >>>>> CNTL(3) (absolute pivoting threshold): 0. >>>>> CNTL(4) (value of static pivoting): -1. >>>>> CNTL(5) (fixation for null pivots): 0. >>>>> RINFO(1) (local estimated flops for the elimination >>>>> after analysis): >>>>> [0] 1.84947e+08 >>>>> [1] 2.42065e+08 >>>>> [2] 2.53044e+08 >>>>> [3] 2.18441e+08 >>>>> RINFO(2) (local estimated flops for the assembly after >>>>> factorization): >>>>> [0] 945938. >>>>> [1] 906795. >>>>> [2] 897815. >>>>> [3] 998840. >>>>> RINFO(3) (local estimated flops for the elimination >>>>> after factorization): >>>>> [0] 1.59835e+08 >>>>> [1] 1.50867e+08 >>>>> [2] 2.27932e+08 >>>>> [3] 1.52037e+08 >>>>> INFO(15) (estimated size of (in MB) MUMPS internal data >>>>> for running numerical factorization): >>>>> [0] 36 >>>>> [1] 37 >>>>> [2] 38 >>>>> [3] 39 >>>>> INFO(16) (size of (in MB) MUMPS internal data used >>>>> during numerical factorization): >>>>> [0] 36 >>>>> [1] 37 >>>>> [2] 38 >>>>> [3] 39 >>>>> INFO(23) (num of pivots eliminated on this processor >>>>> after factorization): >>>>> [0] 6450 >>>>> [1] 5442 >>>>> [2] 4386 >>>>> [3] 5526 >>>>> RINFOG(1) (global estimated flops for the elimination >>>>> after analysis): 8.98497e+08 >>>>> RINFOG(2) (global estimated flops for the assembly after >>>>> factorization): 3.74939e+06 >>>>> RINFOG(3) (global estimated flops for the elimination >>>>> after factorization): 6.9067e+08 >>>>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>>>> (0.,0.)*(2^0) >>>>> INFOG(3) (estimated real workspace for factors on all >>>>> processors after analysis): 4082184 >>>>> INFOG(4) (estimated integer workspace for factors on all >>>>> processors after analysis): 231846 >>>>> INFOG(5) (estimated maximum front size in the complete >>>>> tree): 678 >>>>> INFOG(6) (number of nodes in the complete tree): 1380 >>>>> INFOG(7) (ordering option effectively use after >>>>> analysis): 5 >>>>> INFOG(8) (structural symmetry in percent of the permuted >>>>> matrix after analysis): 100 >>>>> INFOG(9) (total real/complex workspace to store the >>>>> matrix factors after factorization): 3521904 >>>>> INFOG(10) (total integer space store the matrix factors >>>>> after factorization): 229416 >>>>> INFOG(11) (order of largest frontal matrix after >>>>> factorization): 678 >>>>> INFOG(12) (number of off-diagonal pivots): 0 >>>>> INFOG(13) (number of delayed pivots after >>>>> factorization): 0 >>>>> INFOG(14) (number of memory compress after >>>>> factorization): 0 >>>>> INFOG(15) (number of steps of iterative refinement after >>>>> solution): 0 >>>>> INFOG(16) (estimated size (in MB) of all MUMPS internal >>>>> data for factorization after analysis: value on the most memory consuming >>>>> processor): 39 >>>>> INFOG(17) (estimated size of all MUMPS internal data for >>>>> factorization after analysis: sum over all processors): 150 >>>>> INFOG(18) (size of all MUMPS internal data allocated >>>>> during factorization: value on the most memory consuming processor): 39 >>>>> INFOG(19) (size of all MUMPS internal data allocated >>>>> during factorization: sum over all processors): 150 >>>>> INFOG(20) (estimated number of entries in the factors): >>>>> 3361617 >>>>> INFOG(21) (size in MB of memory effectively used during >>>>> factorization - value on the most memory consuming processor): 35 >>>>> INFOG(22) (size in MB of memory effectively used during >>>>> factorization - sum over all processors): 136 >>>>> INFOG(23) (after analysis: value of ICNTL(6) effectively >>>>> used): 0 >>>>> INFOG(24) (after analysis: value of ICNTL(12) >>>>> effectively used): 1 >>>>> INFOG(25) (after factorization: number of pivots >>>>> modified by static pivoting): 0 >>>>> INFOG(28) (after factorization: number of null pivots >>>>> encountered): 0 >>>>> INFOG(29) (after factorization: effective number of >>>>> entries in the factors (sum over all processors)): 2931438 >>>>> INFOG(30, 31) (after solution: size in Mbytes of memory >>>>> used during solution phase): 0, 0 >>>>> INFOG(32) (after analysis: type of analysis done): 1 >>>>> INFOG(33) (value used for ICNTL(8)): 7 >>>>> INFOG(34) (exponent of the determinant if determinant is >>>>> requested): 0 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 4 MPI processes >>>>> type: mpiaij >>>>> rows=22878, cols=22878 >>>>> total: nonzeros=1219140, allocated nonzeros=1219140 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> using I-node (on process 0) routines: found 1889 nodes, limit >>>>> used is 5 >>>>> converged reason: -11 >>>>> KSP Object: 4 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: 4 MPI processes >>>>> type: cholesky >>>>> Cholesky: out-of-place factorization >>>>> tolerance for zero pivot 2.22045e-14 >>>>> matrix ordering: natural >>>>> factor fill ratio given 0., needed 0. >>>>> Factored matrix follows: >>>>> Mat Object: 4 MPI processes >>>>> type: mpiaij >>>>> rows=22878, cols=22878 >>>>> package used to perform factorization: mumps >>>>> total: nonzeros=3361617, allocated nonzeros=3361617 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> MUMPS run parameters: >>>>> SYM (matrix type): 2 >>>>> PAR (host participation): 1 >>>>> ICNTL(1) (output for error): 6 >>>>> ICNTL(2) (output of diagnostic msg): 0 >>>>> ICNTL(3) (output for global info): 0 >>>>> ICNTL(4) (level of printing): 0 >>>>> ICNTL(5) (input mat struct): 0 >>>>> ICNTL(6) (matrix prescaling): 7 >>>>> ICNTL(7) (sequentia matrix ordering):7 >>>>> ICNTL(8) (scalling strategy): 77 >>>>> ICNTL(10) (max num of refinements): 0 >>>>> ICNTL(11) (error analysis): 0 >>>>> ICNTL(12) (efficiency control): >>>>> 0 >>>>> ICNTL(13) (efficiency control): >>>>> 0 >>>>> ICNTL(14) (percentage of estimated workspace increase): >>>>> 30 >>>>> ICNTL(18) (input mat struct): >>>>> 3 >>>>> ICNTL(19) (Shur complement info): >>>>> 0 >>>>> ICNTL(20) (rhs sparse pattern): >>>>> 0 >>>>> ICNTL(21) (solution struct): >>>>> 1 >>>>> ICNTL(22) (in-core/out-of-core facility): >>>>> 0 >>>>> ICNTL(23) (max size of memory can be allocated >>>>> locally):0 >>>>> ICNTL(24) (detection of null pivot rows): >>>>> 0 >>>>> ICNTL(25) (computation of a null space basis): >>>>> 0 >>>>> ICNTL(26) (Schur options for rhs or solution): >>>>> 0 >>>>> ICNTL(27) (experimental parameter): >>>>> -24 >>>>> ICNTL(28) (use parallel or sequential ordering): >>>>> 1 >>>>> ICNTL(29) (parallel ordering): >>>>> 0 >>>>> ICNTL(30) (user-specified set of entries in inv(A)): >>>>> 0 >>>>> ICNTL(31) (factors is discarded in the solve phase): >>>>> 0 >>>>> ICNTL(33) (compute determinant): >>>>> 0 >>>>> CNTL(1) (relative pivoting threshold): 0.01 >>>>> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >>>>> CNTL(3) (absolute pivoting threshold): 0. >>>>> CNTL(4) (value of static pivoting): -1. >>>>> CNTL(5) (fixation for null pivots): 0. >>>>> RINFO(1) (local estimated flops for the elimination >>>>> after analysis): >>>>> [0] 1.84947e+08 >>>>> [1] 2.42065e+08 >>>>> [2] 2.53044e+08 >>>>> [3] 2.18441e+08 >>>>> RINFO(2) (local estimated flops for the assembly after >>>>> factorization): >>>>> [0] 945938. >>>>> [1] 906795. >>>>> [2] 897815. >>>>> [3] 998840. >>>>> RINFO(3) (local estimated flops for the elimination >>>>> after factorization): >>>>> [0] 1.59835e+08 >>>>> [1] 1.50867e+08 >>>>> [2] 2.27932e+08 >>>>> [3] 1.52037e+08 >>>>> INFO(15) (estimated size of (in MB) MUMPS internal data >>>>> for running numerical factorization): >>>>> [0] 36 >>>>> [1] 37 >>>>> [2] 38 >>>>> [3] 39 >>>>> INFO(16) (size of (in MB) MUMPS internal data used >>>>> during numerical factorization): >>>>> [0] 36 >>>>> [1] 37 >>>>> [2] 38 >>>>> [3] 39 >>>>> INFO(23) (num of pivots eliminated on this processor >>>>> after factorization): >>>>> [0] 6450 >>>>> [1] 5442 >>>>> [2] 4386 >>>>> [3] 5526 >>>>> RINFOG(1) (global estimated flops for the elimination >>>>> after analysis): 8.98497e+08 >>>>> RINFOG(2) (global estimated flops for the assembly after >>>>> factorization): 3.74939e+06 >>>>> RINFOG(3) (global estimated flops for the elimination >>>>> after factorization): 6.9067e+08 >>>>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>>>> (0.,0.)*(2^0) >>>>> INFOG(3) (estimated real workspace for factors on all >>>>> processors after analysis): 4082184 >>>>> INFOG(4) (estimated integer workspace for factors on all >>>>> processors after analysis): 231846 >>>>> INFOG(5) (estimated maximum front size in the complete >>>>> tree): 678 >>>>> INFOG(6) (number of nodes in the complete tree): 1380 >>>>> INFOG(7) (ordering option effectively use after >>>>> analysis): 5 >>>>> INFOG(8) (structural symmetry in percent of the permuted >>>>> matrix after analysis): 100 >>>>> INFOG(9) (total real/complex workspace to store the >>>>> matrix factors after factorization): 3521904 >>>>> INFOG(10) (total integer space store the matrix factors >>>>> after factorization): 229416 >>>>> INFOG(11) (order of largest frontal matrix after >>>>> factorization): 678 >>>>> INFOG(12) (number of off-diagonal pivots): 0 >>>>> INFOG(13) (number of delayed pivots after >>>>> factorization): 0 >>>>> INFOG(14) (number of memory compress after >>>>> factorization): 0 >>>>> INFOG(15) (number of steps of iterative refinement after >>>>> solution): 0 >>>>> INFOG(16) (estimated size (in MB) of all MUMPS internal >>>>> data for factorization after analysis: value on the most memory consuming >>>>> processor): 39 >>>>> INFOG(17) (estimated size of all MUMPS internal data for >>>>> factorization after analysis: sum over all processors): 150 >>>>> INFOG(18) (size of all MUMPS internal data allocated >>>>> during factorization: value on the most memory consuming processor): 39 >>>>> INFOG(19) (size of all MUMPS internal data allocated >>>>> during factorization: sum over all processors): 150 >>>>> INFOG(20) (estimated number of entries in the factors): >>>>> 3361617 >>>>> INFOG(21) (size in MB of memory effectively used during >>>>> factorization - value on the most memory consuming processor): 35 >>>>> INFOG(22) (size in MB of memory effectively used during >>>>> factorization - sum over all processors): 136 >>>>> INFOG(23) (after analysis: value of ICNTL(6) effectively >>>>> used): 0 >>>>> INFOG(24) (after analysis: value of ICNTL(12) >>>>> effectively used): 1 >>>>> INFOG(25) (after factorization: number of pivots >>>>> modified by static pivoting): 0 >>>>> INFOG(28) (after factorization: number of null pivots >>>>> encountered): 0 >>>>> INFOG(29) (after factorization: effective number of >>>>> entries in the factors (sum over all processors)): 2931438 >>>>> INFOG(30, 31) (after solution: size in Mbytes of memory >>>>> used during solution phase): 0, 0 >>>>> INFOG(32) (after analysis: type of analysis done): 1 >>>>> INFOG(33) (value used for ICNTL(8)): 7 >>>>> INFOG(34) (exponent of the determinant if determinant is >>>>> requested): 0 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 4 MPI processes >>>>> type: mpiaij >>>>> rows=22878, cols=22878 >>>>> total: nonzeros=1219140, allocated nonzeros=1219140 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> using I-node (on process 0) routines: found 1889 nodes, limit >>>>> used is 5 >>>>> converged reason: -11 >>>>> >>>>> ------------------------------------------------------------ >>>>> ----------------------------------------- >>>>> >>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Sep 19 22:04:22 2016 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 19 Sep 2016 22:04:22 -0500 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: David : I did following: PC pc; Mat F; ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); ierr = PCReset(pc);CHKERRQ(ierr); ierr = KSPSetOperators(ksp,A,A);CHKERRQ(ierr); ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr); ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);CHKERRQ(ierr); ierr = PCFactorSetUpMatSolverPackage(pc);CHKERRQ(ierr); ierr = PCFactorGetMatrix(pc,&F);CHKERRQ(ierr); ierr = MatMumpsSetIcntl(F,14,30);CHKERRQ(ierr); ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); Then it resolves the matrix equation with ICNTL(14)=30. Attached is modified petsc/src/ksp/ksp/examples/tutorials/ex10.c. Using in with your matrix.dat, I get mpiexec -n 4 ./ex10 -f0 matrix.dat -rhs 0 -ksp_reason Number of iterations = 0 KSPConvergedReason: -11 Reset PC with ICNTL(14)=30 ... KSPConvergedReason: 2 Hong On Mon, Sep 19, 2016 at 9:45 PM, Fande Kong wrote: > >> Placing PCReset(PC pc) before the second kspsolve might works. >> >> Fande Kong, >> >> On Mon, Sep 19, 2016 at 7:38 PM, murat ke?eli wrote: >> >>> Another guess: maybe you also need KSPSetUp(ksp); before the second >>> KSPSolve(ksp,b,x);. >>> >>> Murat Keceli >>> >> > Thanks for the suggestions. I just tried these, and they didn't work > either unfortunately. > > David > > > > > >> ? >>> >>> On Mon, Sep 19, 2016 at 8:33 PM, David Knezevic < >>> david.knezevic at akselos.com> wrote: >>> >>>> On Mon, Sep 19, 2016 at 7:26 PM, Dave May >>>> wrote: >>>> >>>>> >>>>> >>>>> On 19 September 2016 at 21:05, David Knezevic < >>>>> david.knezevic at akselos.com> wrote: >>>>> >>>>>> When I use MUMPS via PETSc, one issue is that it can sometimes fail >>>>>> with MUMPS error -9, which means that MUMPS didn't allocate a big enough >>>>>> workspace. This can typically be fixed by increasing MUMPS icntl 14, e.g. >>>>>> via the command line option -mat_mumps_icntl_14. >>>>>> >>>>>> However, instead of having to run several times with different >>>>>> command line options, I'd like to be able to automatically increment icntl >>>>>> 14 value in a loop until the solve succeeds. >>>>>> >>>>>> I have a saved matrix which fails when I use it for a solve with >>>>>> MUMPS with 4 MPI processes and the default ictnl values, so I'm using this >>>>>> to check that I can achieve the automatic icntl 14 update, as described >>>>>> above. (The matrix is 14MB so I haven't attached it here, but I'd be happy >>>>>> to send it to anyone else who wants to try this test case out.) >>>>>> >>>>>> I've pasted some test code below which provides a simple test of this >>>>>> idea using two solves. The first solve uses the default value of icntl 14, >>>>>> which fails, and then we update icntl 14 to 30 and solve again. The second >>>>>> solve should succeed since icntl 14 of 30 is sufficient for MUMPS to >>>>>> succeed in this case, but for some reason the second solve still fails. >>>>>> >>>>>> Below I've also pasted the output from -ksp_view, and you can see >>>>>> that ictnl 14 is being updated correctly (see the ICNTL(14) lines in the >>>>>> output), so it's not clear to me why the second solve fails. It seems like >>>>>> MUMPS is ignoring the update to the ictnl value? >>>>>> >>>>> >>>>> I believe this parameter is utilized during the numerical >>>>> factorization phase. >>>>> In your code, the operator hasn't changed, however you haven't >>>>> signalled to the KSP that you want to re-perform the numerical >>>>> factorization. >>>>> You can do this by calling KSPSetOperators() before your second solve. >>>>> I think if you do this (please try it), the factorization will be >>>>> performed again and the new value of icntl will have an effect. >>>>> >>>>> Note this is a wild stab in the dark - I haven't dug through the >>>>> petsc-mumps code in detail... >>>>> >>>> >>>> That sounds like a plausible guess to me, but unfortunately it didn't >>>> work. I added KSPSetOperators(ksp,A,A); before the second solve and I >>>> got the same behavior as before. >>>> >>>> Thanks, >>>> David >>>> >>>> >>>> >>>> >>>> >>>>> ------------------------------------------------------------ >>>>>> ----------------------------------------- >>>>>> Test code: >>>>>> >>>>>> Mat A; >>>>>> MatCreate(PETSC_COMM_WORLD,&A); >>>>>> MatSetType(A,MATMPIAIJ); >>>>>> >>>>>> PetscViewer petsc_viewer; >>>>>> PetscViewerBinaryOpen( PETSC_COMM_WORLD, >>>>>> "matrix.dat", >>>>>> FILE_MODE_READ, >>>>>> &petsc_viewer); >>>>>> MatLoad(A, petsc_viewer); >>>>>> PetscViewerDestroy(&petsc_viewer); >>>>>> >>>>>> PetscInt m, n; >>>>>> MatGetSize(A, &m, &n); >>>>>> >>>>>> Vec x; >>>>>> VecCreate(PETSC_COMM_WORLD,&x); >>>>>> VecSetSizes(x,PETSC_DECIDE,m); >>>>>> VecSetFromOptions(x); >>>>>> VecSet(x,1.0); >>>>>> >>>>>> Vec b; >>>>>> VecDuplicate(x,&b); >>>>>> >>>>>> KSP ksp; >>>>>> PC pc; >>>>>> >>>>>> KSPCreate(PETSC_COMM_WORLD,&ksp); >>>>>> KSPSetOperators(ksp,A,A); >>>>>> >>>>>> KSPSetType(ksp,KSPPREONLY); >>>>>> KSPGetPC(ksp,&pc); >>>>>> >>>>>> PCSetType(pc,PCCHOLESKY); >>>>>> >>>>>> PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); >>>>>> PCFactorSetUpMatSolverPackage(pc); >>>>>> >>>>>> KSPSetFromOptions(ksp); >>>>>> KSPSetUp(ksp); >>>>>> >>>>>> KSPSolve(ksp,b,x); >>>>>> >>>>>> { >>>>>> KSPConvergedReason reason; >>>>>> KSPGetConvergedReason(ksp, &reason); >>>>>> std::cout << "converged reason: " << reason << std::endl; >>>>>> } >>>>>> >>>>>> Mat F; >>>>>> PCFactorGetMatrix(pc,&F); >>>>>> MatMumpsSetIcntl(F,14,30); >>>>>> >>>>>> KSPSolve(ksp,b,x); >>>>>> >>>>>> { >>>>>> KSPConvergedReason reason; >>>>>> KSPGetConvergedReason(ksp, &reason); >>>>>> std::cout << "converged reason: " << reason << std::endl; >>>>>> } >>>>>> >>>>>> ------------------------------------------------------------ >>>>>> ----------------------------------------- >>>>>> -ksp_view output (ICNTL(14) changes from 20 to 30, but we get >>>>>> "converged reason: -11" for both solves) >>>>>> >>>>>> KSP Object: 4 MPI processes >>>>>> type: preonly >>>>>> maximum iterations=10000, initial guess is zero >>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using NONE norm type for convergence test >>>>>> PC Object: 4 MPI processes >>>>>> type: cholesky >>>>>> Cholesky: out-of-place factorization >>>>>> tolerance for zero pivot 2.22045e-14 >>>>>> matrix ordering: natural >>>>>> factor fill ratio given 0., needed 0. >>>>>> Factored matrix follows: >>>>>> Mat Object: 4 MPI processes >>>>>> type: mpiaij >>>>>> rows=22878, cols=22878 >>>>>> package used to perform factorization: mumps >>>>>> total: nonzeros=3361617, allocated nonzeros=3361617 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> MUMPS run parameters: >>>>>> SYM (matrix type): 2 >>>>>> PAR (host participation): 1 >>>>>> ICNTL(1) (output for error): 6 >>>>>> ICNTL(2) (output of diagnostic msg): 0 >>>>>> ICNTL(3) (output for global info): 0 >>>>>> ICNTL(4) (level of printing): 0 >>>>>> ICNTL(5) (input mat struct): 0 >>>>>> ICNTL(6) (matrix prescaling): 7 >>>>>> ICNTL(7) (sequentia matrix ordering):7 >>>>>> ICNTL(8) (scalling strategy): 77 >>>>>> ICNTL(10) (max num of refinements): 0 >>>>>> ICNTL(11) (error analysis): 0 >>>>>> ICNTL(12) (efficiency control): >>>>>> 0 >>>>>> ICNTL(13) (efficiency control): >>>>>> 0 >>>>>> ICNTL(14) (percentage of estimated workspace increase): >>>>>> 20 >>>>>> ICNTL(18) (input mat struct): >>>>>> 3 >>>>>> ICNTL(19) (Shur complement info): >>>>>> 0 >>>>>> ICNTL(20) (rhs sparse pattern): >>>>>> 0 >>>>>> ICNTL(21) (solution struct): >>>>>> 1 >>>>>> ICNTL(22) (in-core/out-of-core facility): >>>>>> 0 >>>>>> ICNTL(23) (max size of memory can be allocated >>>>>> locally):0 >>>>>> ICNTL(24) (detection of null pivot rows): >>>>>> 0 >>>>>> ICNTL(25) (computation of a null space basis): >>>>>> 0 >>>>>> ICNTL(26) (Schur options for rhs or solution): >>>>>> 0 >>>>>> ICNTL(27) (experimental parameter): >>>>>> -24 >>>>>> ICNTL(28) (use parallel or sequential ordering): >>>>>> 1 >>>>>> ICNTL(29) (parallel ordering): >>>>>> 0 >>>>>> ICNTL(30) (user-specified set of entries in inv(A)): >>>>>> 0 >>>>>> ICNTL(31) (factors is discarded in the solve phase): >>>>>> 0 >>>>>> ICNTL(33) (compute determinant): >>>>>> 0 >>>>>> CNTL(1) (relative pivoting threshold): 0.01 >>>>>> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >>>>>> CNTL(3) (absolute pivoting threshold): 0. >>>>>> CNTL(4) (value of static pivoting): -1. >>>>>> CNTL(5) (fixation for null pivots): 0. >>>>>> RINFO(1) (local estimated flops for the elimination >>>>>> after analysis): >>>>>> [0] 1.84947e+08 >>>>>> [1] 2.42065e+08 >>>>>> [2] 2.53044e+08 >>>>>> [3] 2.18441e+08 >>>>>> RINFO(2) (local estimated flops for the assembly after >>>>>> factorization): >>>>>> [0] 945938. >>>>>> [1] 906795. >>>>>> [2] 897815. >>>>>> [3] 998840. >>>>>> RINFO(3) (local estimated flops for the elimination >>>>>> after factorization): >>>>>> [0] 1.59835e+08 >>>>>> [1] 1.50867e+08 >>>>>> [2] 2.27932e+08 >>>>>> [3] 1.52037e+08 >>>>>> INFO(15) (estimated size of (in MB) MUMPS internal data >>>>>> for running numerical factorization): >>>>>> [0] 36 >>>>>> [1] 37 >>>>>> [2] 38 >>>>>> [3] 39 >>>>>> INFO(16) (size of (in MB) MUMPS internal data used >>>>>> during numerical factorization): >>>>>> [0] 36 >>>>>> [1] 37 >>>>>> [2] 38 >>>>>> [3] 39 >>>>>> INFO(23) (num of pivots eliminated on this processor >>>>>> after factorization): >>>>>> [0] 6450 >>>>>> [1] 5442 >>>>>> [2] 4386 >>>>>> [3] 5526 >>>>>> RINFOG(1) (global estimated flops for the elimination >>>>>> after analysis): 8.98497e+08 >>>>>> RINFOG(2) (global estimated flops for the assembly >>>>>> after factorization): 3.74939e+06 >>>>>> RINFOG(3) (global estimated flops for the elimination >>>>>> after factorization): 6.9067e+08 >>>>>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>>>>> (0.,0.)*(2^0) >>>>>> INFOG(3) (estimated real workspace for factors on all >>>>>> processors after analysis): 4082184 >>>>>> INFOG(4) (estimated integer workspace for factors on >>>>>> all processors after analysis): 231846 >>>>>> INFOG(5) (estimated maximum front size in the complete >>>>>> tree): 678 >>>>>> INFOG(6) (number of nodes in the complete tree): 1380 >>>>>> INFOG(7) (ordering option effectively use after >>>>>> analysis): 5 >>>>>> INFOG(8) (structural symmetry in percent of the >>>>>> permuted matrix after analysis): 100 >>>>>> INFOG(9) (total real/complex workspace to store the >>>>>> matrix factors after factorization): 3521904 >>>>>> INFOG(10) (total integer space store the matrix factors >>>>>> after factorization): 229416 >>>>>> INFOG(11) (order of largest frontal matrix after >>>>>> factorization): 678 >>>>>> INFOG(12) (number of off-diagonal pivots): 0 >>>>>> INFOG(13) (number of delayed pivots after >>>>>> factorization): 0 >>>>>> INFOG(14) (number of memory compress after >>>>>> factorization): 0 >>>>>> INFOG(15) (number of steps of iterative refinement >>>>>> after solution): 0 >>>>>> INFOG(16) (estimated size (in MB) of all MUMPS internal >>>>>> data for factorization after analysis: value on the most memory consuming >>>>>> processor): 39 >>>>>> INFOG(17) (estimated size of all MUMPS internal data >>>>>> for factorization after analysis: sum over all processors): 150 >>>>>> INFOG(18) (size of all MUMPS internal data allocated >>>>>> during factorization: value on the most memory consuming processor): 39 >>>>>> INFOG(19) (size of all MUMPS internal data allocated >>>>>> during factorization: sum over all processors): 150 >>>>>> INFOG(20) (estimated number of entries in the factors): >>>>>> 3361617 >>>>>> INFOG(21) (size in MB of memory effectively used during >>>>>> factorization - value on the most memory consuming processor): 35 >>>>>> INFOG(22) (size in MB of memory effectively used during >>>>>> factorization - sum over all processors): 136 >>>>>> INFOG(23) (after analysis: value of ICNTL(6) >>>>>> effectively used): 0 >>>>>> INFOG(24) (after analysis: value of ICNTL(12) >>>>>> effectively used): 1 >>>>>> INFOG(25) (after factorization: number of pivots >>>>>> modified by static pivoting): 0 >>>>>> INFOG(28) (after factorization: number of null pivots >>>>>> encountered): 0 >>>>>> INFOG(29) (after factorization: effective number of >>>>>> entries in the factors (sum over all processors)): 2931438 >>>>>> INFOG(30, 31) (after solution: size in Mbytes of memory >>>>>> used during solution phase): 0, 0 >>>>>> INFOG(32) (after analysis: type of analysis done): 1 >>>>>> INFOG(33) (value used for ICNTL(8)): 7 >>>>>> INFOG(34) (exponent of the determinant if determinant >>>>>> is requested): 0 >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 4 MPI processes >>>>>> type: mpiaij >>>>>> rows=22878, cols=22878 >>>>>> total: nonzeros=1219140, allocated nonzeros=1219140 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> using I-node (on process 0) routines: found 1889 nodes, limit >>>>>> used is 5 >>>>>> converged reason: -11 >>>>>> KSP Object: 4 MPI processes >>>>>> type: preonly >>>>>> maximum iterations=10000, initial guess is zero >>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using NONE norm type for convergence test >>>>>> PC Object: 4 MPI processes >>>>>> type: cholesky >>>>>> Cholesky: out-of-place factorization >>>>>> tolerance for zero pivot 2.22045e-14 >>>>>> matrix ordering: natural >>>>>> factor fill ratio given 0., needed 0. >>>>>> Factored matrix follows: >>>>>> Mat Object: 4 MPI processes >>>>>> type: mpiaij >>>>>> rows=22878, cols=22878 >>>>>> package used to perform factorization: mumps >>>>>> total: nonzeros=3361617, allocated nonzeros=3361617 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> MUMPS run parameters: >>>>>> SYM (matrix type): 2 >>>>>> PAR (host participation): 1 >>>>>> ICNTL(1) (output for error): 6 >>>>>> ICNTL(2) (output of diagnostic msg): 0 >>>>>> ICNTL(3) (output for global info): 0 >>>>>> ICNTL(4) (level of printing): 0 >>>>>> ICNTL(5) (input mat struct): 0 >>>>>> ICNTL(6) (matrix prescaling): 7 >>>>>> ICNTL(7) (sequentia matrix ordering):7 >>>>>> ICNTL(8) (scalling strategy): 77 >>>>>> ICNTL(10) (max num of refinements): 0 >>>>>> ICNTL(11) (error analysis): 0 >>>>>> ICNTL(12) (efficiency control): >>>>>> 0 >>>>>> ICNTL(13) (efficiency control): >>>>>> 0 >>>>>> ICNTL(14) (percentage of estimated workspace increase): >>>>>> 30 >>>>>> ICNTL(18) (input mat struct): >>>>>> 3 >>>>>> ICNTL(19) (Shur complement info): >>>>>> 0 >>>>>> ICNTL(20) (rhs sparse pattern): >>>>>> 0 >>>>>> ICNTL(21) (solution struct): >>>>>> 1 >>>>>> ICNTL(22) (in-core/out-of-core facility): >>>>>> 0 >>>>>> ICNTL(23) (max size of memory can be allocated >>>>>> locally):0 >>>>>> ICNTL(24) (detection of null pivot rows): >>>>>> 0 >>>>>> ICNTL(25) (computation of a null space basis): >>>>>> 0 >>>>>> ICNTL(26) (Schur options for rhs or solution): >>>>>> 0 >>>>>> ICNTL(27) (experimental parameter): >>>>>> -24 >>>>>> ICNTL(28) (use parallel or sequential ordering): >>>>>> 1 >>>>>> ICNTL(29) (parallel ordering): >>>>>> 0 >>>>>> ICNTL(30) (user-specified set of entries in inv(A)): >>>>>> 0 >>>>>> ICNTL(31) (factors is discarded in the solve phase): >>>>>> 0 >>>>>> ICNTL(33) (compute determinant): >>>>>> 0 >>>>>> CNTL(1) (relative pivoting threshold): 0.01 >>>>>> CNTL(2) (stopping criterion of refinement): 1.49012e-08 >>>>>> CNTL(3) (absolute pivoting threshold): 0. >>>>>> CNTL(4) (value of static pivoting): -1. >>>>>> CNTL(5) (fixation for null pivots): 0. >>>>>> RINFO(1) (local estimated flops for the elimination >>>>>> after analysis): >>>>>> [0] 1.84947e+08 >>>>>> [1] 2.42065e+08 >>>>>> [2] 2.53044e+08 >>>>>> [3] 2.18441e+08 >>>>>> RINFO(2) (local estimated flops for the assembly after >>>>>> factorization): >>>>>> [0] 945938. >>>>>> [1] 906795. >>>>>> [2] 897815. >>>>>> [3] 998840. >>>>>> RINFO(3) (local estimated flops for the elimination >>>>>> after factorization): >>>>>> [0] 1.59835e+08 >>>>>> [1] 1.50867e+08 >>>>>> [2] 2.27932e+08 >>>>>> [3] 1.52037e+08 >>>>>> INFO(15) (estimated size of (in MB) MUMPS internal data >>>>>> for running numerical factorization): >>>>>> [0] 36 >>>>>> [1] 37 >>>>>> [2] 38 >>>>>> [3] 39 >>>>>> INFO(16) (size of (in MB) MUMPS internal data used >>>>>> during numerical factorization): >>>>>> [0] 36 >>>>>> [1] 37 >>>>>> [2] 38 >>>>>> [3] 39 >>>>>> INFO(23) (num of pivots eliminated on this processor >>>>>> after factorization): >>>>>> [0] 6450 >>>>>> [1] 5442 >>>>>> [2] 4386 >>>>>> [3] 5526 >>>>>> RINFOG(1) (global estimated flops for the elimination >>>>>> after analysis): 8.98497e+08 >>>>>> RINFOG(2) (global estimated flops for the assembly >>>>>> after factorization): 3.74939e+06 >>>>>> RINFOG(3) (global estimated flops for the elimination >>>>>> after factorization): 6.9067e+08 >>>>>> (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): >>>>>> (0.,0.)*(2^0) >>>>>> INFOG(3) (estimated real workspace for factors on all >>>>>> processors after analysis): 4082184 >>>>>> INFOG(4) (estimated integer workspace for factors on >>>>>> all processors after analysis): 231846 >>>>>> INFOG(5) (estimated maximum front size in the complete >>>>>> tree): 678 >>>>>> INFOG(6) (number of nodes in the complete tree): 1380 >>>>>> INFOG(7) (ordering option effectively use after >>>>>> analysis): 5 >>>>>> INFOG(8) (structural symmetry in percent of the >>>>>> permuted matrix after analysis): 100 >>>>>> INFOG(9) (total real/complex workspace to store the >>>>>> matrix factors after factorization): 3521904 >>>>>> INFOG(10) (total integer space store the matrix factors >>>>>> after factorization): 229416 >>>>>> INFOG(11) (order of largest frontal matrix after >>>>>> factorization): 678 >>>>>> INFOG(12) (number of off-diagonal pivots): 0 >>>>>> INFOG(13) (number of delayed pivots after >>>>>> factorization): 0 >>>>>> INFOG(14) (number of memory compress after >>>>>> factorization): 0 >>>>>> INFOG(15) (number of steps of iterative refinement >>>>>> after solution): 0 >>>>>> INFOG(16) (estimated size (in MB) of all MUMPS internal >>>>>> data for factorization after analysis: value on the most memory consuming >>>>>> processor): 39 >>>>>> INFOG(17) (estimated size of all MUMPS internal data >>>>>> for factorization after analysis: sum over all processors): 150 >>>>>> INFOG(18) (size of all MUMPS internal data allocated >>>>>> during factorization: value on the most memory consuming processor): 39 >>>>>> INFOG(19) (size of all MUMPS internal data allocated >>>>>> during factorization: sum over all processors): 150 >>>>>> INFOG(20) (estimated number of entries in the factors): >>>>>> 3361617 >>>>>> INFOG(21) (size in MB of memory effectively used during >>>>>> factorization - value on the most memory consuming processor): 35 >>>>>> INFOG(22) (size in MB of memory effectively used during >>>>>> factorization - sum over all processors): 136 >>>>>> INFOG(23) (after analysis: value of ICNTL(6) >>>>>> effectively used): 0 >>>>>> INFOG(24) (after analysis: value of ICNTL(12) >>>>>> effectively used): 1 >>>>>> INFOG(25) (after factorization: number of pivots >>>>>> modified by static pivoting): 0 >>>>>> INFOG(28) (after factorization: number of null pivots >>>>>> encountered): 0 >>>>>> INFOG(29) (after factorization: effective number of >>>>>> entries in the factors (sum over all processors)): 2931438 >>>>>> INFOG(30, 31) (after solution: size in Mbytes of memory >>>>>> used during solution phase): 0, 0 >>>>>> INFOG(32) (after analysis: type of analysis done): 1 >>>>>> INFOG(33) (value used for ICNTL(8)): 7 >>>>>> INFOG(34) (exponent of the determinant if determinant >>>>>> is requested): 0 >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 4 MPI processes >>>>>> type: mpiaij >>>>>> rows=22878, cols=22878 >>>>>> total: nonzeros=1219140, allocated nonzeros=1219140 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> using I-node (on process 0) routines: found 1889 nodes, limit >>>>>> used is 5 >>>>>> converged reason: -11 >>>>>> >>>>>> ------------------------------------------------------------ >>>>>> ----------------------------------------- >>>>>> >>>>> >>>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex10.c Type: application/octet-stream Size: 20232 bytes Desc: not available URL: From mailinglists at xgm.de Tue Sep 20 06:44:01 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Tue, 20 Sep 2016 13:44:01 +0200 Subject: [petsc-users] Computing condition number Message-ID: <6b19c620-7621-790e-799c-b2a2e89db896@xgm.de> Hello, to compute / approximate the condition number of a MATSBAIJ, I put -ksp_view # Conditon number estimate -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 in my .petscrc Output is like: [...] 566 KSP Residual norm 1.241765807317e-07 % max 7.020130499234e+02 min 6.429054752025e-04 max/min 1.091938203983e+06 567 KSP Residual norm 1.219340847328e-07 % max 7.020130499278e+02 min 6.423976501709e-04 max/min 1.092801397609e+06 568 KSP Residual norm 1.198886059519e-07 % max 7.020130499320e+02 min 6.419283878172e-04 max/min 1.093600257062e+06 569 KSP Residual norm 1.178377018879e-07 % max 7.020130499362e+02 min 6.414517591235e-04 max/min 1.094412853268e+06 KSP Object: 8 MPI processes type: gmres GMRES: restart=1000, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=1e-09, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using PRECONDITIONED norm type for convergence test PC Object: 8 MPI processes type: none linear system matrix = precond matrix: Mat Object: C 8 MPI processes type: mpisbaij rows=14404, cols=14404 total: nonzeros=1059188, allocated nonzeros=1244312 total number of mallocs used during MatSetValues calls =75272 block size is 1 (0) 13:30:20 [precice::impl::SolverInterfaceImpl]:380 in initialize: it 1 of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | ongoing yes | dt complete no | (0) 13:30:20 [precice::mapping::PetRadialBasisFctMapping]:519 in map: Mapping Data consistent from MeshA (ID 0) to MeshB (ID 1) 0 KSP Residual norm 1.178378120697e-07 % max 1.000000000000e+00 min 1.000000000000e+00 max/min 1.000000000000e+00 KSP Object: 8 MPI processes type: gmres GMRES: restart=1000, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=1e-09, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using PRECONDITIONED norm type for convergence test PC Object: 8 MPI processes type: none linear system matrix = precond matrix: Mat Object: C 8 MPI processes type: mpisbaij rows=14404, cols=14404 total: nonzeros=1059188, allocated nonzeros=1244312 total number of mallocs used during MatSetValues calls =75272 block size is 1 The approximate condition number is the max/min value, 1.094412853268e+06 here? Just make sure my mathematical illerateracy does not spoil my report... Best thanks, Florian From knepley at gmail.com Tue Sep 20 06:45:13 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 20 Sep 2016 06:45:13 -0500 Subject: [petsc-users] Computing condition number In-Reply-To: <6b19c620-7621-790e-799c-b2a2e89db896@xgm.de> References: <6b19c620-7621-790e-799c-b2a2e89db896@xgm.de> Message-ID: On Tue, Sep 20, 2016 at 6:44 AM, Florian Lindner wrote: > Hello, > > to compute / approximate the condition number of a MATSBAIJ, I put > > -ksp_view > > # Conditon number estimate > -pc_type none > -ksp_type gmres > -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > > in my .petscrc > > Output is like: > > [...] > 566 KSP Residual norm 1.241765807317e-07 % max 7.020130499234e+02 min > 6.429054752025e-04 max/min 1.091938203983e+06 > 567 KSP Residual norm 1.219340847328e-07 % max 7.020130499278e+02 min > 6.423976501709e-04 max/min 1.092801397609e+06 > 568 KSP Residual norm 1.198886059519e-07 % max 7.020130499320e+02 min > 6.419283878172e-04 max/min 1.093600257062e+06 > 569 KSP Residual norm 1.178377018879e-07 % max 7.020130499362e+02 min > 6.414517591235e-04 max/min 1.094412853268e+06 > KSP Object: 8 MPI processes > type: gmres > GMRES: restart=1000, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object: 8 MPI processes > type: none > linear system matrix = precond matrix: > Mat Object: C 8 MPI processes > type: mpisbaij > rows=14404, cols=14404 > total: nonzeros=1059188, allocated nonzeros=1244312 > total number of mallocs used during MatSetValues calls =75272 > block size is 1 > (0) 13:30:20 [precice::impl::SolverInterfaceImpl]:380 in initialize: it 1 > of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | > ongoing yes | dt complete no | > (0) 13:30:20 [precice::mapping::PetRadialBasisFctMapping]:519 in map: > Mapping Data consistent from MeshA (ID 0) to MeshB > (ID 1) > 0 KSP Residual norm 1.178378120697e-07 % max 1.000000000000e+00 min > 1.000000000000e+00 max/min 1.000000000000e+00 > KSP Object: 8 MPI processes > type: gmres > GMRES: restart=1000, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object: 8 MPI processes > type: none > linear system matrix = precond matrix: > Mat Object: C 8 MPI processes > type: mpisbaij > rows=14404, cols=14404 > total: nonzeros=1059188, allocated nonzeros=1244312 > total number of mallocs used during MatSetValues calls =75272 > block size is 1 > > > The approximate condition number is the max/min value, 1.094412853268e+06 > here? > Yes. Thanks, Matt > Just make sure my mathematical illerateracy does not spoil my report... > > > Best thanks, > Florian > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Tue Sep 20 09:50:49 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Tue, 20 Sep 2016 10:50:49 -0400 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: On Mon, Sep 19, 2016 at 11:04 PM, Hong wrote: > David : > I did following: > > PC pc; > Mat F; > ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); > ierr = PCReset(pc);CHKERRQ(ierr); > ierr = KSPSetOperators(ksp,A,A);CHKERRQ(ierr); > ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr); > > ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);CHKERRQ( > ierr); > ierr = PCFactorSetUpMatSolverPackage(pc);CHKERRQ(ierr); > ierr = PCFactorGetMatrix(pc,&F);CHKERRQ(ierr); > ierr = MatMumpsSetIcntl(F,14,30);CHKERRQ(ierr); > > ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); > > Then it resolves the matrix equation with ICNTL(14)=30. > Attached is modified petsc/src/ksp/ksp/examples/tutorials/ex10.c. > Using in with your matrix.dat, I get > > mpiexec -n 4 ./ex10 -f0 matrix.dat -rhs 0 -ksp_reason > Number of iterations = 0 > KSPConvergedReason: -11 > Reset PC with ICNTL(14)=30 ... > KSPConvergedReason: 2 > Hong, Thanks very much for your test code. I get the same output as you when I run "mpiexec -n 4 ./ex10 -f0 matrix.dat -rhs 0 -ksp_reason". However, I used KSPPREONLY in my original test code, and if I add KSPSetType(ksp,KSPPREONLY) in your modified exc10.c after the line KSPCreate(PETSC_COMM_WORLD,&ksp) then I get the following output: mpiexec -np 4 ./mumps_test-opt -f0 matrix.dat -rhs 0 -ksp_reason Number of iterations = 0 KSPConvergedReason: -11 Reset PC with ICNTL(14)=30 ... KSPConvergedReason: -11 So it seems like the icntl data is not being updated when we use PREONLY. Do you know how to fix this? Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Tue Sep 20 13:01:52 2016 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 20 Sep 2016 13:01:52 -0500 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: David : This is a bug in PETSc. A change $ git diff ../../../pc/impls/factor/cholesky/cholesky.c diff --git a/src/ksp/pc/impls/factor/cholesky/cholesky.c b/src/ksp/pc/impls/factor/cholesky/cholesky.c index 953d551..cc28369 100644 --- a/src/ksp/pc/impls/factor/cholesky/cholesky.c +++ b/src/ksp/pc/impls/factor/cholesky/cholesky.c @@ -141,9 +141,7 @@ static PetscErrorCode PCSetUp_Cholesky(PC pc) ierr = MatCholeskyFactorNumeric(((PC_Factor*)dir)->fact,pc->pmat,&((PC_Factor*)dir)->info);CHKERRQ(ierr); ierr = MatFactorGetError(((PC_Factor*)dir)->fact,&err);CHKERRQ(ierr); - if (err) { /* FactorNumeric() fails */ - pc->failedreason = (PCFailedReason)err; - } + pc->failedreason = (PCFailedReason)err; } fixed the problem. I'll fix this problem in petsc-release, including other routines. Thanks for reporting the bug and sending matrix.dat. Let us know whenever you encounter problem using PETSc. Hong On Mon, Sep 19, 2016 at 11:04 PM, Hong wrote: > >> David : >> I did following: >> >> PC pc; >> Mat F; >> ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); >> ierr = PCReset(pc);CHKERRQ(ierr); >> ierr = KSPSetOperators(ksp,A,A);CHKERRQ(ierr); >> ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr); >> >> ierr = PCFactorSetMatSolverPackage(pc >> ,MATSOLVERMUMPS);CHKERRQ(ierr); >> ierr = PCFactorSetUpMatSolverPackage(pc);CHKERRQ(ierr); >> ierr = PCFactorGetMatrix(pc,&F);CHKERRQ(ierr); >> ierr = MatMumpsSetIcntl(F,14,30);CHKERRQ(ierr); >> >> ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); >> >> Then it resolves the matrix equation with ICNTL(14)=30. >> Attached is modified petsc/src/ksp/ksp/examples/tutorials/ex10.c. >> Using in with your matrix.dat, I get >> >> mpiexec -n 4 ./ex10 -f0 matrix.dat -rhs 0 -ksp_reason >> Number of iterations = 0 >> KSPConvergedReason: -11 >> Reset PC with ICNTL(14)=30 ... >> KSPConvergedReason: 2 >> > > Hong, > > Thanks very much for your test code. I get the same output as you when I > run "mpiexec -n 4 ./ex10 -f0 matrix.dat -rhs 0 -ksp_reason". > > However, I used KSPPREONLY in my original test code, and if I > add KSPSetType(ksp,KSPPREONLY) in your modified exc10.c after the line > KSPCreate(PETSC_COMM_WORLD,&ksp) then I get the following output: > > mpiexec -np 4 ./mumps_test-opt -f0 matrix.dat -rhs 0 -ksp_reason > Number of iterations = 0 > KSPConvergedReason: -11 > Reset PC with ICNTL(14)=30 ... > KSPConvergedReason: -11 > > So it seems like the icntl data is not being updated when we use PREONLY. > Do you know how to fix this? > > Thanks, > David > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Tue Sep 20 13:08:38 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Tue, 20 Sep 2016 14:08:38 -0400 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: On Tue, Sep 20, 2016 at 2:01 PM, Hong wrote: > David : > This is a bug in PETSc. A change > $ git diff ../../../pc/impls/factor/cholesky/cholesky.c > diff --git a/src/ksp/pc/impls/factor/cholesky/cholesky.c > b/src/ksp/pc/impls/factor/cholesky/cholesky.c > index 953d551..cc28369 100644 > --- a/src/ksp/pc/impls/factor/cholesky/cholesky.c > +++ b/src/ksp/pc/impls/factor/cholesky/cholesky.c > @@ -141,9 +141,7 @@ static PetscErrorCode PCSetUp_Cholesky(PC pc) > > ierr = MatCholeskyFactorNumeric(((PC_Factor*)dir)->fact,pc->pmat,&( > (PC_Factor*)dir)->info);CHKERRQ(ierr); > ierr = MatFactorGetError(((PC_Factor*)dir)->fact,&err);CHKERRQ(ierr); > - if (err) { /* FactorNumeric() fails */ > - pc->failedreason = (PCFailedReason)err; > - } > + pc->failedreason = (PCFailedReason)err; > } > > fixed the problem. I'll fix this problem in petsc-release, including other > routines. > Thanks for reporting the bug and sending matrix.dat. Let us know whenever > you encounter problem using PETSc. > OK, great, thanks for the fix. Will this fix be included in the next patch release of 3.7? Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Tue Sep 20 13:24:03 2016 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 20 Sep 2016 13:24:03 -0500 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: David: I'll patch petsc-maint (v3.7), then merge it to petsc-master. It might take 1-2 days for regression tests. Hong On Tue, Sep 20, 2016 at 2:01 PM, Hong wrote: > >> David : >> This is a bug in PETSc. A change >> $ git diff ../../../pc/impls/factor/cholesky/cholesky.c >> diff --git a/src/ksp/pc/impls/factor/cholesky/cholesky.c >> b/src/ksp/pc/impls/factor/cholesky/cholesky.c >> index 953d551..cc28369 100644 >> --- a/src/ksp/pc/impls/factor/cholesky/cholesky.c >> +++ b/src/ksp/pc/impls/factor/cholesky/cholesky.c >> @@ -141,9 +141,7 @@ static PetscErrorCode PCSetUp_Cholesky(PC pc) >> >> ierr = MatCholeskyFactorNumeric(((PC_Factor*)dir)->fact,pc->pmat,&( >> (PC_Factor*)dir)->info);CHKERRQ(ierr); >> ierr = MatFactorGetError(((PC_Factor*)dir)->fact,&err);CHKERRQ(ierr >> ); >> - if (err) { /* FactorNumeric() fails */ >> - pc->failedreason = (PCFailedReason)err; >> - } >> + pc->failedreason = (PCFailedReason)err; >> } >> >> fixed the problem. I'll fix this problem in petsc-release, including >> other routines. >> Thanks for reporting the bug and sending matrix.dat. Let us know whenever >> you encounter problem using PETSc. >> > > > OK, great, thanks for the fix. > > Will this fix be included in the next patch release of 3.7? > > Thanks, > David > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.knezevic at akselos.com Wed Sep 21 16:48:48 2016 From: david.knezevic at akselos.com (David Knezevic) Date: Wed, 21 Sep 2016 17:48:48 -0400 Subject: [petsc-users] Issue updating MUMPS ictnl after failed solve In-Reply-To: References: Message-ID: On Tue, Sep 20, 2016 at 2:24 PM, Hong wrote: > David: > I'll patch petsc-maint (v3.7), then merge it to petsc-master. > It might take 1-2 days for regression tests. > > I pulled the patched version and it works well for me now, thanks! David > On Tue, Sep 20, 2016 at 2:01 PM, Hong wrote: >> >>> David : >>> This is a bug in PETSc. A change >>> $ git diff ../../../pc/impls/factor/cholesky/cholesky.c >>> diff --git a/src/ksp/pc/impls/factor/cholesky/cholesky.c >>> b/src/ksp/pc/impls/factor/cholesky/cholesky.c >>> index 953d551..cc28369 100644 >>> --- a/src/ksp/pc/impls/factor/cholesky/cholesky.c >>> +++ b/src/ksp/pc/impls/factor/cholesky/cholesky.c >>> @@ -141,9 +141,7 @@ static PetscErrorCode PCSetUp_Cholesky(PC pc) >>> >>> ierr = MatCholeskyFactorNumeric(((PC_Factor*)dir)->fact,pc->pmat,&( >>> (PC_Factor*)dir)->info);CHKERRQ(ierr); >>> ierr = MatFactorGetError(((PC_Factor*)dir)->fact,&err);CHKERRQ(ierr >>> ); >>> - if (err) { /* FactorNumeric() fails */ >>> - pc->failedreason = (PCFailedReason)err; >>> - } >>> + pc->failedreason = (PCFailedReason)err; >>> } >>> >>> fixed the problem. I'll fix this problem in petsc-release, including >>> other routines. >>> Thanks for reporting the bug and sending matrix.dat. Let us know >>> whenever you encounter problem using PETSc. >>> >> >> >> OK, great, thanks for the fix. >> >> Will this fix be included in the next patch release of 3.7? >> >> Thanks, >> David >> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglists at xgm.de Thu Sep 22 05:42:15 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Thu, 22 Sep 2016 12:42:15 +0200 Subject: [petsc-users] Write binary to matrix Message-ID: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> Hello, I want to write a MATSBAIJ to a file in binary, so that I can load it later using MatLoad. However, I keep getting the error: [5]PETSC ERROR: No support for this operation for this object type! [5]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets unless SubViewer contains the rank 0 process [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/binv.c The rank 0 is included, as you can see below, I use PETSC_COMM_WORLD and the matrix is also created like that. The code looks like: PetscErrorCode ierr = 0; PetscViewer viewer; PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), FILE_MODE_WRITE, &viewer); CHKERRV(ierr); MatView(matrix, viewer); CHKERRV(ierr); PetscViewerDestroy(&viewer); Thanks, Florian From dave.mayhem23 at gmail.com Thu Sep 22 05:53:40 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 22 Sep 2016 12:53:40 +0200 Subject: [petsc-users] Write binary to matrix In-Reply-To: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> Message-ID: On Thursday, 22 September 2016, Florian Lindner > wrote: > Hello, > > I want to write a MATSBAIJ to a file in binary, so that I can load it > later using MatLoad. > > However, I keep getting the error: > > [5]PETSC ERROR: No support for this operation for this object type! > [5]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets > unless SubViewer contains the rank 0 process > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in > /data/scratch/lindnefn/software/petsc/src/sys/classes/ > viewer/impls/binary/binv.c > > The rank 0 is included, as you can see below, I use PETSC_COMM_WORLD and > the matrix is also created like that. > > The code looks like: > > PetscErrorCode ierr = 0; > PetscViewer viewer; > PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), > FILE_MODE_WRITE, &viewer); CHKERRV(ierr); > MatView(matrix, viewer); CHKERRV(ierr); > PetscViewerDestroy(&viewer); The code snippet looks weird. The error could be related to your usage of the error checking macros, Eg the fact you set ierr to zero rather than assigning it to the return value of your petsc function calls. You should do ierr = petscfunc();CHQERRQ(ierr); And why do you use CHQERRV and not CHKERRQ? Thanks Dave > > Thanks, > Florian > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 22 06:32:01 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 Sep 2016 06:32:01 -0500 Subject: [petsc-users] Write binary to matrix In-Reply-To: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> Message-ID: On Thu, Sep 22, 2016 at 5:42 AM, Florian Lindner wrote: > Hello, > > I want to write a MATSBAIJ to a file in binary, so that I can load it > later using MatLoad. > > However, I keep getting the error: > > [5]PETSC ERROR: No support for this operation for this object type! > [5]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets > unless SubViewer contains the rank 0 process > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/ > binv.c > Do not truncate the stack. Run under valgrind. Thanks, Matt > The rank 0 is included, as you can see below, I use PETSC_COMM_WORLD and > the matrix is also created like that. > > The code looks like: > > PetscErrorCode ierr = 0; > PetscViewer viewer; > PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), > FILE_MODE_WRITE, &viewer); CHKERRV(ierr); > MatView(matrix, viewer); CHKERRV(ierr); > PetscViewerDestroy(&viewer); > > Thanks, > Florian > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglists at xgm.de Thu Sep 22 08:32:09 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Thu, 22 Sep 2016 15:32:09 +0200 Subject: [petsc-users] Write binary to matrix In-Reply-To: References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> Message-ID: <26501f48-2d44-04b4-e193-aeda3ee9c64b@xgm.de> Am 22.09.2016 um 12:53 schrieb Dave May: > > > On Thursday, 22 September 2016, Florian Lindner > wrote: > > Hello, > > I want to write a MATSBAIJ to a file in binary, so that I can load it later using MatLoad. > > However, I keep getting the error: > > [5]PETSC ERROR: No support for this operation for this object type! > [5]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets unless SubViewer contains the rank 0 process > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/binv.c > > The rank 0 is included, as you can see below, I use PETSC_COMM_WORLD and the matrix is also created like that. > > The code looks like: > > PetscErrorCode ierr = 0; > PetscViewer viewer; > PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), FILE_MODE_WRITE, &viewer); CHKERRV(ierr); > MatView(matrix, viewer); CHKERRV(ierr); > PetscViewerDestroy(&viewer); > > > The code snippet looks weird. > > The error could be related to your usage of the error checking macros, Eg the fact you set ierr to zero rather than > assigning it to the return value of your petsc function calls. > > You should do > ierr = petscfunc();CHQERRQ(ierr); > > And why do you use CHQERRV and not CHKERRQ? Hey, sorry, I copied that code from different locations and edited it, incompletely. Unfortunatly I was unable to reproduce the problem with a small snippet, see my other (coming) mail in this thread. Best, Florian From mailinglists at xgm.de Thu Sep 22 09:12:22 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Thu, 22 Sep 2016 16:12:22 +0200 Subject: [petsc-users] Write binary to matrix In-Reply-To: References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> Message-ID: <2f78a83c-650c-b6b7-914b-8ad76cb785b0@xgm.de> Hey, this code reproduces the error when run with 2 or more ranks. #include #include int main(int argc, char *argv[]) { PetscInitialize(&argc, &argv, "", NULL); Mat matrix; MatCreate(PETSC_COMM_WORLD, &matrix); MatSetType(matrix, MATSBAIJ); MatSetSizes(matrix, 10, 10, PETSC_DETERMINE, PETSC_DETERMINE); MatSetFromOptions(matrix); MatSetUp(matrix); MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); PetscViewer viewer; PetscViewerBinaryOpen(PETSC_COMM_WORLD, "test.mat", FILE_MODE_WRITE, &viewer); MatView(matrix, viewer); PetscViewerDestroy(&viewer); MatDestroy(&matrix); PetscFinalize(); } The complete output is: lindnefn at neon /data/scratch/lindnefn/aste (git)-[master] % mpic++ petsc.cpp -lpetsc && mpirun -n 2 ./a.out [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets unless SubViewer contains the rank 0 process [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named neon by lindnefn Thu Sep 22 16:10:34 2016 [0]PETSC ERROR: Configure options --with-debugging=1 --download-petsc4py=yes --download-mpi4py=yes --download-superlu_dist --download-parmetis --download-metis [0]PETSC ERROR: #1 PetscViewerGetSubViewer_Binary() line 46 in /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/binv.c [0]PETSC ERROR: #2 PetscViewerGetSubViewer() line 43 in /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/interface/dupl.c [0]PETSC ERROR: #3 MatView_MPISBAIJ_ASCIIorDraworSocket() line 900 in /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/mpi/mpisbaij.c [0]PETSC ERROR: #4 MatView_MPISBAIJ() line 926 in /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/mpi/mpisbaij.c [0]PETSC ERROR: #5 MatView() line 901 in /data/scratch/lindnefn/software/petsc/src/mat/interface/matrix.c WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-ksp_converged_reason (no value) Option left: name:-ksp_final_residual (no value) Option left: name:-ksp_view (no value) [neon:113111] *** Process received signal *** [neon:113111] Signal: Aborted (6) [neon:113111] Signal code: (-6) [neon:113111] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7feed8958cb0] [neon:113111] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7feed8958c37] [neon:113111] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7feed895c028] [neon:113111] [ 3] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscTraceBackErrorHandler+0x563) [0x7feed8d8db31] [neon:113111] [ 4] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscError+0x374) [0x7feed8d88750] [neon:113111] [ 5] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x19b2f6) [0x7feed8e822f6] [neon:113111] [ 6] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscViewerGetSubViewer+0x4f1) [0x7feed8e803cb] [neon:113111] [ 7] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x860c95) [0x7feed9547c95] [neon:113111] [ 8] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x861494) [0x7feed9548494] [neon:113111] [ 9] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(MatView+0x12b6) [0x7feed971c08f] [neon:113111] [10] ./a.out() [0x400b8b] [neon:113111] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7feed8943f45] [neon:113111] [12] ./a.out() [0x4009e9] [neon:113111] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 1 with PID 113111 on node neon exited on signal 6 (Aborted). -------------------------------------------------------------------------- Thanks, Florian Am 22.09.2016 um 13:32 schrieb Matthew Knepley: > On Thu, Sep 22, 2016 at 5:42 AM, Florian Lindner > wrote: > > Hello, > > I want to write a MATSBAIJ to a file in binary, so that I can load it later using MatLoad. > > However, I keep getting the error: > > [5]PETSC ERROR: No support for this operation for this object type! > [5]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets unless SubViewer contains the rank 0 process > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/binv.c > > > Do not truncate the stack. > > Run under valgrind. > > Thanks, > > Matt > > > The rank 0 is included, as you can see below, I use PETSC_COMM_WORLD and the matrix is also created like that. > > The code looks like: > > PetscErrorCode ierr = 0; > PetscViewer viewer; > PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), FILE_MODE_WRITE, &viewer); CHKERRV(ierr); > MatView(matrix, viewer); CHKERRV(ierr); > PetscViewerDestroy(&viewer); > > Thanks, > Florian > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any > results to which their experiments lead. > -- Norbert Wiener From hzhang at mcs.anl.gov Thu Sep 22 11:34:43 2016 From: hzhang at mcs.anl.gov (Hong) Date: Thu, 22 Sep 2016 11:34:43 -0500 Subject: [petsc-users] Write binary to matrix In-Reply-To: <2f78a83c-650c-b6b7-914b-8ad76cb785b0@xgm.de> References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> <2f78a83c-650c-b6b7-914b-8ad76cb785b0@xgm.de> Message-ID: Florian: Would it work if replacing MATSBAIJ to MATAIJ or MATMPISBAIJ? Hong Hey, > > this code reproduces the error when run with 2 or more ranks. > > #include > #include > > int main(int argc, char *argv[]) > { > PetscInitialize(&argc, &argv, "", NULL); > > Mat matrix; > MatCreate(PETSC_COMM_WORLD, &matrix); > MatSetType(matrix, MATSBAIJ); > MatSetSizes(matrix, 10, 10, PETSC_DETERMINE, PETSC_DETERMINE); > MatSetFromOptions(matrix); > MatSetUp(matrix); > > MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); > > PetscViewer viewer; > PetscViewerBinaryOpen(PETSC_COMM_WORLD, "test.mat", FILE_MODE_WRITE, > &viewer); > MatView(matrix, viewer); > PetscViewerDestroy(&viewer); > MatDestroy(&matrix); > > PetscFinalize(); > } > > > The complete output is: > > > lindnefn at neon /data/scratch/lindnefn/aste (git)-[master] % mpic++ > petsc.cpp -lpetsc && mpirun -n 2 ./a.out > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets > unless SubViewer contains the rank 0 process > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named neon by lindnefn > Thu Sep 22 16:10:34 2016 > [0]PETSC ERROR: Configure options --with-debugging=1 > --download-petsc4py=yes --download-mpi4py=yes > --download-superlu_dist --download-parmetis --download-metis > [0]PETSC ERROR: #1 PetscViewerGetSubViewer_Binary() line 46 in > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/ > binv.c > [0]PETSC ERROR: #2 PetscViewerGetSubViewer() line 43 in > /data/scratch/lindnefn/software/petsc/src/sys/ > classes/viewer/interface/dupl.c > [0]PETSC ERROR: #3 MatView_MPISBAIJ_ASCIIorDraworSocket() line 900 in > /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/mpi/mpisbaij.c > [0]PETSC ERROR: #4 MatView_MPISBAIJ() line 926 in /data/scratch/lindnefn/ > software/petsc/src/mat/impls/sbaij/mpi/mpisbaij.c > [0]PETSC ERROR: #5 MatView() line 901 in /data/scratch/lindnefn/ > software/petsc/src/mat/interface/matrix.c > WARNING! There are options you set that were not used! > WARNING! could be spelling mistake, etc! > Option left: name:-ksp_converged_reason (no value) > Option left: name:-ksp_final_residual (no value) > Option left: name:-ksp_view (no value) > [neon:113111] *** Process received signal *** > [neon:113111] Signal: Aborted (6) > [neon:113111] Signal code: (-6) > [neon:113111] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) > [0x7feed8958cb0] > [neon:113111] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) > [0x7feed8958c37] > [neon:113111] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) > [0x7feed895c028] > [neon:113111] [ 3] > /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(PetscTraceBackErrorHandler+0x563) > [0x7feed8d8db31] > [neon:113111] [ 4] /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(PetscError+0x374) > [0x7feed8d88750] > [neon:113111] [ 5] /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(+0x19b2f6) [0x7feed8e822f6] > [neon:113111] [ 6] > /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(PetscViewerGetSubViewer+0x4f1) > [0x7feed8e803cb] > [neon:113111] [ 7] /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(+0x860c95) [0x7feed9547c95] > [neon:113111] [ 8] /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(+0x861494) [0x7feed9548494] > [neon:113111] [ 9] /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(MatView+0x12b6) > [0x7feed971c08f] > [neon:113111] [10] ./a.out() [0x400b8b] > [neon:113111] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) > [0x7feed8943f45] > [neon:113111] [12] ./a.out() [0x4009e9] > [neon:113111] *** End of error message *** > -------------------------------------------------------------------------- > mpirun noticed that process rank 1 with PID 113111 on node neon exited on > signal 6 (Aborted). > -------------------------------------------------------------------------- > > Thanks, > Florian > > > > Am 22.09.2016 um 13:32 schrieb Matthew Knepley: > > On Thu, Sep 22, 2016 at 5:42 AM, Florian Lindner > wrote: > > > > Hello, > > > > I want to write a MATSBAIJ to a file in binary, so that I can load > it later using MatLoad. > > > > However, I keep getting the error: > > > > [5]PETSC ERROR: No support for this operation for this object type! > > [5]PETSC ERROR: Cannot get subcomm viewer for binary files or > sockets unless SubViewer contains the rank 0 process > > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in > > /data/scratch/lindnefn/software/petsc/src/sys/ > classes/viewer/impls/binary/binv.c > > > > > > Do not truncate the stack. > > > > Run under valgrind. > > > > Thanks, > > > > Matt > > > > > > The rank 0 is included, as you can see below, I use PETSC_COMM_WORLD > and the matrix is also created like that. > > > > The code looks like: > > > > PetscErrorCode ierr = 0; > > PetscViewer viewer; > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), > FILE_MODE_WRITE, &viewer); CHKERRV(ierr); > > MatView(matrix, viewer); CHKERRV(ierr); > > PetscViewerDestroy(&viewer); > > > > Thanks, > > Florian > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any > > results to which their experiments lead. > > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mono at dtu.dk Fri Sep 23 03:48:31 2016 From: mono at dtu.dk (=?utf-8?B?TW9ydGVuIE5vYmVsLUrDuHJnZW5zZW4=?=) Date: Fri, 23 Sep 2016 08:48:31 +0000 Subject: [petsc-users] DMPlex problem In-Reply-To: References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk>, Message-ID: <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> Dear PETSc developers Any update on this issue regarding DMPlex? Or is there any obvious workaround that we are unaware of? Also should we additionally register the issue on Bitbucket or is reporting the issue on the mailing list enough? Kind regards, Morten ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Friday, September 09, 2016 12:21 PM To: Morten Nobel-J?rgensen Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen > wrote: Dear PETSc developers and users, Last week we posted a question regarding an error with DMPlex and multiple dofs and have not gotten any feedback yet. This is uncharted waters for us, since we have gotten used to an extremely fast feedback from the PETSc crew. So - with the chance of sounding impatient and ungrateful - we would like to hear if anybody has any ideas that could point us in the right direction? This is my fault. You have not gotten a response because everyone else was waiting for me, and I have been slow because I just moved houses at the same time as term started here. Sorry about that. The example ran for me and I saw your problem. The local-tp-global map is missing for some reason. I am tracking it down now. It should be made by DMCreateMatrix(), so this is mysterious. I hope to have this fixed by early next week. Thanks, Matt We have created a small example problem that demonstrates the error in the matrix assembly. Thanks, Morten -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglists at xgm.de Fri Sep 23 04:15:13 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Fri, 23 Sep 2016 11:15:13 +0200 Subject: [petsc-users] Write binary to matrix In-Reply-To: References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> <2f78a83c-650c-b6b7-914b-8ad76cb785b0@xgm.de> Message-ID: <0ab6d96e-2056-5683-2d9f-6aac46fd86b8@xgm.de> Am 22.09.2016 um 18:34 schrieb Hong: > Florian: > Would it work if replacing MATSBAIJ to MATAIJ or MATMPISBAIJ? MATAIJ works, but is not an option for my actual application. MATMPISBAIJ does not work. Not very suprisingly, since afaik setting it to MATSBAIJ and executing it on multiple MPI ranks actually results in MATMPISBAIJ. Best, Florian > > Hong > > Hey, > > this code reproduces the error when run with 2 or more ranks. > > #include > #include > > int main(int argc, char *argv[]) > { > PetscInitialize(&argc, &argv, "", NULL); > > Mat matrix; > MatCreate(PETSC_COMM_WORLD, &matrix); > MatSetType(matrix, MATSBAIJ); > MatSetSizes(matrix, 10, 10, PETSC_DETERMINE, PETSC_DETERMINE); > MatSetFromOptions(matrix); > MatSetUp(matrix); > > MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); > > PetscViewer viewer; > PetscViewerBinaryOpen(PETSC_COMM_WORLD, "test.mat", FILE_MODE_WRITE, &viewer); > MatView(matrix, viewer); > PetscViewerDestroy(&viewer); > MatDestroy(&matrix); > > PetscFinalize(); > } > > > The complete output is: > > > lindnefn at neon /data/scratch/lindnefn/aste (git)-[master] % mpic++ petsc.cpp -lpetsc && mpirun -n 2 ./a.out > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets unless SubViewer contains the rank 0 process > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named neon by lindnefn Thu Sep 22 16:10:34 2016 > [0]PETSC ERROR: Configure options --with-debugging=1 --download-petsc4py=yes --download-mpi4py=yes > --download-superlu_dist --download-parmetis --download-metis > [0]PETSC ERROR: #1 PetscViewerGetSubViewer_Binary() line 46 in > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/binv.c > [0]PETSC ERROR: #2 PetscViewerGetSubViewer() line 43 in > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/interface/dupl.c > [0]PETSC ERROR: #3 MatView_MPISBAIJ_ASCIIorDraworSocket() line 900 in > /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/mpi/mpisbaij.c > [0]PETSC ERROR: #4 MatView_MPISBAIJ() line 926 in > /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/mpi/mpisbaij.c > [0]PETSC ERROR: #5 MatView() line 901 in /data/scratch/lindnefn/software/petsc/src/mat/interface/matrix.c > WARNING! There are options you set that were not used! > WARNING! could be spelling mistake, etc! > Option left: name:-ksp_converged_reason (no value) > Option left: name:-ksp_final_residual (no value) > Option left: name:-ksp_view (no value) > [neon:113111] *** Process received signal *** > [neon:113111] Signal: Aborted (6) > [neon:113111] Signal code: (-6) > [neon:113111] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7feed8958cb0] > [neon:113111] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7feed8958c37] > [neon:113111] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7feed895c028] > [neon:113111] [ 3] > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscTraceBackErrorHandler+0x563) > [0x7feed8d8db31] > [neon:113111] [ 4] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscError+0x374) > [0x7feed8d88750] > [neon:113111] [ 5] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x19b2f6) > [0x7feed8e822f6] > [neon:113111] [ 6] > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscViewerGetSubViewer+0x4f1) > [0x7feed8e803cb] > [neon:113111] [ 7] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x860c95) > [0x7feed9547c95] > [neon:113111] [ 8] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x861494) > [0x7feed9548494] > [neon:113111] [ 9] /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(MatView+0x12b6) > [0x7feed971c08f] > [neon:113111] [10] ./a.out() [0x400b8b] > [neon:113111] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7feed8943f45] > [neon:113111] [12] ./a.out() [0x4009e9] > [neon:113111] *** End of error message *** > -------------------------------------------------------------------------- > mpirun noticed that process rank 1 with PID 113111 on node neon exited on signal 6 (Aborted). > -------------------------------------------------------------------------- > > Thanks, > Florian > > > > Am 22.09.2016 um 13:32 schrieb Matthew Knepley: > > On Thu, Sep 22, 2016 at 5:42 AM, Florian Lindner > >> wrote: > > > > Hello, > > > > I want to write a MATSBAIJ to a file in binary, so that I can load it later using MatLoad. > > > > However, I keep getting the error: > > > > [5]PETSC ERROR: No support for this operation for this object type! > > [5]PETSC ERROR: Cannot get subcomm viewer for binary files or sockets unless SubViewer contains the rank 0 process > > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in > > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/binv.c > > > > > > Do not truncate the stack. > > > > Run under valgrind. > > > > Thanks, > > > > Matt > > > > > > The rank 0 is included, as you can see below, I use PETSC_COMM_WORLD and the matrix is also created like that. > > > > The code looks like: > > > > PetscErrorCode ierr = 0; > > PetscViewer viewer; > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), FILE_MODE_WRITE, &viewer); CHKERRV(ierr); > > MatView(matrix, viewer); CHKERRV(ierr); > > PetscViewerDestroy(&viewer); > > > > Thanks, > > Florian > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any > > results to which their experiments lead. > > -- Norbert Wiener > > From knepley at gmail.com Fri Sep 23 07:45:54 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Sep 2016 07:45:54 -0500 Subject: [petsc-users] DMPlex problem In-Reply-To: <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> Message-ID: On Fri, Sep 23, 2016 at 3:48 AM, Morten Nobel-J?rgensen wrote: > Dear PETSc developers > > Any update on this issue regarding DMPlex? Or is there any obvious > workaround that we are unaware of? > I have fixed this bug. It did not come up in nightly tests because we are not using MatSetValuesLocal(). Instead we use MatSetValuesClosure() which translates differently. Here is the branch https://bitbucket.org/petsc/petsc/branch/knepley/fix-dm-ltog-bs and I have merged it to next. It will go to master in a day or two. > Also should we additionally register the issue on Bitbucket or is > reporting the issue on the mailing list enough? > Normally we are faster, but the start of the semester was hard this year. Thanks, Matt > Kind regards, > Morten > > ------------------------------ > *From:* Matthew Knepley [knepley at gmail.com] > *Sent:* Friday, September 09, 2016 12:21 PM > *To:* Morten Nobel-J?rgensen > *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? > *Subject:* Re: [petsc-users] DMPlex problem > > On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen > wrote: > >> Dear PETSc developers and users, >> >> Last week we posted a question regarding an error with DMPlex and >> multiple dofs and have not gotten any feedback yet. This is uncharted >> waters for us, since we have gotten used to an extremely fast feedback from >> the PETSc crew. So - with the chance of sounding impatient and ungrateful - >> we would like to hear if anybody has any ideas that could point us in the >> right direction? >> > > This is my fault. You have not gotten a response because everyone else was > waiting for me, and I have been > slow because I just moved houses at the same time as term started here. > Sorry about that. > > The example ran for me and I saw your problem. The local-tp-global map is > missing for some reason. > I am tracking it down now. It should be made by DMCreateMatrix(), so this > is mysterious. I hope to have > this fixed by early next week. > > Thanks, > > Matt > > >> We have created a small example problem that demonstrates the error in >> the matrix assembly. >> >> Thanks, >> Morten >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 23 07:46:48 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Sep 2016 07:46:48 -0500 Subject: [petsc-users] DMPlex problem In-Reply-To: References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> Message-ID: On Fri, Sep 23, 2016 at 7:45 AM, Matthew Knepley wrote: > On Fri, Sep 23, 2016 at 3:48 AM, Morten Nobel-J?rgensen > wrote: > >> Dear PETSc developers >> >> Any update on this issue regarding DMPlex? Or is there any obvious >> workaround that we are unaware of? >> > > I have fixed this bug. It did not come up in nightly tests because we are > not using MatSetValuesLocal(). Instead we > use MatSetValuesClosure() which translates differently. > > Here is the branch > > https://bitbucket.org/petsc/petsc/branch/knepley/fix-dm-ltog-bs > > and I have merged it to next. It will go to master in a day or two. > Also, here is the cleaned up source with no memory leaks. Matt > Also should we additionally register the issue on Bitbucket or is >> reporting the issue on the mailing list enough? >> > > Normally we are faster, but the start of the semester was hard this year. > > Thanks, > > Matt > > >> Kind regards, >> Morten >> >> ------------------------------ >> *From:* Matthew Knepley [knepley at gmail.com] >> *Sent:* Friday, September 09, 2016 12:21 PM >> *To:* Morten Nobel-J?rgensen >> *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? >> *Subject:* Re: [petsc-users] DMPlex problem >> >> On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen >> wrote: >> >>> Dear PETSc developers and users, >>> >>> Last week we posted a question regarding an error with DMPlex and >>> multiple dofs and have not gotten any feedback yet. This is uncharted >>> waters for us, since we have gotten used to an extremely fast feedback from >>> the PETSc crew. So - with the chance of sounding impatient and ungrateful - >>> we would like to hear if anybody has any ideas that could point us in the >>> right direction? >>> >> >> This is my fault. You have not gotten a response because everyone else >> was waiting for me, and I have been >> slow because I just moved houses at the same time as term started here. >> Sorry about that. >> >> The example ran for me and I saw your problem. The local-tp-global map is >> missing for some reason. >> I am tracking it down now. It should be made by DMCreateMatrix(), so this >> is mysterious. I hope to have >> this fixed by early next week. >> >> Thanks, >> >> Matt >> >> >>> We have created a small example problem that demonstrates the error in >>> the matrix assembly. >>> >>> Thanks, >>> Morten >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex18.c Type: application/octet-stream Size: 4946 bytes Desc: not available URL: From ztdepyahoo at 163.com Fri Sep 23 09:38:12 2016 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Fri, 23 Sep 2016 22:38:12 +0800 (CST) Subject: [petsc-users] How to solve the pressure possion equations with four neuman bc Message-ID: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> Dear friends: In the projection method for the solution of incompressible flow, a pressure equation with four neuman bcs need to be solved. but the pressure matrix is singular. it gives divergence solution. how to solve this kind of equations. Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From ml2448 at cornell.edu Fri Sep 23 09:53:32 2016 From: ml2448 at cornell.edu (Melanie Li Sing How) Date: Fri, 23 Sep 2016 10:53:32 -0400 Subject: [petsc-users] Error during compiling my own code Message-ID: <21280B84-3D91-459E-9A03-AF04704F1235@cornell.edu> Hi, I happened to fall on your link (http://lists.mcs.anl.gov/pipermail/petsc-users/2010-April/006170.html ) to the same problem I am having. I am a yellowstone user and I could not find a contact detail for the same error I am having: /error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_TAG] INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR Could you please provide some advice on this? Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 23 09:59:05 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Sep 2016 09:59:05 -0500 Subject: [petsc-users] Error during compiling my own code In-Reply-To: <21280B84-3D91-459E-9A03-AF04704F1235@cornell.edu> References: <21280B84-3D91-459E-9A03-AF04704F1235@cornell.edu> Message-ID: On Fri, Sep 23, 2016 at 9:53 AM, Melanie Li Sing How wrote: > Hi, > I happened to fall on your link (http://lists.mcs.anl.gov/ > pipermail/petsc-users/2010-April/006170.html) to the same problem I am > having. I am a yellowstone user and I could not find a contact detail for > the same error I am having: > > /error #6401: The attributes of this name conflict with those made > accessible by a USE statement. [MPI_TAG] > INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR > > Could you please provide some advice on this? > As Satish says, show us what you are doing, or at least the whole error message. Matt > Thank you > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 23 10:00:07 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Sep 2016 10:00:07 -0500 Subject: [petsc-users] How to solve the pressure possion equations with four neuman bc In-Reply-To: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> References: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> Message-ID: On Fri, Sep 23, 2016 at 9:38 AM, ??? wrote: > Dear friends: > In the projection method for the solution of incompressible flow, a > pressure equation with four neuman bcs need to be solved. > but the pressure matrix is singular. it gives divergence solution. how to > solve this kind of equations. > Tell the matrix it has a nullspace of constants: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetNullSpace.html Thanks, Matt > Regards > > > > > > > > > > > > > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ml2448 at cornell.edu Fri Sep 23 10:01:00 2016 From: ml2448 at cornell.edu (Melanie Li Sing How) Date: Fri, 23 Sep 2016 11:01:00 -0400 Subject: [petsc-users] Error during compiling my own code In-Reply-To: References: <21280B84-3D91-459E-9A03-AF04704F1235@cornell.edu> Message-ID: make[1]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src' make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/library' make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/liblibrary.a' is up to date. make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/library' make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/p3dfft' make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libp3dfft.a' is up to date. make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/p3dfft' make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/config' make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libconfig.a' is up to date. make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/config' make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/fluid' make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libfluid.a' is up to date. make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/fluid' make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/particles' mpif90 -O2 -ip -I/glade/u/home/lisingh/hippstr2.0/mod -c particles_global.f90 -o /glade/u/home/lisingh/hippstr2.0/obj/particles_global.o -module /glade/u/home/lisingh/hippstr2.0/mod /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_SOURCE] INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_TAG] INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR ---------------------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERROR] INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR ------------------------------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(11): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_STATUS_SIZE] INTEGER MPI_STATUS_SIZE ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(13): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_STATUS_IGNORE] INTEGER MPI_STATUS_IGNORE(MPI_STATUS_SIZE) ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(14): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_STATUSES_IGNORE] INTEGER MPI_STATUSES_IGNORE(MPI_STATUS_SIZE,1) ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(15): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERRCODES_IGNORE] INTEGER MPI_ERRCODES_IGNORE(1) ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(16): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ARGVS_NULL] CHARACTER*1 MPI_ARGVS_NULL(1,1) -------------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(17): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ARGV_NULL] CHARACTER*1 MPI_ARGV_NULL(1) -------------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(18): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_SUCCESS] INTEGER MPI_SUCCESS ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(20): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_OTHER] INTEGER MPI_ERR_OTHER ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(22): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_COUNT] INTEGER MPI_ERR_COUNT ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(24): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_SPAWN] INTEGER MPI_ERR_SPAWN ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(26): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_LOCKTYPE] INTEGER MPI_ERR_LOCKTYPE ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(28): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_OP] INTEGER MPI_ERR_OP ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(30): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_DUP_DATAREP] INTEGER MPI_ERR_DUP_DATAREP ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(32): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_UNSUPPORTED_DATAREP] INTEGER MPI_ERR_UNSUPPORTED_DATAREP ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(34): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_TRUNCATE] INTEGER MPI_ERR_TRUNCATE ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(36): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_INFO_NOKEY] INTEGER MPI_ERR_INFO_NOKEY ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(38): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_ASSERT] INTEGER MPI_ERR_ASSERT ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(40): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_FILE_EXISTS] INTEGER MPI_ERR_FILE_EXISTS ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(42): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_PENDING] INTEGER MPI_ERR_PENDING ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(44): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_COMM] INTEGER MPI_ERR_COMM ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(46): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_KEYVAL] INTEGER MPI_ERR_KEYVAL ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(48): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_NAME] INTEGER MPI_ERR_NAME ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(50): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_REQUEST] INTEGER MPI_ERR_REQUEST ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(52): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_TYPE] INTEGER MPI_ERR_TYPE ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(54): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_INFO_VALUE] INTEGER MPI_ERR_INFO_VALUE ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(56): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_RMA_SYNC] INTEGER MPI_ERR_RMA_SYNC ---------------^ /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(58): error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_ERR_NO_MEM] INTEGER MPI_ERR_NO_MEM ---------------^ particles_global.f90(655): catastrophic error: Too many errors, exiting compilation aborted for particles_global.f90 (code 1) make[2]: *** [particles_global.o] Error 1 make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/particles' make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/io' make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libio.a' is up to date. make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/io' make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/monitor' make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libmonitor.a' is up to date. make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/monitor' make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/core' make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libcore.a' is up to date. make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/core' make[1]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src' make[1]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src' mpif90 /glade/u/home/lisingh/hippstr2.0/obj/driver.o /glade/u/home/lisingh/hippstr2.0/lib/libcore.a /glade/u/home/lisingh/hippstr2.0/lib/libmonitor.a /glade/u/home/lisingh/hippstr2.0/lib/libio.a /glade/u/home/lisingh/hippstr2.0/lib/libparticles.a /glade/u/home/lisingh/hippstr2.0/lib/libfluid.a /glade/u/home/lisingh/hippstr2.0/lib/libconfig.a /glade/u/home/lisingh/hippstr2.0/lib/libp3dfft.a /glade/u/home/lisingh/hippstr2.0/lib/liblibrary.a -o /glade/u/home/lisingh/hippstr2.0/bin/hippstr ld: warning: libhdf5_hl.so.7, needed by /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf_c++4.so.1, may conflict with libhdf5_hl.so.8 ld: warning: libhdf5.so.7, needed by /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf_c++4.so.1, may conflict with libhdf5.so.8 ld: warning: libhdf5_fortran.so.7, needed by /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf.so.7, may conflict with libhdf5_fortran.so.8 ld: warning: libhdf5hl_fortran.so.7, needed by /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf.so.7, may conflict with libhdf5hl_fortran.so.8 ld: warning: libhdf5_cpp.so.7, needed by /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf.so.7, may conflict with libhdf5_cpp.so.8 ld: warning: libhdf5_hl_cpp.so.7, needed by /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf.so.7, may conflict with libhdf5_hl_cpp.so.8 /glade/u/home/lisingh/hippstr2.0/lib/libcore.a(simulation.o): In function `simulation_mp_simulation_run_': simulation.f90:(.text+0x139): undefined reference to `particles_neighborlist_' make[1]: *** [hippstr] Error 1 make[1]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src' make: *** [default] Error 2 Would it be easier to send you the two modules that I think are conflicting? Thank you so much > On Sep 23, 2016, at 10:59 AM, Matthew Knepley wrote: > > On Fri, Sep 23, 2016 at 9:53 AM, Melanie Li Sing How > wrote: > Hi, > I happened to fall on your link (http://lists.mcs.anl.gov/pipermail/petsc-users/2010-April/006170.html ) to the same problem I am having. I am a yellowstone user and I could not find a contact detail for the same error I am having: > > /error #6401: The attributes of this name conflict with those made accessible by a USE statement. [MPI_TAG] > INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR > > Could you please provide some advice on this? > > As Satish says, show us what you are doing, or at least the whole error message. > > Matt > > Thank you > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 23 10:06:00 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Sep 2016 10:06:00 -0500 Subject: [petsc-users] Error during compiling my own code In-Reply-To: References: <21280B84-3D91-459E-9A03-AF04704F1235@cornell.edu> Message-ID: On Fri, Sep 23, 2016 at 10:01 AM, Melanie Li Sing How wrote: > make[1]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src' > make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/library' > make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/liblibrary.a' is up to > date. > make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/library' > make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/p3dfft' > make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libp3dfft.a' is up to date. > make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/p3dfft' > make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/config' > make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libconfig.a' is up to date. > make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/config' > make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/fluid' > make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libfluid.a' is up to date. > make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/fluid' > make[2]: Entering directory `/glade/u/home/lisingh/ > hippstr2.0/src/particles' > mpif90 -O2 -ip -I/glade/u/home/lisingh/hippstr2.0/mod -c > particles_global.f90 -o /glade/u/home/lisingh/hippstr2.0/obj/particles_global.o > -module /glade/u/home/lisingh/hippstr2.0/mod > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_SOURCE] > INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_TAG] > INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR > ---------------------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERROR] > INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR > ------------------------------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(11): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_STATUS_SIZE] > INTEGER MPI_STATUS_SIZE > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(13): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_STATUS_IGNORE] > INTEGER MPI_STATUS_IGNORE(MPI_STATUS_SIZE) > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(14): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_STATUSES_IGNORE] > INTEGER MPI_STATUSES_IGNORE(MPI_STATUS_SIZE,1) > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(15): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERRCODES_IGNORE] > INTEGER MPI_ERRCODES_IGNORE(1) > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(16): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ARGVS_NULL] > CHARACTER*1 MPI_ARGVS_NULL(1,1) > -------------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(17): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ARGV_NULL] > CHARACTER*1 MPI_ARGV_NULL(1) > -------------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(18): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_SUCCESS] > INTEGER MPI_SUCCESS > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(20): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_OTHER] > INTEGER MPI_ERR_OTHER > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(22): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_COUNT] > INTEGER MPI_ERR_COUNT > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(24): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_SPAWN] > INTEGER MPI_ERR_SPAWN > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(26): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_LOCKTYPE] > INTEGER MPI_ERR_LOCKTYPE > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(28): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_OP] > INTEGER MPI_ERR_OP > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(30): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_DUP_DATAREP] > INTEGER MPI_ERR_DUP_DATAREP > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(32): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_UNSUPPORTED_DATAREP] > INTEGER MPI_ERR_UNSUPPORTED_DATAREP > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(34): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_TRUNCATE] > INTEGER MPI_ERR_TRUNCATE > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(36): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_INFO_NOKEY] > INTEGER MPI_ERR_INFO_NOKEY > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(38): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_ASSERT] > INTEGER MPI_ERR_ASSERT > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(40): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_FILE_EXISTS] > INTEGER MPI_ERR_FILE_EXISTS > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(42): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_PENDING] > INTEGER MPI_ERR_PENDING > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(44): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_COMM] > INTEGER MPI_ERR_COMM > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(46): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_KEYVAL] > INTEGER MPI_ERR_KEYVAL > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(48): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_NAME] > INTEGER MPI_ERR_NAME > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(50): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_REQUEST] > INTEGER MPI_ERR_REQUEST > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(52): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_TYPE] > INTEGER MPI_ERR_TYPE > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(54): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_INFO_VALUE] > INTEGER MPI_ERR_INFO_VALUE > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(56): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_RMA_SYNC] > INTEGER MPI_ERR_RMA_SYNC > ---------------^ > /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(58): error #6401: The > attributes of this name conflict with those made accessible by a USE > statement. [MPI_ERR_NO_MEM] > INTEGER MPI_ERR_NO_MEM > ---------------^ > particles_global.f90(655): catastrophic error: Too many errors, exiting > compilation aborted for particles_global.f90 (code 1) > make[2]: *** [particles_global.o] Error 1 > make[2]: Leaving directory `/glade/u/home/lisingh/ > hippstr2.0/src/particles' > make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/io' > make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libio.a' is up to date. > make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/io' > make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/monitor' > make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libmonitor.a' is up to > date. > make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/monitor' > make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/core' > make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libcore.a' is up to date. > make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/core' > make[1]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src' > make[1]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src' > mpif90 /glade/u/home/lisingh/hippstr2.0/obj/driver.o > /glade/u/home/lisingh/hippstr2.0/lib/libcore.a /glade/u/home/lisingh/hippstr2.0/lib/libmonitor.a > /glade/u/home/lisingh/hippstr2.0/lib/libio.a /glade/u/home/lisingh/hippstr2.0/lib/libparticles.a > /glade/u/home/lisingh/hippstr2.0/lib/libfluid.a /glade/u/home/lisingh/hippstr2.0/lib/libconfig.a > /glade/u/home/lisingh/hippstr2.0/lib/libp3dfft.a /glade/u/home/lisingh/hippstr2.0/lib/liblibrary.a > -o /glade/u/home/lisingh/hippstr2.0/bin/hippstr > ld: warning: libhdf5_hl.so.7, needed by /glade/apps/opt/netcdf/4.2/ > intel/default/lib/libnetcdf_c++4.so.1, may conflict with libhdf5_hl.so.8 > ld: warning: libhdf5.so.7, needed by /glade/apps/opt/netcdf/4.2/ > intel/default/lib/libnetcdf_c++4.so.1, may conflict with libhdf5.so.8 > ld: warning: libhdf5_fortran.so.7, needed by /glade/apps/opt/netcdf/4.2/ > intel/default/lib/libnetcdf.so.7, may conflict with libhdf5_fortran.so.8 > ld: warning: libhdf5hl_fortran.so.7, needed by /glade/apps/opt/netcdf/4.2/ > intel/default/lib/libnetcdf.so.7, may conflict with libhdf5hl_fortran.so.8 > ld: warning: libhdf5_cpp.so.7, needed by /glade/apps/opt/netcdf/4.2/ > intel/default/lib/libnetcdf.so.7, may conflict with libhdf5_cpp.so.8 > ld: warning: libhdf5_hl_cpp.so.7, needed by /glade/apps/opt/netcdf/4.2/ > intel/default/lib/libnetcdf.so.7, may conflict with libhdf5_hl_cpp.so.8 > /glade/u/home/lisingh/hippstr2.0/lib/libcore.a(simulation.o): In function > `simulation_mp_simulation_run_': > simulation.f90:(.text+0x139): undefined reference to > `particles_neighborlist_' > make[1]: *** [hippstr] Error 1 > make[1]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src' > make: *** [default] Error 2 > > Would it be easier to send you the two modules that I think are > conflicting? Thank you so much > No. This is not really a PETSc error. You have some module that does something like "use mpi", and another that probably does #include "mpif.h". We can't debug other people's code. Thanks, Matt > On Sep 23, 2016, at 10:59 AM, Matthew Knepley wrote: > > On Fri, Sep 23, 2016 at 9:53 AM, Melanie Li Sing How > wrote: > >> Hi, >> I happened to fall on your link (http://lists.mcs.anl.gov/pipe >> rmail/petsc-users/2010-April/006170.html) to the same problem I am >> having. I am a yellowstone user and I could not find a contact detail for >> the same error I am having: >> >> /error #6401: The attributes of this name conflict with those made >> accessible by a USE statement. [MPI_TAG] >> INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR >> >> Could you please provide some advice on this? >> > > As Satish says, show us what you are doing, or at least the whole error > message. > > Matt > > >> Thank you >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 23 10:10:54 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Sep 2016 10:10:54 -0500 Subject: [petsc-users] Error during compiling my own code In-Reply-To: <473D0490-C775-448F-BB77-595020EB43F3@cornell.edu> References: <21280B84-3D91-459E-9A03-AF04704F1235@cornell.edu> <473D0490-C775-448F-BB77-595020EB43F3@cornell.edu> Message-ID: On Fri, Sep 23, 2016 at 10:08 AM, Melanie Li Sing How wrote: > Ok Thank you. I only have include ?mpif.h? in both modules that are > conflicting but thank you for your help. > Obviously from the message, you have a USE statement that eventually uses mpi. Matt > On Sep 23, 2016, at 11:06 AM, Matthew Knepley wrote: > > On Fri, Sep 23, 2016 at 10:01 AM, Melanie Li Sing How > wrote: > >> make[1]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src' >> make[2]: Entering directory `/glade/u/home/lisingh/hippstr >> 2.0/src/library' >> make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/liblibrary.a' is up to >> date. >> make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/library' >> make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/p3dfft' >> make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libp3dfft.a' is up to >> date. >> make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/p3dfft' >> make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/config' >> make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libconfig.a' is up to >> date. >> make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/config' >> make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/fluid' >> make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libfluid.a' is up to date. >> make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/fluid' >> make[2]: Entering directory `/glade/u/home/lisingh/hippstr >> 2.0/src/particles' >> mpif90 -O2 -ip -I/glade/u/home/lisingh/hippstr2.0/mod -c >> particles_global.f90 -o /glade/u/home/lisingh/hippstr2.0/obj/particles_global.o >> -module /glade/u/home/lisingh/hippstr2.0/mod >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The >> attributes of this name conflict with those made accessible by a USE >> statement. [MPI_SOURCE] >> INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The >> attributes of this name conflict with those made accessible by a USE >> statement. [MPI_TAG] >> INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR >> ---------------------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(9): error #6401: The >> attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERROR] >> INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR >> ------------------------------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(11): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_STATUS_SIZE] >> INTEGER MPI_STATUS_SIZE >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(13): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_STATUS_IGNORE] >> INTEGER MPI_STATUS_IGNORE(MPI_STATUS_SIZE) >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(14): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_STATUSES_IGNORE] >> INTEGER MPI_STATUSES_IGNORE(MPI_STATUS_SIZE,1) >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(15): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERRCODES_IGNORE] >> INTEGER MPI_ERRCODES_IGNORE(1) >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(16): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ARGVS_NULL] >> CHARACTER*1 MPI_ARGVS_NULL(1,1) >> -------------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(17): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ARGV_NULL] >> CHARACTER*1 MPI_ARGV_NULL(1) >> -------------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(18): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_SUCCESS] >> INTEGER MPI_SUCCESS >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(20): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_OTHER] >> INTEGER MPI_ERR_OTHER >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(22): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_COUNT] >> INTEGER MPI_ERR_COUNT >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(24): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_SPAWN] >> INTEGER MPI_ERR_SPAWN >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(26): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_LOCKTYPE] >> INTEGER MPI_ERR_LOCKTYPE >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(28): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_OP] >> INTEGER MPI_ERR_OP >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(30): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_DUP_DATAREP] >> INTEGER MPI_ERR_DUP_DATAREP >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(32): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_UNSUPPORTED_DATAREP] >> INTEGER MPI_ERR_UNSUPPORTED_DATAREP >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(34): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_TRUNCATE] >> INTEGER MPI_ERR_TRUNCATE >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(36): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_INFO_NOKEY] >> INTEGER MPI_ERR_INFO_NOKEY >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(38): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_ASSERT] >> INTEGER MPI_ERR_ASSERT >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(40): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_FILE_EXISTS] >> INTEGER MPI_ERR_FILE_EXISTS >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(42): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_PENDING] >> INTEGER MPI_ERR_PENDING >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(44): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_COMM] >> INTEGER MPI_ERR_COMM >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(46): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_KEYVAL] >> INTEGER MPI_ERR_KEYVAL >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(48): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_NAME] >> INTEGER MPI_ERR_NAME >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(50): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_REQUEST] >> INTEGER MPI_ERR_REQUEST >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(52): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_TYPE] >> INTEGER MPI_ERR_TYPE >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(54): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_INFO_VALUE] >> INTEGER MPI_ERR_INFO_VALUE >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(56): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_RMA_SYNC] >> INTEGER MPI_ERR_RMA_SYNC >> ---------------^ >> /opt/ibmhpc/pecurrent/mpich2/intel/include64/mpif.h(58): error #6401: >> The attributes of this name conflict with those made accessible by a USE >> statement. [MPI_ERR_NO_MEM] >> INTEGER MPI_ERR_NO_MEM >> ---------------^ >> particles_global.f90(655): catastrophic error: Too many errors, exiting >> compilation aborted for particles_global.f90 (code 1) >> make[2]: *** [particles_global.o] Error 1 >> make[2]: Leaving directory `/glade/u/home/lisingh/hippstr >> 2.0/src/particles' >> make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/io' >> make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libio.a' is up to date. >> make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/io' >> make[2]: Entering directory `/glade/u/home/lisingh/hippstr >> 2.0/src/monitor' >> make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libmonitor.a' is up to >> date. >> make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/monitor' >> make[2]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src/core' >> make[2]: `/glade/u/home/lisingh/hippstr2.0/lib/libcore.a' is up to date. >> make[2]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src/core' >> make[1]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src' >> make[1]: Entering directory `/glade/u/home/lisingh/hippstr2.0/src' >> mpif90 /glade/u/home/lisingh/hippstr2.0/obj/driver.o >> /glade/u/home/lisingh/hippstr2.0/lib/libcore.a >> /glade/u/home/lisingh/hippstr2.0/lib/libmonitor.a >> /glade/u/home/lisingh/hippstr2.0/lib/libio.a >> /glade/u/home/lisingh/hippstr2.0/lib/libparticles.a >> /glade/u/home/lisingh/hippstr2.0/lib/libfluid.a >> /glade/u/home/lisingh/hippstr2.0/lib/libconfig.a >> /glade/u/home/lisingh/hippstr2.0/lib/libp3dfft.a >> /glade/u/home/lisingh/hippstr2.0/lib/liblibrary.a -o >> /glade/u/home/lisingh/hippstr2.0/bin/hippstr >> ld: warning: libhdf5_hl.so.7, needed by /glade/apps/opt/netcdf/4.2/int >> el/default/lib/libnetcdf_c++4.so.1, may conflict with libhdf5_hl.so.8 >> ld: warning: libhdf5.so.7, needed by /glade/apps/opt/netcdf/4.2/int >> el/default/lib/libnetcdf_c++4.so.1, may conflict with libhdf5.so.8 >> ld: warning: libhdf5_fortran.so.7, needed by >> /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf.so.7, may >> conflict with libhdf5_fortran.so.8 >> ld: warning: libhdf5hl_fortran.so.7, needed by >> /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf.so.7, may >> conflict with libhdf5hl_fortran.so.8 >> ld: warning: libhdf5_cpp.so.7, needed by /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf.so.7, >> may conflict with libhdf5_cpp.so.8 >> ld: warning: libhdf5_hl_cpp.so.7, needed by /glade/apps/opt/netcdf/4.2/intel/default/lib/libnetcdf.so.7, >> may conflict with libhdf5_hl_cpp.so.8 >> /glade/u/home/lisingh/hippstr2.0/lib/libcore.a(simulation.o): In >> function `simulation_mp_simulation_run_': >> simulation.f90:(.text+0x139): undefined reference to >> `particles_neighborlist_' >> make[1]: *** [hippstr] Error 1 >> make[1]: Leaving directory `/glade/u/home/lisingh/hippstr2.0/src' >> make: *** [default] Error 2 >> >> Would it be easier to send you the two modules that I think are >> conflicting? Thank you so much >> > > No. This is not really a PETSc error. You have some module that does > something like "use mpi", and another > that probably does #include "mpif.h". We can't debug other people's code. > > Thanks, > > Matt > > >> On Sep 23, 2016, at 10:59 AM, Matthew Knepley wrote: >> >> On Fri, Sep 23, 2016 at 9:53 AM, Melanie Li Sing How >> wrote: >> >>> Hi, >>> I happened to fall on your link (http://lists.mcs.anl.gov/pipe >>> rmail/petsc-users/2010-April/006170.html) to the same problem I am >>> having. I am a yellowstone user and I could not find a contact detail for >>> the same error I am having: >>> >>> /error #6401: The attributes of this name conflict with those made >>> accessible by a USE statement. [MPI_TAG] >>> INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR >>> >>> Could you please provide some advice on this? >>> >> >> As Satish says, show us what you are doing, or at least the whole error >> message. >> >> Matt >> >> >>> Thank you >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Fri Sep 23 10:37:25 2016 From: hzhang at mcs.anl.gov (Hong) Date: Fri, 23 Sep 2016 10:37:25 -0500 Subject: [petsc-users] Write binary to matrix In-Reply-To: <0ab6d96e-2056-5683-2d9f-6aac46fd86b8@xgm.de> References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> <2f78a83c-650c-b6b7-914b-8ad76cb785b0@xgm.de> <0ab6d96e-2056-5683-2d9f-6aac46fd86b8@xgm.de> Message-ID: Florian: I can reproduce this error. This is a bug in PETSc library. I'll fix it and get back to you soon. Hong > > Am 22.09.2016 um 18:34 schrieb Hong: > > Florian: > > Would it work if replacing MATSBAIJ to MATAIJ or MATMPISBAIJ? > > MATAIJ works, but is not an option for my actual application. > > MATMPISBAIJ does not work. Not very suprisingly, since afaik setting it to > MATSBAIJ and executing it on multiple MPI > ranks actually results in MATMPISBAIJ. > > Best, > Florian > > > > > > Hong > > > > Hey, > > > > this code reproduces the error when run with 2 or more ranks. > > > > #include > > #include > > > > int main(int argc, char *argv[]) > > { > > PetscInitialize(&argc, &argv, "", NULL); > > > > Mat matrix; > > MatCreate(PETSC_COMM_WORLD, &matrix); > > MatSetType(matrix, MATSBAIJ); > > MatSetSizes(matrix, 10, 10, PETSC_DETERMINE, PETSC_DETERMINE); > > MatSetFromOptions(matrix); > > MatSetUp(matrix); > > > > MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); > > MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); > > > > PetscViewer viewer; > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, "test.mat", > FILE_MODE_WRITE, &viewer); > > MatView(matrix, viewer); > > PetscViewerDestroy(&viewer); > > MatDestroy(&matrix); > > > > PetscFinalize(); > > } > > > > > > The complete output is: > > > > > > lindnefn at neon /data/scratch/lindnefn/aste (git)-[master] % mpic++ > petsc.cpp -lpetsc && mpirun -n 2 ./a.out > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: No support for this operation for this object type > > [0]PETSC ERROR: Cannot get subcomm viewer for binary files or > sockets unless SubViewer contains the rank 0 process > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html > > for trouble > shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named neon by > lindnefn Thu Sep 22 16:10:34 2016 > > [0]PETSC ERROR: Configure options --with-debugging=1 > --download-petsc4py=yes --download-mpi4py=yes > > --download-superlu_dist --download-parmetis --download-metis > > [0]PETSC ERROR: #1 PetscViewerGetSubViewer_Binary() line 46 in > > /data/scratch/lindnefn/software/petsc/src/sys/ > classes/viewer/impls/binary/binv.c > > [0]PETSC ERROR: #2 PetscViewerGetSubViewer() line 43 in > > /data/scratch/lindnefn/software/petsc/src/sys/ > classes/viewer/interface/dupl.c > > [0]PETSC ERROR: #3 MatView_MPISBAIJ_ASCIIorDraworSocket() line 900 > in > > /data/scratch/lindnefn/software/petsc/src/mat/impls/ > sbaij/mpi/mpisbaij.c > > [0]PETSC ERROR: #4 MatView_MPISBAIJ() line 926 in > > /data/scratch/lindnefn/software/petsc/src/mat/impls/ > sbaij/mpi/mpisbaij.c > > [0]PETSC ERROR: #5 MatView() line 901 in /data/scratch/lindnefn/ > software/petsc/src/mat/interface/matrix.c > > WARNING! There are options you set that were not used! > > WARNING! could be spelling mistake, etc! > > Option left: name:-ksp_converged_reason (no value) > > Option left: name:-ksp_final_residual (no value) > > Option left: name:-ksp_view (no value) > > [neon:113111] *** Process received signal *** > > [neon:113111] Signal: Aborted (6) > > [neon:113111] Signal code: (-6) > > [neon:113111] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) > [0x7feed8958cb0] > > [neon:113111] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) > [0x7feed8958c37] > > [neon:113111] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) > [0x7feed895c028] > > [neon:113111] [ 3] > > /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(PetscTraceBackErrorHandler+0x563) > > [0x7feed8d8db31] > > [neon:113111] [ 4] /data/scratch/lindnefn/ > software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscError+0x374) > > [0x7feed8d88750] > > [neon:113111] [ 5] /data/scratch/lindnefn/ > software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x19b2f6) > > [0x7feed8e822f6] > > [neon:113111] [ 6] > > /data/scratch/lindnefn/software/petsc/arch-linux2-c- > debug/lib/libpetsc.so.3.7(PetscViewerGetSubViewer+0x4f1) > > [0x7feed8e803cb] > > [neon:113111] [ 7] /data/scratch/lindnefn/ > software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x860c95) > > [0x7feed9547c95] > > [neon:113111] [ 8] /data/scratch/lindnefn/ > software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x861494) > > [0x7feed9548494] > > [neon:113111] [ 9] /data/scratch/lindnefn/ > software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(MatView+0x12b6) > > [0x7feed971c08f] > > [neon:113111] [10] ./a.out() [0x400b8b] > > [neon:113111] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) > [0x7feed8943f45] > > [neon:113111] [12] ./a.out() [0x4009e9] > > [neon:113111] *** End of error message *** > > ------------------------------------------------------------ > -------------- > > mpirun noticed that process rank 1 with PID 113111 on node neon > exited on signal 6 (Aborted). > > ------------------------------------------------------------ > -------------- > > > > Thanks, > > Florian > > > > > > > > Am 22.09.2016 um 13:32 schrieb Matthew Knepley: > > > On Thu, Sep 22, 2016 at 5:42 AM, Florian Lindner < > mailinglists at xgm.de > > >> wrote: > > > > > > Hello, > > > > > > I want to write a MATSBAIJ to a file in binary, so that I can > load it later using MatLoad. > > > > > > However, I keep getting the error: > > > > > > [5]PETSC ERROR: No support for this operation for this object > type! > > > [5]PETSC ERROR: Cannot get subcomm viewer for binary files or > sockets unless SubViewer contains the rank 0 process > > > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in > > > /data/scratch/lindnefn/software/petsc/src/sys/ > classes/viewer/impls/binary/binv.c > > > > > > > > > Do not truncate the stack. > > > > > > Run under valgrind. > > > > > > Thanks, > > > > > > Matt > > > > > > > > > The rank 0 is included, as you can see below, I use > PETSC_COMM_WORLD and the matrix is also created like that. > > > > > > The code looks like: > > > > > > PetscErrorCode ierr = 0; > > > PetscViewer viewer; > > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), > FILE_MODE_WRITE, &viewer); CHKERRV(ierr); > > > MatView(matrix, viewer); CHKERRV(ierr); > > > PetscViewerDestroy(&viewer); > > > > > > Thanks, > > > Florian > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any > > > results to which their experiments lead. > > > -- Norbert Wiener > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Fri Sep 23 10:56:44 2016 From: hzhang at mcs.anl.gov (Hong) Date: Fri, 23 Sep 2016 10:56:44 -0500 Subject: [petsc-users] Write binary to matrix In-Reply-To: References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> <2f78a83c-650c-b6b7-914b-8ad76cb785b0@xgm.de> <0ab6d96e-2056-5683-2d9f-6aac46fd86b8@xgm.de> Message-ID: Florian: I pushed a fix in branch hzhang/fix_matview_mpisbaij (off petsc-maint) https://bitbucket.org/petsc/petsc/commits/d1654148bc9f02cde4d336bb9518a18cfb35148e After it is tested in our regression tests, it will be merged to petsc-maint and petsc-master. Thanks for reporting it! Hong On Fri, Sep 23, 2016 at 10:37 AM, Hong wrote: > Florian: > I can reproduce this error. > This is a bug in PETSc library. I'll fix it and get back to you soon. > Hong > > >> Am 22.09.2016 um 18:34 schrieb Hong: >> > Florian: >> > Would it work if replacing MATSBAIJ to MATAIJ or MATMPISBAIJ? >> >> MATAIJ works, but is not an option for my actual application. >> >> MATMPISBAIJ does not work. Not very suprisingly, since afaik setting it >> to MATSBAIJ and executing it on multiple MPI >> ranks actually results in MATMPISBAIJ. >> >> Best, >> Florian >> >> >> > >> > Hong >> > >> > Hey, >> > >> > this code reproduces the error when run with 2 or more ranks. >> > >> > #include >> > #include >> > >> > int main(int argc, char *argv[]) >> > { >> > PetscInitialize(&argc, &argv, "", NULL); >> > >> > Mat matrix; >> > MatCreate(PETSC_COMM_WORLD, &matrix); >> > MatSetType(matrix, MATSBAIJ); >> > MatSetSizes(matrix, 10, 10, PETSC_DETERMINE, PETSC_DETERMINE); >> > MatSetFromOptions(matrix); >> > MatSetUp(matrix); >> > >> > MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); >> > MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); >> > >> > PetscViewer viewer; >> > PetscViewerBinaryOpen(PETSC_COMM_WORLD, "test.mat", >> FILE_MODE_WRITE, &viewer); >> > MatView(matrix, viewer); >> > PetscViewerDestroy(&viewer); >> > MatDestroy(&matrix); >> > >> > PetscFinalize(); >> > } >> > >> > >> > The complete output is: >> > >> > >> > lindnefn at neon /data/scratch/lindnefn/aste (git)-[master] % mpic++ >> petsc.cpp -lpetsc && mpirun -n 2 ./a.out >> > >> > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > [0]PETSC ERROR: No support for this operation for this object type >> > [0]PETSC ERROR: Cannot get subcomm viewer for binary files or >> sockets unless SubViewer contains the rank 0 process >> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/d >> ocumentation/faq.html >> > for trouble >> shooting. >> > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown >> > [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named neon by >> lindnefn Thu Sep 22 16:10:34 2016 >> > [0]PETSC ERROR: Configure options --with-debugging=1 >> --download-petsc4py=yes --download-mpi4py=yes >> > --download-superlu_dist --download-parmetis --download-metis >> > [0]PETSC ERROR: #1 PetscViewerGetSubViewer_Binary() line 46 in >> > /data/scratch/lindnefn/software/petsc/src/sys/classes/ >> viewer/impls/binary/binv.c >> > [0]PETSC ERROR: #2 PetscViewerGetSubViewer() line 43 in >> > /data/scratch/lindnefn/software/petsc/src/sys/classes/ >> viewer/interface/dupl.c >> > [0]PETSC ERROR: #3 MatView_MPISBAIJ_ASCIIorDraworSocket() line 900 >> in >> > /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/ >> mpi/mpisbaij.c >> > [0]PETSC ERROR: #4 MatView_MPISBAIJ() line 926 in >> > /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/ >> mpi/mpisbaij.c >> > [0]PETSC ERROR: #5 MatView() line 901 in >> /data/scratch/lindnefn/software/petsc/src/mat/interface/matrix.c >> > WARNING! There are options you set that were not used! >> > WARNING! could be spelling mistake, etc! >> > Option left: name:-ksp_converged_reason (no value) >> > Option left: name:-ksp_final_residual (no value) >> > Option left: name:-ksp_view (no value) >> > [neon:113111] *** Process received signal *** >> > [neon:113111] Signal: Aborted (6) >> > [neon:113111] Signal code: (-6) >> > [neon:113111] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) >> [0x7feed8958cb0] >> > [neon:113111] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) >> [0x7feed8958c37] >> > [neon:113111] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) >> [0x7feed895c028] >> > [neon:113111] [ 3] >> > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/ >> lib/libpetsc.so.3.7(PetscTraceBackErrorHandler+0x563) >> > [0x7feed8d8db31] >> > [neon:113111] [ 4] /data/scratch/lindnefn/softwar >> e/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscError+0x374) >> > [0x7feed8d88750] >> > [neon:113111] [ 5] /data/scratch/lindnefn/softwar >> e/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x19b2f6) >> > [0x7feed8e822f6] >> > [neon:113111] [ 6] >> > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/ >> lib/libpetsc.so.3.7(PetscViewerGetSubViewer+0x4f1) >> > [0x7feed8e803cb] >> > [neon:113111] [ 7] /data/scratch/lindnefn/softwar >> e/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x860c95) >> > [0x7feed9547c95] >> > [neon:113111] [ 8] /data/scratch/lindnefn/softwar >> e/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x861494) >> > [0x7feed9548494] >> > [neon:113111] [ 9] /data/scratch/lindnefn/softwar >> e/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(MatView+0x12b6) >> > [0x7feed971c08f] >> > [neon:113111] [10] ./a.out() [0x400b8b] >> > [neon:113111] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) >> [0x7feed8943f45] >> > [neon:113111] [12] ./a.out() [0x4009e9] >> > [neon:113111] *** End of error message *** >> > ----------------------------------------------------------- >> --------------- >> > mpirun noticed that process rank 1 with PID 113111 on node neon >> exited on signal 6 (Aborted). >> > ----------------------------------------------------------- >> --------------- >> > >> > Thanks, >> > Florian >> > >> > >> > >> > Am 22.09.2016 um 13:32 schrieb Matthew Knepley: >> > > On Thu, Sep 22, 2016 at 5:42 AM, Florian Lindner < >> mailinglists at xgm.de >> > >> wrote: >> > > >> > > Hello, >> > > >> > > I want to write a MATSBAIJ to a file in binary, so that I can >> load it later using MatLoad. >> > > >> > > However, I keep getting the error: >> > > >> > > [5]PETSC ERROR: No support for this operation for this object >> type! >> > > [5]PETSC ERROR: Cannot get subcomm viewer for binary files or >> sockets unless SubViewer contains the rank 0 process >> > > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() line 46 in >> > > /data/scratch/lindnefn/software/petsc/src/sys/classes/ >> viewer/impls/binary/binv.c >> > > >> > > >> > > Do not truncate the stack. >> > > >> > > Run under valgrind. >> > > >> > > Thanks, >> > > >> > > Matt >> > > >> > > >> > > The rank 0 is included, as you can see below, I use >> PETSC_COMM_WORLD and the matrix is also created like that. >> > > >> > > The code looks like: >> > > >> > > PetscErrorCode ierr = 0; >> > > PetscViewer viewer; >> > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, filename.c_str(), >> FILE_MODE_WRITE, &viewer); CHKERRV(ierr); >> > > MatView(matrix, viewer); CHKERRV(ierr); >> > > PetscViewerDestroy(&viewer); >> > > >> > > Thanks, >> > > Florian >> > > >> > > >> > > >> > > >> > > -- >> > > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any >> > > results to which their experiments lead. >> > > -- Norbert Wiener >> > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Sep 23 10:58:49 2016 From: jed at jedbrown.org (Jed Brown) Date: Fri, 23 Sep 2016 09:58:49 -0600 Subject: [petsc-users] How to solve the pressure possion equations with four neuman bc In-Reply-To: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> References: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> Message-ID: <87zimyzjye.fsf@jedbrown.org> ??? writes: > Dear friends: > In the projection method for the solution of incompressible flow, a pressure equation with four neuman bcs need to be solved. > but the pressure matrix is singular. it gives divergence solution. how to solve this kind of equations. See the users manual section on solving singular systems. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From fande.kong at inl.gov Fri Sep 23 11:01:08 2016 From: fande.kong at inl.gov (Kong, Fande) Date: Fri, 23 Sep 2016 10:01:08 -0600 Subject: [petsc-users] How to solve the pressure possion equations with four neuman bc In-Reply-To: <87zimyzjye.fsf@jedbrown.org> References: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> <87zimyzjye.fsf@jedbrown.org> Message-ID: Any references on this topic except the users manual? I am interested in mathematics theory on this topic. Fande Kong, On Fri, Sep 23, 2016 at 9:58 AM, Jed Brown wrote: > ??? writes: > > > Dear friends: > > In the projection method for the solution of incompressible flow, > a pressure equation with four neuman bcs need to be solved. > > but the pressure matrix is singular. it gives divergence solution. how > to solve this kind of equations. > > See the users manual section on solving singular systems. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Sep 23 11:25:09 2016 From: jed at jedbrown.org (Jed Brown) Date: Fri, 23 Sep 2016 10:25:09 -0600 Subject: [petsc-users] How to solve the pressure possion equations with four neuman bc In-Reply-To: References: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> <87zimyzjye.fsf@jedbrown.org> Message-ID: <87wpi2ziqi.fsf@jedbrown.org> "Kong, Fande" writes: > Any references on this topic except the users manual? I am interested in > mathematics theory on this topic. These are relevant. I've only skimmed them briefly, but they might suggest ways to improve PETSc's handling of singular systems. The technique is ancient. https://doi.org/10.1137/S0895479803437803 https://doi.org/10.1007/s10543-009-0247-7 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From mvalera at mail.sdsu.edu Fri Sep 23 12:47:21 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Fri, 23 Sep 2016 10:47:21 -0700 Subject: [petsc-users] Loading Laplacian as Module Message-ID: Hello all, I'm trying to load my laplacian matrix into a fortran module, and i have implemented it and it works for the first iteration of laplacian solver, but when starts the second step the laplacian matrix object becomes corrupts and looks like it loses one of it's dimensions. Can you help me understand whats happening? The modules are attached, the error i get is the following, i bolded the lines where i detected corruption: ucmsSeamount Entering MAIN loop. RHS loaded, size: 213120 / 213120 *CSRMAt loaded, sizes: 213120 x 213120* 8.39198399 s solveP pass: 1 !Iteration number RHS loaded, size: 213120 / 213120 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Wrong type of object: Parameter # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c * CSRMAt loaded, sizes: 213120 x 0* [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Wrong type of object: Parameter # 2 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [0]PETSC ERROR: #2 KSPSetOperators() line 531 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Nonconforming object sizes [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal resulting vector number of rows 213120 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Mat object's type is not set: Argument # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: LoadPetscMatrix.f90 Type: text/x-fortran Size: 4269 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: SolvePetscLinear.f90 Type: text/x-fortran Size: 6709 bytes Desc: not available URL: From bsmith at mcs.anl.gov Fri Sep 23 12:53:02 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 23 Sep 2016 12:53:02 -0500 Subject: [petsc-users] Loading Laplacian as Module In-Reply-To: References: Message-ID: <8214E1F2-FAEB-49AD-97FC-319AB37A2AC9@mcs.anl.gov> Run with valgrind to find the exact location of the first memory corruption. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > On Sep 23, 2016, at 12:47 PM, Manuel Valera wrote: > > Hello all, > > I'm trying to load my laplacian matrix into a fortran module, and i have implemented it and it works for the first iteration of laplacian solver, but when starts the second step the laplacian matrix object becomes corrupts and looks like it loses one of it's dimensions. > > Can you help me understand whats happening? > > The modules are attached, the error i get is the following, i bolded the lines where i detected corruption: > > ucmsSeamount Entering MAIN loop. > RHS loaded, size: 213120 / 213120 > CSRMAt loaded, sizes: 213120 x 213120 > 8.39198399 s > solveP pass: 1 !Iteration number > RHS loaded, size: 213120 / 213120 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > CSRMAt loaded, sizes: 213120 x 0 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Nonconforming object sizes > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal resulting vector number of rows 213120 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > From mvalera at mail.sdsu.edu Fri Sep 23 13:09:53 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Fri, 23 Sep 2016 11:09:53 -0700 Subject: [petsc-users] Loading Laplacian as Module In-Reply-To: <8214E1F2-FAEB-49AD-97FC-319AB37A2AC9@mcs.anl.gov> References: <8214E1F2-FAEB-49AD-97FC-319AB37A2AC9@mcs.anl.gov> Message-ID: Thanks Barry, for the quick reply, I tried doing that once recently, not for this problem though, but it looks like the model i'm working on isn't optimized at all for memory leaks, and valgrind stopped with thousands of errors before reaching this part of the execution. Is there maybe an alternative approach ? or it would be better to just get the model in better shape already ? Thanks On Fri, Sep 23, 2016 at 10:53 AM, Barry Smith wrote: > > Run with valgrind to find the exact location of the first memory > corruption. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > On Sep 23, 2016, at 12:47 PM, Manuel Valera > wrote: > > > > Hello all, > > > > I'm trying to load my laplacian matrix into a fortran module, and i have > implemented it and it works for the first iteration of laplacian solver, > but when starts the second step the laplacian matrix object becomes > corrupts and looks like it loses one of it's dimensions. > > > > Can you help me understand whats happening? > > > > The modules are attached, the error i get is the following, i bolded the > lines where i detected corruption: > > > > ucmsSeamount Entering MAIN loop. > > RHS loaded, size: 213120 / 213120 > > CSRMAt loaded, sizes: 213120 x 213120 > > 8.39198399 s > > solveP pass: 1 !Iteration number > > RHS loaded, size: 213120 / 213120 > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/ > petsc/src/mat/interface/matrix.c > > CSRMAt loaded, sizes: 213120 x 0 > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Nonconforming object sizes > > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal > resulting vector number of rows 213120 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 23 13:15:06 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 23 Sep 2016 13:15:06 -0500 Subject: [petsc-users] Loading Laplacian as Module In-Reply-To: References: <8214E1F2-FAEB-49AD-97FC-319AB37A2AC9@mcs.anl.gov> Message-ID: > On Sep 23, 2016, at 1:09 PM, Manuel Valera wrote: > > Thanks Barry, for the quick reply, > > I tried doing that once recently, not for this problem though, but it looks like the model i'm working on isn't optimized at all for memory leaks, and valgrind stopped with thousands of errors before reaching this part of the execution. Some MPI implementations by default produce many meaningless valgrind messages. So make sure you ./configure PETSc with --download-mpich this version will not produce any meaningless valgrind messages about MPI. You are not concerned with "memory leaks" in this exercise, only with using uninitialized memory or overwriting memory you should not overwrite. So you want valgrind arguments like -q --tool=memcheck --num-callers=20 --track-origins=yes you do not need --leak-check=yes So run with valgrind and email use the output and we may have suggestions on the cause. Barry > > Is there maybe an alternative approach ? or it would be better to just get the model in better shape already ? > > Thanks > > On Fri, Sep 23, 2016 at 10:53 AM, Barry Smith wrote: > > Run with valgrind to find the exact location of the first memory corruption. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > On Sep 23, 2016, at 12:47 PM, Manuel Valera wrote: > > > > Hello all, > > > > I'm trying to load my laplacian matrix into a fortran module, and i have implemented it and it works for the first iteration of laplacian solver, but when starts the second step the laplacian matrix object becomes corrupts and looks like it loses one of it's dimensions. > > > > Can you help me understand whats happening? > > > > The modules are attached, the error i get is the following, i bolded the lines where i detected corruption: > > > > ucmsSeamount Entering MAIN loop. > > RHS loaded, size: 213120 / 213120 > > CSRMAt loaded, sizes: 213120 x 213120 > > 8.39198399 s > > solveP pass: 1 !Iteration number > > RHS loaded, size: 213120 / 213120 > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > CSRMAt loaded, sizes: 213120 x 0 > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Nonconforming object sizes > > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal resulting vector number of rows 213120 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > > > From mvalera at mail.sdsu.edu Fri Sep 23 14:07:26 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Fri, 23 Sep 2016 12:07:26 -0700 Subject: [petsc-users] Loading Laplacian as Module In-Reply-To: References: <8214E1F2-FAEB-49AD-97FC-319AB37A2AC9@mcs.anl.gov> Message-ID: Barry, that was awesome, all the valgrind error dissappeared after using the mpiexec from petsc folder, the more you know... Anyway this is my output from valgrind running with those options: Last Update: 9/23/2016 12: 5:12 ucmsSeamount Entering MAIN loop. RHS loaded, size: 213120 / 213120 CSRMAt loaded, sizes: 213120 x 213120 8.32709217 s solveP pass: 1 RHS loaded, size: 213120 / 213120 CSRMAt loaded, sizes: 213120 x 0 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Wrong type of object: Parameter # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Wrong type of object: Parameter # 2 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [0]PETSC ERROR: #2 KSPSetOperators() line 531 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Nonconforming object sizes [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal resulting vector number of rows 213120 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Mat object's type is not set: Argument # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 [cli_0]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 6490 RUNNING AT valera-HP-xw4600-Workstation = EXIT CODE: 73 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES =================================================================================== ==6488== ==6488== HEAP SUMMARY: ==6488== in use at exit: 131,120 bytes in 2 blocks ==6488== total heap usage: 1,224 allocs, 1,222 frees, 249,285 bytes allocated ==6488== ==6488== LEAK SUMMARY: ==6488== definitely lost: 0 bytes in 0 blocks ==6488== indirectly lost: 0 bytes in 0 blocks ==6488== possibly lost: 0 bytes in 0 blocks ==6488== still reachable: 131,120 bytes in 2 blocks ==6488== suppressed: 0 bytes in 0 blocks ==6488== Rerun with --leak-check=full to see details of leaked memory ==6488== ==6488== For counts of detected and suppressed errors, rerun with: -v ==6488== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) On Fri, Sep 23, 2016 at 11:15 AM, Barry Smith wrote: > > > On Sep 23, 2016, at 1:09 PM, Manuel Valera > wrote: > > > > Thanks Barry, for the quick reply, > > > > I tried doing that once recently, not for this problem though, but it > looks like the model i'm working on isn't optimized at all for memory > leaks, and valgrind stopped with thousands of errors before reaching this > part of the execution. > > Some MPI implementations by default produce many meaningless valgrind > messages. So make sure you ./configure PETSc with --download-mpich this > version will not produce any meaningless valgrind messages about MPI. > > You are not concerned with "memory leaks" in this exercise, only with > using uninitialized memory or overwriting memory you should not overwrite. > So you want valgrind arguments like -q --tool=memcheck --num-callers=20 > --track-origins=yes you do not need --leak-check=yes > > So run with valgrind and email use the output and we may have > suggestions on the cause. > > Barry > > > > > > > Is there maybe an alternative approach ? or it would be better to just > get the model in better shape already ? > > > > Thanks > > > > On Fri, Sep 23, 2016 at 10:53 AM, Barry Smith > wrote: > > > > Run with valgrind to find the exact location of the first memory > corruption. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > On Sep 23, 2016, at 12:47 PM, Manuel Valera > wrote: > > > > > > Hello all, > > > > > > I'm trying to load my laplacian matrix into a fortran module, and i > have implemented it and it works for the first iteration of laplacian > solver, but when starts the second step the laplacian matrix object becomes > corrupts and looks like it loses one of it's dimensions. > > > > > > Can you help me understand whats happening? > > > > > > The modules are attached, the error i get is the following, i bolded > the lines where i detected corruption: > > > > > > ucmsSeamount Entering MAIN loop. > > > RHS loaded, size: 213120 / 213120 > > > CSRMAt loaded, sizes: 213120 x 213120 > > > 8.39198399 s > > > solveP pass: 1 !Iteration number > > > RHS loaded, size: 213120 / 213120 > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Invalid argument > > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > CSRMAt loaded, sizes: 213120 x 0 > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Invalid argument > > > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Nonconforming object sizes > > > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal > resulting vector number of rows 213120 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Object is in wrong state > > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > > > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > [unset]: aborting job: > > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 23 14:18:24 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 23 Sep 2016 14:18:24 -0500 Subject: [petsc-users] Loading Laplacian as Module In-Reply-To: References: <8214E1F2-FAEB-49AD-97FC-319AB37A2AC9@mcs.anl.gov> Message-ID: <9B695F87-F73F-4BA6-A0E3-0DA144F22E69@mcs.anl.gov> Ok, so the problem is not memory corruption. > 0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c So it looks like the matrix has not been created yet in this call. You can run with -start_in_debugger noxterm and then type cont in the debugger and it should stop at this error so you can look at the mat object to see what its value is. Barry > On Sep 23, 2016, at 2:07 PM, Manuel Valera wrote: > > Barry, that was awesome, all the valgrind error dissappeared after using the mpiexec from petsc folder, the more you know... > > Anyway this is my output from valgrind running with those options: > > Last Update: 9/23/2016 12: 5:12 > ucmsSeamount Entering MAIN loop. > RHS loaded, size: 213120 / 213120 > CSRMAt loaded, sizes: 213120 x 213120 > 8.32709217 s > solveP pass: 1 > RHS loaded, size: 213120 / 213120 > CSRMAt loaded, sizes: 213120 x 0 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Nonconforming object sizes > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal resulting vector number of rows 213120 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > [cli_0]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = PID 6490 RUNNING AT valera-HP-xw4600-Workstation > = EXIT CODE: 73 > = CLEANING UP REMAINING PROCESSES > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > =================================================================================== > ==6488== > ==6488== HEAP SUMMARY: > ==6488== in use at exit: 131,120 bytes in 2 blocks > ==6488== total heap usage: 1,224 allocs, 1,222 frees, 249,285 bytes allocated > ==6488== > ==6488== LEAK SUMMARY: > ==6488== definitely lost: 0 bytes in 0 blocks > ==6488== indirectly lost: 0 bytes in 0 blocks > ==6488== possibly lost: 0 bytes in 0 blocks > ==6488== still reachable: 131,120 bytes in 2 blocks > ==6488== suppressed: 0 bytes in 0 blocks > ==6488== Rerun with --leak-check=full to see details of leaked memory > ==6488== > ==6488== For counts of detected and suppressed errors, rerun with: -v > ==6488== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > > On Fri, Sep 23, 2016 at 11:15 AM, Barry Smith wrote: > > > On Sep 23, 2016, at 1:09 PM, Manuel Valera wrote: > > > > Thanks Barry, for the quick reply, > > > > I tried doing that once recently, not for this problem though, but it looks like the model i'm working on isn't optimized at all for memory leaks, and valgrind stopped with thousands of errors before reaching this part of the execution. > > Some MPI implementations by default produce many meaningless valgrind messages. So make sure you ./configure PETSc with --download-mpich this version will not produce any meaningless valgrind messages about MPI. > > You are not concerned with "memory leaks" in this exercise, only with using uninitialized memory or overwriting memory you should not overwrite. So you want valgrind arguments like -q --tool=memcheck --num-callers=20 --track-origins=yes you do not need --leak-check=yes > > So run with valgrind and email use the output and we may have suggestions on the cause. > > Barry > > > > > > > Is there maybe an alternative approach ? or it would be better to just get the model in better shape already ? > > > > Thanks > > > > On Fri, Sep 23, 2016 at 10:53 AM, Barry Smith wrote: > > > > Run with valgrind to find the exact location of the first memory corruption. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > On Sep 23, 2016, at 12:47 PM, Manuel Valera wrote: > > > > > > Hello all, > > > > > > I'm trying to load my laplacian matrix into a fortran module, and i have implemented it and it works for the first iteration of laplacian solver, but when starts the second step the laplacian matrix object becomes corrupts and looks like it loses one of it's dimensions. > > > > > > Can you help me understand whats happening? > > > > > > The modules are attached, the error i get is the following, i bolded the lines where i detected corruption: > > > > > > ucmsSeamount Entering MAIN loop. > > > RHS loaded, size: 213120 / 213120 > > > CSRMAt loaded, sizes: 213120 x 213120 > > > 8.39198399 s > > > solveP pass: 1 !Iteration number > > > RHS loaded, size: 213120 / 213120 > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Invalid argument > > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > CSRMAt loaded, sizes: 213120 x 0 > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Invalid argument > > > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Nonconforming object sizes > > > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal resulting vector number of rows 213120 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Object is in wrong state > > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > > > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > [unset]: aborting job: > > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > > > > > > > > > From mvalera at mail.sdsu.edu Fri Sep 23 14:31:00 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Fri, 23 Sep 2016 12:31:00 -0700 Subject: [petsc-users] Loading Laplacian as Module In-Reply-To: <9B695F87-F73F-4BA6-A0E3-0DA144F22E69@mcs.anl.gov> References: <8214E1F2-FAEB-49AD-97FC-319AB37A2AC9@mcs.anl.gov> <9B695F87-F73F-4BA6-A0E3-0DA144F22E69@mcs.anl.gov> Message-ID: Ok, i got this: RHS loaded, size: 213120 / 213120 CSRMAt loaded, sizes: 213120 x 213120 8.43036175 s solveP pass: 1 RHS loaded, size: 213120 / 213120 [0]PETSC ERROR: MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c Wrong type of object: Parameter # 1 Program received signal SIGABRT: Process abort signal. Backtrace for this error: #0 0x7F2A35AEA777 #1 0x7F2A35AEAD7E #2 0x7F2A34FC6CAF #3 0x7F2A34FC6C37 #4 0x7F2A34FCA027 #5 0x7F2A35F6F6AA #6 0x7F2A35F6A2EA #7 0x7F2A362E2FEF #8 0x7F2A36326681 #9 0x799AFF in solvepetsclinear_ at SolvePetscLinear.f90:137 (discriminator 2) #10 0x798F6A in solvep_rhs_ at SolveP_Rhs.f90:284 #11 0x80D028 in ucmsmain at ucmsMain.f90:472 .-.-.-.-.-.-.-.- What is weird for me is why it loads everything as it should for the first timestep of the problem and then it breaks on the second one, shouldnt the matrix be loaded at modules and shared with all subroutines? also, shouldnt the matrix be locked after assembly_final was used ? that matrix call is Ap which is inside LoadPetscMatrix module, and it looks like its changed after the first timestep. On Fri, Sep 23, 2016 at 12:18 PM, Barry Smith wrote: > > Ok, so the problem is not memory corruption. > > > 0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/ > petsc/src/mat/interface/matrix.c > > So it looks like the matrix has not been created yet in this call. You > can run with -start_in_debugger noxterm and then type cont in the debugger > and it should stop at this error so you can look at the mat object to see > what its value is. > > Barry > > > > > On Sep 23, 2016, at 2:07 PM, Manuel Valera > wrote: > > > > Barry, that was awesome, all the valgrind error dissappeared after using > the mpiexec from petsc folder, the more you know... > > > > Anyway this is my output from valgrind running with those options: > > > > Last Update: 9/23/2016 12: 5:12 > > ucmsSeamount Entering MAIN loop. > > RHS loaded, size: 213120 / 213120 > > CSRMAt loaded, sizes: 213120 x 213120 > > 8.32709217 s > > solveP pass: 1 > > RHS loaded, size: 213120 / 213120 > > CSRMAt loaded, sizes: 213120 x 0 > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 12:05:03 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/ > petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 12:05:03 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Nonconforming object sizes > > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal > resulting vector number of rows 213120 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 12:05:03 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 12:05:03 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > [cli_0]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > ============================================================ > ======================= > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > = PID 6490 RUNNING AT valera-HP-xw4600-Workstation > > = EXIT CODE: 73 > > = CLEANING UP REMAINING PROCESSES > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > ============================================================ > ======================= > > ==6488== > > ==6488== HEAP SUMMARY: > > ==6488== in use at exit: 131,120 bytes in 2 blocks > > ==6488== total heap usage: 1,224 allocs, 1,222 frees, 249,285 bytes > allocated > > ==6488== > > ==6488== LEAK SUMMARY: > > ==6488== definitely lost: 0 bytes in 0 blocks > > ==6488== indirectly lost: 0 bytes in 0 blocks > > ==6488== possibly lost: 0 bytes in 0 blocks > > ==6488== still reachable: 131,120 bytes in 2 blocks > > ==6488== suppressed: 0 bytes in 0 blocks > > ==6488== Rerun with --leak-check=full to see details of leaked memory > > ==6488== > > ==6488== For counts of detected and suppressed errors, rerun with: -v > > ==6488== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > > > > > On Fri, Sep 23, 2016 at 11:15 AM, Barry Smith > wrote: > > > > > On Sep 23, 2016, at 1:09 PM, Manuel Valera > wrote: > > > > > > Thanks Barry, for the quick reply, > > > > > > I tried doing that once recently, not for this problem though, but it > looks like the model i'm working on isn't optimized at all for memory > leaks, and valgrind stopped with thousands of errors before reaching this > part of the execution. > > > > Some MPI implementations by default produce many meaningless valgrind > messages. So make sure you ./configure PETSc with --download-mpich this > version will not produce any meaningless valgrind messages about MPI. > > > > You are not concerned with "memory leaks" in this exercise, only with > using uninitialized memory or overwriting memory you should not overwrite. > So you want valgrind arguments like -q --tool=memcheck --num-callers=20 > --track-origins=yes you do not need --leak-check=yes > > > > So run with valgrind and email use the output and we may have > suggestions on the cause. > > > > Barry > > > > > > > > > > > > Is there maybe an alternative approach ? or it would be better to just > get the model in better shape already ? > > > > > > Thanks > > > > > > On Fri, Sep 23, 2016 at 10:53 AM, Barry Smith > wrote: > > > > > > Run with valgrind to find the exact location of the first memory > corruption. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > > > On Sep 23, 2016, at 12:47 PM, Manuel Valera > wrote: > > > > > > > > Hello all, > > > > > > > > I'm trying to load my laplacian matrix into a fortran module, and i > have implemented it and it works for the first iteration of laplacian > solver, but when starts the second step the laplacian matrix object becomes > corrupts and looks like it loses one of it's dimensions. > > > > > > > > Can you help me understand whats happening? > > > > > > > > The modules are attached, the error i get is the following, i bolded > the lines where i detected corruption: > > > > > > > > ucmsSeamount Entering MAIN loop. > > > > RHS loaded, size: 213120 / 213120 > > > > CSRMAt loaded, sizes: 213120 x 213120 > > > > 8.39198399 s > > > > solveP pass: 1 !Iteration number > > > > RHS loaded, size: 213120 / 213120 > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Invalid argument > > > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > CSRMAt loaded, sizes: 213120 x 0 > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Invalid argument > > > > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Nonconforming object sizes > > > > [0]PETSC ERROR: Preconditioner number of local rows -1 does not > equal resulting vector number of rows 213120 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/pc/interface/precon.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Object is in wrong state > > > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [0]PETSC ERROR: ./ucmsSeamount > > > ?J? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 > 10:27:21 2016 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > > > > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > > > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > [unset]: aborting job: > > > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 23 14:39:26 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 23 Sep 2016 14:39:26 -0500 Subject: [petsc-users] Loading Laplacian as Module In-Reply-To: References: <8214E1F2-FAEB-49AD-97FC-319AB37A2AC9@mcs.anl.gov> <9B695F87-F73F-4BA6-A0E3-0DA144F22E69@mcs.anl.gov> Message-ID: I don't know much about modules so can't help, but PETSc variables are just like any other variables and so should behave in the same way. Barry > On Sep 23, 2016, at 2:31 PM, Manuel Valera wrote: > > Ok, i got this: > > RHS loaded, size: 213120 / 213120 > CSRMAt loaded, sizes: 213120 x 213120 > 8.43036175 s > solveP pass: 1 > RHS loaded, size: 213120 / 213120 > [0]PETSC ERROR: MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c Wrong type of object: Parameter # 1 > > Program received signal SIGABRT: Process abort signal. > > Backtrace for this error: > #0 0x7F2A35AEA777 > #1 0x7F2A35AEAD7E > #2 0x7F2A34FC6CAF > #3 0x7F2A34FC6C37 > #4 0x7F2A34FCA027 > #5 0x7F2A35F6F6AA > #6 0x7F2A35F6A2EA > #7 0x7F2A362E2FEF > #8 0x7F2A36326681 > #9 0x799AFF in solvepetsclinear_ at SolvePetscLinear.f90:137 (discriminator 2) > #10 0x798F6A in solvep_rhs_ at SolveP_Rhs.f90:284 > #11 0x80D028 in ucmsmain at ucmsMain.f90:472 > > .-.-.-.-.-.-.-.- > > What is weird for me is why it loads everything as it should for the first timestep of the problem and then it breaks on the second one, shouldnt the matrix be loaded at modules and shared with all subroutines? also, shouldnt the matrix be locked after assembly_final was used ? that matrix call is Ap which is inside LoadPetscMatrix module, and it looks like its changed after the first timestep. > > > On Fri, Sep 23, 2016 at 12:18 PM, Barry Smith wrote: > > Ok, so the problem is not memory corruption. > > > 0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > So it looks like the matrix has not been created yet in this call. You can run with -start_in_debugger noxterm and then type cont in the debugger and it should stop at this error so you can look at the mat object to see what its value is. > > Barry > > > > > On Sep 23, 2016, at 2:07 PM, Manuel Valera wrote: > > > > Barry, that was awesome, all the valgrind error dissappeared after using the mpiexec from petsc folder, the more you know... > > > > Anyway this is my output from valgrind running with those options: > > > > Last Update: 9/23/2016 12: 5:12 > > ucmsSeamount Entering MAIN loop. > > RHS loaded, size: 213120 / 213120 > > CSRMAt loaded, sizes: 213120 x 213120 > > 8.32709217 s > > solveP pass: 1 > > RHS loaded, size: 213120 / 213120 > > CSRMAt loaded, sizes: 213120 x 0 > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Nonconforming object sizes > > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal resulting vector number of rows 213120 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 12:05:03 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > [cli_0]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > =================================================================================== > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > = PID 6490 RUNNING AT valera-HP-xw4600-Workstation > > = EXIT CODE: 73 > > = CLEANING UP REMAINING PROCESSES > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > =================================================================================== > > ==6488== > > ==6488== HEAP SUMMARY: > > ==6488== in use at exit: 131,120 bytes in 2 blocks > > ==6488== total heap usage: 1,224 allocs, 1,222 frees, 249,285 bytes allocated > > ==6488== > > ==6488== LEAK SUMMARY: > > ==6488== definitely lost: 0 bytes in 0 blocks > > ==6488== indirectly lost: 0 bytes in 0 blocks > > ==6488== possibly lost: 0 bytes in 0 blocks > > ==6488== still reachable: 131,120 bytes in 2 blocks > > ==6488== suppressed: 0 bytes in 0 blocks > > ==6488== Rerun with --leak-check=full to see details of leaked memory > > ==6488== > > ==6488== For counts of detected and suppressed errors, rerun with: -v > > ==6488== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > > > > > On Fri, Sep 23, 2016 at 11:15 AM, Barry Smith wrote: > > > > > On Sep 23, 2016, at 1:09 PM, Manuel Valera wrote: > > > > > > Thanks Barry, for the quick reply, > > > > > > I tried doing that once recently, not for this problem though, but it looks like the model i'm working on isn't optimized at all for memory leaks, and valgrind stopped with thousands of errors before reaching this part of the execution. > > > > Some MPI implementations by default produce many meaningless valgrind messages. So make sure you ./configure PETSc with --download-mpich this version will not produce any meaningless valgrind messages about MPI. > > > > You are not concerned with "memory leaks" in this exercise, only with using uninitialized memory or overwriting memory you should not overwrite. So you want valgrind arguments like -q --tool=memcheck --num-callers=20 --track-origins=yes you do not need --leak-check=yes > > > > So run with valgrind and email use the output and we may have suggestions on the cause. > > > > Barry > > > > > > > > > > > > Is there maybe an alternative approach ? or it would be better to just get the model in better shape already ? > > > > > > Thanks > > > > > > On Fri, Sep 23, 2016 at 10:53 AM, Barry Smith wrote: > > > > > > Run with valgrind to find the exact location of the first memory corruption. http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > > > On Sep 23, 2016, at 12:47 PM, Manuel Valera wrote: > > > > > > > > Hello all, > > > > > > > > I'm trying to load my laplacian matrix into a fortran module, and i have implemented it and it works for the first iteration of laplacian solver, but when starts the second step the laplacian matrix object becomes corrupts and looks like it loses one of it's dimensions. > > > > > > > > Can you help me understand whats happening? > > > > > > > > The modules are attached, the error i get is the following, i bolded the lines where i detected corruption: > > > > > > > > ucmsSeamount Entering MAIN loop. > > > > RHS loaded, size: 213120 / 213120 > > > > CSRMAt loaded, sizes: 213120 x 213120 > > > > 8.39198399 s > > > > solveP pass: 1 !Iteration number > > > > RHS loaded, size: 213120 / 213120 > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Invalid argument > > > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > > [0]PETSC ERROR: #1 MatGetSize() line 6295 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > CSRMAt loaded, sizes: 213120 x 0 > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Invalid argument > > > > [0]PETSC ERROR: Wrong type of object: Parameter # 2 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > > [0]PETSC ERROR: #2 KSPSetOperators() line 531 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcreate.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Nonconforming object sizes > > > > [0]PETSC ERROR: Preconditioner number of local rows -1 does not equal resulting vector number of rows 213120 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > > [0]PETSC ERROR: #3 PCApply() line 474 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Object is in wrong state > > > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [0]PETSC ERROR: ./ucmsSeamount ?J? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Fri Sep 23 10:27:21 2016 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > > [0]PETSC ERROR: #4 MatGetFactorAvailable() line 4286 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #5 PCGetDefaultType_Private() line 28 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > [0]PETSC ERROR: #6 PCSetFromOptions() line 159 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/pcset.c > > > > [0]PETSC ERROR: #7 KSPSetFromOptions() line 400 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itcl.c > > > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > [unset]: aborting job: > > > > application called MPI_Abort(MPI_COMM_WORLD, 73) - process 0 > > > > > > > > > > > > > > > > > > > > From mailinglists at xgm.de Sun Sep 25 04:07:51 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Sun, 25 Sep 2016 11:07:51 +0200 Subject: [petsc-users] Write binary to matrix In-Reply-To: References: <80fd8841-d427-2e48-b126-2ea6e00887ea@xgm.de> <2f78a83c-650c-b6b7-914b-8ad76cb785b0@xgm.de> <0ab6d96e-2056-5683-2d9f-6aac46fd86b8@xgm.de> Message-ID: <9ebc840d-de37-d14a-c462-abf93a4f042b@xgm.de> Great! Thanks to you! Am 23.09.2016 um 17:56 schrieb Hong: > Florian: > I pushed a fix in branch hzhang/fix_matview_mpisbaij (off petsc-maint) > https://bitbucket.org/petsc/petsc/commits/d1654148bc9f02cde4d336bb9518a18cfb35148e > > After it is tested in our regression tests, it will be merged to > petsc-maint and petsc-master. > Thanks for reporting it! > > Hong > > On Fri, Sep 23, 2016 at 10:37 AM, Hong > wrote: > > Florian: > I can reproduce this error. > This is a bug in PETSc library. I'll fix it and get back to you soon. > Hong > > > Am 22.09.2016 um 18:34 schrieb Hong: > > Florian: > > Would it work if replacing MATSBAIJ to MATAIJ or MATMPISBAIJ? > > MATAIJ works, but is not an option for my actual application. > > MATMPISBAIJ does not work. Not very suprisingly, since afaik > setting it to MATSBAIJ and executing it on multiple MPI > ranks actually results in MATMPISBAIJ. > > Best, > Florian > > > > > > Hong > > > > Hey, > > > > this code reproduces the error when run with 2 or more ranks. > > > > #include > > #include > > > > int main(int argc, char *argv[]) > > { > > PetscInitialize(&argc, &argv, "", NULL); > > > > Mat matrix; > > MatCreate(PETSC_COMM_WORLD, &matrix); > > MatSetType(matrix, MATSBAIJ); > > MatSetSizes(matrix, 10, 10, PETSC_DETERMINE, > PETSC_DETERMINE); > > MatSetFromOptions(matrix); > > MatSetUp(matrix); > > > > MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); > > MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); > > > > PetscViewer viewer; > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, "test.mat", > FILE_MODE_WRITE, &viewer); > > MatView(matrix, viewer); > > PetscViewerDestroy(&viewer); > > MatDestroy(&matrix); > > > > PetscFinalize(); > > } > > > > > > The complete output is: > > > > > > lindnefn at neon /data/scratch/lindnefn/aste (git)-[master] % > mpic++ petsc.cpp -lpetsc && mpirun -n 2 ./a.out > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: No support for this operation for this > object type > > [0]PETSC ERROR: Cannot get subcomm viewer for binary files > or sockets unless SubViewer contains the rank 0 process > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html > > > > for > trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named > neon by lindnefn Thu Sep 22 16:10:34 2016 > > [0]PETSC ERROR: Configure options --with-debugging=1 > --download-petsc4py=yes --download-mpi4py=yes > > --download-superlu_dist --download-parmetis --download-metis > > [0]PETSC ERROR: #1 PetscViewerGetSubViewer_Binary() line 46 in > > > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/binv.c > > [0]PETSC ERROR: #2 PetscViewerGetSubViewer() line 43 in > > > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/interface/dupl.c > > [0]PETSC ERROR: #3 MatView_MPISBAIJ_ASCIIorDraworSocket() > line 900 in > > > /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/mpi/mpisbaij.c > > [0]PETSC ERROR: #4 MatView_MPISBAIJ() line 926 in > > > /data/scratch/lindnefn/software/petsc/src/mat/impls/sbaij/mpi/mpisbaij.c > > [0]PETSC ERROR: #5 MatView() line 901 in > /data/scratch/lindnefn/software/petsc/src/mat/interface/matrix.c > > WARNING! There are options you set that were not used! > > WARNING! could be spelling mistake, etc! > > Option left: name:-ksp_converged_reason (no value) > > Option left: name:-ksp_final_residual (no value) > > Option left: name:-ksp_view (no value) > > [neon:113111] *** Process received signal *** > > [neon:113111] Signal: Aborted (6) > > [neon:113111] Signal code: (-6) > > [neon:113111] [ 0] > /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7feed8958cb0] > > [neon:113111] [ 1] > /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7feed8958c37] > > [neon:113111] [ 2] > /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7feed895c028] > > [neon:113111] [ 3] > > > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscTraceBackErrorHandler+0x563) > > [0x7feed8d8db31] > > [neon:113111] [ 4] > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscError+0x374) > > [0x7feed8d88750] > > [neon:113111] [ 5] > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x19b2f6) > > [0x7feed8e822f6] > > [neon:113111] [ 6] > > > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(PetscViewerGetSubViewer+0x4f1) > > [0x7feed8e803cb] > > [neon:113111] [ 7] > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x860c95) > > [0x7feed9547c95] > > [neon:113111] [ 8] > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(+0x861494) > > [0x7feed9548494] > > [neon:113111] [ 9] > /data/scratch/lindnefn/software/petsc/arch-linux2-c-debug/lib/libpetsc.so.3.7(MatView+0x12b6) > > [0x7feed971c08f] > > [neon:113111] [10] ./a.out() [0x400b8b] > > [neon:113111] [11] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) > [0x7feed8943f45] > > [neon:113111] [12] ./a.out() [0x4009e9] > > [neon:113111] *** End of error message *** > > > -------------------------------------------------------------------------- > > mpirun noticed that process rank 1 with PID 113111 on node > neon exited on signal 6 (Aborted). > > > -------------------------------------------------------------------------- > > > > Thanks, > > Florian > > > > > > > > Am 22.09.2016 um 13:32 schrieb Matthew Knepley: > > > On Thu, Sep 22, 2016 at 5:42 AM, Florian Lindner > > > > > > >>> wrote: > > > > > > Hello, > > > > > > I want to write a MATSBAIJ to a file in binary, so > that I can load it later using MatLoad. > > > > > > However, I keep getting the error: > > > > > > [5]PETSC ERROR: No support for this operation for > this object type! > > > [5]PETSC ERROR: Cannot get subcomm viewer for binary > files or sockets unless SubViewer contains the rank 0 process > > > [6]PETSC ERROR: PetscViewerGetSubViewer_Binary() > line 46 in > > > > /data/scratch/lindnefn/software/petsc/src/sys/classes/viewer/impls/binary/binv.c > > > > > > > > > Do not truncate the stack. > > > > > > Run under valgrind. > > > > > > Thanks, > > > > > > Matt > > > > > > > > > The rank 0 is included, as you can see below, I use > PETSC_COMM_WORLD and the matrix is also created like that. > > > > > > The code looks like: > > > > > > PetscErrorCode ierr = 0; > > > PetscViewer viewer; > > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, > filename.c_str(), FILE_MODE_WRITE, &viewer); CHKERRV(ierr); > > > MatView(matrix, viewer); CHKERRV(ierr); > > > PetscViewerDestroy(&viewer); > > > > > > Thanks, > > > Florian > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they > begin their experiments is infinitely more interesting than any > > > results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > From ztdepyahoo at 163.com Sun Sep 25 04:06:59 2016 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Sun, 25 Sep 2016 17:06:59 +0800 (CST) Subject: [petsc-users] How to solve the pressure possion equations with four neuman bc In-Reply-To: <87wpi2ziqi.fsf@jedbrown.org> References: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> <87zimyzjye.fsf@jedbrown.org> <87wpi2ziqi.fsf@jedbrown.org> Message-ID: <57ab0a96.4495.1576098f34a.Coremail.ztdepyahoo@163.com> Dear professor: In the ksp example ex50.c, i do not understand the meaning of KSPSolve(ksp,NULL,NULL); nMatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_TRUE,0,0,&nullspace); What is the meaning of NULL in the kspsolve. and two "0" in the MatNullSpaceCreate. Regards At 2016-09-24 00:25:09, "Jed Brown" wrote: >"Kong, Fande" writes: > >> Any references on this topic except the users manual? I am interested in >> mathematics theory on this topic. > >These are relevant. I've only skimmed them briefly, but they might >suggest ways to improve PETSc's handling of singular systems. The >technique is ancient. > >https://doi.org/10.1137/S0895479803437803 > >https://doi.org/10.1007/s10543-009-0247-7 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mono at dtu.dk Sun Sep 25 04:15:34 2016 From: mono at dtu.dk (=?utf-8?B?TW9ydGVuIE5vYmVsLUrDuHJnZW5zZW4=?=) Date: Sun, 25 Sep 2016 09:15:34 +0000 Subject: [petsc-users] DMPlex problem In-Reply-To: References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> , Message-ID: <6B03D347796DED499A2696FC095CE81A05B4BB2B@ait-pex02mbx04.win.dtu.dk> Hi Matthew Thank you for the bug-fix :) I can confirm that it works :) And thanks for your hard work on PETSc - your work is very much appreciated! Kind regards, Morten ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Friday, September 23, 2016 2:46 PM To: Morten Nobel-J?rgensen Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem On Fri, Sep 23, 2016 at 7:45 AM, Matthew Knepley > wrote: On Fri, Sep 23, 2016 at 3:48 AM, Morten Nobel-J?rgensen > wrote: Dear PETSc developers Any update on this issue regarding DMPlex? Or is there any obvious workaround that we are unaware of? I have fixed this bug. It did not come up in nightly tests because we are not using MatSetValuesLocal(). Instead we use MatSetValuesClosure() which translates differently. Here is the branch https://bitbucket.org/petsc/petsc/branch/knepley/fix-dm-ltog-bs and I have merged it to next. It will go to master in a day or two. Also, here is the cleaned up source with no memory leaks. Matt Also should we additionally register the issue on Bitbucket or is reporting the issue on the mailing list enough? Normally we are faster, but the start of the semester was hard this year. Thanks, Matt Kind regards, Morten ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Friday, September 09, 2016 12:21 PM To: Morten Nobel-J?rgensen Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen > wrote: Dear PETSc developers and users, Last week we posted a question regarding an error with DMPlex and multiple dofs and have not gotten any feedback yet. This is uncharted waters for us, since we have gotten used to an extremely fast feedback from the PETSc crew. So - with the chance of sounding impatient and ungrateful - we would like to hear if anybody has any ideas that could point us in the right direction? This is my fault. You have not gotten a response because everyone else was waiting for me, and I have been slow because I just moved houses at the same time as term started here. Sorry about that. The example ran for me and I saw your problem. The local-tp-global map is missing for some reason. I am tracking it down now. It should be made by DMCreateMatrix(), so this is mysterious. I hope to have this fixed by early next week. Thanks, Matt We have created a small example problem that demonstrates the error in the matrix assembly. Thanks, Morten -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Sep 25 08:35:08 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 25 Sep 2016 08:35:08 -0500 Subject: [petsc-users] How to solve the pressure possion equations with four neuman bc In-Reply-To: <57ab0a96.4495.1576098f34a.Coremail.ztdepyahoo@163.com> References: <78fe5014.ea37.157577b7a83.Coremail.ztdepyahoo@163.com> <87zimyzjye.fsf@jedbrown.org> <87wpi2ziqi.fsf@jedbrown.org> <57ab0a96.4495.1576098f34a.Coremail.ztdepyahoo@163.com> Message-ID: On Sun, Sep 25, 2016 at 4:06 AM, ??? wrote: > Dear professor: > In the ksp example ex50.c, i do not understand the meaning of > *KSPSolve* > > *(ksp,NULL,NULL);* > > * n**MatNullSpaceCreate* *(**PETSC_COMM_WORLD* *,**PETSC_TRUE* *,0,0,&nullspace); * > > What is the meaning of NULL in the kspsolve. and two "0" in the > MatNullSpaceCrea > > te. > 1) The NULL for KSPSolve() means that http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetComputeOperators.html http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetComputeRHS.html#KSPSetComputeRHS are used to make the problem. 2) The NullSpace is the space of constant functions with these arguments Matt Regards > > > > > > > At 2016-09-24 00:25:09, "Jed Brown" wrote: > >"Kong, Fande" writes: > > > >> Any references on this topic except the users manual? I am interested in > >> mathematics theory on this topic. > > > >These are relevant. I've only skimmed them briefly, but they might > >suggest ways to improve PETSc's handling of singular systems. The > >technique is ancient. > > > >https://doi.org/10.1137/S0895479803437803 > > > >https://doi.org/10.1007/s10543-009-0247-7 > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aurelien.ponte at ifremer.fr Sun Sep 25 23:47:03 2016 From: aurelien.ponte at ifremer.fr (Aurelien Ponte) Date: Sun, 25 Sep 2016 21:47:03 -0700 Subject: [petsc-users] petsc4py with complex numbers? Message-ID: <8b889009-6cfd-7d43-589b-31b0f3d91fde@ifremer.fr> Hi, I am trying to solve a linear problem whose operator has complex coefficients via petsc4py but keep running into the following error: File "/Users/aponte/Current_projects/people/kraig_marine/wd_response/solver/set_L.py", line 52, in set_L L.setValueStencil(row, col, value) File "PETSc/Mat.pyx", line 882, in petsc4py.PETSc.Mat.setValueStencil (src/petsc4py.PETSc.c:124785) File "PETSc/petscmat.pxi", line 1017, in petsc4py.PETSc.matsetvaluestencil (src/petsc4py.PETSc.c:31469) File "PETSc/arraynpy.pxi", line 140, in petsc4py.PETSc.iarray_s (src/petsc4py.PETSc.c:8811) File "PETSc/arraynpy.pxi", line 121, in petsc4py.PETSc.iarray (src/petsc4py.PETSc.c:8542) TypeError: can't convert complex to float The code looks like: for j in range(ys, ye): for i in range(xs, xe): row.index = (i, j, kx) row.field = 0 col.index = (i, j, kx) col.field=0 L.setValueStencil(row, col, 1j) Any idea about what am I doing wrong? cheers aurelien -- Aur?lien Ponte Tel: (+33) 2 98 22 40 73 Fax: (+33) 2 98 22 44 96 UMR 6523, IFREMER ZI de la Pointe du Diable CS 10070 29280 Plouzan? From dalcinl at gmail.com Mon Sep 26 01:00:25 2016 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 26 Sep 2016 09:00:25 +0300 Subject: [petsc-users] petsc4py with complex numbers? In-Reply-To: <8b889009-6cfd-7d43-589b-31b0f3d91fde@ifremer.fr> References: <8b889009-6cfd-7d43-589b-31b0f3d91fde@ifremer.fr> Message-ID: Are you sure you built petsc4py with a PETSc build with complex scalars? Whats the output of "print(PETSc.ScalarType)" ? On 26 September 2016 at 07:47, Aurelien Ponte wrote: > Hi, > > I am trying to solve a linear problem whose operator has complex > coefficients via petsc4py > but keep running into the following error: > > File "/Users/aponte/Current_projects/people/kraig_marine/wd_response/solver/set_L.py", > line 52, in set_L > L.setValueStencil(row, col, value) > File "PETSc/Mat.pyx", line 882, in petsc4py.PETSc.Mat.setValueStencil > (src/petsc4py.PETSc.c:124785) > File "PETSc/petscmat.pxi", line 1017, in petsc4py.PETSc.matsetvaluestencil > (src/petsc4py.PETSc.c:31469) > File "PETSc/arraynpy.pxi", line 140, in petsc4py.PETSc.iarray_s > (src/petsc4py.PETSc.c:8811) > File "PETSc/arraynpy.pxi", line 121, in petsc4py.PETSc.iarray > (src/petsc4py.PETSc.c:8542) > TypeError: can't convert complex to float > > The code looks like: > > for j in range(ys, ye): > for i in range(xs, xe): > row.index = (i, j, kx) > row.field = 0 > col.index = (i, j, kx) col.field=0 > L.setValueStencil(row, col, 1j) > > Any idea about what am I doing wrong? > > cheers > > aurelien > > > > -- > Aur?lien Ponte > Tel: (+33) 2 98 22 40 73 > Fax: (+33) 2 98 22 44 96 > UMR 6523, IFREMER > ZI de la Pointe du Diable > CS 10070 > 29280 Plouzan? > > -- Lisandro Dalcin ============ Research Scientist Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ 4700 King Abdullah University of Science and Technology al-Khawarizmi Bldg (Bldg 1), Office # 0109 Thuwal 23955-6900, Kingdom of Saudi Arabia http://www.kaust.edu.sa Office Phone: +966 12 808-0459 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Giang.Bui at ruhr-uni-bochum.de Mon Sep 26 05:02:56 2016 From: Giang.Bui at ruhr-uni-bochum.de (Hoang-Giang Bui) Date: Mon, 26 Sep 2016 12:02:56 +0200 Subject: [petsc-users] Position for Junior Professorship in High Performance Computing, Ruhr University Bochum, Germany In-Reply-To: <00c801d21416$2dfc2120$89f46360$@rub.de> References: <00c801d21416$2dfc2120$89f46360$@rub.de> Message-ID: Dear colleagues There is the position opening for Junior Professorship in HPC at my department. Please take a look if you are interested in such the position or forward this information to your colleagues. Thank you very much. --- MSc. Hoang-Giang Bui Ruhr University Bochum Institute for Structural Mechanics IC6/163 Universitaetsstr. 150 44801 Bochum Germany Phone: +49 234 32 29056 e-mail: giang.bui at rub.de On 2016-09-21 16:41, Svenja Sch?tzner wrote: > Dear Sir or Madam, > > the Faculty of Civil and Environmental Engineering Sciences at Ruhr > University Bochum, Germany, in conjunction with the interdepartmental > Research Department "Subsurface Modeling and Engineering" and the > Collaborative Research Center SFB 837 invites applications for the > position of a JUNIOR PROFESSORSHIP (W1) > > ?HIGH PERFORMANCE COMPUTING IN THE ENGINEERING SCIENCES? > > (for more information see attachment). On behalf of Prof. G?nther > Meschke I kindly ask you to post this announcement in your institution > and to distribute it to interested candidates? > > Thank you cordially in advance for your help! > > Freundliche Gr??e / kind regards > > Svenja Sch?tzner > > -Sekretariat- > > RUHR-UNIVERSIT?T BOCHUM > > Fakult?t f?r Bau- und > > Umweltingenieurwissenschaften > > Lehrstuhl f?r Statik und Dynamik > > Universit?tsstr. 150 > > 44780 Bochum > > Tel.: +49 (0) 234 / 32 29051 > > Fax: +49 (0) 234 / 32 14149 > > E-Mail: svenja.schuetzner at rub.de > > Homepage: http://www.sd.rub.de [1] > > > > Links: > ------ > [1] http://www.sd.rub.de/ -------------- next part -------------- A non-text attachment was scrubbed... Name: Ausschreibung-Jun-Prof-HPC-engl.pdf Type: application/pdf Size: 64855 bytes Desc: not available URL: From mono at dtu.dk Mon Sep 26 07:00:10 2016 From: mono at dtu.dk (=?utf-8?B?TW9ydGVuIE5vYmVsLUrDuHJnZW5zZW4=?=) Date: Mon, 26 Sep 2016 12:00:10 +0000 Subject: [petsc-users] DMPlex problem In-Reply-To: <6B03D347796DED499A2696FC095CE81A05B4BB2B@ait-pex02mbx04.win.dtu.dk> References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> , , <6B03D347796DED499A2696FC095CE81A05B4BB2B@ait-pex02mbx04.win.dtu.dk> Message-ID: <6B03D347796DED499A2696FC095CE81A05B4C01C@ait-pex02mbx04.win.dtu.dk> Hi Matthew It seems like the problem is not fully fixed. I have changed the code to now run on with both 2,3 and 4 cells. When I run the code using NP = 1..3 I get different result both for NP=1 to NP=2/3 when cell count is larger than 2. Kind regards, Morten ____ mpiexec -np 1 ./ex18k cells 2 Loc size: 36 Trace of matrix: 132.000000 cells 3 Loc size: 48 Trace of matrix: 192.000000 cells 4 Loc size: 60 Trace of matrix: 258.000000 mpiexec -np 2 ./ex18k cells 2 Loc size: 24 Loc size: 24 Trace of matrix: 132.000000 cells 3 Loc size: 36 Loc size: 24 Trace of matrix: 198.000000 cells 4 Loc size: 36 Loc size: 36 Trace of matrix: 264.000000 mpiexec -np 3 ./ex18k cells 2 Loc size: 24 Loc size: 24 Loc size: 0 Trace of matrix: 132.000000 cells 3 Loc size: 24 Loc size: 24 Loc size: 24 Trace of matrix: 198.000000 cells 4 Loc size: 36 Loc size: 24 Loc size: 24 Trace of matrix: 264.000000 ________________________________ From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] on behalf of Morten Nobel-J?rgensen [mono at dtu.dk] Sent: Sunday, September 25, 2016 11:15 AM To: Matthew Knepley Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem Hi Matthew Thank you for the bug-fix :) I can confirm that it works :) And thanks for your hard work on PETSc - your work is very much appreciated! Kind regards, Morten ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Friday, September 23, 2016 2:46 PM To: Morten Nobel-J?rgensen Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem On Fri, Sep 23, 2016 at 7:45 AM, Matthew Knepley > wrote: On Fri, Sep 23, 2016 at 3:48 AM, Morten Nobel-J?rgensen > wrote: Dear PETSc developers Any update on this issue regarding DMPlex? Or is there any obvious workaround that we are unaware of? I have fixed this bug. It did not come up in nightly tests because we are not using MatSetValuesLocal(). Instead we use MatSetValuesClosure() which translates differently. Here is the branch https://bitbucket.org/petsc/petsc/branch/knepley/fix-dm-ltog-bs and I have merged it to next. It will go to master in a day or two. Also, here is the cleaned up source with no memory leaks. Matt Also should we additionally register the issue on Bitbucket or is reporting the issue on the mailing list enough? Normally we are faster, but the start of the semester was hard this year. Thanks, Matt Kind regards, Morten ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Friday, September 09, 2016 12:21 PM To: Morten Nobel-J?rgensen Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen > wrote: Dear PETSc developers and users, Last week we posted a question regarding an error with DMPlex and multiple dofs and have not gotten any feedback yet. This is uncharted waters for us, since we have gotten used to an extremely fast feedback from the PETSc crew. So - with the chance of sounding impatient and ungrateful - we would like to hear if anybody has any ideas that could point us in the right direction? This is my fault. You have not gotten a response because everyone else was waiting for me, and I have been slow because I just moved houses at the same time as term started here. Sorry about that. The example ran for me and I saw your problem. The local-tp-global map is missing for some reason. I am tracking it down now. It should be made by DMCreateMatrix(), so this is mysterious. I hope to have this fixed by early next week. Thanks, Matt We have created a small example problem that demonstrates the error in the matrix assembly. Thanks, Morten -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex18k.cc Type: text/x-c++src Size: 4782 bytes Desc: ex18k.cc URL: From knepley at gmail.com Mon Sep 26 07:19:07 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 26 Sep 2016 07:19:07 -0500 Subject: [petsc-users] DMPlex problem In-Reply-To: <6B03D347796DED499A2696FC095CE81A05B4C01C@ait-pex02mbx04.win.dtu.dk> References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4BB2B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4C01C@ait-pex02mbx04.win.dtu.dk> Message-ID: On Mon, Sep 26, 2016 at 7:00 AM, Morten Nobel-J?rgensen wrote: > Hi Matthew > > It seems like the problem is not fully fixed. I have changed the code to > now run on with both 2,3 and 4 cells. When I run the code using NP = 1..3 I > get different result both for NP=1 to NP=2/3 when cell count is larger than > 2. > Do you mean the trace? I have no idea what you are actually putting in. I have a lot of debugging when you use, MatSetValuesClosure(), but when you directly use MatSetValuesLocal(), you are handling things yourself. Thanks, Matt > Kind regards, > Morten > ____ > mpiexec -np 1 ./ex18k > cells 2 > Loc size: 36 > Trace of matrix: 132.000000 > cells 3 > Loc size: 48 > Trace of matrix: 192.000000 > cells 4 > Loc size: 60 > Trace of matrix: 258.000000 > mpiexec -np 2 ./ex18k > cells 2 > Loc size: 24 > Loc size: 24 > Trace of matrix: 132.000000 > cells 3 > Loc size: 36 > Loc size: 24 > Trace of matrix: 198.000000 > cells 4 > Loc size: 36 > Loc size: 36 > Trace of matrix: 264.000000 > mpiexec -np 3 ./ex18k > cells 2 > Loc size: 24 > Loc size: 24 > Loc size: 0 > Trace of matrix: 132.000000 > cells 3 > Loc size: 24 > Loc size: 24 > Loc size: 24 > Trace of matrix: 198.000000 > cells 4 > Loc size: 36 > Loc size: 24 > Loc size: 24 > Trace of matrix: 264.000000 > > > > > ------------------------------ > *From:* petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] > on behalf of Morten Nobel-J?rgensen [mono at dtu.dk] > *Sent:* Sunday, September 25, 2016 11:15 AM > *To:* Matthew Knepley > *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? > *Subject:* Re: [petsc-users] DMPlex problem > > Hi Matthew > > Thank you for the bug-fix :) I can confirm that it works :) > > And thanks for your hard work on PETSc - your work is very much > appreciated! > > Kind regards, > Morten > ------------------------------ > *From:* Matthew Knepley [knepley at gmail.com] > *Sent:* Friday, September 23, 2016 2:46 PM > *To:* Morten Nobel-J?rgensen > *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? > *Subject:* Re: [petsc-users] DMPlex problem > > On Fri, Sep 23, 2016 at 7:45 AM, Matthew Knepley > wrote: > >> On Fri, Sep 23, 2016 at 3:48 AM, Morten Nobel-J?rgensen >> wrote: >> >>> Dear PETSc developers >>> >>> Any update on this issue regarding DMPlex? Or is there any obvious >>> workaround that we are unaware of? >>> >> >> I have fixed this bug. It did not come up in nightly tests because we are >> not using MatSetValuesLocal(). Instead we >> use MatSetValuesClosure() which translates differently. >> >> Here is the branch >> >> https://bitbucket.org/petsc/petsc/branch/knepley/fix-dm-ltog-bs >> >> and I have merged it to next. It will go to master in a day or two. >> > > Also, here is the cleaned up source with no memory leaks. > > Matt > > >> Also should we additionally register the issue on Bitbucket or is >>> reporting the issue on the mailing list enough? >>> >> >> Normally we are faster, but the start of the semester was hard this year. >> >> Thanks, >> >> Matt >> >> >>> Kind regards, >>> Morten >>> >>> ------------------------------ >>> *From:* Matthew Knepley [knepley at gmail.com] >>> *Sent:* Friday, September 09, 2016 12:21 PM >>> *To:* Morten Nobel-J?rgensen >>> *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? >>> *Subject:* Re: [petsc-users] DMPlex problem >>> >>> On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen >>> wrote: >>> >>>> Dear PETSc developers and users, >>>> >>>> Last week we posted a question regarding an error with DMPlex and >>>> multiple dofs and have not gotten any feedback yet. This is uncharted >>>> waters for us, since we have gotten used to an extremely fast feedback from >>>> the PETSc crew. So - with the chance of sounding impatient and ungrateful - >>>> we would like to hear if anybody has any ideas that could point us in the >>>> right direction? >>>> >>> >>> This is my fault. You have not gotten a response because everyone else >>> was waiting for me, and I have been >>> slow because I just moved houses at the same time as term started here. >>> Sorry about that. >>> >>> The example ran for me and I saw your problem. The local-tp-global map >>> is missing for some reason. >>> I am tracking it down now. It should be made by DMCreateMatrix(), so >>> this is mysterious. I hope to have >>> this fixed by early next week. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> We have created a small example problem that demonstrates the error in >>>> the matrix assembly. >>>> >>>> Thanks, >>>> Morten >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mono at dtu.dk Mon Sep 26 08:41:26 2016 From: mono at dtu.dk (=?utf-8?B?TW9ydGVuIE5vYmVsLUrDuHJnZW5zZW4=?=) Date: Mon, 26 Sep 2016 13:41:26 +0000 Subject: [petsc-users] DMPlex problem In-Reply-To: References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4BB2B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4C01C@ait-pex02mbx04.win.dtu.dk>, Message-ID: <6B03D347796DED499A2696FC095CE81A05B4D0AB@ait-pex02mbx04.win.dtu.dk> Hi Matt We are trying to do a simple FE using DMPlex, but when assemble the global stiffness matrix we get in problems when running NP>1 - that is the global matrix differs when we move to a distributed system, where it should not. In pseudo-code our CreateGlobalStiffnessMatrix does the following create a local stiffness matrix ke with some values for each local cell/element e (using result from DMPlexGetHeightStratum(..,0,..) for each of its vertices (using DMPlexGetTransitiveClosure(..,e,..) set local/global mapping to edof update the global stiffness matrix K using the local mapping edof and values ke The code we have sent is a simplified version, which just builds a dummy stiffness matrix - but we believe this matrix should still the same independent of NP. (That is why we use trace). I'm not familiar with MatSetValuesClosure(). Is that the missing piece? Kind regards, Morten ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Monday, September 26, 2016 2:19 PM To: Morten Nobel-J?rgensen Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem On Mon, Sep 26, 2016 at 7:00 AM, Morten Nobel-J?rgensen > wrote: Hi Matthew It seems like the problem is not fully fixed. I have changed the code to now run on with both 2,3 and 4 cells. When I run the code using NP = 1..3 I get different result both for NP=1 to NP=2/3 when cell count is larger than 2. Do you mean the trace? I have no idea what you are actually putting in. I have a lot of debugging when you use, MatSetValuesClosure(), but when you directly use MatSetValuesLocal(), you are handling things yourself. Thanks, Matt Kind regards, Morten ____ mpiexec -np 1 ./ex18k cells 2 Loc size: 36 Trace of matrix: 132.000000 cells 3 Loc size: 48 Trace of matrix: 192.000000 cells 4 Loc size: 60 Trace of matrix: 258.000000 mpiexec -np 2 ./ex18k cells 2 Loc size: 24 Loc size: 24 Trace of matrix: 132.000000 cells 3 Loc size: 36 Loc size: 24 Trace of matrix: 198.000000 cells 4 Loc size: 36 Loc size: 36 Trace of matrix: 264.000000 mpiexec -np 3 ./ex18k cells 2 Loc size: 24 Loc size: 24 Loc size: 0 Trace of matrix: 132.000000 cells 3 Loc size: 24 Loc size: 24 Loc size: 24 Trace of matrix: 198.000000 cells 4 Loc size: 36 Loc size: 24 Loc size: 24 Trace of matrix: 264.000000 ________________________________ From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] on behalf of Morten Nobel-J?rgensen [mono at dtu.dk] Sent: Sunday, September 25, 2016 11:15 AM To: Matthew Knepley Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem Hi Matthew Thank you for the bug-fix :) I can confirm that it works :) And thanks for your hard work on PETSc - your work is very much appreciated! Kind regards, Morten ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Friday, September 23, 2016 2:46 PM To: Morten Nobel-J?rgensen Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem On Fri, Sep 23, 2016 at 7:45 AM, Matthew Knepley > wrote: On Fri, Sep 23, 2016 at 3:48 AM, Morten Nobel-J?rgensen > wrote: Dear PETSc developers Any update on this issue regarding DMPlex? Or is there any obvious workaround that we are unaware of? I have fixed this bug. It did not come up in nightly tests because we are not using MatSetValuesLocal(). Instead we use MatSetValuesClosure() which translates differently. Here is the branch https://bitbucket.org/petsc/petsc/branch/knepley/fix-dm-ltog-bs and I have merged it to next. It will go to master in a day or two. Also, here is the cleaned up source with no memory leaks. Matt Also should we additionally register the issue on Bitbucket or is reporting the issue on the mailing list enough? Normally we are faster, but the start of the semester was hard this year. Thanks, Matt Kind regards, Morten ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Friday, September 09, 2016 12:21 PM To: Morten Nobel-J?rgensen Cc: PETSc ?[petsc-users at mcs.anl.gov]? Subject: Re: [petsc-users] DMPlex problem On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen > wrote: Dear PETSc developers and users, Last week we posted a question regarding an error with DMPlex and multiple dofs and have not gotten any feedback yet. This is uncharted waters for us, since we have gotten used to an extremely fast feedback from the PETSc crew. So - with the chance of sounding impatient and ungrateful - we would like to hear if anybody has any ideas that could point us in the right direction? This is my fault. You have not gotten a response because everyone else was waiting for me, and I have been slow because I just moved houses at the same time as term started here. Sorry about that. The example ran for me and I saw your problem. The local-tp-global map is missing for some reason. I am tracking it down now. It should be made by DMCreateMatrix(), so this is mysterious. I hope to have this fixed by early next week. Thanks, Matt We have created a small example problem that demonstrates the error in the matrix assembly. Thanks, Morten -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 26 09:04:56 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 26 Sep 2016 09:04:56 -0500 Subject: [petsc-users] DMPlex problem In-Reply-To: <6B03D347796DED499A2696FC095CE81A05B4D0AB@ait-pex02mbx04.win.dtu.dk> References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4BB2B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4C01C@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4D0AB@ait-pex02mbx04.win.dtu.dk> Message-ID: On Mon, Sep 26, 2016 at 8:41 AM, Morten Nobel-J?rgensen wrote: > Hi Matt > > We are trying to do a simple FE using DMPlex, but when assemble the global > stiffness matrix we get in problems when running NP>1 - that is the global > matrix differs when we move to a distributed system, where it should not. > > In pseudo-code our CreateGlobalStiffnessMatrix does the following > > create a local stiffness matrix ke with some values > for each local cell/element e (using result from > DMPlexGetHeightStratum(..,0,..) > for each of its vertices (using DMPlexGetTransitiveClosure(..,e,..) > set local/global mapping to edof > update the global stiffness matrix K using the local mapping edof and > values ke > > > The code we have sent is a simplified version, which just builds a dummy > stiffness matrix - but we believe this matrix should still the same > independent of NP. (That is why we use trace). > I am not sure what is wrong there, but there are a bunch of Plex FEM examples, like SNES ex12, ex62, ex77. > I'm not familiar with MatSetValuesClosure(). Is that the missing piece? > I use this to do indexing since it is so error prone to do it by yourself. I think this could be the problem with your example. When you distribute the mesh, ordering changes, and we do permutations to keep all local indices contiguous. The MatSetValuesClosure() is a convenience function for FEM which takes in a mesh point (could be cell, face, etc.) and translates that point number to a set of row indices using a) the transitive closure of that point in the mesh DAG and b) the default PetscSection for the DM which maps mesh points to sets of dofs (row indices) For it to work, you need to setup the default DM section, which it looks like you have done. Thanks, Matt > Kind regards, > Morten > > > ------------------------------ > *From:* Matthew Knepley [knepley at gmail.com] > *Sent:* Monday, September 26, 2016 2:19 PM > *To:* Morten Nobel-J?rgensen > *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? > *Subject:* Re: [petsc-users] DMPlex problem > > On Mon, Sep 26, 2016 at 7:00 AM, Morten Nobel-J?rgensen > wrote: > >> Hi Matthew >> >> It seems like the problem is not fully fixed. I have changed the code to >> now run on with both 2,3 and 4 cells. When I run the code using NP = 1..3 I >> get different result both for NP=1 to NP=2/3 when cell count is larger than >> 2. >> > > Do you mean the trace? I have no idea what you are actually putting in. > > I have a lot of debugging when you use, MatSetValuesClosure(), but when > you directly use > MatSetValuesLocal(), you are handling things yourself. > > Thanks, > > Matt > > >> Kind regards, >> Morten >> ____ >> mpiexec -np 1 ./ex18k >> cells 2 >> Loc size: 36 >> Trace of matrix: 132.000000 >> cells 3 >> Loc size: 48 >> Trace of matrix: 192.000000 >> cells 4 >> Loc size: 60 >> Trace of matrix: 258.000000 >> mpiexec -np 2 ./ex18k >> cells 2 >> Loc size: 24 >> Loc size: 24 >> Trace of matrix: 132.000000 >> cells 3 >> Loc size: 36 >> Loc size: 24 >> Trace of matrix: 198.000000 >> cells 4 >> Loc size: 36 >> Loc size: 36 >> Trace of matrix: 264.000000 >> mpiexec -np 3 ./ex18k >> cells 2 >> Loc size: 24 >> Loc size: 24 >> Loc size: 0 >> Trace of matrix: 132.000000 >> cells 3 >> Loc size: 24 >> Loc size: 24 >> Loc size: 24 >> Trace of matrix: 198.000000 >> cells 4 >> Loc size: 36 >> Loc size: 24 >> Loc size: 24 >> Trace of matrix: 264.000000 >> >> >> >> >> ------------------------------ >> *From:* petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] >> on behalf of Morten Nobel-J?rgensen [mono at dtu.dk] >> *Sent:* Sunday, September 25, 2016 11:15 AM >> *To:* Matthew Knepley >> *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? >> *Subject:* Re: [petsc-users] DMPlex problem >> >> Hi Matthew >> >> Thank you for the bug-fix :) I can confirm that it works :) >> >> And thanks for your hard work on PETSc - your work is very much >> appreciated! >> >> Kind regards, >> Morten >> ------------------------------ >> *From:* Matthew Knepley [knepley at gmail.com] >> *Sent:* Friday, September 23, 2016 2:46 PM >> *To:* Morten Nobel-J?rgensen >> *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? >> *Subject:* Re: [petsc-users] DMPlex problem >> >> On Fri, Sep 23, 2016 at 7:45 AM, Matthew Knepley >> wrote: >> >>> On Fri, Sep 23, 2016 at 3:48 AM, Morten Nobel-J?rgensen >>> wrote: >>> >>>> Dear PETSc developers >>>> >>>> Any update on this issue regarding DMPlex? Or is there any obvious >>>> workaround that we are unaware of? >>>> >>> >>> I have fixed this bug. It did not come up in nightly tests because we >>> are not using MatSetValuesLocal(). Instead we >>> use MatSetValuesClosure() which translates differently. >>> >>> Here is the branch >>> >>> https://bitbucket.org/petsc/petsc/branch/knepley/fix-dm-ltog-bs >>> >>> and I have merged it to next. It will go to master in a day or two. >>> >> >> Also, here is the cleaned up source with no memory leaks. >> >> Matt >> >> >>> Also should we additionally register the issue on Bitbucket or is >>>> reporting the issue on the mailing list enough? >>>> >>> >>> Normally we are faster, but the start of the semester was hard this year. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Kind regards, >>>> Morten >>>> >>>> ------------------------------ >>>> *From:* Matthew Knepley [knepley at gmail.com] >>>> *Sent:* Friday, September 09, 2016 12:21 PM >>>> *To:* Morten Nobel-J?rgensen >>>> *Cc:* PETSc ?[petsc-users at mcs.anl.gov]? >>>> *Subject:* Re: [petsc-users] DMPlex problem >>>> >>>> On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen >>>> wrote: >>>> >>>>> Dear PETSc developers and users, >>>>> >>>>> Last week we posted a question regarding an error with DMPlex and >>>>> multiple dofs and have not gotten any feedback yet. This is uncharted >>>>> waters for us, since we have gotten used to an extremely fast feedback from >>>>> the PETSc crew. So - with the chance of sounding impatient and ungrateful - >>>>> we would like to hear if anybody has any ideas that could point us in the >>>>> right direction? >>>>> >>>> >>>> This is my fault. You have not gotten a response because everyone else >>>> was waiting for me, and I have been >>>> slow because I just moved houses at the same time as term started here. >>>> Sorry about that. >>>> >>>> The example ran for me and I saw your problem. The local-tp-global map >>>> is missing for some reason. >>>> I am tracking it down now. It should be made by DMCreateMatrix(), so >>>> this is mysterious. I hope to have >>>> this fixed by early next week. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> We have created a small example problem that demonstrates the error in >>>>> the matrix assembly. >>>>> >>>>> Thanks, >>>>> Morten >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Mon Sep 26 09:19:57 2016 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Mon, 26 Sep 2016 17:19:57 +0300 Subject: [petsc-users] DMPlex problem In-Reply-To: References: <6B03D347796DED499A2696FC095CE81A05B3A99B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4A38D@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4BB2B@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4C01C@ait-pex02mbx04.win.dtu.dk> <6B03D347796DED499A2696FC095CE81A05B4D0AB@ait-pex02mbx04.win.dtu.dk> Message-ID: Morten, have a look at the local-to-global mapping that is created for the MATIS case in DMCreateMatrix_Plex in src/dm/impls/plex/plex.c. and how DMPlexMatSetClosure is used in src/snes/utils/dmplexsnes.c in the MATIS case. MATIS is a format for non-overlapping domain decomposition when you assemble your stiffness matrix on each subdomain (MPI proc) separately. On Sep 26, 2016, at 5:04 PM, Matthew Knepley wrote: > On Mon, Sep 26, 2016 at 8:41 AM, Morten Nobel-J?rgensen wrote: > Hi Matt > > We are trying to do a simple FE using DMPlex, but when assemble the global stiffness matrix we get in problems when running NP>1 - that is the global matrix differs when we move to a distributed system, where it should not. > > In pseudo-code our CreateGlobalStiffnessMatrix does the following > create a local stiffness matrix ke with some values > for each local cell/element e (using result from DMPlexGetHeightStratum(..,0,..) > for each of its vertices (using DMPlexGetTransitiveClosure(..,e,..) > set local/global mapping to edof > update the global stiffness matrix K using the local mapping edof and values ke > > The code we have sent is a simplified version, which just builds a dummy stiffness matrix - but we believe this matrix should still the same independent of NP. (That is why we use trace). > > I am not sure what is wrong there, but there are a bunch of Plex FEM examples, like SNES ex12, ex62, ex77. > > I'm not familiar with MatSetValuesClosure(). Is that the missing piece? > > I use this to do indexing since it is so error prone to do it by yourself. I think this could be the problem with your example. > > When you distribute the mesh, ordering changes, and we do permutations to keep all local indices contiguous. The > MatSetValuesClosure() is a convenience function for FEM which takes in a mesh point (could be cell, face, etc.) and > translates that point number to a set of row indices using > > a) the transitive closure of that point in the mesh DAG > > and > > b) the default PetscSection for the DM which maps mesh points to sets of dofs (row indices) > > For it to work, you need to setup the default DM section, which it looks like you have done. > > Thanks, > > Matt > > Kind regards, > Morten > > > From: Matthew Knepley [knepley at gmail.com] > Sent: Monday, September 26, 2016 2:19 PM > To: Morten Nobel-J?rgensen > Cc: PETSc ?[petsc-users at mcs.anl.gov]? > Subject: Re: [petsc-users] DMPlex problem > > On Mon, Sep 26, 2016 at 7:00 AM, Morten Nobel-J?rgensen wrote: > Hi Matthew > > It seems like the problem is not fully fixed. I have changed the code to now run on with both 2,3 and 4 cells. When I run the code using NP = 1..3 I get different result both for NP=1 to NP=2/3 when cell count is larger than 2. > > Do you mean the trace? I have no idea what you are actually putting in. > > I have a lot of debugging when you use, MatSetValuesClosure(), but when you directly use > MatSetValuesLocal(), you are handling things yourself. > > Thanks, > > Matt > > Kind regards, > Morten > ____ > mpiexec -np 1 ./ex18k > cells 2 > Loc size: 36 > Trace of matrix: 132.000000 > cells 3 > Loc size: 48 > Trace of matrix: 192.000000 > cells 4 > Loc size: 60 > Trace of matrix: 258.000000 > mpiexec -np 2 ./ex18k > cells 2 > Loc size: 24 > Loc size: 24 > Trace of matrix: 132.000000 > cells 3 > Loc size: 36 > Loc size: 24 > Trace of matrix: 198.000000 > cells 4 > Loc size: 36 > Loc size: 36 > Trace of matrix: 264.000000 > mpiexec -np 3 ./ex18k > cells 2 > Loc size: 24 > Loc size: 24 > Loc size: 0 > Trace of matrix: 132.000000 > cells 3 > Loc size: 24 > Loc size: 24 > Loc size: 24 > Trace of matrix: 198.000000 > cells 4 > Loc size: 36 > Loc size: 24 > Loc size: 24 > Trace of matrix: 264.000000 > > > > > From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] on behalf of Morten Nobel-J?rgensen [mono at dtu.dk] > Sent: Sunday, September 25, 2016 11:15 AM > To: Matthew Knepley > Cc: PETSc ?[petsc-users at mcs.anl.gov]? > Subject: Re: [petsc-users] DMPlex problem > > Hi Matthew > > Thank you for the bug-fix :) I can confirm that it works :) > > And thanks for your hard work on PETSc - your work is very much appreciated! > > Kind regards, > Morten > From: Matthew Knepley [knepley at gmail.com] > Sent: Friday, September 23, 2016 2:46 PM > To: Morten Nobel-J?rgensen > Cc: PETSc ?[petsc-users at mcs.anl.gov]? > Subject: Re: [petsc-users] DMPlex problem > > On Fri, Sep 23, 2016 at 7:45 AM, Matthew Knepley wrote: > On Fri, Sep 23, 2016 at 3:48 AM, Morten Nobel-J?rgensen wrote: > Dear PETSc developers > > Any update on this issue regarding DMPlex? Or is there any obvious workaround that we are unaware of? > > I have fixed this bug. It did not come up in nightly tests because we are not using MatSetValuesLocal(). Instead we > use MatSetValuesClosure() which translates differently. > > Here is the branch > > https://bitbucket.org/petsc/petsc/branch/knepley/fix-dm-ltog-bs > > and I have merged it to next. It will go to master in a day or two. > > Also, here is the cleaned up source with no memory leaks. > > Matt > > Also should we additionally register the issue on Bitbucket or is reporting the issue on the mailing list enough? > > Normally we are faster, but the start of the semester was hard this year. > > Thanks, > > Matt > > Kind regards, > Morten > > From: Matthew Knepley [knepley at gmail.com] > Sent: Friday, September 09, 2016 12:21 PM > To: Morten Nobel-J?rgensen > Cc: PETSc ?[petsc-users at mcs.anl.gov]? > Subject: Re: [petsc-users] DMPlex problem > > On Fri, Sep 9, 2016 at 4:04 AM, Morten Nobel-J?rgensen wrote: > Dear PETSc developers and users, > > Last week we posted a question regarding an error with DMPlex and multiple dofs and have not gotten any feedback yet. This is uncharted waters for us, since we have gotten used to an extremely fast feedback from the PETSc crew. So - with the chance of sounding impatient and ungrateful - we would like to hear if anybody has any ideas that could point us in the right direction? > > This is my fault. You have not gotten a response because everyone else was waiting for me, and I have been > slow because I just moved houses at the same time as term started here. Sorry about that. > > The example ran for me and I saw your problem. The local-tp-global map is missing for some reason. > I am tracking it down now. It should be made by DMCreateMatrix(), so this is mysterious. I hope to have > this fixed by early next week. > > Thanks, > > Matt > > We have created a small example problem that demonstrates the error in the matrix assembly. > > Thanks, > Morten > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aurelien.ponte at ifremer.fr Mon Sep 26 10:23:04 2016 From: aurelien.ponte at ifremer.fr (Aurelien Ponte) Date: Mon, 26 Sep 2016 08:23:04 -0700 Subject: [petsc-users] petsc4py with complex numbers? In-Reply-To: References: <8b889009-6cfd-7d43-589b-31b0f3d91fde@ifremer.fr> Message-ID: The output is: I installed petsc via macport and wrongly assumed complex number support would have been default. I've just reinstalled petsc with it and it seems to be working. thanks Lisandro ! aurelien Le 25/09/16 ? 23:00, Lisandro Dalcin a ?crit : > Are you sure you built petsc4py with a PETSc build with complex > scalars? Whats the output of "print(PETSc.ScalarType)" ? > > > On 26 September 2016 at 07:47, Aurelien Ponte > > wrote: > > Hi, > > I am trying to solve a linear problem whose operator has complex > coefficients via petsc4py > but keep running into the following error: > > File > "/Users/aponte/Current_projects/people/kraig_marine/wd_response/solver/set_L.py", > line 52, in set_L > L.setValueStencil(row, col, value) > File "PETSc/Mat.pyx", line 882, in > petsc4py.PETSc.Mat.setValueStencil (src/petsc4py.PETSc.c:124785) > File "PETSc/petscmat.pxi", line 1017, in > petsc4py.PETSc.matsetvaluestencil (src/petsc4py.PETSc.c:31469) > File "PETSc/arraynpy.pxi", line 140, in petsc4py.PETSc.iarray_s > (src/petsc4py.PETSc.c:8811) > File "PETSc/arraynpy.pxi", line 121, in petsc4py.PETSc.iarray > (src/petsc4py.PETSc.c:8542) > TypeError: can't convert complex to float > > The code looks like: > > for j in range(ys, ye): > for i in range(xs, xe): > row.index = (i, j, kx) > row.field = 0 > col.index = (i, j, kx) col.field=0 > L.setValueStencil(row, col, 1j) > > Any idea about what am I doing wrong? > > cheers > > aurelien > > > > -- > Aur?lien Ponte > Tel: (+33) 2 98 22 40 73 > Fax: (+33) 2 98 22 44 96 > UMR 6523, IFREMER > ZI de la Pointe du Diable > CS 10070 > 29280 Plouzan? > > > > > -- > Lisandro Dalcin > ============ > Research Scientist > Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) > Extreme Computing Research Center (ECRC) > King Abdullah University of Science and Technology (KAUST) > http://ecrc.kaust.edu.sa/ > > 4700 King Abdullah University of Science and Technology > al-Khawarizmi Bldg (Bldg 1), Office # 0109 > Thuwal 23955-6900, Kingdom of Saudi Arabia > http://www.kaust.edu.sa > > Office Phone: +966 12 808-0459 -- Aur?lien Ponte Tel: (+33) 2 98 22 40 73 Fax: (+33) 2 98 22 44 96 UMR 6523, IFREMER ZI de la Pointe du Diable CS 10070 29280 Plouzan? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera at mail.sdsu.edu Mon Sep 26 15:42:52 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Mon, 26 Sep 2016 13:42:52 -0700 Subject: [petsc-users] Solve KSP in parallel. Message-ID: Hello, I'm working on solve a linear system in parallel, following ex12 of the ksp tutorial i don't see major complication on doing so, so for a working linear system solver with PCJACOBI and KSPGCR i did only the following changes: call MatCreate(PETSC_COMM_WORLD,Ap,ierr) ! call MatSetType(Ap,MATSEQAIJ,ierr) call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) call MatSetFromOptions(Ap,ierr) ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp,iapi,japi,app,Ap,ierr) call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real(nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) I grayed out the changes from sequential implementation. So, it does not complain at runtime until it reaches KSPSolve(), with the following error: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Object is in wrong state [1]PETSC ERROR: Matrix is missing diagonal entry 0 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown [1]PETSC ERROR: ./solvelinearmgPETSc ? ? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 13:35:15 2016 [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c At line 333 of file solvelinearmgPETSc.f90 Fortran runtime error: Array bound mismatch for dimension 1 of array 'sol' (213120/106560) This code works for -n 1 cores, but it gives this error when using more than one core. What am i missing? Regards, Manuel. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: solvelinearmgPETSc.f90 Type: text/x-fortran Size: 14416 bytes Desc: not available URL: From bsmith at mcs.anl.gov Mon Sep 26 16:02:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 26 Sep 2016 16:02:14 -0500 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: References: Message-ID: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> The call to MatCreateMPIAIJWithArrays() is likely interpreting the values you pass in different than you expect. Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > On Sep 26, 2016, at 3:42 PM, Manuel Valera wrote: > > Hello, > > I'm working on solve a linear system in parallel, following ex12 of the ksp tutorial i don't see major complication on doing so, so for a working linear system solver with PCJACOBI and KSPGCR i did only the following changes: > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > ! call MatSetType(Ap,MATSEQAIJ,ierr) > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > call MatSetFromOptions(Ap,ierr) Note that none of the lines above are needed (or do anything) because the MatCreateMPIAIJWithArrays() creates the matrix from scratch itself. Barry > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp,iapi,japi,app,Ap,ierr) > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real(nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > I grayed out the changes from sequential implementation. > > So, it does not complain at runtime until it reaches KSPSolve(), with the following error: > > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Object is in wrong state > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > [1]PETSC ERROR: ./solvelinearmgPETSc ? ? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 13:35:15 2016 > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > At line 333 of file solvelinearmgPETSc.f90 > Fortran runtime error: Array bound mismatch for dimension 1 of array 'sol' (213120/106560) > > > This code works for -n 1 cores, but it gives this error when using more than one core. > > What am i missing? > > Regards, > > Manuel. > > From mvalera at mail.sdsu.edu Mon Sep 26 16:34:05 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Mon, 26 Sep 2016 14:34:05 -0700 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> Message-ID: Indeed there is something wrong with that call, it hangs out indefinitely showing only: Mat Object: 1 MPI processes type: mpiaij It draws my attention that this program works for 1 processor but not more, but it doesnt show anything for that viewer in either case. Thanks for the insight on the redundant calls, this is not very clear on documentation, which calls are included in others. On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith wrote: > > The call to MatCreateMPIAIJWithArrays() is likely interpreting the > values you pass in different than you expect. > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after the > MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera > wrote: > > > > Hello, > > > > I'm working on solve a linear system in parallel, following ex12 of the > ksp tutorial i don't see major complication on doing so, so for a working > linear system solver with PCJACOBI and KSPGCR i did only the following > changes: > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > call MatSetFromOptions(Ap,ierr) > > Note that none of the lines above are needed (or do anything) because > the MatCreateMPIAIJWithArrays() creates the matrix from scratch itself. > > Barry > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp, > iapi,japi,app,Ap,ierr) > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real( > nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > I grayed out the changes from sequential implementation. > > > > So, it does not complain at runtime until it reaches KSPSolve(), with > the following error: > > > > > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: Object is in wrong state > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > ? ? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 > 13:35:15 2016 > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in > /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/pc/interface/precon.c > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/ksp/interface/itfunc.c > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/ksp/interface/itfunc.c > > At line 333 of file solvelinearmgPETSc.f90 > > Fortran runtime error: Array bound mismatch for dimension 1 of array > 'sol' (213120/106560) > > > > > > This code works for -n 1 cores, but it gives this error when using more > than one core. > > > > What am i missing? > > > > Regards, > > > > Manuel. > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 26 16:40:50 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 26 Sep 2016 16:40:50 -0500 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> Message-ID: How large is the matrix? It will take a very long time if the matrix is large. Debug with a very small matrix. Barry > On Sep 26, 2016, at 4:34 PM, Manuel Valera wrote: > > Indeed there is something wrong with that call, it hangs out indefinitely showing only: > > Mat Object: 1 MPI processes > type: mpiaij > > It draws my attention that this program works for 1 processor but not more, but it doesnt show anything for that viewer in either case. > > Thanks for the insight on the redundant calls, this is not very clear on documentation, which calls are included in others. > > > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith wrote: > > The call to MatCreateMPIAIJWithArrays() is likely interpreting the values you pass in different than you expect. > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera wrote: > > > > Hello, > > > > I'm working on solve a linear system in parallel, following ex12 of the ksp tutorial i don't see major complication on doing so, so for a working linear system solver with PCJACOBI and KSPGCR i did only the following changes: > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > call MatSetFromOptions(Ap,ierr) > > Note that none of the lines above are needed (or do anything) because the MatCreateMPIAIJWithArrays() creates the matrix from scratch itself. > > Barry > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real(nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > I grayed out the changes from sequential implementation. > > > > So, it does not complain at runtime until it reaches KSPSolve(), with the following error: > > > > > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: Object is in wrong state > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > [1]PETSC ERROR: ./solvelinearmgPETSc ? ? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 13:35:15 2016 > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > At line 333 of file solvelinearmgPETSc.f90 > > Fortran runtime error: Array bound mismatch for dimension 1 of array 'sol' (213120/106560) > > > > > > This code works for -n 1 cores, but it gives this error when using more than one core. > > > > What am i missing? > > > > Regards, > > > > Manuel. > > > > > > From mvalera at mail.sdsu.edu Mon Sep 26 17:07:46 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Mon, 26 Sep 2016 15:07:46 -0700 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> Message-ID: Ok i was using a big matrix before, from a smaller testcase i got the output and effectively, it looks like is not well read at all, results are attached for DRAW viewer, output is too big to use STDOUT even in the small testcase. n# is the number of processors requested. is there a way to create the matrix in one node and the distribute it as needed on the rest ? maybe that would work. Thanks On Mon, Sep 26, 2016 at 2:40 PM, Barry Smith wrote: > > How large is the matrix? It will take a very long time if the matrix > is large. Debug with a very small matrix. > > Barry > > > On Sep 26, 2016, at 4:34 PM, Manuel Valera > wrote: > > > > Indeed there is something wrong with that call, it hangs out > indefinitely showing only: > > > > Mat Object: 1 MPI processes > > type: mpiaij > > > > It draws my attention that this program works for 1 processor but not > more, but it doesnt show anything for that viewer in either case. > > > > Thanks for the insight on the redundant calls, this is not very clear on > documentation, which calls are included in others. > > > > > > > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith wrote: > > > > The call to MatCreateMPIAIJWithArrays() is likely interpreting the > values you pass in different than you expect. > > > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after the > MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera > wrote: > > > > > > Hello, > > > > > > I'm working on solve a linear system in parallel, following ex12 of > the ksp tutorial i don't see major complication on doing so, so for a > working linear system solver with PCJACOBI and KSPGCR i did only the > following changes: > > > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > > call MatSetFromOptions(Ap,ierr) > > > > Note that none of the lines above are needed (or do anything) > because the MatCreateMPIAIJWithArrays() creates the matrix from scratch > itself. > > > > Barry > > > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp, > iapi,japi,app,Ap,ierr) > > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real( > nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > > > > I grayed out the changes from sequential implementation. > > > > > > So, it does not complain at runtime until it reaches KSPSolve(), with > the following error: > > > > > > > > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [1]PETSC ERROR: Object is in wrong state > > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > ? ? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 > 13:35:15 2016 > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in > /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/pc/interface/precon.c > > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/ksp/interface/itfunc.c > > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/ksp/interface/itfunc.c > > > At line 333 of file solvelinearmgPETSc.f90 > > > Fortran runtime error: Array bound mismatch for dimension 1 of array > 'sol' (213120/106560) > > > > > > > > > This code works for -n 1 cores, but it gives this error when using > more than one core. > > > > > > What am i missing? > > > > > > Regards, > > > > > > Manuel. > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: n4.png Type: image/png Size: 1867 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: n2.png Type: image/png Size: 1949 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: n1.png Type: image/png Size: 1985 bytes Desc: not available URL: From bsmith at mcs.anl.gov Mon Sep 26 17:12:25 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 26 Sep 2016 17:12:25 -0500 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> Message-ID: <174EAC3B-DA31-4FEA-8321-FE7000E74D41@mcs.anl.gov> > On Sep 26, 2016, at 5:07 PM, Manuel Valera wrote: > > Ok i was using a big matrix before, from a smaller testcase i got the output and effectively, it looks like is not well read at all, results are attached for DRAW viewer, output is too big to use STDOUT even in the small testcase. n# is the number of processors requested. You need to construct a very small test case so you can determine why the values do not end up where you expect them. There is no way around it. > > is there a way to create the matrix in one node and the distribute it as needed on the rest ? maybe that would work. No the is not scalable. You become limited by the memory of the one node. > > Thanks > > On Mon, Sep 26, 2016 at 2:40 PM, Barry Smith wrote: > > How large is the matrix? It will take a very long time if the matrix is large. Debug with a very small matrix. > > Barry > > > On Sep 26, 2016, at 4:34 PM, Manuel Valera wrote: > > > > Indeed there is something wrong with that call, it hangs out indefinitely showing only: > > > > Mat Object: 1 MPI processes > > type: mpiaij > > > > It draws my attention that this program works for 1 processor but not more, but it doesnt show anything for that viewer in either case. > > > > Thanks for the insight on the redundant calls, this is not very clear on documentation, which calls are included in others. > > > > > > > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith wrote: > > > > The call to MatCreateMPIAIJWithArrays() is likely interpreting the values you pass in different than you expect. > > > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera wrote: > > > > > > Hello, > > > > > > I'm working on solve a linear system in parallel, following ex12 of the ksp tutorial i don't see major complication on doing so, so for a working linear system solver with PCJACOBI and KSPGCR i did only the following changes: > > > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > > call MatSetFromOptions(Ap,ierr) > > > > Note that none of the lines above are needed (or do anything) because the MatCreateMPIAIJWithArrays() creates the matrix from scratch itself. > > > > Barry > > > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real(nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > > > > I grayed out the changes from sequential implementation. > > > > > > So, it does not complain at runtime until it reaches KSPSolve(), with the following error: > > > > > > > > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [1]PETSC ERROR: Object is in wrong state > > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > [1]PETSC ERROR: ./solvelinearmgPETSc ? ? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 13:35:15 2016 > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > At line 333 of file solvelinearmgPETSc.f90 > > > Fortran runtime error: Array bound mismatch for dimension 1 of array 'sol' (213120/106560) > > > > > > > > > This code works for -n 1 cores, but it gives this error when using more than one core. > > > > > > What am i missing? > > > > > > Regards, > > > > > > Manuel. > > > > > > > > > > > > > From mvalera at mail.sdsu.edu Mon Sep 26 17:51:17 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Mon, 26 Sep 2016 15:51:17 -0700 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: <174EAC3B-DA31-4FEA-8321-FE7000E74D41@mcs.anl.gov> References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> <174EAC3B-DA31-4FEA-8321-FE7000E74D41@mcs.anl.gov> Message-ID: Ok, i created a tiny testcase just for this, The output from n# calls are as follows: n1: Mat Object: 1 MPI processes type: mpiaij row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) row 2: (0, 4.) (1, 3.) (2, 1.) (3, 2.) row 3: (0, 3.) (1, 4.) (2, 2.) (3, 1.) n2: Mat Object: 2 MPI processes type: mpiaij row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) row 3: (0, 2.) (1, 1.) (2, 3.) (3, 4.) n4: Mat Object: 4 MPI processes type: mpiaij row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) row 1: (0, 1.) (1, 2.) (2, 4.) (3, 3.) row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) row 3: (0, 1.) (1, 2.) (2, 4.) (3, 3.) It really gets messed, no idea what's happening. On Mon, Sep 26, 2016 at 3:12 PM, Barry Smith wrote: > > > On Sep 26, 2016, at 5:07 PM, Manuel Valera > wrote: > > > > Ok i was using a big matrix before, from a smaller testcase i got the > output and effectively, it looks like is not well read at all, results are > attached for DRAW viewer, output is too big to use STDOUT even in the small > testcase. n# is the number of processors requested. > > You need to construct a very small test case so you can determine why > the values do not end up where you expect them. There is no way around it. > > > > is there a way to create the matrix in one node and the distribute it as > needed on the rest ? maybe that would work. > > No the is not scalable. You become limited by the memory of the one > node. > > > > > Thanks > > > > On Mon, Sep 26, 2016 at 2:40 PM, Barry Smith wrote: > > > > How large is the matrix? It will take a very long time if the matrix > is large. Debug with a very small matrix. > > > > Barry > > > > > On Sep 26, 2016, at 4:34 PM, Manuel Valera > wrote: > > > > > > Indeed there is something wrong with that call, it hangs out > indefinitely showing only: > > > > > > Mat Object: 1 MPI processes > > > type: mpiaij > > > > > > It draws my attention that this program works for 1 processor but not > more, but it doesnt show anything for that viewer in either case. > > > > > > Thanks for the insight on the redundant calls, this is not very clear > on documentation, which calls are included in others. > > > > > > > > > > > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith > wrote: > > > > > > The call to MatCreateMPIAIJWithArrays() is likely interpreting the > values you pass in different than you expect. > > > > > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after > the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > > > > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera > wrote: > > > > > > > > Hello, > > > > > > > > I'm working on solve a linear system in parallel, following ex12 of > the ksp tutorial i don't see major complication on doing so, so for a > working linear system solver with PCJACOBI and KSPGCR i did only the > following changes: > > > > > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > > > call MatSetFromOptions(Ap,ierr) > > > > > > Note that none of the lines above are needed (or do anything) > because the MatCreateMPIAIJWithArrays() creates the matrix from scratch > itself. > > > > > > Barry > > > > > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp, > iapi,japi,app,Ap,ierr) > > > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real( > nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > > > > > > > I grayed out the changes from sequential implementation. > > > > > > > > So, it does not complain at runtime until it reaches KSPSolve(), > with the following error: > > > > > > > > > > > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [1]PETSC ERROR: Object is in wrong state > > > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > ? ? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 > 13:35:15 2016 > > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in > /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > > [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/pc/interface/precon.c > > > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/ksp/interface/itfunc.c > > > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 > in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/ > petsc/src/ksp/ksp/interface/itfunc.c > > > > At line 333 of file solvelinearmgPETSc.f90 > > > > Fortran runtime error: Array bound mismatch for dimension 1 of array > 'sol' (213120/106560) > > > > > > > > > > > > This code works for -n 1 cores, but it gives this error when using > more than one core. > > > > > > > > What am i missing? > > > > > > > > Regards, > > > > > > > > Manuel. > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera at mail.sdsu.edu Mon Sep 26 18:40:21 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Mon, 26 Sep 2016 16:40:21 -0700 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> <174EAC3B-DA31-4FEA-8321-FE7000E74D41@mcs.anl.gov> Message-ID: Ok, last output was from simulated multicores, in an actual cluster the errors are of the kind: [valera at cinci CSRMatrix]$ petsc -n 2 ./solvelinearmgPETSc TrivSoln loaded, size: 4 / 4 TrivSoln loaded, size: 4 / 4 RHS loaded, size: 4 / 4 RHS loaded, size: 4 / 4 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Comm must be of size 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Argument out of range [1]PETSC ERROR: Comm must be of size 1 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [1]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: #2 MatSetType() line 94 in /home/valera/petsc-3.7.2/src/mat/interface/matreg.c [1]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c local size: 2 local size: 2 Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [0]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: #2 MatSetType() line 94 in /home/valera/petsc-3.7.2/src/mat/interface/matreg.c [0]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: [0]PETSC ERROR: Nonconforming object sizes [0]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, my local length 4 likely a call to VecSetSizes() or MatSetSizes() is wrong. See http://www.mcs.anl.gov/petsc/documentation/faq.html#split [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. Nonconforming object sizes [1]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, my local length 4 likely a call to VecSetSizes() or MatSetSizes() is wrong. See http://www.mcs.anl.gov/petsc/documentation/faq.html#split [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [0]PETSC ERROR: #4 PetscSplitOwnership() line 93 in /home/valera/petsc-3.7.2/src/sys/utils/psplit.c [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [1]PETSC ERROR: #4 PetscSplitOwnership() line 93 in /home/valera/petsc-3.7.2/src/sys/utils/psplit.c [0]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c [0]PETSC ERROR: #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [1]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c [1]PETSC ERROR: [0]PETSC ERROR: #7 MatMPIAIJSetPreallocation() line 3505 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [1]PETSC ERROR: [0]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c #7 MatMPIAIJSetPreallocation() line 3505 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [1]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [1]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatSetNearNullSpace() [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 Object is in wrong state [1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatSetNearNullSpace() [1]PETSC ERROR: [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [0]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [1]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatAssemblyBegin() [0]PETSC ERROR: [1]PETSC ERROR: Object is in wrong state [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatAssemblyBegin() [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function is given. [0]PETSC ERROR: [0] MatAssemblyEnd line 5185 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [0]PETSC ERROR: [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] MatAssemblyEnd line 5185 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [0] MatAssemblyBegin line 5090 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatSetNearNullSpace line 8191 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [0]PETSC ERROR: [1]PETSC ERROR: [1] MatAssemblyBegin line 5090 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [1]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/valera/petsc-3.7.2/src/sys/utils/psplit.c [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation_MPIAIJ line 2767 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [1] MatSetNearNullSpace line 8191 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [1]PETSC ERROR: [1] PetscSplitOwnership line 80 /home/valera/petsc-3.7.2/src/sys/utils/psplit.c [1]PETSC ERROR: [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation line 3502 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: [0] MatSetUp_MPIAIJ line 2152 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [1] PetscLayoutSetUp line 129 /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation_MPIAIJ line 2767 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: [0] MatSetUp line 727 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatCreate_SeqAIJ line 3956 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation line 3502 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [1]PETSC ERROR: [1] MatSetUp_MPIAIJ line 2152 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: [0] MatSetType line 44 /home/valera/petsc-3.7.2/src/mat/interface/matreg.c [0]PETSC ERROR: [0] MatCreateSeqAIJWithArrays line 4295 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: [1] MatSetUp line 727 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c [1]PETSC ERROR: [1] MatCreate_SeqAIJ line 3956 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [1]PETSC ERROR: [1] MatSetType line 44 /home/valera/petsc-3.7.2/src/mat/interface/matreg.c [1]PETSC ERROR: [1] MatCreateSeqAIJWithArrays line 4295 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [0]PETSC ERROR: Signal received [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: #12 User provided function() line 0 in unknown file Petsc Release Version 3.7.2, Jun, 05, 2016 [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich [1]PETSC ERROR: #12 User provided function() line 0 in unknown file application called MPI_Abort(comm=0x84000004, 59) - process 0 [cli_0]: aborting job: application called MPI_Abort(comm=0x84000004, 59) - process 0 application called MPI_Abort(comm=0x84000002, 59) - process 1 [cli_1]: aborting job: application called MPI_Abort(comm=0x84000002, 59) - process 1 =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 10266 RUNNING AT cinci = EXIT CODE: 59 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES =================================================================================== On Mon, Sep 26, 2016 at 3:51 PM, Manuel Valera wrote: > Ok, i created a tiny testcase just for this, > > The output from n# calls are as follows: > > n1: > Mat Object: 1 MPI processes > type: mpiaij > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > row 2: (0, 4.) (1, 3.) (2, 1.) (3, 2.) > row 3: (0, 3.) (1, 4.) (2, 2.) (3, 1.) > > n2: > Mat Object: 2 MPI processes > type: mpiaij > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 3: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > n4: > Mat Object: 4 MPI processes > type: mpiaij > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 1: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 3: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > > It really gets messed, no idea what's happening. > > > > > On Mon, Sep 26, 2016 at 3:12 PM, Barry Smith wrote: > >> >> > On Sep 26, 2016, at 5:07 PM, Manuel Valera >> wrote: >> > >> > Ok i was using a big matrix before, from a smaller testcase i got the >> output and effectively, it looks like is not well read at all, results are >> attached for DRAW viewer, output is too big to use STDOUT even in the small >> testcase. n# is the number of processors requested. >> >> You need to construct a very small test case so you can determine why >> the values do not end up where you expect them. There is no way around it. >> > >> > is there a way to create the matrix in one node and the distribute it >> as needed on the rest ? maybe that would work. >> >> No the is not scalable. You become limited by the memory of the one >> node. >> >> > >> > Thanks >> > >> > On Mon, Sep 26, 2016 at 2:40 PM, Barry Smith >> wrote: >> > >> > How large is the matrix? It will take a very long time if the >> matrix is large. Debug with a very small matrix. >> > >> > Barry >> > >> > > On Sep 26, 2016, at 4:34 PM, Manuel Valera >> wrote: >> > > >> > > Indeed there is something wrong with that call, it hangs out >> indefinitely showing only: >> > > >> > > Mat Object: 1 MPI processes >> > > type: mpiaij >> > > >> > > It draws my attention that this program works for 1 processor but not >> more, but it doesnt show anything for that viewer in either case. >> > > >> > > Thanks for the insight on the redundant calls, this is not very clear >> on documentation, which calls are included in others. >> > > >> > > >> > > >> > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith >> wrote: >> > > >> > > The call to MatCreateMPIAIJWithArrays() is likely interpreting the >> values you pass in different than you expect. >> > > >> > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after >> the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. >> > > >> > > >> > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera >> wrote: >> > > > >> > > > Hello, >> > > > >> > > > I'm working on solve a linear system in parallel, following ex12 of >> the ksp tutorial i don't see major complication on doing so, so for a >> working linear system solver with PCJACOBI and KSPGCR i did only the >> following changes: >> > > > >> > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) >> > > > ! call MatSetType(Ap,MATSEQAIJ,ierr) >> > > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization >> > > > >> > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); >> > > > >> > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) >> > > > call MatSetFromOptions(Ap,ierr) >> > > >> > > Note that none of the lines above are needed (or do anything) >> because the MatCreateMPIAIJWithArrays() creates the matrix from scratch >> itself. >> > > >> > > Barry >> > > >> > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp,iapi, >> japi,app,Ap,ierr) >> > > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real(nbdp)/ >> sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) >> > > > >> > > > >> > > > I grayed out the changes from sequential implementation. >> > > > >> > > > So, it does not complain at runtime until it reaches KSPSolve(), >> with the following error: >> > > > >> > > > >> > > > [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [1]PETSC ERROR: Object is in wrong state >> > > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 >> > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/d >> ocumentation/faq.html for trouble shooting. >> > > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown >> > > > [1]PETSC ERROR: ./solvelinearmgPETSc >> >> >> ? ? on a >> arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 >> 13:35:15 2016 >> > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 >> --download-ml?=1 >> > > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in >> /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c >> > > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in >> /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c >> > > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in >> /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c >> > > > [1]PETSC ERROR: #4 PCSetUp() line 968 in >> /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c >> > > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in >> /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c >> > > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 >> in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c >> > > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in >> /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c >> > > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in >> /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c >> > > > [1]PETSC ERROR: #9 KSPSolve() line 600 in >> /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c >> > > > At line 333 of file solvelinearmgPETSc.f90 >> > > > Fortran runtime error: Array bound mismatch for dimension 1 of >> array 'sol' (213120/106560) >> > > > >> > > > >> > > > This code works for -n 1 cores, but it gives this error when using >> more than one core. >> > > > >> > > > What am i missing? >> > > > >> > > > Regards, >> > > > >> > > > Manuel. >> > > > >> > > > >> > > >> > > >> > >> > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- 1.000000 2.000000 4.000000 3.000000 2.000000 1.000000 3.000000 4.000000 4.000000 3.000000 1.000000 2.000000 3.000000 4.000000 2.000000 1.000000 -------------- next part -------------- A non-text attachment was scrubbed... Name: solvelinearmgPETSc.f90 Type: text/x-fortran Size: 15072 bytes Desc: not available URL: -------------- next part -------------- 1.000000 2.000000 4.000000 3.000000 1.000000 2.000000 4.000000 3.000000 1.000000 2.000000 4.000000 3.000000 1.000000 2.000000 4.000000 3.000000 -------------- next part -------------- 0 4 8 12 16 -------------- next part -------------- 0 1 2 3 1 0 3 2 2 3 0 1 3 2 1 0 From cyrill.von.planta at usi.ch Tue Sep 27 04:54:18 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Tue, 27 Sep 2016 09:54:18 +0000 Subject: [petsc-users] Example for MatInvertBlockDiagonal In-Reply-To: References: <136AA62C-B799-46D7-8DA2-DE5542114A67@mcs.anl.gov> <2D8FC5B4-6117-479D-836F-280945B76210@mcs.anl.gov> <905ECA30-E8BC-4083-AE15-F51284E0E814@usi.ch> Message-ID: <74D077E5-07FD-4575-A92D-9EE85FED5D26@usi.ch> Thanks. Just to wrap it up: In the end I took the respective SOR-code and added a block-manipulation routine to it after the update of the x_i's. That way I also get to use the "MatInvertBlockDiagonal()? functionality. Cyrill > On 19 Sep 2016, at 21:38, Barry Smith wrote: > > >> On Sep 19, 2016, at 2:21 PM, Cyrill Vonplanta wrote: >> >> >>> block size > 1 really only makes sense if the block size is really greater than one. So if A has blocks of size 3 you should create A as BAIJ and thus never need to call the convert routine. >> >> Unfortunately A is not created by my part of the program and comes with blocksize 1. > > Ok, copy the code for MatInvertBlockDiagonal_SeqAIJ() into your source code with a different name and modify it to serve your purpose and call it directly instead of calling MatInvertBlockDiagonal > >> >>> >>> You can also set the block size for AIJ matrix to 3 and use MatInvertBlockDiagonal() on that matrix and not use the BAIJ matrix. >> >> If I run: >> ierr = MatSetBlockSize(A, 3); CHKERRQ(ierr); >> >> It doesn?t work for me. I get: >> >> [1;31m[0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0;39m[0;49m[0]PETSC ERROR: Arguments are incompatible >> [0]PETSC ERROR: Cannot change block size 1 to 3 >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >> . >> . >> >> What are the constraints for block size? > > You need to set it early in the life of the matrix. > > Barry > >> >> >> >> > From cyrill.von.planta at usi.ch Tue Sep 27 06:48:26 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Tue, 27 Sep 2016 11:48:26 +0000 Subject: [petsc-users] MatSOR and GaussSeidel In-Reply-To: <87k2f7tu1v.fsf@jedbrown.org> References: <67876D44-A778-487C-9761-3DAA10356F6A@usi.ch> <87k2f7tu1v.fsf@jedbrown.org> Message-ID: <4C5EBEFD-EF1D-4637-91AB-1148DB43E5A1@usi.ch> I am finished with what I set up to do and I just want to leave this note to other potential PETSc-newbies that browse the mailing list: The important thing to point out here, is that PETSc (with the configuration below) in general does NOT do a Gauss-Seidel step. Instead it might do a block (!) Gauss-Seidel step using the inodes of the matrix. This leads to iterates that look different from what you would expect if the SOR-step is done coordinate-wise and you get a different convergence history. (It?s mentioned on the documentation page, but one quickly overreads it) Cyrill > On 23 Aug 2016, at 16:54, Jed Brown wrote: > > Cyrill Vonplanta writes: > >> Dear PETSc-Users, >> >> I am debugging a smoother of ours and i was wondering what settings of MatSOR exactly form one ordinary Gauss Seidel smoothing step. Currently I use: >> >> ierr = MatSOR(A,b,1.0,(MatSORType)(SOR_ZERO_INITIAL_GUESS | SOR_FORWARD_SWEEP),0,1,1,x); CHKERRV(ierr); > > Yes, this is a standard forward sweep of GS. Note that your code below > computes half zeros (because the vector starts as 0), but that it > handles the diagonal incorrectly if you were to use a nonzero initial > guess. > >> I expect this to be the same as this na?ve Gauss-Seidel step: >> >> >> for (int i=0;i> >> sum_i = 0; >> >> sum_i += ps_b_values[i]; >> >> for (int j=0;j> >> sum_i -= ps_A_values[i+j*m]*ps_x_values[j]; >> >> } >> >> ps_x_values[i] += sum_i/ps_A_values[i*m +i]; >> >> } >> >> The ps_* refer to the data parts of PETSc types (everything is serial and dense in my toy example. Initial x is zero.m is dimension of A). However the convergence history looks different. Am I missing something here? >> >> Best Cyrill From knepley at gmail.com Tue Sep 27 07:14:57 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 27 Sep 2016 07:14:57 -0500 Subject: [petsc-users] MatSOR and GaussSeidel In-Reply-To: <4C5EBEFD-EF1D-4637-91AB-1148DB43E5A1@usi.ch> References: <67876D44-A778-487C-9761-3DAA10356F6A@usi.ch> <87k2f7tu1v.fsf@jedbrown.org> <4C5EBEFD-EF1D-4637-91AB-1148DB43E5A1@usi.ch> Message-ID: On Tue, Sep 27, 2016 at 6:48 AM, Cyrill Vonplanta wrote: > I am finished with what I set up to do and I just want to leave this note > to other potential PETSc-newbies that browse the mailing list: > > The important thing to point out here, is that PETSc (with the > configuration below) in general does NOT do a Gauss-Seidel step. Instead it > might do a block (!) Gauss-Seidel step using the inodes of the matrix. This > leads to iterates that look different from what you would expect if the > SOR-step is done coordinate-wise and you get a different convergence > history. > Note that you can disable this using -mat_no_inodes Thanks, Matt > (It?s mentioned on the documentation page, but one quickly overreads it) > > Cyrill > > > On 23 Aug 2016, at 16:54, Jed Brown wrote: > > > > Cyrill Vonplanta writes: > > > >> Dear PETSc-Users, > >> > >> I am debugging a smoother of ours and i was wondering what settings of > MatSOR exactly form one ordinary Gauss Seidel smoothing step. Currently I > use: > >> > >> ierr = MatSOR(A,b,1.0,(MatSORType)(SOR_ZERO_INITIAL_GUESS | > SOR_FORWARD_SWEEP),0,1,1,x); CHKERRV(ierr); > > > > Yes, this is a standard forward sweep of GS. Note that your code below > > computes half zeros (because the vector starts as 0), but that it > > handles the diagonal incorrectly if you were to use a nonzero initial > > guess. > > > >> I expect this to be the same as this na?ve Gauss-Seidel step: > >> > >> > >> for (int i=0;i >> > >> sum_i = 0; > >> > >> sum_i += ps_b_values[i]; > >> > >> for (int j=0;j >> > >> sum_i -= ps_A_values[i+j*m]*ps_x_values[j]; > >> > >> } > >> > >> ps_x_values[i] += sum_i/ps_A_values[i*m +i]; > >> > >> } > >> > >> The ps_* refer to the data parts of PETSc types (everything is serial > and dense in my toy example. Initial x is zero.m is dimension of A). > However the convergence history looks different. Am I missing something > here? > >> > >> Best Cyrill > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gcfrai at gmail.com Tue Sep 27 13:05:32 2016 From: gcfrai at gmail.com (Amit Itagi) Date: Tue, 27 Sep 2016 14:05:32 -0400 Subject: [petsc-users] FFT using Petsc4py Message-ID: Hello, I am looking at the Petsc FFT interfaces. I was wondering if a parallel FFT can be performed within a Petsc4Py code. If not, what would be the best way to use the Petsc interfaces for FFT from Petsc4Py ? Thanks Amit -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 27 14:53:40 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 27 Sep 2016 14:53:40 -0500 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> <174EAC3B-DA31-4FEA-8321-FE7000E74D41@mcs.anl.gov> Message-ID: <5F3C5343-DF36-4121-ADF0-9D3224CC89D9@mcs.anl.gov> Are you loading a matrix from an ASCII file? If so don't do that. You should write a simple sequential PETSc program that reads in the ASCII file and saves the matrix as a PETSc binary file with MatView(). Then write your parallel code that reads in the binary file with MatLoad() and solves the system. You can read in the right hand side from ASCII and save it in the binary file also. Trying to read an ASCII file in parallel and set it into a PETSc parallel matrix is just a totally thankless task that is unnecessary. Barry > On Sep 26, 2016, at 6:40 PM, Manuel Valera wrote: > > Ok, last output was from simulated multicores, in an actual cluster the errors are of the kind: > > [valera at cinci CSRMatrix]$ petsc -n 2 ./solvelinearmgPETSc > TrivSoln loaded, size: 4 / 4 > TrivSoln loaded, size: 4 / 4 > RHS loaded, size: 4 / 4 > RHS loaded, size: 4 / 4 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Comm must be of size 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Argument out of range > [1]PETSC ERROR: Comm must be of size 1 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [1]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > [1]PETSC ERROR: #2 MatSetType() line 94 in /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > [1]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > local size: 2 > local size: 2 > Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [0]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: #2 MatSetType() line 94 in /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > [0]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: [0]PETSC ERROR: Nonconforming object sizes > [0]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, my local length 4 > likely a call to VecSetSizes() or MatSetSizes() is wrong. > See http://www.mcs.anl.gov/petsc/documentation/faq.html#split > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > Nonconforming object sizes > [1]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, my local length 4 > likely a call to VecSetSizes() or MatSetSizes() is wrong. > See http://www.mcs.anl.gov/petsc/documentation/faq.html#split > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [0]PETSC ERROR: #4 PetscSplitOwnership() line 93 in /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [1]PETSC ERROR: #4 PetscSplitOwnership() line 93 in /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > [0]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > [0]PETSC ERROR: #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [1]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > [1]PETSC ERROR: [0]PETSC ERROR: #7 MatMPIAIJSetPreallocation() line 3505 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [1]PETSC ERROR: [0]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > #7 MatMPIAIJSetPreallocation() line 3505 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [1]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [1]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatSetNearNullSpace() > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > Object is in wrong state > [1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatSetNearNullSpace() > [1]PETSC ERROR: [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [0]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [1]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatAssemblyBegin() > [0]PETSC ERROR: [1]PETSC ERROR: Object is in wrong state > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatAssemblyBegin() > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [1]PETSC ERROR: ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [1]PETSC ERROR: [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > is given. > [0]PETSC ERROR: [0] MatAssemblyEnd line 5185 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [0]PETSC ERROR: [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] MatAssemblyEnd line 5185 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [0] MatAssemblyBegin line 5090 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatSetNearNullSpace line 8191 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [0]PETSC ERROR: [1]PETSC ERROR: [1] MatAssemblyBegin line 5090 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [1]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation_MPIAIJ line 2767 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [1] MatSetNearNullSpace line 8191 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [1]PETSC ERROR: [1] PetscSplitOwnership line 80 /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > [1]PETSC ERROR: [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation line 3502 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatSetUp_MPIAIJ line 2152 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [1] PetscLayoutSetUp line 129 /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation_MPIAIJ line 2767 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatSetUp line 727 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatCreate_SeqAIJ line 3956 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation line 3502 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [1]PETSC ERROR: [1] MatSetUp_MPIAIJ line 2152 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatSetType line 44 /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > [0]PETSC ERROR: [0] MatCreateSeqAIJWithArrays line 4295 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > [1]PETSC ERROR: [1] MatSetUp line 727 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > [1]PETSC ERROR: [1] MatCreate_SeqAIJ line 3956 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [1]PETSC ERROR: [1] MatSetType line 44 /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > [1]PETSC ERROR: [1] MatCreateSeqAIJWithArrays line 4295 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [0]PETSC ERROR: Signal received > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: #12 User provided function() line 0 in unknown file > Petsc Release Version 3.7.2, Jun, 05, 2016 > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > [1]PETSC ERROR: #12 User provided function() line 0 in unknown file > application called MPI_Abort(comm=0x84000004, 59) - process 0 > [cli_0]: aborting job: > application called MPI_Abort(comm=0x84000004, 59) - process 0 > application called MPI_Abort(comm=0x84000002, 59) - process 1 > [cli_1]: aborting job: > application called MPI_Abort(comm=0x84000002, 59) - process 1 > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = PID 10266 RUNNING AT cinci > = EXIT CODE: 59 > = CLEANING UP REMAINING PROCESSES > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > =================================================================================== > > > On Mon, Sep 26, 2016 at 3:51 PM, Manuel Valera wrote: > Ok, i created a tiny testcase just for this, > > The output from n# calls are as follows: > > n1: > Mat Object: 1 MPI processes > type: mpiaij > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > row 2: (0, 4.) (1, 3.) (2, 1.) (3, 2.) > row 3: (0, 3.) (1, 4.) (2, 2.) (3, 1.) > > n2: > Mat Object: 2 MPI processes > type: mpiaij > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 3: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > n4: > Mat Object: 4 MPI processes > type: mpiaij > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 1: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > row 3: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > > It really gets messed, no idea what's happening. > > > > > On Mon, Sep 26, 2016 at 3:12 PM, Barry Smith wrote: > > > On Sep 26, 2016, at 5:07 PM, Manuel Valera wrote: > > > > Ok i was using a big matrix before, from a smaller testcase i got the output and effectively, it looks like is not well read at all, results are attached for DRAW viewer, output is too big to use STDOUT even in the small testcase. n# is the number of processors requested. > > You need to construct a very small test case so you can determine why the values do not end up where you expect them. There is no way around it. > > > > is there a way to create the matrix in one node and the distribute it as needed on the rest ? maybe that would work. > > No the is not scalable. You become limited by the memory of the one node. > > > > > Thanks > > > > On Mon, Sep 26, 2016 at 2:40 PM, Barry Smith wrote: > > > > How large is the matrix? It will take a very long time if the matrix is large. Debug with a very small matrix. > > > > Barry > > > > > On Sep 26, 2016, at 4:34 PM, Manuel Valera wrote: > > > > > > Indeed there is something wrong with that call, it hangs out indefinitely showing only: > > > > > > Mat Object: 1 MPI processes > > > type: mpiaij > > > > > > It draws my attention that this program works for 1 processor but not more, but it doesnt show anything for that viewer in either case. > > > > > > Thanks for the insight on the redundant calls, this is not very clear on documentation, which calls are included in others. > > > > > > > > > > > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith wrote: > > > > > > The call to MatCreateMPIAIJWithArrays() is likely interpreting the values you pass in different than you expect. > > > > > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > > > > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera wrote: > > > > > > > > Hello, > > > > > > > > I'm working on solve a linear system in parallel, following ex12 of the ksp tutorial i don't see major complication on doing so, so for a working linear system solver with PCJACOBI and KSPGCR i did only the following changes: > > > > > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > > > call MatSetFromOptions(Ap,ierr) > > > > > > Note that none of the lines above are needed (or do anything) because the MatCreateMPIAIJWithArrays() creates the matrix from scratch itself. > > > > > > Barry > > > > > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real(nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > > > > > > > I grayed out the changes from sequential implementation. > > > > > > > > So, it does not complain at runtime until it reaches KSPSolve(), with the following error: > > > > > > > > > > > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [1]PETSC ERROR: Object is in wrong state > > > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > [1]PETSC ERROR: ./solvelinearmgPETSc ? ? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 13:35:15 2016 > > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > > [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > At line 333 of file solvelinearmgPETSc.f90 > > > > Fortran runtime error: Array bound mismatch for dimension 1 of array 'sol' (213120/106560) > > > > > > > > > > > > This code works for -n 1 cores, but it gives this error when using more than one core. > > > > > > > > What am i missing? > > > > > > > > Regards, > > > > > > > > Manuel. > > > > > > > > > > > > > > > > > > > > > > > > From mvalera at mail.sdsu.edu Tue Sep 27 15:13:28 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Tue, 27 Sep 2016 13:13:28 -0700 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: <5F3C5343-DF36-4121-ADF0-9D3224CC89D9@mcs.anl.gov> References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> <174EAC3B-DA31-4FEA-8321-FE7000E74D41@mcs.anl.gov> <5F3C5343-DF36-4121-ADF0-9D3224CC89D9@mcs.anl.gov> Message-ID: Barry, thanks for your insight, This standalone script must be translated into a much bigger model, which uses AIJ matrices to define the laplacian in the form of the 3 usual arrays, the ascii files in the script take the place of the arrays which are passed to the solving routine in the model. So, can i use the approach you mention to create the MPIAIJ from the petsc binary file ? would this be a better solution than reading the three arrays directly? In the model, even the smallest matrix is 10^5x10^5 elements Thanks. On Tue, Sep 27, 2016 at 12:53 PM, Barry Smith wrote: > > Are you loading a matrix from an ASCII file? If so don't do that. You > should write a simple sequential PETSc program that reads in the ASCII file > and saves the matrix as a PETSc binary file with MatView(). Then write your > parallel code that reads in the binary file with MatLoad() and solves the > system. You can read in the right hand side from ASCII and save it in the > binary file also. Trying to read an ASCII file in parallel and set it into > a PETSc parallel matrix is just a totally thankless task that is > unnecessary. > > Barry > > > On Sep 26, 2016, at 6:40 PM, Manuel Valera > wrote: > > > > Ok, last output was from simulated multicores, in an actual cluster the > errors are of the kind: > > > > [valera at cinci CSRMatrix]$ petsc -n 2 ./solvelinearmgPETSc > > TrivSoln loaded, size: 4 / 4 > > TrivSoln loaded, size: 4 / 4 > > RHS loaded, size: 4 / 4 > > RHS loaded, size: 4 / 4 > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Argument out of range > > [0]PETSC ERROR: Comm must be of size 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: Argument out of range > > [1]PETSC ERROR: Comm must be of size 1 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: #2 MatSetType() line 94 in /home/valera/petsc-3.7.2/src/ > mat/interface/matreg.c > > [1]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > local size: 2 > > local size: 2 > > Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran > --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: #2 MatSetType() line 94 in /home/valera/petsc-3.7.2/src/ > mat/interface/matreg.c > > [0]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: [0]PETSC ERROR: Nonconforming object sizes > > [0]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, > my local length 4 > > likely a call to VecSetSizes() or MatSetSizes() is wrong. > > See http://www.mcs.anl.gov/petsc/documentation/faq.html#split > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > Nonconforming object sizes > > [1]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, > my local length 4 > > likely a call to VecSetSizes() or MatSetSizes() is wrong. > > See http://www.mcs.anl.gov/petsc/documentation/faq.html#split > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: #4 PetscSplitOwnership() line 93 in > /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #4 PetscSplitOwnership() line 93 in > /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > [0]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in > /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > [0]PETSC ERROR: #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in > /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > [1]PETSC ERROR: [0]PETSC ERROR: #7 MatMPIAIJSetPreallocation() line 3505 > in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1]PETSC ERROR: [0]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > #7 MatMPIAIJSetPreallocation() line 3505 in /home/valera/petsc-3.7.2/src/ > mat/impls/aij/mpi/mpiaij.c > > [1]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > [1]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatSetNearNullSpace() > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > Object is in wrong state > > [1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatSetNearNullSpace() > > [1]PETSC ERROR: [0]PETSC ERROR: Configure options --with-cc=gcc > --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble > shooting. > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatAssemblyBegin() > > [0]PETSC ERROR: [1]PETSC ERROR: Object is in wrong state > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatAssemblyBegin() > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran > --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [1]PETSC ERROR: ------------------------------ > ------------------------------------------ > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [1]PETSC ERROR: [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind > > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > [1]PETSC ERROR: likely location of problem given in stack below > > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > [0]PETSC ERROR: [1]PETSC ERROR: Note: The EXACT line numbers in the > stack are not available, > > [1]PETSC ERROR: INSTEAD the line number of the start of the > function > > is given. > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5185 > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: [1]PETSC ERROR: is given. > > [1]PETSC ERROR: [1] MatAssemblyEnd line 5185 > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0] MatAssemblyBegin line 5090 /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetNearNullSpace line 8191 > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: [1]PETSC ERROR: [1] MatAssemblyBegin line 5090 > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [1]PETSC ERROR: [0] PetscSplitOwnership line 80 > /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 > /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation_MPIAIJ line 2767 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1] MatSetNearNullSpace line 8191 /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > [1]PETSC ERROR: [1] PetscSplitOwnership line 80 > /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > [1]PETSC ERROR: [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation line 3502 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatSetUp_MPIAIJ line 2152 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1] PetscLayoutSetUp line 129 /home/valera/petsc-3.7.2/src/ > vec/is/utils/pmap.c > > [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation_MPIAIJ line 2767 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatSetUp line 727 /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatCreate_SeqAIJ line 3956 > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation line 3502 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1]PETSC ERROR: [1] MatSetUp_MPIAIJ line 2152 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatSetType line 44 /home/valera/petsc-3.7.2/src/ > mat/interface/matreg.c > > [0]PETSC ERROR: [0] MatCreateSeqAIJWithArrays line 4295 > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: [1] MatSetUp line 727 /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > [1]PETSC ERROR: [1] MatCreate_SeqAIJ line 3956 > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [1]PETSC ERROR: [1] MatSetType line 44 /home/valera/petsc-3.7.2/src/ > mat/interface/matreg.c > > [1]PETSC ERROR: [1] MatCreateSeqAIJWithArrays line 4295 > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: Signal received > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [1]PETSC ERROR: #12 User provided function() line 0 in unknown file > > Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a arch-linux2-c-debug > named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #12 User provided function() line 0 in unknown file > > application called MPI_Abort(comm=0x84000004, 59) - process 0 > > [cli_0]: aborting job: > > application called MPI_Abort(comm=0x84000004, 59) - process 0 > > application called MPI_Abort(comm=0x84000002, 59) - process 1 > > [cli_1]: aborting job: > > application called MPI_Abort(comm=0x84000002, 59) - process 1 > > > > ============================================================ > ======================= > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > = PID 10266 RUNNING AT cinci > > = EXIT CODE: 59 > > = CLEANING UP REMAINING PROCESSES > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > ============================================================ > ======================= > > > > > > On Mon, Sep 26, 2016 at 3:51 PM, Manuel Valera > wrote: > > Ok, i created a tiny testcase just for this, > > > > The output from n# calls are as follows: > > > > n1: > > Mat Object: 1 MPI processes > > type: mpiaij > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > row 2: (0, 4.) (1, 3.) (2, 1.) (3, 2.) > > row 3: (0, 3.) (1, 4.) (2, 2.) (3, 1.) > > > > n2: > > Mat Object: 2 MPI processes > > type: mpiaij > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 3: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > > > n4: > > Mat Object: 4 MPI processes > > type: mpiaij > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 1: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 3: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > > > > > > It really gets messed, no idea what's happening. > > > > > > > > > > On Mon, Sep 26, 2016 at 3:12 PM, Barry Smith wrote: > > > > > On Sep 26, 2016, at 5:07 PM, Manuel Valera > wrote: > > > > > > Ok i was using a big matrix before, from a smaller testcase i got the > output and effectively, it looks like is not well read at all, results are > attached for DRAW viewer, output is too big to use STDOUT even in the small > testcase. n# is the number of processors requested. > > > > You need to construct a very small test case so you can determine why > the values do not end up where you expect them. There is no way around it. > > > > > > is there a way to create the matrix in one node and the distribute it > as needed on the rest ? maybe that would work. > > > > No the is not scalable. You become limited by the memory of the one > node. > > > > > > > > Thanks > > > > > > On Mon, Sep 26, 2016 at 2:40 PM, Barry Smith > wrote: > > > > > > How large is the matrix? It will take a very long time if the > matrix is large. Debug with a very small matrix. > > > > > > Barry > > > > > > > On Sep 26, 2016, at 4:34 PM, Manuel Valera > wrote: > > > > > > > > Indeed there is something wrong with that call, it hangs out > indefinitely showing only: > > > > > > > > Mat Object: 1 MPI processes > > > > type: mpiaij > > > > > > > > It draws my attention that this program works for 1 processor but > not more, but it doesnt show anything for that viewer in either case. > > > > > > > > Thanks for the insight on the redundant calls, this is not very > clear on documentation, which calls are included in others. > > > > > > > > > > > > > > > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith > wrote: > > > > > > > > The call to MatCreateMPIAIJWithArrays() is likely interpreting > the values you pass in different than you expect. > > > > > > > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after > the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > > > > > > > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera > wrote: > > > > > > > > > > Hello, > > > > > > > > > > I'm working on solve a linear system in parallel, following ex12 > of the ksp tutorial i don't see major complication on doing so, so for a > working linear system solver with PCJACOBI and KSPGCR i did only the > following changes: > > > > > > > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > > > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > > > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > > > > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > > > > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > > > > call MatSetFromOptions(Ap,ierr) > > > > > > > > Note that none of the lines above are needed (or do anything) > because the MatCreateMPIAIJWithArrays() creates the matrix from scratch > itself. > > > > > > > > Barry > > > > > > > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp, > iapi,japi,app,Ap,ierr) > > > > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real( > nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > > > > > > > > > > I grayed out the changes from sequential implementation. > > > > > > > > > > So, it does not complain at runtime until it reaches KSPSolve(), > with the following error: > > > > > > > > > > > > > > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [1]PETSC ERROR: Object is in wrong state > > > > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > ? ? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 > 13:35:15 2016 > > > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in > /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > > > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > > > [1]PETSC ERROR: #4 PCSetUp() line 968 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 > in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > > > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > [1]PETSC ERROR: #9 KSPSolve() line 600 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > At line 333 of file solvelinearmgPETSc.f90 > > > > > Fortran runtime error: Array bound mismatch for dimension 1 of > array 'sol' (213120/106560) > > > > > > > > > > > > > > > This code works for -n 1 cores, but it gives this error when using > more than one core. > > > > > > > > > > What am i missing? > > > > > > > > > > Regards, > > > > > > > > > > Manuel. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 27 18:07:22 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 27 Sep 2016 18:07:22 -0500 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> <174EAC3B-DA31-4FEA-8321-FE7000E74D41@mcs.anl.gov> <5F3C5343-DF36-4121-ADF0-9D3224CC89D9@mcs.anl.gov> Message-ID: Yes, always use the binary file > On Sep 27, 2016, at 3:13 PM, Manuel Valera wrote: > > Barry, thanks for your insight, > > This standalone script must be translated into a much bigger model, which uses AIJ matrices to define the laplacian in the form of the 3 usual arrays, the ascii files in the script take the place of the arrays which are passed to the solving routine in the model. > > So, can i use the approach you mention to create the MPIAIJ from the petsc binary file ? would this be a better solution than reading the three arrays directly? In the model, even the smallest matrix is 10^5x10^5 elements > > Thanks. > > > On Tue, Sep 27, 2016 at 12:53 PM, Barry Smith wrote: > > Are you loading a matrix from an ASCII file? If so don't do that. You should write a simple sequential PETSc program that reads in the ASCII file and saves the matrix as a PETSc binary file with MatView(). Then write your parallel code that reads in the binary file with MatLoad() and solves the system. You can read in the right hand side from ASCII and save it in the binary file also. Trying to read an ASCII file in parallel and set it into a PETSc parallel matrix is just a totally thankless task that is unnecessary. > > Barry > > > On Sep 26, 2016, at 6:40 PM, Manuel Valera wrote: > > > > Ok, last output was from simulated multicores, in an actual cluster the errors are of the kind: > > > > [valera at cinci CSRMatrix]$ petsc -n 2 ./solvelinearmgPETSc > > TrivSoln loaded, size: 4 / 4 > > TrivSoln loaded, size: 4 / 4 > > RHS loaded, size: 4 / 4 > > RHS loaded, size: 4 / 4 > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Argument out of range > > [0]PETSC ERROR: Comm must be of size 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: Argument out of range > > [1]PETSC ERROR: Comm must be of size 1 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: #2 MatSetType() line 94 in /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > > [1]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > local size: 2 > > local size: 2 > > Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: #2 MatSetType() line 94 in /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > > [0]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: [0]PETSC ERROR: Nonconforming object sizes > > [0]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, my local length 4 > > likely a call to VecSetSizes() or MatSetSizes() is wrong. > > See http://www.mcs.anl.gov/petsc/documentation/faq.html#split > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > Nonconforming object sizes > > [1]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, my local length 4 > > likely a call to VecSetSizes() or MatSetSizes() is wrong. > > See http://www.mcs.anl.gov/petsc/documentation/faq.html#split > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: #4 PetscSplitOwnership() line 93 in /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #4 PetscSplitOwnership() line 93 in /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > [0]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > [0]PETSC ERROR: #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > [1]PETSC ERROR: [0]PETSC ERROR: #7 MatMPIAIJSetPreallocation() line 3505 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1]PETSC ERROR: [0]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > #7 MatMPIAIJSetPreallocation() line 3505 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [1]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatSetNearNullSpace() > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > Object is in wrong state > > [1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatSetNearNullSpace() > > [1]PETSC ERROR: [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatAssemblyBegin() > > [0]PETSC ERROR: [1]PETSC ERROR: Object is in wrong state > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatAssemblyBegin() > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > [1]PETSC ERROR: ------------------------------------------------------------------------ > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > [1]PETSC ERROR: [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [1]PETSC ERROR: likely location of problem given in stack below > > [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > [0]PETSC ERROR: [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [1]PETSC ERROR: INSTEAD the line number of the start of the function > > is given. > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5185 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: [1]PETSC ERROR: is given. > > [1]PETSC ERROR: [1] MatAssemblyEnd line 5185 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0] MatAssemblyBegin line 5090 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetNearNullSpace line 8191 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: [1]PETSC ERROR: [1] MatAssemblyBegin line 5090 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [1]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation_MPIAIJ line 2767 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1] MatSetNearNullSpace line 8191 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [1]PETSC ERROR: [1] PetscSplitOwnership line 80 /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > [1]PETSC ERROR: [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation line 3502 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatSetUp_MPIAIJ line 2152 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1] PetscLayoutSetUp line 129 /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation_MPIAIJ line 2767 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatSetUp line 727 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatCreate_SeqAIJ line 3956 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation line 3502 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [1]PETSC ERROR: [1] MatSetUp_MPIAIJ line 2152 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatSetType line 44 /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > > [0]PETSC ERROR: [0] MatCreateSeqAIJWithArrays line 4295 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: [1] MatSetUp line 727 /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > [1]PETSC ERROR: [1] MatCreate_SeqAIJ line 3956 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [1]PETSC ERROR: [1] MatSetType line 44 /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > > [1]PETSC ERROR: [1] MatCreateSeqAIJWithArrays line 4295 /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [0]PETSC ERROR: Signal received > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: #12 User provided function() line 0 in unknown file > > Petsc Release Version 3.7.2, Jun, 05, 2016 > > [1]PETSC ERROR: ./solvelinearmgPETSc P on a arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > [1]PETSC ERROR: #12 User provided function() line 0 in unknown file > > application called MPI_Abort(comm=0x84000004, 59) - process 0 > > [cli_0]: aborting job: > > application called MPI_Abort(comm=0x84000004, 59) - process 0 > > application called MPI_Abort(comm=0x84000002, 59) - process 1 > > [cli_1]: aborting job: > > application called MPI_Abort(comm=0x84000002, 59) - process 1 > > > > =================================================================================== > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > = PID 10266 RUNNING AT cinci > > = EXIT CODE: 59 > > = CLEANING UP REMAINING PROCESSES > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > =================================================================================== > > > > > > On Mon, Sep 26, 2016 at 3:51 PM, Manuel Valera wrote: > > Ok, i created a tiny testcase just for this, > > > > The output from n# calls are as follows: > > > > n1: > > Mat Object: 1 MPI processes > > type: mpiaij > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > row 2: (0, 4.) (1, 3.) (2, 1.) (3, 2.) > > row 3: (0, 3.) (1, 4.) (2, 2.) (3, 1.) > > > > n2: > > Mat Object: 2 MPI processes > > type: mpiaij > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 3: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > > > n4: > > Mat Object: 4 MPI processes > > type: mpiaij > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 1: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > row 3: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > > > > > > It really gets messed, no idea what's happening. > > > > > > > > > > On Mon, Sep 26, 2016 at 3:12 PM, Barry Smith wrote: > > > > > On Sep 26, 2016, at 5:07 PM, Manuel Valera wrote: > > > > > > Ok i was using a big matrix before, from a smaller testcase i got the output and effectively, it looks like is not well read at all, results are attached for DRAW viewer, output is too big to use STDOUT even in the small testcase. n# is the number of processors requested. > > > > You need to construct a very small test case so you can determine why the values do not end up where you expect them. There is no way around it. > > > > > > is there a way to create the matrix in one node and the distribute it as needed on the rest ? maybe that would work. > > > > No the is not scalable. You become limited by the memory of the one node. > > > > > > > > Thanks > > > > > > On Mon, Sep 26, 2016 at 2:40 PM, Barry Smith wrote: > > > > > > How large is the matrix? It will take a very long time if the matrix is large. Debug with a very small matrix. > > > > > > Barry > > > > > > > On Sep 26, 2016, at 4:34 PM, Manuel Valera wrote: > > > > > > > > Indeed there is something wrong with that call, it hangs out indefinitely showing only: > > > > > > > > Mat Object: 1 MPI processes > > > > type: mpiaij > > > > > > > > It draws my attention that this program works for 1 processor but not more, but it doesnt show anything for that viewer in either case. > > > > > > > > Thanks for the insight on the redundant calls, this is not very clear on documentation, which calls are included in others. > > > > > > > > > > > > > > > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith wrote: > > > > > > > > The call to MatCreateMPIAIJWithArrays() is likely interpreting the values you pass in different than you expect. > > > > > > > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) after the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > > > > > > > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera wrote: > > > > > > > > > > Hello, > > > > > > > > > > I'm working on solve a linear system in parallel, following ex12 of the ksp tutorial i don't see major complication on doing so, so for a working linear system solver with PCJACOBI and KSPGCR i did only the following changes: > > > > > > > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > > > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > > > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > > > > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > > > > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > > > > call MatSetFromOptions(Ap,ierr) > > > > > > > > Note that none of the lines above are needed (or do anything) because the MatCreateMPIAIJWithArrays() creates the matrix from scratch itself. > > > > > > > > Barry > > > > > > > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real(nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > > > > > > > > > > I grayed out the changes from sequential implementation. > > > > > > > > > > So, it does not complain at runtime until it reaches KSPSolve(), with the following error: > > > > > > > > > > > > > > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [1]PETSC ERROR: Object is in wrong state > > > > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > > [1]PETSC ERROR: ./solvelinearmgPETSc ? ? on a arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 13:35:15 2016 > > > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 --download-ml?=1 > > > > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > > > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > > > [1]PETSC ERROR: #4 PCSetUp() line 968 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line 650 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > > > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > [1]PETSC ERROR: #9 KSPSolve() line 600 in /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > At line 333 of file solvelinearmgPETSc.f90 > > > > > Fortran runtime error: Array bound mismatch for dimension 1 of array 'sol' (213120/106560) > > > > > > > > > > > > > > > This code works for -n 1 cores, but it gives this error when using more than one core. > > > > > > > > > > What am i missing? > > > > > > > > > > Regards, > > > > > > > > > > Manuel. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From ztdepyahoo at 163.com Tue Sep 27 22:31:12 2016 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Wed, 28 Sep 2016 11:31:12 +0800 (CST) Subject: [petsc-users] Slep test error Message-ID: <3736de11.5abf.1576ed89d5d.Coremail.ztdepyahoo@163.com> Dear professor: I have sucessfully make the slep 3.7.2 package. But the final step "make test" gives me the following error: I do not the reason. make test makefile:31: /lib/slepc/conf/slepc_common: No such file or directory make: *** No rule to make target '/lib/slepc/conf/slepc_common'. Stop. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Wed Sep 28 01:13:14 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 28 Sep 2016 08:13:14 +0200 Subject: [petsc-users] Slep test error In-Reply-To: <3736de11.5abf.1576ed89d5d.Coremail.ztdepyahoo@163.com> References: <3736de11.5abf.1576ed89d5d.Coremail.ztdepyahoo@163.com> Message-ID: > El 28 sept 2016, a las 5:31, ??? escribi?: > > Dear professor: > I have sucessfully make the slep 3.7.2 package. But the final step "make test" gives me the following error: > I do not the reason. > > make test > makefile:31: /lib/slepc/conf/slepc_common: No such file or directory > make: *** No rule to make target '/lib/slepc/conf/slepc_common'. Stop. > > Probably you forgot to export the PETSC_ARCH variable. Follow the instructions in section 1.2.1 of the users manual. Jose From bknaepen at ulb.ac.be Wed Sep 28 13:44:06 2016 From: bknaepen at ulb.ac.be (Bernard Knaepen) Date: Wed, 28 Sep 2016 20:44:06 +0200 Subject: [petsc-users] MATELEMENTAL strange behaviour Message-ID: <76625A6D-63A1-4289-B351-FB39C888843E@ulb.ac.be> Hello, Here is a trimmed down piece of code that illustrates a strange behaviour that I am desperately trying to understand. The problem is that I would expect successive calls to MatView(C,?) to display the same numbers. However, the numbers displayed are different, even though the content of the matrix C should be the same. Any help to resolve this would be appreciated. Cheers, Bernard. Code: ****** static char help[] = "Elemental test\n\n"; /*T T*/ #include PetscScalar pi=3.141592653589793; /* Prototypes */ PetscErrorCode matrixC(Mat C, PetscScalar t); #undef __FUNCT__ #define __FUNCT__ "main" int main(int argc, char **args){ Mat C; PetscInt N = 8; PetscScalar t=0.1, dt=0.0001; PetscErrorCode ierr; /* parameters */ ierr = PetscInitialize(&argc, &args, (char*) 0, help) ;CHKERRQ(ierr); ierr = MatCreate(PETSC_COMM_WORLD, &C); CHKERRQ(ierr); ierr = MatSetSizes(C, PETSC_DECIDE, PETSC_DECIDE, N, N); CHKERRQ(ierr); ierr = MatSetType(C, MATELEMENTAL); CHKERRQ(ierr); ierr = MatSetFromOptions(C); CHKERRQ(ierr); ierr = MatSetUp(C); CHKERRQ(ierr); ierr=matrixC(C,t); ierr = MatView(C,PETSC_VIEWER_STDOUT_WORLD); ierr=matrixC(C,t); ierr = MatView(C,PETSC_VIEWER_STDOUT_WORLD); ierr = MatDestroy(&C);CHKERRQ(ierr); ierr = PetscFinalize(); return ierr; } /* Matrix C*/ PetscErrorCode matrixC(Mat C, PetscScalar t){ PetscErrorCode ierr; IS isrows,iscols; const PetscInt *rows,*cols; PetscScalar *v; PetscInt i,j,nrows,ncols; PetscInt n,m; /* Set local matrix entries */ ierr = MatGetOwnershipIS(C,&isrows,&iscols);CHKERRQ(ierr); ierr = ISGetLocalSize(isrows,&nrows);CHKERRQ(ierr); ierr = ISGetIndices(isrows,&rows);CHKERRQ(ierr); ierr = ISGetLocalSize(iscols,&ncols);CHKERRQ(ierr); ierr = ISGetIndices(iscols,&cols);CHKERRQ(ierr); ierr = PetscMalloc1(nrows*ncols,&v);CHKERRQ(ierr); for (i=0; i From hzhang at mcs.anl.gov Wed Sep 28 20:04:58 2016 From: hzhang at mcs.anl.gov (Hong) Date: Wed, 28 Sep 2016 20:04:58 -0500 Subject: [petsc-users] MATELEMENTAL strange behaviour In-Reply-To: <76625A6D-63A1-4289-B351-FB39C888843E@ulb.ac.be> References: <76625A6D-63A1-4289-B351-FB39C888843E@ulb.ac.be> Message-ID: Bernard: With your code, I reproduced error and found that the default MAT_ROW_ORIENTED for MatSetValues() is changed to MAT_COLUMN_ORIENTED. This is a bug in our library. I'll fix it. You can add MatSetOption(C,MAT_ROW_ORIENTED,PETSC_TRUE); before 2nd call of matrixC(C,t). Thanks for reporting the bug. Hong Hello, > > Here is a trimmed down piece of code that illustrates a strange behaviour > that I am desperately trying to understand. > > The problem is that I would expect successive calls to MatView(C,?) to > display the same numbers. However, the numbers displayed are different, > even though the content of the matrix C should be the same. > > Any help to resolve this would be appreciated. > > Cheers, > Bernard. > > Code: > ****** > > static char help[] = "Elemental test\n\n"; > > /*T > > T*/ > > #include > > PetscScalar pi=3.141592653589793; > > /* Prototypes */ > PetscErrorCode matrixC(Mat C, PetscScalar t); > > #undef __FUNCT__ > #define __FUNCT__ "main" > int main(int argc, char **args){ > > Mat C; > > PetscInt N = 8; > PetscScalar t=0.1, dt=0.0001; > PetscErrorCode ierr; > > /* parameters */ > ierr = PetscInitialize(&argc, &args, (char*) 0, help) ;CHKERRQ(ierr); > > ierr = MatCreate(PETSC_COMM_WORLD, &C); CHKERRQ(ierr); > ierr = MatSetSizes(C, PETSC_DECIDE, PETSC_DECIDE, N, N); CHKERRQ(ierr); > ierr = MatSetType(C, MATELEMENTAL); CHKERRQ(ierr); > ierr = MatSetFromOptions(C); CHKERRQ(ierr); > ierr = MatSetUp(C); CHKERRQ(ierr); > > ierr=matrixC(C,t); > ierr = MatView(C,PETSC_VIEWER_STDOUT_WORLD); > > ierr=matrixC(C,t); > ierr = MatView(C,PETSC_VIEWER_STDOUT_WORLD); > > ierr = MatDestroy(&C);CHKERRQ(ierr); > > ierr = PetscFinalize(); > > return ierr; > } > > /* Matrix C*/ > PetscErrorCode matrixC(Mat C, PetscScalar t){ > > PetscErrorCode ierr; > IS isrows,iscols; > const PetscInt *rows,*cols; > PetscScalar *v; > PetscInt i,j,nrows,ncols; > PetscInt n,m; > > /* Set local matrix entries */ > ierr = MatGetOwnershipIS(C,&isrows,&iscols);CHKERRQ(ierr); > ierr = ISGetLocalSize(isrows,&nrows);CHKERRQ(ierr); > ierr = ISGetIndices(isrows,&rows);CHKERRQ(ierr); > ierr = ISGetLocalSize(iscols,&ncols);CHKERRQ(ierr); > ierr = ISGetIndices(iscols,&cols);CHKERRQ(ierr); > ierr = PetscMalloc1(nrows*ncols,&v);CHKERRQ(ierr); > > for (i=0; i n=rows[i]; > for (j=0; j m=cols[j]; > v[i*ncols+j] = -0.5*(exp(-(m+n+1.5)*(m+n+1.5) > *pi*pi*t)-exp(-(m-n+0.5)*(m-n+0.5)*pi*pi*t)); > } > } > > ierr = MatSetValues(C,nrows,rows,ncols,cols,v,INSERT_VALUES); > CHKERRQ(ierr); > ierr = MatAssemblyBegin(C,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(C,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = ISRestoreIndices(isrows,&rows);CHKERRQ(ierr); > ierr = ISRestoreIndices(iscols,&cols);CHKERRQ(ierr); > ierr = ISDestroy(&isrows);CHKERRQ(ierr); > ierr = ISDestroy(&iscols);CHKERRQ(ierr); > ierr = PetscFree(v);CHKERRQ(ierr); > > PetscFunctionReturn(0); > } > > Output: > ******** > > mpirun -n 2 ./elemental > > Mat Object: 2 MPI processes > type: elemental > Elemental matrix (cyclic ordering) > 0.336403 2.8059e-06 0.0532215 1.04511e-09 0.00104438 > 0.00104718 0.390672 0.0542687 0.0542687 0.390672 > 0.389625 0.00104718 0.390669 2.80695e-06 0.0542687 > 2.80695e-06 0.390672 0.00104718 0.390672 0.0542687 > 0.0542659 0.0542687 0.390672 0.00104718 0.390672 > > Elemental matrix (explicit ordering) > Mat Object: 2 MPI processes > type: mpidense > 3.3640319378153799e-01 5.3221486768492490e-02 1.0443777750910219e-03 > 2.8059034408741489e-06 1.0451056906250189e-09 > 3.8962468055003047e-01 3.9066905832512150e-01 5.4268670447024388e-02 > 1.0471847236375868e-03 2.8069486006233780e-06 > 5.4265864543583515e-02 3.9067186422856237e-01 3.9067186527366804e-01 > 5.4268671492184138e-02 1.0471847236916457e-03 > 1.0471836785318962e-03 5.4268671492130077e-02 3.9067186527372211e-01 > 3.9067186527372211e-01 5.4268671492184138e-02 > 2.8069485465647739e-06 1.0471847236916453e-03 5.4268671492184138e-02 > 3.9067186527372211e-01 3.9067186527372211e-01 > Mat Object: 2 MPI processes > type: elemental > Elemental matrix (cyclic ordering) > 0.336403 2.8059e-06 0.390672 2.80695e-06 0.00104718 > 0.0532215 1.04511e-09 0.389625 0.390672 0.0542687 > 0.00104438 0.390672 0.390669 0.390672 0.0542659 > 0.00104718 0.0542687 0.0542687 0.0542687 0.390672 > 0.0542687 0.00104718 2.80695e-06 0.00104718 0.390672 > > Elemental matrix (explicit ordering) > Mat Object: 2 MPI processes > type: mpidense > 3.3640319378153799e-01 3.9067186527372211e-01 1.0471847236916453e-03 > 2.8059034408741489e-06 2.8069486006233780e-06 > 1.0443777750910219e-03 3.9066905832512150e-01 5.4265864543583515e-02 > 3.9067186527372211e-01 3.9067186527372211e-01 > 5.4268671492130077e-02 2.8069485465647739e-06 3.9067186527366804e-01 > 1.0471847236375868e-03 1.0471847236916457e-03 > 5.3221486768492490e-02 3.8962468055003047e-01 5.4268671492184138e-02 > 1.0451056906250189e-09 3.9067186527372211e-01 > 1.0471836785318962e-03 5.4268670447024388e-02 3.9067186422856237e-01 > 5.4268671492184138e-02 5.4268671492184138e-02 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gotofd at gmail.com Thu Sep 29 03:56:20 2016 From: gotofd at gmail.com (Ji Zhang) Date: Thu, 29 Sep 2016 16:56:20 +0800 Subject: [petsc-users] Why the code needs much longer time when I use MPI Message-ID: Dear all, I'm using KSP included in PETSc to solve some linear equations. The code runs well if I do not use mpirun. However, the running duration since increase linearly with the number of CUPs. The solve method is gmres and without precondition method. For a same case, it need about 4.9s with 'mpirun -n 1', which is about 7.9s if I use 'mpirun -n 2', and 12.5s when I use 'mpirun -n 4'. It looks like that I need double time when I use double CUPs. Is there any one could give me some suggestion? Thanks. Best, Regards, Zhang Ji, PhD student Beijing Computational Science Research Center Zhongguancun Software Park II, No. 10 Dongbeiwang West Road, Haidian District, Beijing 100193, China -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Thu Sep 29 04:12:33 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Thu, 29 Sep 2016 11:12:33 +0200 Subject: [petsc-users] Why the code needs much longer time when I use MPI In-Reply-To: References: Message-ID: Is the number of iterations to convergence also increasing with the number of processors? A couple of possibly relevant FAQs: http://www.mcs.anl.gov/petsc/documentation/faq.html#differentiterations http://www.mcs.anl.gov/petsc/documentation/faq.html#slowerparallel On Thu, Sep 29, 2016 at 10:56 AM, Ji Zhang wrote: > Dear all, > I'm using KSP included in PETSc to solve some linear equations. The code > runs well if I do not use mpirun. However, the running duration since > increase linearly with the number of CUPs. > > The solve method is gmres and without precondition method. For a same case, > it need about 4.9s with 'mpirun -n 1', which is about 7.9s if I use 'mpirun > -n 2', and 12.5s when I use 'mpirun -n 4'. It looks like that I need double > time when I use double CUPs. Is there any one could give me some suggestion? > Thanks. > > Best, > Regards, > Zhang Ji, PhD student > Beijing Computational Science Research Center > Zhongguancun Software Park II, No. 10 Dongbeiwang West Road, Haidian > District, Beijing 100193, China From hzhang at mcs.anl.gov Thu Sep 29 10:57:30 2016 From: hzhang at mcs.anl.gov (Hong) Date: Thu, 29 Sep 2016 10:57:30 -0500 Subject: [petsc-users] MATELEMENTAL strange behaviour In-Reply-To: References: <76625A6D-63A1-4289-B351-FB39C888843E@ulb.ac.be> Message-ID: Bernard, The bug is fixed in branch hzhang/fix-elementalSetOption. https://bitbucket.org/petsc/petsc/commits/891a05710665f381fc66132810d0b09973b0e049 It will be merged to petsc-release after night tests. Thanks for reporting it! Hong On Wed, Sep 28, 2016 at 8:04 PM, Hong wrote: > Bernard: > With your code, I reproduced error and found that the default > MAT_ROW_ORIENTED for MatSetValues() is changed to MAT_COLUMN_ORIENTED. > This is a bug in our library. I'll fix it. > > You can add > MatSetOption(C,MAT_ROW_ORIENTED,PETSC_TRUE); > before 2nd call of matrixC(C,t). > > Thanks for reporting the bug. > > Hong > > Hello, >> >> Here is a trimmed down piece of code that illustrates a strange behaviour >> that I am desperately trying to understand. >> >> The problem is that I would expect successive calls to MatView(C,?) to >> display the same numbers. However, the numbers displayed are different, >> even though the content of the matrix C should be the same. >> >> Any help to resolve this would be appreciated. >> >> Cheers, >> Bernard. >> >> Code: >> ****** >> >> static char help[] = "Elemental test\n\n"; >> >> /*T >> >> T*/ >> >> #include >> >> PetscScalar pi=3.141592653589793; >> >> /* Prototypes */ >> PetscErrorCode matrixC(Mat C, PetscScalar t); >> >> #undef __FUNCT__ >> #define __FUNCT__ "main" >> int main(int argc, char **args){ >> >> Mat C; >> >> PetscInt N = 8; >> PetscScalar t=0.1, dt=0.0001; >> PetscErrorCode ierr; >> >> /* parameters */ >> ierr = PetscInitialize(&argc, &args, (char*) 0, help) ;CHKERRQ(ierr); >> >> ierr = MatCreate(PETSC_COMM_WORLD, &C); CHKERRQ(ierr); >> ierr = MatSetSizes(C, PETSC_DECIDE, PETSC_DECIDE, N, N); CHKERRQ(ierr); >> ierr = MatSetType(C, MATELEMENTAL); CHKERRQ(ierr); >> ierr = MatSetFromOptions(C); CHKERRQ(ierr); >> ierr = MatSetUp(C); CHKERRQ(ierr); >> >> ierr=matrixC(C,t); >> ierr = MatView(C,PETSC_VIEWER_STDOUT_WORLD); >> >> ierr=matrixC(C,t); >> ierr = MatView(C,PETSC_VIEWER_STDOUT_WORLD); >> >> ierr = MatDestroy(&C);CHKERRQ(ierr); >> >> ierr = PetscFinalize(); >> >> return ierr; >> } >> >> /* Matrix C*/ >> PetscErrorCode matrixC(Mat C, PetscScalar t){ >> >> PetscErrorCode ierr; >> IS isrows,iscols; >> const PetscInt *rows,*cols; >> PetscScalar *v; >> PetscInt i,j,nrows,ncols; >> PetscInt n,m; >> >> /* Set local matrix entries */ >> ierr = MatGetOwnershipIS(C,&isrows,&iscols);CHKERRQ(ierr); >> ierr = ISGetLocalSize(isrows,&nrows);CHKERRQ(ierr); >> ierr = ISGetIndices(isrows,&rows);CHKERRQ(ierr); >> ierr = ISGetLocalSize(iscols,&ncols);CHKERRQ(ierr); >> ierr = ISGetIndices(iscols,&cols);CHKERRQ(ierr); >> ierr = PetscMalloc1(nrows*ncols,&v);CHKERRQ(ierr); >> >> for (i=0; i> n=rows[i]; >> for (j=0; j> m=cols[j]; >> v[i*ncols+j] = -0.5*(exp(-(m+n+1.5)*(m+n+1.5) >> *pi*pi*t)-exp(-(m-n+0.5)*(m-n+0.5)*pi*pi*t)); >> } >> } >> >> ierr = MatSetValues(C,nrows,rows,ncols,cols,v,INSERT_VALUES);CHKERR >> Q(ierr); >> ierr = MatAssemblyBegin(C,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = MatAssemblyEnd(C,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> ierr = ISRestoreIndices(isrows,&rows);CHKERRQ(ierr); >> ierr = ISRestoreIndices(iscols,&cols);CHKERRQ(ierr); >> ierr = ISDestroy(&isrows);CHKERRQ(ierr); >> ierr = ISDestroy(&iscols);CHKERRQ(ierr); >> ierr = PetscFree(v);CHKERRQ(ierr); >> >> PetscFunctionReturn(0); >> } >> >> Output: >> ******** >> >> mpirun -n 2 ./elemental >> >> Mat Object: 2 MPI processes >> type: elemental >> Elemental matrix (cyclic ordering) >> 0.336403 2.8059e-06 0.0532215 1.04511e-09 0.00104438 >> 0.00104718 0.390672 0.0542687 0.0542687 0.390672 >> 0.389625 0.00104718 0.390669 2.80695e-06 0.0542687 >> 2.80695e-06 0.390672 0.00104718 0.390672 0.0542687 >> 0.0542659 0.0542687 0.390672 0.00104718 0.390672 >> >> Elemental matrix (explicit ordering) >> Mat Object: 2 MPI processes >> type: mpidense >> 3.3640319378153799e-01 5.3221486768492490e-02 1.0443777750910219e-03 >> 2.8059034408741489e-06 1.0451056906250189e-09 >> 3.8962468055003047e-01 3.9066905832512150e-01 5.4268670447024388e-02 >> 1.0471847236375868e-03 2.8069486006233780e-06 >> 5.4265864543583515e-02 3.9067186422856237e-01 3.9067186527366804e-01 >> 5.4268671492184138e-02 1.0471847236916457e-03 >> 1.0471836785318962e-03 5.4268671492130077e-02 3.9067186527372211e-01 >> 3.9067186527372211e-01 5.4268671492184138e-02 >> 2.8069485465647739e-06 1.0471847236916453e-03 5.4268671492184138e-02 >> 3.9067186527372211e-01 3.9067186527372211e-01 >> Mat Object: 2 MPI processes >> type: elemental >> Elemental matrix (cyclic ordering) >> 0.336403 2.8059e-06 0.390672 2.80695e-06 0.00104718 >> 0.0532215 1.04511e-09 0.389625 0.390672 0.0542687 >> 0.00104438 0.390672 0.390669 0.390672 0.0542659 >> 0.00104718 0.0542687 0.0542687 0.0542687 0.390672 >> 0.0542687 0.00104718 2.80695e-06 0.00104718 0.390672 >> >> Elemental matrix (explicit ordering) >> Mat Object: 2 MPI processes >> type: mpidense >> 3.3640319378153799e-01 3.9067186527372211e-01 1.0471847236916453e-03 >> 2.8059034408741489e-06 2.8069486006233780e-06 >> 1.0443777750910219e-03 3.9066905832512150e-01 5.4265864543583515e-02 >> 3.9067186527372211e-01 3.9067186527372211e-01 >> 5.4268671492130077e-02 2.8069485465647739e-06 3.9067186527366804e-01 >> 1.0471847236375868e-03 1.0471847236916457e-03 >> 5.3221486768492490e-02 3.8962468055003047e-01 5.4268671492184138e-02 >> 1.0451056906250189e-09 3.9067186527372211e-01 >> 1.0471836785318962e-03 5.4268670447024388e-02 3.9067186422856237e-01 >> 5.4268671492184138e-02 5.4268671492184138e-02 >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpraveen at gmail.com Thu Sep 29 11:57:01 2016 From: cpraveen at gmail.com (Praveen C) Date: Thu, 29 Sep 2016 22:27:01 +0530 Subject: [petsc-users] Example for hdf5 output and visualization Message-ID: Dear all Is there an example to save hdf5 file on cartesian mesh with time dependent solution, that I can visualize in VisIt ? I am able to save in hdf5 and open in VisIt but I cannot get the actual mesh coordinates or time dependent data. I have seen a script petsc_gen_xdmf.py but that needs lot of information in the hdf5 file which I do not know how to create. An example for 2d Cartesian mesh would be very useful to learn this. Thanks praveen -------------- next part -------------- An HTML attachment was scrubbed... URL: From sospinar at unal.edu.co Thu Sep 29 12:33:00 2016 From: sospinar at unal.edu.co (Santiago Ospina De Los Rios) Date: Thu, 29 Sep 2016 19:33:00 +0200 Subject: [petsc-users] Example for hdf5 output and visualization In-Reply-To: References: Message-ID: 2016-09-29 18:57 GMT+02:00 Praveen C : > Dear all > > Is there an example to save hdf5 file on cartesian mesh with time > dependent solution, that I can visualize in VisIt ? > > If you save the file with a successive numbering at the final of the name, Visit will recognize it as a time-dependent. I am able to save in hdf5 and open in VisIt but I cannot get the actual > mesh coordinates or time dependent data. > > To get the mesh you have to save a vector associated to a DM object, otherwise, you will get a row of values at the visualization. > I have seen a script petsc_gen_xdmf.py but that needs lot of information > in the hdf5 file which I do not know how to create. An example for 2d > Cartesian mesh would be very useful to learn this. > > Anyway, it's a good question: how to visualize time-dependent hdf5 files in other viewers such as Paraview just using PETSc calls. Is there a way? Santiago O. > Thanks > praveen > -- -- Att: Santiago Ospina De Los R?os National University of Colombia -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 29 12:36:41 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 Sep 2016 12:36:41 -0500 Subject: [petsc-users] Why the code needs much longer time when I use MPI In-Reply-To: References: Message-ID: <6A3E4F6F-54B3-4ABF-B791-CE5000874F1E@mcs.anl.gov> What is the output of -ksp_monitor -ksp_view Barry > On Sep 29, 2016, at 3:56 AM, Ji Zhang wrote: > > Dear all, > I'm using KSP included in PETSc to solve some linear equations. The code runs well if I do not use mpirun. However, the running duration since increase linearly with the number of CUPs. > > The solve method is gmres and without precondition method. For a same case, it need about 4.9s with 'mpirun -n 1', which is about 7.9s if I use 'mpirun -n 2', and 12.5s when I use 'mpirun -n 4'. It looks like that I need double time when I use double CUPs. Is there any one could give me some suggestion? Thanks. > > Best, > Regards, > Zhang Ji, PhD student > Beijing Computational Science Research Center > Zhongguancun Software Park II, No. 10 Dongbeiwang West Road, Haidian District, Beijing 100193, China From cpraveen at gmail.com Thu Sep 29 22:08:06 2016 From: cpraveen at gmail.com (Praveen C) Date: Fri, 30 Sep 2016 08:38:06 +0530 Subject: [petsc-users] Example for hdf5 output and visualization In-Reply-To: References: Message-ID: Hello I am creating my vector using DMDA, so the hdf file knows something about the grid. But it uses (i,j) indices as (x,y) coordinates to plot. Here is what I do ierr = DMCreateGlobalVector(da, &ug); CHKERRQ(ierr); // set values into ug ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, "sol.h5",FILE_MODE_WRITE,&viewer); CHKERRQ(ierr); ierr = PetscViewerHDF5SetTimestep(viewer, 0); CHKERRQ(ierr); ierr = VecView(ug, viewer); CHKERRQ(ierr); // update ug and save it again ierr = PetscViewerHDF5IncrementTimestep(viewer); CHKERRQ(ierr); ierr = VecView(ug, viewer); CHKERRQ(ierr); I have not saved mesh coordinates, so I can understand that it uses (i,j) coordinates. How can I save x,y coordinates and time information so that VisIt will be able to know the mesh and time information ? I can save solution at different times into different .h5 files but is that really necessary ? What I understood is that the h5 file only contains some data, and one has to create a xdmf file that tells VisIt how to use the data in the h5 file. The h5 file should contain "geometry" and "topology" sections. I could add the "geometry" section by saving the mesh coordinates. But how to add the "topology" section which contains cell information ? Is there an easy way to add this o the hdf file when I use DMDA ? Thanks praveen On Thu, Sep 29, 2016 at 11:03 PM, Santiago Ospina De Los Rios < sospinar at unal.edu.co> wrote: > > > 2016-09-29 18:57 GMT+02:00 Praveen C : > >> Dear all >> >> Is there an example to save hdf5 file on cartesian mesh with time >> dependent solution, that I can visualize in VisIt ? >> >> > If you save the file with a successive numbering at the final of the name, > Visit will recognize it as a time-dependent. > > I am able to save in hdf5 and open in VisIt but I cannot get the actual >> mesh coordinates or time dependent data. >> >> > To get the mesh you have to save a vector associated to a DM object, > otherwise, you will get a row of values at the visualization. > > >> I have seen a script petsc_gen_xdmf.py but that needs lot of information >> in the hdf5 file which I do not know how to create. An example for 2d >> Cartesian mesh would be very useful to learn this. >> >> > Anyway, it's a good question: how to visualize time-dependent hdf5 files > in other viewers such as Paraview just using PETSc calls. Is there a way? > > Santiago O. > > >> Thanks >> praveen >> > > > > -- > > -- > Att: > > Santiago Ospina De Los R?os > National University of Colombia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alena.kopanicakova13 at gmail.com Fri Sep 30 00:46:40 2016 From: alena.kopanicakova13 at gmail.com (alena kopanicakova) Date: Fri, 30 Sep 2016 07:46:40 +0200 Subject: [petsc-users] transformation of Mat into BlockMat Message-ID: Hello, I have parallel matrix, let's say a b c d I would like to extend it to block matrix with structure as (bs = 3): a 0 0 b 0 0 0 a 0 0 b 0 0 0 a 0 0 b c 0 0 d 0 0 0 c 0 0 d 0 0 0 c 0 0 d Result is basically kron. product with ident. At the moment, I am just creating new mat with size bs-times bigger than initial and MatSetValues to proper places. It seems, that this is very inefficient and time consuming. I wonder, is there any trick, how to perform assembly efficiently? thanks for suggestions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Fri Sep 30 03:16:08 2016 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 30 Sep 2016 11:16:08 +0300 Subject: [petsc-users] FFT using Petsc4py In-Reply-To: References: Message-ID: On 27 September 2016 at 21:05, Amit Itagi wrote: > Hello, > > I am looking at the Petsc FFT interfaces. I was wondering if a parallel > FFT can be performed within a Petsc4Py code. If not, what would be the best > way to use the Petsc interfaces for FFT from Petsc4Py ? > > It should work out of the box by using mat.setType(Mat.Type.FFTW) before setup of your matrix. -- Lisandro Dalcin ============ Research Scientist Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ 4700 King Abdullah University of Science and Technology al-Khawarizmi Bldg (Bldg 1), Office # 0109 Thuwal 23955-6900, Kingdom of Saudi Arabia http://www.kaust.edu.sa Office Phone: +966 12 808-0459 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Sep 30 07:58:02 2016 From: jed at jedbrown.org (Jed Brown) Date: Fri, 30 Sep 2016 06:58:02 -0600 Subject: [petsc-users] transformation of Mat into BlockMat In-Reply-To: References: Message-ID: <87k2dta6jp.fsf@jedbrown.org> alena kopanicakova writes: > Hello, > > I have parallel matrix, let's say > > a b > c d > > I would like to extend it to block matrix with structure as (bs = 3): > > a 0 0 b 0 0 > 0 a 0 0 b 0 > 0 0 a 0 0 b > c 0 0 d 0 0 > 0 c 0 0 d 0 > 0 0 c 0 0 d > > Result is basically kron. product with ident. MatCreateMAIJ > At the moment, I am just creating new mat with size bs-times bigger than > initial and MatSetValues to proper places. It seems, that this is very > inefficient and time consuming. > I wonder, is there any trick, how to perform assembly efficiently? http://www.mcs.anl.gov/petsc/documentation/faq.html#efficient-assembly -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From bknaepen at ulb.ac.be Fri Sep 30 14:20:25 2016 From: bknaepen at ulb.ac.be (Bernard Knaepen) Date: Fri, 30 Sep 2016 21:20:25 +0200 Subject: [petsc-users] MATELEMENTAL strange behaviour In-Reply-To: References: <76625A6D-63A1-4289-B351-FB39C888843E@ulb.ac.be> Message-ID: <317C5C17-BA1D-47DE-9B5D-F9913815087F@ulb.ac.be> Hi Hong, I tested the fix and it works perfectly. Thanks for this rapid feedback. Best regards, Bernard. > On 29 Sep 2016, at 17:57, Hong > wrote: > > Bernard, > The bug is fixed in branch hzhang/fix-elementalSetOption. > https://bitbucket.org/petsc/petsc/commits/891a05710665f381fc66132810d0b09973b0e049 > > It will be merged to petsc-release after night tests. > Thanks for reporting it! > > Hong > > On Wed, Sep 28, 2016 at 8:04 PM, Hong > wrote: > Bernard: > With your code, I reproduced error and found that the default > MAT_ROW_ORIENTED for MatSetValues() is changed to MAT_COLUMN_ORIENTED. This is a bug in our library. I'll fix it. > > You can add > MatSetOption(C,MAT_ROW_ORIENTED,PETSC_TRUE); > before 2nd call of matrixC(C,t). > > Thanks for reporting the bug. > > Hong > > Hello, > > Here is a trimmed down piece of code that illustrates a strange behaviour that I am desperately trying to understand. > > The problem is that I would expect successive calls to MatView(C,?) to display the same numbers. However, the numbers displayed are different, even though the content of the matrix C should be the same. > > Any help to resolve this would be appreciated. > > Cheers, > Bernard. > > Code: > ****** > > static char help[] = "Elemental test\n\n"; > > /*T > > T*/ > > #include > > PetscScalar pi=3.141592653589793; > > /* Prototypes */ > PetscErrorCode matrixC(Mat C, PetscScalar t); > > #undef __FUNCT__ > #define __FUNCT__ "main" > int main(int argc, char **args){ > > Mat C; > > PetscInt N = 8; > PetscScalar t=0.1, dt=0.0001; > PetscErrorCode ierr; > > /* parameters */ > ierr = PetscInitialize(&argc, &args, (char*) 0, help) ;CHKERRQ(ierr); > > ierr = MatCreate(PETSC_COMM_WORLD, &C); CHKERRQ(ierr); > ierr = MatSetSizes(C, PETSC_DECIDE, PETSC_DECIDE, N, N); CHKERRQ(ierr); > ierr = MatSetType(C, MATELEMENTAL); CHKERRQ(ierr); > ierr = MatSetFromOptions(C); CHKERRQ(ierr); > ierr = MatSetUp(C); CHKERRQ(ierr); > > ierr=matrixC(C,t); > ierr = MatView(C,PETSC_VIEWER_STDOUT_WORLD); > > ierr=matrixC(C,t); > ierr = MatView(C,PETSC_VIEWER_STDOUT_WORLD); > > ierr = MatDestroy(&C);CHKERRQ(ierr); > > ierr = PetscFinalize(); > > return ierr; > } > > /* Matrix C*/ > PetscErrorCode matrixC(Mat C, PetscScalar t){ > > PetscErrorCode ierr; > IS isrows,iscols; > const PetscInt *rows,*cols; > PetscScalar *v; > PetscInt i,j,nrows,ncols; > PetscInt n,m; > > /* Set local matrix entries */ > ierr = MatGetOwnershipIS(C,&isrows,&iscols);CHKERRQ(ierr); > ierr = ISGetLocalSize(isrows,&nrows);CHKERRQ(ierr); > ierr = ISGetIndices(isrows,&rows);CHKERRQ(ierr); > ierr = ISGetLocalSize(iscols,&ncols);CHKERRQ(ierr); > ierr = ISGetIndices(iscols,&cols);CHKERRQ(ierr); > ierr = PetscMalloc1(nrows*ncols,&v);CHKERRQ(ierr); > > for (i=0; i n=rows[i]; > for (j=0; j m=cols[j]; > v[i*ncols+j] = -0.5*(exp(-(m+n+1.5)*(m+n+1.5)*pi*pi*t)-exp(-(m-n+0.5)*(m-n+0.5)*pi*pi*t)); > } > } > > ierr = MatSetValues(C,nrows,rows,ncols,cols,v,INSERT_VALUES);CHKERRQ(ierr); > ierr = MatAssemblyBegin(C,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(C,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = ISRestoreIndices(isrows,&rows);CHKERRQ(ierr); > ierr = ISRestoreIndices(iscols,&cols);CHKERRQ(ierr); > ierr = ISDestroy(&isrows);CHKERRQ(ierr); > ierr = ISDestroy(&iscols);CHKERRQ(ierr); > ierr = PetscFree(v);CHKERRQ(ierr); > > PetscFunctionReturn(0); > } > > Output: > ******** > > mpirun -n 2 ./elemental > > Mat Object: 2 MPI processes > type: elemental > Elemental matrix (cyclic ordering) > 0.336403 2.8059e-06 0.0532215 1.04511e-09 0.00104438 > 0.00104718 0.390672 0.0542687 0.0542687 0.390672 > 0.389625 0.00104718 0.390669 2.80695e-06 0.0542687 > 2.80695e-06 0.390672 0.00104718 0.390672 0.0542687 > 0.0542659 0.0542687 0.390672 0.00104718 0.390672 > > Elemental matrix (explicit ordering) > Mat Object: 2 MPI processes > type: mpidense > 3.3640319378153799e-01 5.3221486768492490e-02 1.0443777750910219e-03 2.8059034408741489e-06 1.0451056906250189e-09 > 3.8962468055003047e-01 3.9066905832512150e-01 5.4268670447024388e-02 1.0471847236375868e-03 2.8069486006233780e-06 > 5.4265864543583515e-02 3.9067186422856237e-01 3.9067186527366804e-01 5.4268671492184138e-02 1.0471847236916457e-03 > 1.0471836785318962e-03 5.4268671492130077e-02 3.9067186527372211e-01 3.9067186527372211e-01 5.4268671492184138e-02 > 2.8069485465647739e-06 1.0471847236916453e-03 5.4268671492184138e-02 3.9067186527372211e-01 3.9067186527372211e-01 > Mat Object: 2 MPI processes > type: elemental > Elemental matrix (cyclic ordering) > 0.336403 2.8059e-06 0.390672 2.80695e-06 0.00104718 > 0.0532215 1.04511e-09 0.389625 0.390672 0.0542687 > 0.00104438 0.390672 0.390669 0.390672 0.0542659 > 0.00104718 0.0542687 0.0542687 0.0542687 0.390672 > 0.0542687 0.00104718 2.80695e-06 0.00104718 0.390672 > > Elemental matrix (explicit ordering) > Mat Object: 2 MPI processes > type: mpidense > 3.3640319378153799e-01 3.9067186527372211e-01 1.0471847236916453e-03 2.8059034408741489e-06 2.8069486006233780e-06 > 1.0443777750910219e-03 3.9066905832512150e-01 5.4265864543583515e-02 3.9067186527372211e-01 3.9067186527372211e-01 > 5.4268671492130077e-02 2.8069485465647739e-06 3.9067186527366804e-01 1.0471847236375868e-03 1.0471847236916457e-03 > 5.3221486768492490e-02 3.8962468055003047e-01 5.4268671492184138e-02 1.0451056906250189e-09 3.9067186527372211e-01 > 1.0471836785318962e-03 5.4268670447024388e-02 3.9067186422856237e-01 5.4268671492184138e-02 5.4268671492184138e-02 > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera at mail.sdsu.edu Fri Sep 30 21:13:53 2016 From: mvalera at mail.sdsu.edu (Manuel Valera) Date: Fri, 30 Sep 2016 19:13:53 -0700 Subject: [petsc-users] Solve KSP in parallel. In-Reply-To: References: <0002BCB5-855B-4A7A-A31D-3566CC6F80D7@mcs.anl.gov> <174EAC3B-DA31-4FEA-8321-FE7000E74D41@mcs.anl.gov> <5F3C5343-DF36-4121-ADF0-9D3224CC89D9@mcs.anl.gov> Message-ID: Hi Barry and all, I was successful on creating the parallel version to solve my big system, it is scaling accordingly, but i noticed the error norm increasing too, i don't know if this is because the output is duplicated or if its really increasing. Is this expected ? Thanks On Tue, Sep 27, 2016 at 4:07 PM, Barry Smith wrote: > > Yes, always use the binary file > > > On Sep 27, 2016, at 3:13 PM, Manuel Valera > wrote: > > > > Barry, thanks for your insight, > > > > This standalone script must be translated into a much bigger model, > which uses AIJ matrices to define the laplacian in the form of the 3 usual > arrays, the ascii files in the script take the place of the arrays which > are passed to the solving routine in the model. > > > > So, can i use the approach you mention to create the MPIAIJ from the > petsc binary file ? would this be a better solution than reading the three > arrays directly? In the model, even the smallest matrix is 10^5x10^5 > elements > > > > Thanks. > > > > > > On Tue, Sep 27, 2016 at 12:53 PM, Barry Smith > wrote: > > > > Are you loading a matrix from an ASCII file? If so don't do that. You > should write a simple sequential PETSc program that reads in the ASCII file > and saves the matrix as a PETSc binary file with MatView(). Then write your > parallel code that reads in the binary file with MatLoad() and solves the > system. You can read in the right hand side from ASCII and save it in the > binary file also. Trying to read an ASCII file in parallel and set it into > a PETSc parallel matrix is just a totally thankless task that is > unnecessary. > > > > Barry > > > > > On Sep 26, 2016, at 6:40 PM, Manuel Valera > wrote: > > > > > > Ok, last output was from simulated multicores, in an actual cluster > the errors are of the kind: > > > > > > [valera at cinci CSRMatrix]$ petsc -n 2 ./solvelinearmgPETSc > > > TrivSoln loaded, size: 4 / 4 > > > TrivSoln loaded, size: 4 / 4 > > > RHS loaded, size: 4 / 4 > > > RHS loaded, size: 4 / 4 > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Argument out of range > > > [0]PETSC ERROR: Comm must be of size 1 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [0]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [1]PETSC ERROR: Argument out of range > > > [1]PETSC ERROR: Comm must be of size 1 > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > > [1]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > > [1]PETSC ERROR: #2 MatSetType() line 94 in > /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > > > [1]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > > local size: 2 > > > local size: 2 > > > Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran > --download-fblaslapack=1 --download-mpich > > > [0]PETSC ERROR: #1 MatCreate_SeqAIJ() line 3958 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > > [0]PETSC ERROR: #2 MatSetType() line 94 in > /home/valera/petsc-3.7.2/src/mat/interface/matreg.c > > > [0]PETSC ERROR: #3 MatCreateSeqAIJWithArrays() line 4300 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [1]PETSC ERROR: [0]PETSC ERROR: Nonconforming object sizes > > > [0]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, > my local length 4 > > > likely a call to VecSetSizes() or MatSetSizes() is wrong. > > > See http://www.mcs.anl.gov/petsc/documentation/faq.html#split > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > Nonconforming object sizes > > > [1]PETSC ERROR: Sum of local lengths 8 does not equal global length 4, > my local length 4 > > > likely a call to VecSetSizes() or MatSetSizes() is wrong. > > > See http://www.mcs.anl.gov/petsc/documentation/faq.html#split > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [0]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > > [0]PETSC ERROR: #4 PetscSplitOwnership() line 93 in > /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > > [1]PETSC ERROR: #4 PetscSplitOwnership() line 93 in > /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > > [0]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in > /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > > [0]PETSC ERROR: #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [1]PETSC ERROR: #5 PetscLayoutSetUp() line 143 in > /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > > [1]PETSC ERROR: [0]PETSC ERROR: #7 MatMPIAIJSetPreallocation() line > 3505 in /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > #6 MatMPIAIJSetPreallocation_MPIAIJ() line 2768 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [1]PETSC ERROR: [0]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > #7 MatMPIAIJSetPreallocation() line 3505 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [1]PETSC ERROR: #8 MatSetUp_MPIAIJ() line 2153 in > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > > [1]PETSC ERROR: #9 MatSetUp() line 739 in /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Object is in wrong state > > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatSetNearNullSpace() > > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [0]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > Object is in wrong state > > > [1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatSetNearNullSpace() > > > [1]PETSC ERROR: [0]PETSC ERROR: Configure options --with-cc=gcc > --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > > [0]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > > See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble > shooting. > > > [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > > [1]PETSC ERROR: #10 MatSetNearNullSpace() line 8195 in > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Object is in wrong state > > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatAssemblyBegin() > > > [0]PETSC ERROR: [1]PETSC ERROR: Object is in wrong state > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatAssemblyBegin() > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > > Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran > --download-fblaslapack=1 --download-mpich > > > [1]PETSC ERROR: #11 MatAssemblyBegin() line 5093 in > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > [1]PETSC ERROR: ------------------------------ > ------------------------------------------ > > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > [1]PETSC ERROR: [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind > > > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > > or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > [1]PETSC ERROR: likely location of problem given in stack below > > > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > [0]PETSC ERROR: [1]PETSC ERROR: Note: The EXACT line numbers in the > stack are not available, > > > [1]PETSC ERROR: INSTEAD the line number of the start of the > function > > > is given. > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5185 > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [1]PETSC ERROR: is given. > > > [1]PETSC ERROR: [1] MatAssemblyEnd line 5185 > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > > [0] MatAssemblyBegin line 5090 /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetNearNullSpace line 8191 > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [1]PETSC ERROR: [1] MatAssemblyBegin line 5090 > /home/valera/petsc-3.7.2/src/mat/interface/matrix.c > > > [1]PETSC ERROR: [0] PetscSplitOwnership line 80 > /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > > [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 > /home/valera/petsc-3.7.2/src/vec/is/utils/pmap.c > > > [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation_MPIAIJ line 2767 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [1] MatSetNearNullSpace line 8191 /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > > [1]PETSC ERROR: [1] PetscSplitOwnership line 80 > /home/valera/petsc-3.7.2/src/sys/utils/psplit.c > > > [1]PETSC ERROR: [0]PETSC ERROR: [0] MatMPIAIJSetPreallocation line > 3502 /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatSetUp_MPIAIJ line 2152 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [1] PetscLayoutSetUp line 129 /home/valera/petsc-3.7.2/src/ > vec/is/utils/pmap.c > > > [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation_MPIAIJ line 2767 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatSetUp line 727 /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatCreate_SeqAIJ line 3956 > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > > [1]PETSC ERROR: [1] MatMPIAIJSetPreallocation line 3502 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [1]PETSC ERROR: [1] MatSetUp_MPIAIJ line 2152 > /home/valera/petsc-3.7.2/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatSetType line 44 /home/valera/petsc-3.7.2/src/ > mat/interface/matreg.c > > > [0]PETSC ERROR: [0] MatCreateSeqAIJWithArrays line 4295 > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > > [1]PETSC ERROR: [1] MatSetUp line 727 /home/valera/petsc-3.7.2/src/ > mat/interface/matrix.c > > > [1]PETSC ERROR: [1] MatCreate_SeqAIJ line 3956 > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [1]PETSC ERROR: [1] MatSetType line 44 /home/valera/petsc-3.7.2/src/ > mat/interface/matreg.c > > > [1]PETSC ERROR: [1] MatCreateSeqAIJWithArrays line 4295 > /home/valera/petsc-3.7.2/src/mat/impls/aij/seq/aij.c > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > > [0]PETSC ERROR: Signal received > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > [1]PETSC ERROR: #12 User provided function() line 0 in unknown file > > > Petsc Release Version 3.7.2, Jun, 05, 2016 > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > P on a > arch-linux2-c-debug named cinci by valera Mon Sep 26 16:39:02 2016 > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich > > > [1]PETSC ERROR: #12 User provided function() line 0 in unknown file > > > application called MPI_Abort(comm=0x84000004, 59) - process 0 > > > [cli_0]: aborting job: > > > application called MPI_Abort(comm=0x84000004, 59) - process 0 > > > application called MPI_Abort(comm=0x84000002, 59) - process 1 > > > [cli_1]: aborting job: > > > application called MPI_Abort(comm=0x84000002, 59) - process 1 > > > > > > ============================================================ > ======================= > > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > > = PID 10266 RUNNING AT cinci > > > = EXIT CODE: 59 > > > = CLEANING UP REMAINING PROCESSES > > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > > ============================================================ > ======================= > > > > > > > > > On Mon, Sep 26, 2016 at 3:51 PM, Manuel Valera > wrote: > > > Ok, i created a tiny testcase just for this, > > > > > > The output from n# calls are as follows: > > > > > > n1: > > > Mat Object: 1 MPI processes > > > type: mpiaij > > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > > row 2: (0, 4.) (1, 3.) (2, 1.) (3, 2.) > > > row 3: (0, 3.) (1, 4.) (2, 2.) (3, 1.) > > > > > > n2: > > > Mat Object: 2 MPI processes > > > type: mpiaij > > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > row 1: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > row 3: (0, 2.) (1, 1.) (2, 3.) (3, 4.) > > > > > > n4: > > > Mat Object: 4 MPI processes > > > type: mpiaij > > > row 0: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > row 1: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > row 2: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > row 3: (0, 1.) (1, 2.) (2, 4.) (3, 3.) > > > > > > > > > > > > It really gets messed, no idea what's happening. > > > > > > > > > > > > > > > On Mon, Sep 26, 2016 at 3:12 PM, Barry Smith > wrote: > > > > > > > On Sep 26, 2016, at 5:07 PM, Manuel Valera > wrote: > > > > > > > > Ok i was using a big matrix before, from a smaller testcase i got > the output and effectively, it looks like is not well read at all, results > are attached for DRAW viewer, output is too big to use STDOUT even in the > small testcase. n# is the number of processors requested. > > > > > > You need to construct a very small test case so you can determine > why the values do not end up where you expect them. There is no way around > it. > > > > > > > > is there a way to create the matrix in one node and the distribute > it as needed on the rest ? maybe that would work. > > > > > > No the is not scalable. You become limited by the memory of the one > node. > > > > > > > > > > > Thanks > > > > > > > > On Mon, Sep 26, 2016 at 2:40 PM, Barry Smith > wrote: > > > > > > > > How large is the matrix? It will take a very long time if the > matrix is large. Debug with a very small matrix. > > > > > > > > Barry > > > > > > > > > On Sep 26, 2016, at 4:34 PM, Manuel Valera > wrote: > > > > > > > > > > Indeed there is something wrong with that call, it hangs out > indefinitely showing only: > > > > > > > > > > Mat Object: 1 MPI processes > > > > > type: mpiaij > > > > > > > > > > It draws my attention that this program works for 1 processor but > not more, but it doesnt show anything for that viewer in either case. > > > > > > > > > > Thanks for the insight on the redundant calls, this is not very > clear on documentation, which calls are included in others. > > > > > > > > > > > > > > > > > > > > On Mon, Sep 26, 2016 at 2:02 PM, Barry Smith > wrote: > > > > > > > > > > The call to MatCreateMPIAIJWithArrays() is likely interpreting > the values you pass in different than you expect. > > > > > > > > > > Put a call to MatView(Ap,PETSC_VIEWER_STDOUT_WORLD,ierr) > after the MatCreateMPIAIJWithArray() to see what PETSc thinks the matrix is. > > > > > > > > > > > > > > > > On Sep 26, 2016, at 3:42 PM, Manuel Valera < > mvalera at mail.sdsu.edu> wrote: > > > > > > > > > > > > Hello, > > > > > > > > > > > > I'm working on solve a linear system in parallel, following ex12 > of the ksp tutorial i don't see major complication on doing so, so for a > working linear system solver with PCJACOBI and KSPGCR i did only the > following changes: > > > > > > > > > > > > call MatCreate(PETSC_COMM_WORLD,Ap,ierr) > > > > > > ! call MatSetType(Ap,MATSEQAIJ,ierr) > > > > > > call MatSetType(Ap,MATMPIAIJ,ierr) !paralellization > > > > > > > > > > > > call MatSetSizes(Ap,PETSC_DECIDE,PETSC_DECIDE,nbdp,nbdp,ierr); > > > > > > > > > > > > ! call MatSeqAIJSetPreallocationCSR(Ap,iapi,japi,app,ierr) > > > > > > call MatSetFromOptions(Ap,ierr) > > > > > > > > > > Note that none of the lines above are needed (or do anything) > because the MatCreateMPIAIJWithArrays() creates the matrix from scratch > itself. > > > > > > > > > > Barry > > > > > > > > > > > ! call MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD,nbdp,nbdp, > iapi,japi,app,Ap,ierr) > > > > > > call MatCreateMPIAIJWithArrays(PETSC_COMM_WORLD,floor(real( > nbdp)/sizel),PETSC_DECIDE,nbdp,nbdp,iapi,japi,app,Ap,ierr) > > > > > > > > > > > > > > > > > > I grayed out the changes from sequential implementation. > > > > > > > > > > > > So, it does not complain at runtime until it reaches KSPSolve(), > with the following error: > > > > > > > > > > > > > > > > > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > > [1]PETSC ERROR: Object is in wrong state > > > > > > [1]PETSC ERROR: Matrix is missing diagonal entry 0 > > > > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/ > documentation/faq.html for trouble shooting. > > > > > > [1]PETSC ERROR: Petsc Release Version 3.7.3, unknown > > > > > > [1]PETSC ERROR: ./solvelinearmgPETSc > > > ? ? on a > arch-linux2-c-debug named valera-HP-xw4600-Workstation by valera Mon Sep 26 > 13:35:15 2016 > > > > > > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack=1 --download-mpich=1 > --download-ml?=1 > > > > > > [1]PETSC ERROR: #1 MatILUFactorSymbolic_SeqAIJ() line 1733 in > /home/valera/v5PETSc/petsc/petsc/src/mat/impls/aij/seq/aijfact.c > > > > > > [1]PETSC ERROR: #2 MatILUFactorSymbolic() line 6579 in > /home/valera/v5PETSc/petsc/petsc/src/mat/interface/matrix.c > > > > > > [1]PETSC ERROR: #3 PCSetUp_ILU() line 212 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/factor/ilu/ilu.c > > > > > > [1]PETSC ERROR: #4 PCSetUp() line 968 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > > > [1]PETSC ERROR: #5 KSPSetUp() line 390 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > > [1]PETSC ERROR: #6 PCSetUpOnBlocks_BJacobi_Singleblock() line > 650 in /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c > > > > > > [1]PETSC ERROR: #7 PCSetUpOnBlocks() line 1001 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/pc/interface/precon.c > > > > > > [1]PETSC ERROR: #8 KSPSetUpOnBlocks() line 220 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > > [1]PETSC ERROR: #9 KSPSolve() line 600 in > /home/valera/v5PETSc/petsc/petsc/src/ksp/ksp/interface/itfunc.c > > > > > > At line 333 of file solvelinearmgPETSc.f90 > > > > > > Fortran runtime error: Array bound mismatch for dimension 1 of > array 'sol' (213120/106560) > > > > > > > > > > > > > > > > > > This code works for -n 1 cores, but it gives this error when > using more than one core. > > > > > > > > > > > > What am i missing? > > > > > > > > > > > > Regards, > > > > > > > > > > > > Manuel. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: