From D.Lathouwers at tudelft.nl Thu May 1 02:35:35 2014 From: D.Lathouwers at tudelft.nl (Danny Lathouwers - TNW) Date: Thu, 1 May 2014 07:35:35 +0000 Subject: [petsc-users] singular matrix solve using MAT_SHIFT_POSITIVE_DEFINITE option In-Reply-To: References: <4E6B33F4128CED4DB307BA83146E9A64258E28C4@SRV362.tudelft.net> Message-ID: <4E6B33F4128CED4DB307BA83146E9A64258E2E72@SRV362.tudelft.net> Thank you Matt. I?ll try that soon. Will let you know if this works for me. From: Matthew Knepley [mailto:knepley at gmail.com] Sent: donderdag 1 mei 2014 0:09 To: Danny Lathouwers - TNW Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] singular matrix solve using MAT_SHIFT_POSITIVE_DEFINITE option On Wed, Apr 30, 2014 at 2:53 PM, Danny Lathouwers - TNW > wrote: Dear users, I encountered a strange problem. I have a singular matrix P (Poisson, Neumann boundary conditions, N=4). The rhs b sums to 0. If I hand-fill the matrix with the right entries (non-zeroes only) things work with KSPCG and ICC preconditioning and using the MAT_SHIFT_POSITIVE_DEFINITE option. Convergence in 2 iterations to (a) correct solution. So far for the debugging problem. That option changes the preconditioner matrix to (alpha I + P). I don't know of a theoretical reason that this should be a good preconditioner, but perhaps it exists. Certainly ICC is exquisitely sensitive (you can easily write down matrices where an epsilon change destroys convergence). Yes, you should use null space, and here it is really easy -ksp_constant_null_space Its possible that this fixes your convergence, if the ICC perturbation was introducing components in the null space to your solution. Matt My real problem computes P from D * M * D^T. If I do this I get the same matrix (on std out I do not see the difference to all digits). The system P * x = b now does NOT converge. More strange is that is if I remove the zeroes from D then things do work again. Either things are overly sensitive or I am misusing petsc. It does work when using e.g. the AMG preconditioner (again it is a correct but different solution). So system really seems OK. Should I also use the Null space commands as I have seen in some of the examples as well? But, I recall from many years ago when using MICCG (alpha) preconditioning that no such tricks were needed for CG with Poisson-Neumann. I am supposing the MAT_SHIFT_POSITIVE_DEFINITE option does something similar as MICCG. For clarity I have included the code (unfortunately this is the smallest I could get it; it?s quite straightforward though). By setting the value of option to 1 in main.f90 the code use P = D * M * D^T otherwise it will use the hand-filled matrix. The code prints the matrix P and solution etc. Anyone any hints on this? What other preconditioners (serial) are suitable for this problem besides ICC/AMG? Thanks very much. Danny Lathouwers -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From epscodes at gmail.com Thu May 1 10:32:44 2014 From: epscodes at gmail.com (Xiangdong) Date: Thu, 1 May 2014 11:32:44 -0400 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> Message-ID: Under what condition, SNESGetFunctionNorm() will output different results from SENEGetFunction + VecNorm (with NORM_2)? For most of my test cases, it is the same. However, when I have some special (trivial) initial guess to the SNES problem, I see different norms. Another phenomenon I noticed with this is that KSP in SNES squeeze my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, and the rhs and solution is with length 25. Do you have any clue on what triggered this? To my surprise, when I output the Jacobian inside the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct numerical entries. Why does the operator obtained from KSP is different and got rows eliminated? These rows got eliminated have only one entries per row, but the rhs in that row is not zero. Eliminating these rows would give wrong solutions. Thank you. Xiangdong On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley wrote: > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong wrote: > >> It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo >> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize >> the array f. Zero the array f solved the problem and gave consistent result. >> >> Just curious, why does not petsc initialize the array f to zero by >> default inside petsc when passing the f array to FormFunctionLocal? >> > > If you directly set entires, you might not want us to spend the time > writing those zeros. > > >> I have another quick question about the array x passed to >> FormFunctionLocal. If I want to know the which x is evaluated, how can I >> output x in a vector format? Currently, I created a global vector vecx and >> a local vector vecx_local, get the array of vecx_local_array, copy the x to >> vecx_local_array, scatter to global vecx and output vecx. Is there a quick >> way to restore the array x to a vector and output? >> > > I cannot think of a better way than that. > > Matt > > >> Thank you. >> >> Best, >> Xiangdong >> >> >> >> On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith wrote: >> >>> >>> On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: >>> >>> > Hello everyone, >>> > >>> > When I run snes program, >>> >>> ^^^^ what SNES program?? >>> >>> > it outputs "SNES Function norm 1.23456789e+10". It seems that this >>> norm is different from residue norm (even if solving F(x)=0) >>> >>> Please send the full output where you see this. >>> >>> > and also differ from norm of the Jacobian. What is the definition of >>> this "SNES Function Norm?? >>> >>> The SNES Function Norm as printed by PETSc is suppose to the 2-norm >>> of F(x) - b (where b is usually zero) and this is also the same thing as >>> the ?residue norm? >>> >>> Barry >>> >>> > >>> > Thank you. >>> > >>> > Best, >>> > Xiangdong >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hemak at asu.edu Thu May 1 12:51:44 2014 From: hemak at asu.edu (Hema Krishnamurthy) Date: Thu, 1 May 2014 17:51:44 +0000 Subject: [petsc-users] SVD Implementation Message-ID: <842702CC6788EE46BFA492B17777501B0C66FE17@exmbw01.asurite.ad.asu.edu> Hi, Could someone please explain as to why the input data to SVD is being scaled in PETSc? HANDLER(MatScale(Y,1./sqrt(nColsGlobal-1))); // data'T / sqrt(N-1) is being done before the call to SVDCreate() -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 1 12:58:27 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 1 May 2014 12:58:27 -0500 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> Message-ID: On May 1, 2014, at 10:32 AM, Xiangdong wrote: > Under what condition, SNESGetFunctionNorm() will output different results from SENEGetFunction + VecNorm (with NORM_2)? > > For most of my test cases, it is the same. However, when I have some special (trivial) initial guess to the SNES problem, I see different norms. Please send more details on your ?trivial? case where the values are different. It could be that we are not setting the function norm properly on early exit from the solvers. > > Another phenomenon I noticed with this is that KSP in SNES squeeze my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, and the rhs and solution is with length 25. Do you have any clue on what triggered this? To my surprise, when I output the Jacobian inside the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct numerical entries. Why does the operator obtained from KSP is different and got rows eliminated? These rows got eliminated have only one entries per row, but the rhs in that row is not zero. Eliminating these rows would give wrong solutions. Hmm, we never squeeze out rows/columns from the Jacobian. The size of the Jacobian set with SNESSetJacobian() should always match that obtained with KSPGetOperators() on the linear system. Please send more details on how you get this. Are you calling the KSPGetOperators() inside a preconditioner where the the preconditioner has chopped up the operator? Barry > > Thank you. > > Xiangdong > > > > > > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley wrote: > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong wrote: > It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize the array f. Zero the array f solved the problem and gave consistent result. > > Just curious, why does not petsc initialize the array f to zero by default inside petsc when passing the f array to FormFunctionLocal? > > If you directly set entires, you might not want us to spend the time writing those zeros. > > I have another quick question about the array x passed to FormFunctionLocal. If I want to know the which x is evaluated, how can I output x in a vector format? Currently, I created a global vector vecx and a local vector vecx_local, get the array of vecx_local_array, copy the x to vecx_local_array, scatter to global vecx and output vecx. Is there a quick way to restore the array x to a vector and output? > > I cannot think of a better way than that. > > Matt > > Thank you. > > Best, > Xiangdong > > > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith wrote: > > On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: > > > Hello everyone, > > > > When I run snes program, > > ^^^^ what SNES program?? > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this norm is different from residue norm (even if solving F(x)=0) > > Please send the full output where you see this. > > > and also differ from norm of the Jacobian. What is the definition of this "SNES Function Norm?? > > The SNES Function Norm as printed by PETSc is suppose to the 2-norm of F(x) - b (where b is usually zero) and this is also the same thing as the ?residue norm? > > Barry > > > > > Thank you. > > > > Best, > > Xiangdong > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From epscodes at gmail.com Thu May 1 14:43:09 2014 From: epscodes at gmail.com (Xiangdong) Date: Thu, 1 May 2014 15:43:09 -0400 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> Message-ID: Here is the order of functions I called: DMDACreate3d(); SNESCreate(); SNESSetDM(); (DM with dof=2); DMSetApplicationContext(); DMDASNESSetFunctionLocal(); SNESVISetVariableBounds(); DMDASNESetJacobianLocal(); SNESSetFromOptions(); SNESSolve(); SNESGetKSP(); KSPGetSolution(); KSPGetRhs(); KSPGetOperators(); //get operator kspA, kspx, kspb; SNESGetFunctionNorm(); ==> get norm fnorma; SNESGetFunction(); VecNorm(); ==> get norm fnormb; SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the solution x and get norm fnormc; Inside the FormJacobianLocal(), I output the matrix jac and preB; I found that fnorma matches the default SNES monitor output "SNES Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx obtained by snescomputefunction, mat jac and preB are length 50 or 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25. I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; x(2:2:end)=0; It seems that it completely ignores the second degree of freedom (setting it to zero). I saw this for (close to) constant initial guess, while for heterogeneous initial guess, it works fine and the matrix and vector size are correct, and the solution is correct. So this eliminating row behavior seems to be initial guess dependent. I saw this even if I use snes_fd, so we can rule out the possibility of wrong Jacobian. For the FormFunctionLocal(), I checked via SNESComputeFunction and it output the correct vector of residue. Are the orders of function calls correct? Thank you. Xiangdong On Thu, May 1, 2014 at 1:58 PM, Barry Smith wrote: > > On May 1, 2014, at 10:32 AM, Xiangdong wrote: > > > Under what condition, SNESGetFunctionNorm() will output different > results from SENEGetFunction + VecNorm (with NORM_2)? > > > > For most of my test cases, it is the same. However, when I have some > special (trivial) initial guess to the SNES problem, I see different norms. > > Please send more details on your ?trivial? case where the values are > different. It could be that we are not setting the function norm properly > on early exit from the solvers. > > > > Another phenomenon I noticed with this is that KSP in SNES squeeze my > matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When > I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, > and the rhs and solution is with length 25. Do you have any clue on what > triggered this? To my surprise, when I output the Jacobian inside the > FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct > numerical entries. Why does the operator obtained from KSP is different and > got rows eliminated? These rows got eliminated have only one entries per > row, but the rhs in that row is not zero. Eliminating these rows would give > wrong solutions. > > Hmm, we never squeeze out rows/columns from the Jacobian. The size of > the Jacobian set with SNESSetJacobian() should always match that obtained > with KSPGetOperators() on the linear system. Please send more details on > how you get this. Are you calling the KSPGetOperators() inside a > preconditioner where the the preconditioner has chopped up the operator? > > Barry > > > > > Thank you. > > > > Xiangdong > > > > > > > > > > > > > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley > wrote: > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong wrote: > > It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo > *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize > the array f. Zero the array f solved the problem and gave consistent result. > > > > Just curious, why does not petsc initialize the array f to zero by > default inside petsc when passing the f array to FormFunctionLocal? > > > > If you directly set entires, you might not want us to spend the time > writing those zeros. > > > > I have another quick question about the array x passed to > FormFunctionLocal. If I want to know the which x is evaluated, how can I > output x in a vector format? Currently, I created a global vector vecx and > a local vector vecx_local, get the array of vecx_local_array, copy the x to > vecx_local_array, scatter to global vecx and output vecx. Is there a quick > way to restore the array x to a vector and output? > > > > I cannot think of a better way than that. > > > > Matt > > > > Thank you. > > > > Best, > > Xiangdong > > > > > > > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith > wrote: > > > > On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: > > > > > Hello everyone, > > > > > > When I run snes program, > > > > ^^^^ what SNES program?? > > > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this > norm is different from residue norm (even if solving F(x)=0) > > > > Please send the full output where you see this. > > > > > and also differ from norm of the Jacobian. What is the definition of > this "SNES Function Norm?? > > > > The SNES Function Norm as printed by PETSc is suppose to the 2-norm > of F(x) - b (where b is usually zero) and this is also the same thing as > the ?residue norm? > > > > Barry > > > > > > > > Thank you. > > > > > > Best, > > > Xiangdong > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jingyue at gmail.com Thu May 1 15:32:48 2014 From: jingyue at gmail.com (Jingyue Wang) Date: Thu, 01 May 2014 15:32:48 -0500 Subject: [petsc-users] SLEPc configuration problem Message-ID: <5362AF70.9050909@gmail.com> Hi, Can anyone please help me on how to configure SLEPc? I have installed PETSc 3.4.4 (compiled with MKL) and downloaded and extracted the source of SLEPc 3.4.4. I set up export SLEPC_DIR="/home/jwang/opt/slepc-3.4.4" export PETSC_DIR="/home/jwang/opt/petsc-3.4.4" export PETSC_ARCH=linux-amd64-opt However, after I enter the source directory of SLEPc and type ./configure, I got the error messages that I append at the end of the email. I tried to read the python configuration code and it seems that the reason is self.framework is None and the reason for self.framework is None is in the script.py in my petsc-3.4.4/config/BuildSystem directory, the following code in function loadConfigure(self, argDB = None): ..... if not 'configureCache' in argDB: self.logPrint('No cached configure in RDict at '+str(argDB.saveFilename)) return None ..... returns a None value. It seems the reason is SLEPc can not find cached configuration in PETSc, but I don't know how to enable such cached configuration in PETSc... ***********************Error messages***************************************** Checking environment... Checking PETSc installation... Checking LAPACK library... Traceback (most recent call last): File "./configure", line 10, in execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py')) File "./config/configure.py", line 401, in cmakeok = cmakeboot.main(slepcdir,petscdir,petscarch=petscconf.ARCH,log=log) File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 172, in main return PETScMaker(slepcdir,petscdir,petscarch,argDB,framework).cmakeboot(args,log) File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 87, in cmakeboot self.setup() File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 83, in setup self.setupModules() File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 51, in setupModules self.mpi = self.framework.require('config.packages.MPI', None) AttributeError: 'NoneType' object has no attribute 'require' -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu May 1 15:48:27 2014 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 1 May 2014 22:48:27 +0200 Subject: [petsc-users] SLEPc configuration problem In-Reply-To: <5362AF70.9050909@gmail.com> References: <5362AF70.9050909@gmail.com> Message-ID: <126D95D7-DDFF-4ED4-995C-D20D07DB5E4F@dsic.upv.es> El 01/05/2014, a las 22:32, Jingyue Wang escribi?: > Hi, > > Can anyone please help me on how to configure SLEPc? I have installed PETSc 3.4.4 (compiled with MKL) and downloaded and extracted the source of SLEPc 3.4.4. I set up > > export SLEPC_DIR="/home/jwang/opt/slepc-3.4.4" > export PETSC_DIR="/home/jwang/opt/petsc-3.4.4" > export PETSC_ARCH=linux-amd64-opt > > However, after I enter the source directory of SLEPc and type ./configure, I got the error messages that I append at the end of the email. I tried to read the python configuration code and it seems that the reason is self.framework is None and the reason for self.framework is None is in the script.py in my petsc-3.4.4/config/BuildSystem directory, the following code in function loadConfigure(self, argDB = None): > ..... > if not 'configureCache' in argDB: > self.logPrint('No cached configure in RDict at '+str(argDB.saveFilename)) > return None > ..... > returns a None value. > > It seems the reason is SLEPc can not find cached configuration in PETSc, but I don't know how to enable such cached configuration in PETSc... > > > ***********************Error messages***************************************** > > Checking environment... > Checking PETSc installation... > Checking LAPACK library... > > Traceback (most recent call last): > File "./configure", line 10, in > execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py')) > File "./config/configure.py", line 401, in > cmakeok = cmakeboot.main(slepcdir,petscdir,petscarch=petscconf.ARCH,log=log) > File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 172, in main > return PETScMaker(slepcdir,petscdir,petscarch,argDB,framework).cmakeboot(args,log) > File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 87, in cmakeboot > self.setup() > File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 83, in setup > self.setupModules() > File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 51, in setupModules > self.mpi = self.framework.require('config.packages.MPI', None) > AttributeError: 'NoneType' object has no attribute 'require' > This problem has been reported before and it may happen occasionally. Check the file $PETSC_DIR/$PETSC_ARCH/conf/RDict.db - see if it has a smaller size than usual. If this is the case, then the problem is that PETSc's configuration did not write this file completely, I don't know the reason. Suggest to reconfigure PETSc. Jose From popov at uni-mainz.de Thu May 1 17:25:06 2014 From: popov at uni-mainz.de (Anton Popov) Date: Fri, 2 May 2014 00:25:06 +0200 Subject: [petsc-users] Assembling a matrix for a DMComposite vector In-Reply-To: References: <535F6D64.5040700@uni-mainz.de> Message-ID: <5362C9C2.40602@uni-mainz.de> On 5/1/14 10:39 PM, Anush Krishnan wrote: > Hi Anton, > > On 29 April 2014 05:14, Anton Popov > wrote: > > > You can do the whole thing much easier (to my opinion). > Since you created two DMDA anyway, just do: > > - find first index on every processor using MPI_Scan > - create two global vectors (no ghosts) > - put proper global indicies to global vectors > - create two local vectors (with ghosts) and set ALL entries to -1 > (to have what you need in boundary ghosts) > - call global-to-local scatter > > Done! > > > Won't the vectors contain floating point values? Are you storing your > indices as real numbers? YES, exactly. And then I cast them to PetscInt when I compose stencils. Something like this: idx[0] = (PetscInt) ivx[k][j][i]; idx[1] = (PetscInt) ivx[k][j][i+1]; idx[2] = (PetscInt) ivy[k][j][i]; ... and so on, where ivx, ivy, ... are the index arrays in x, y .. directions Then I insert (actually add) stencils using MatSetValues. By the way, you can ideally preallocate in parallel with MatMPIAIJSetPreallocation. To count precisely number entries in the diagonal & off-diagonal blocks use the same mechanism to easily access global indices, and then compare them with the local row range, which is also known: - within the range -> d_nnz[i]++; - outside the range -> o_nnz[i]++; Anton > > > The advantage is that you can access global indices (including > ghosts) in every block using i-j-k indexing scheme. > I personally find this way quite easy to implement with PETSc > > Anton > > > Thank you, > Anush -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Thu May 1 18:14:30 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Thu, 1 May 2014 18:14:30 -0500 Subject: [petsc-users] Adaptive mesh refinement in Petsc Message-ID: Hello everybody I want to implement an adaptive mesh refinement library in a code written in petsc. I have checked out some of the available libraries, but I want to work with the latest petsc-dev version and I am sure there will be many incompatibilities. So far I think I'll end up working with one of these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start checking out each of them and learn how to use them I though I would ask you guys which one you would recommend. My code would be a finite element analysis in solid mechanics. I would like to take full advantage of petsc capabilities, but I would not mind start with some restrictions. I hope my question is not too broad. Take care Miguel -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 1 19:19:32 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 1 May 2014 19:19:32 -0500 Subject: [petsc-users] Adaptive mesh refinement in Petsc In-Reply-To: References: Message-ID: On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > Hello everybody > > I want to implement an adaptive mesh refinement library in a code written > in petsc. I have checked out some of the available libraries, but I want to > work with the latest petsc-dev version and I am sure there will be many > incompatibilities. So far I think I'll end up working with one of these > libraries: SAMRAI, Chombo, libMesh and deal II. Before I start checking out > each of them and learn how to use them I though I would ask you guys which > one you would recommend. My code would be a finite element analysis in > solid mechanics. I would like to take full advantage of petsc capabilities, > but I would not mind start with some restrictions. I hope my question is > not too broad. > SAMRAI, Chombo, and Deal II are all structured adaptive refinement codes, whereas LibMesh is unstructured. If you want unstructured, there is really no other game in town. If you use deal II, I would suggest trying out p4est underneath which gives great scalability. My understanding is that Chombo is mostly used for finite volume and SAMRAI and deal II for finite element, but this could be out of date. Matt > Take care > Miguel > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From griffith at cims.nyu.edu Thu May 1 19:25:33 2014 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 1 May 2014 20:25:33 -0400 Subject: [petsc-users] Adaptive mesh refinement in Petsc In-Reply-To: References: Message-ID: > On May 1, 2014, at 8:19 PM, Matthew Knepley wrote: > >> On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya wrote: >> Hello everybody >> >> I want to implement an adaptive mesh refinement library in a code written in petsc. I have checked out some of the available libraries, but I want to work with the latest petsc-dev version and I am sure there will be many incompatibilities. So far I think I'll end up working with one of these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start checking out each of them and learn how to use them I though I would ask you guys which one you would recommend. My code would be a finite element analysis in solid mechanics. I would like to take full advantage of petsc capabilities, but I would not mind start with some restrictions. I hope my question is not too broad. > > SAMRAI, Chombo, and Deal II are all structured adaptive refinement codes, whereas LibMesh is unstructured. If you want unstructured, there is > really no other game in town. If you use deal II, I would suggest trying out p4est underneath which gives great scalability. My understanding > is that Chombo is mostly used for finite volume and SAMRAI and deal II for finite element, but this could be out of date. SAMRAI is definitely much better suited to finite volume, although it does have basic features needed for structured-grid FE. -- Boyce > > Matt > >> Take care >> Miguel >> >> -- >> Miguel Angel Salazar de Troya >> Graduate Research Assistant >> Department of Mechanical Science and Engineering >> University of Illinois at Urbana-Champaign >> (217) 550-2360 >> salaza11 at illinois.edu > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 1 19:31:51 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 1 May 2014 19:31:51 -0500 Subject: [petsc-users] Adaptive mesh refinement in Petsc In-Reply-To: References: Message-ID: <48DFB171-4663-4E30-96A6-80103400ACD2@mcs.anl.gov> You also could likely benefit from Moose http://www.mooseframework.org it sits on top of libMesh which sits on top of PETSc and manages almost all of what you need for finite element analysis. Barry On May 1, 2014, at 7:19 PM, Matthew Knepley wrote: > On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya wrote: > Hello everybody > > I want to implement an adaptive mesh refinement library in a code written in petsc. I have checked out some of the available libraries, but I want to work with the latest petsc-dev version and I am sure there will be many incompatibilities. So far I think I'll end up working with one of these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start checking out each of them and learn how to use them I though I would ask you guys which one you would recommend. My code would be a finite element analysis in solid mechanics. I would like to take full advantage of petsc capabilities, but I would not mind start with some restrictions. I hope my question is not too broad. > > SAMRAI, Chombo, and Deal II are all structured adaptive refinement codes, whereas LibMesh is unstructured. If you want unstructured, there is > really no other game in town. If you use deal II, I would suggest trying out p4est underneath which gives great scalability. My understanding > is that Chombo is mostly used for finite volume and SAMRAI and deal II for finite element, but this could be out of date. > > Matt > > Take care > Miguel > > -- > Miguel Angel Salazar de Troya > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From friedmud at gmail.com Thu May 1 20:04:32 2014 From: friedmud at gmail.com (Derek Gaston) Date: Thu, 1 May 2014 19:04:32 -0600 Subject: [petsc-users] Adaptive mesh refinement in Petsc In-Reply-To: <48DFB171-4663-4E30-96A6-80103400ACD2@mcs.anl.gov> References: <48DFB171-4663-4E30-96A6-80103400ACD2@mcs.anl.gov> Message-ID: Miguel, I'm the lead for the MOOSE Framework project Barry spoke of... we would love to help you get up and running with adaptive finite elements for solid mechanics with MOOSE. If you are doing fairly normal solid mechanics using small or large strain formulations with some plasticity... most of what you need is already there. You may need to plug in your particular material model but that's about it. Mesh adaptivity is built-in and should work out of the box. The major benefit of using MOOSE is that you can easily couple in other physics (like heat conduction, chemistry and more) and of course you have full access to all the power of PETSc. I recommend going through the Getting Started material on http://www.mooseframework.org to get set up... and go ahead and create yourself a new Application using these instructions: http://mooseframework.org/create-an-app/ . That Application will already have full access to our solid mechanics capabilities (as well as tons of other stuff like heat conduction, chemistry, etc.). After that - join up on the moose-users mailing list and you can get in touch with everyone else doing solid mechanics with MOOSE who can point you in the right direction depending on your particular application. Let me know if you have any questions... Derek On Thu, May 1, 2014 at 6:31 PM, Barry Smith wrote: > > You also could likely benefit from Moose http://www.mooseframework.orgit sits on top of libMesh which sits on top of PETSc and manages almost all > of what you need for finite element analysis. > > Barry > > On May 1, 2014, at 7:19 PM, Matthew Knepley wrote: > > > On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya < > salazardetroya at gmail.com> wrote: > > Hello everybody > > > > I want to implement an adaptive mesh refinement library in a code > written in petsc. I have checked out some of the available libraries, but I > want to work with the latest petsc-dev version and I am sure there will be > many incompatibilities. So far I think I'll end up working with one of > these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start > checking out each of them and learn how to use them I though I would ask > you guys which one you would recommend. My code would be a finite element > analysis in solid mechanics. I would like to take full advantage of petsc > capabilities, but I would not mind start with some restrictions. I hope my > question is not too broad. > > > > SAMRAI, Chombo, and Deal II are all structured adaptive refinement > codes, whereas LibMesh is unstructured. If you want unstructured, there is > > really no other game in town. If you use deal II, I would suggest trying > out p4est underneath which gives great scalability. My understanding > > is that Chombo is mostly used for finite volume and SAMRAI and deal II > for finite element, but this could be out of date. > > > > Matt > > > > Take care > > Miguel > > > > -- > > Miguel Angel Salazar de Troya > > Graduate Research Assistant > > Department of Mechanical Science and Engineering > > University of Illinois at Urbana-Champaign > > (217) 550-2360 > > salaza11 at illinois.edu > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From epscodes at gmail.com Thu May 1 21:12:38 2014 From: epscodes at gmail.com (Xiangdong) Date: Thu, 1 May 2014 22:12:38 -0400 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> Message-ID: I came up with a simple example to demonstrate this "eliminating row" behavior. It happens when the solution x to the linearized equation Ax=b is out of the bound set by SNESVISetVariableBounds(); In the attached example, I use snes to solve a simple function x-b=0. When you run it, it outputs the matrix as 25 rows, while the real Jacobian should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be -inf, it will output 50 rows for the Jacobian. In the first case, the norm given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different. In solving the nonlinear equations, it is likely that the solution of the linearized equation is out of bound, but then we can reset the out-of-bound solution to be lower or upper bound instead of eliminating the variables (the rows). Any suggestions on doing this in petsc? Thank you. Best, Xiangdong P.S. If we change the lower bound of field u (line 124) to be zero, then the Jacobian matrix is set to be NULL by petsc. On Thu, May 1, 2014 at 3:43 PM, Xiangdong wrote: > Here is the order of functions I called: > > DMDACreate3d(); > > SNESCreate(); > > SNESSetDM(); (DM with dof=2); > > DMSetApplicationContext(); > > DMDASNESSetFunctionLocal(); > > SNESVISetVariableBounds(); > > DMDASNESetJacobianLocal(); > > SNESSetFromOptions(); > > SNESSolve(); > > SNESGetKSP(); > KSPGetSolution(); > KSPGetRhs(); > KSPGetOperators(); //get operator kspA, kspx, kspb; > > SNESGetFunctionNorm(); ==> get norm fnorma; > SNESGetFunction(); VecNorm(); ==> get norm fnormb; > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the > solution x and get norm fnormc; > > Inside the FormJacobianLocal(), I output the matrix jac and preB; > > I found that fnorma matches the default SNES monitor output "SNES Function > norm", but fnormb=fnormc != fnorma. The solution x, the residue fx obtained > by snescomputefunction, mat jac and preB are length 50 or 50-by-50, while > the kspA, kspx, kspb are 25-by-25 or length 25. > > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; > x(2:2:end)=0; It seems that it completely ignores the second degree of > freedom (setting it to zero). I saw this for (close to) constant initial > guess, while for heterogeneous initial guess, it works fine and the matrix > and vector size are correct, and the solution is correct. So this > eliminating row behavior seems to be initial guess dependent. > > I saw this even if I use snes_fd, so we can rule out the possibility of > wrong Jacobian. For the FormFunctionLocal(), I checked via > SNESComputeFunction and it output the correct vector of residue. > > Are the orders of function calls correct? > > Thank you. > > Xiangdong > > > > > > > > On Thu, May 1, 2014 at 1:58 PM, Barry Smith wrote: > >> >> On May 1, 2014, at 10:32 AM, Xiangdong wrote: >> >> > Under what condition, SNESGetFunctionNorm() will output different >> results from SENEGetFunction + VecNorm (with NORM_2)? >> > >> > For most of my test cases, it is the same. However, when I have some >> special (trivial) initial guess to the SNES problem, I see different norms. >> >> Please send more details on your ?trivial? case where the values are >> different. It could be that we are not setting the function norm properly >> on early exit from the solvers. >> > >> > Another phenomenon I noticed with this is that KSP in SNES squeeze my >> matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When >> I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, >> and the rhs and solution is with length 25. Do you have any clue on what >> triggered this? To my surprise, when I output the Jacobian inside the >> FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct >> numerical entries. Why does the operator obtained from KSP is different and >> got rows eliminated? These rows got eliminated have only one entries per >> row, but the rhs in that row is not zero. Eliminating these rows would give >> wrong solutions. >> >> Hmm, we never squeeze out rows/columns from the Jacobian. The size of >> the Jacobian set with SNESSetJacobian() should always match that obtained >> with KSPGetOperators() on the linear system. Please send more details on >> how you get this. Are you calling the KSPGetOperators() inside a >> preconditioner where the the preconditioner has chopped up the operator? >> >> Barry >> >> > >> > Thank you. >> > >> > Xiangdong >> > >> > >> > >> > >> > >> > >> > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley >> wrote: >> > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong wrote: >> > It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo >> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize >> the array f. Zero the array f solved the problem and gave consistent result. >> > >> > Just curious, why does not petsc initialize the array f to zero by >> default inside petsc when passing the f array to FormFunctionLocal? >> > >> > If you directly set entires, you might not want us to spend the time >> writing those zeros. >> > >> > I have another quick question about the array x passed to >> FormFunctionLocal. If I want to know the which x is evaluated, how can I >> output x in a vector format? Currently, I created a global vector vecx and >> a local vector vecx_local, get the array of vecx_local_array, copy the x to >> vecx_local_array, scatter to global vecx and output vecx. Is there a quick >> way to restore the array x to a vector and output? >> > >> > I cannot think of a better way than that. >> > >> > Matt >> > >> > Thank you. >> > >> > Best, >> > Xiangdong >> > >> > >> > >> > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith >> wrote: >> > >> > On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: >> > >> > > Hello everyone, >> > > >> > > When I run snes program, >> > >> > ^^^^ what SNES program?? >> > >> > > it outputs "SNES Function norm 1.23456789e+10". It seems that this >> norm is different from residue norm (even if solving F(x)=0) >> > >> > Please send the full output where you see this. >> > >> > > and also differ from norm of the Jacobian. What is the definition of >> this "SNES Function Norm?? >> > >> > The SNES Function Norm as printed by PETSc is suppose to the 2-norm >> of F(x) - b (where b is usually zero) and this is also the same thing as >> the ?residue norm? >> > >> > Barry >> > >> > > >> > > Thank you. >> > > >> > > Best, >> > > Xiangdong >> > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: exdemo.c Type: text/x-csrc Size: 4875 bytes Desc: not available URL: From bsmith at mcs.anl.gov Thu May 1 21:21:45 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 1 May 2014 21:21:45 -0500 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> Message-ID: <8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov> On May 1, 2014, at 9:12 PM, Xiangdong wrote: > I came up with a simple example to demonstrate this "eliminating row" behavior. It happens when the solution x to the linearized equation Ax=b is out of the bound set by SNESVISetVariableBounds(); > > In the attached example, I use snes to solve a simple function x-b=0. When you run it, it outputs the matrix as 25 rows, while the real Jacobian should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be -inf, it will output 50 rows for the Jacobian. In the first case, the norm given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different. > > In solving the nonlinear equations, it is likely that the solution of the linearized equation is out of bound, but then we can reset the out-of-bound solution to be lower or upper bound instead of eliminating the variables (the rows). Any suggestions on doing this in petsc? This is what PETSc is doing. It is using the "active set method". Variables that are at their bounds are ?frozen? and then a smaller system is solved (involving just the variables not a that bounds) to get the next search direction. Based on the next search direction some of the variables on the bounds may be unfrozen and other variables may be frozen. There is a huge literature on this topic. See for example our buddies ? Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, New York: Springer-Verlag. ISBN 978-0-387-30303-1.. The SNESGetFunctionNorm and SNESGetFunction+VecNorm may return different values with the SNES VI solver. If you care about the function value just use SNESGetFunction() and compute the norm that way. We are eliminating SNESGetFunctionNorm() from PETSc because it is problematic. If you think the SNES VI solver is actually not solving the problem, or giving the wrong answer than please send us the entire simple code and we?ll see if we have introduced any bugs into our solver. But note that the linear system being of different sizes is completely normal for the solver. Barry > > Thank you. > > Best, > Xiangdong > > P.S. If we change the lower bound of field u (line 124) to be zero, then the Jacobian matrix is set to be NULL by petsc. > > > On Thu, May 1, 2014 at 3:43 PM, Xiangdong wrote: > Here is the order of functions I called: > > DMDACreate3d(); > > SNESCreate(); > > SNESSetDM(); (DM with dof=2); > > DMSetApplicationContext(); > > DMDASNESSetFunctionLocal(); > > SNESVISetVariableBounds(); > > DMDASNESetJacobianLocal(); > > SNESSetFromOptions(); > > SNESSolve(); > > SNESGetKSP(); > KSPGetSolution(); > KSPGetRhs(); > KSPGetOperators(); //get operator kspA, kspx, kspb; > > SNESGetFunctionNorm(); ==> get norm fnorma; > SNESGetFunction(); VecNorm(); ==> get norm fnormb; > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the solution x and get norm fnormc; > > Inside the FormJacobianLocal(), I output the matrix jac and preB; > > I found that fnorma matches the default SNES monitor output "SNES Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx obtained by snescomputefunction, mat jac and preB are length 50 or 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25. > > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; x(2:2:end)=0; It seems that it completely ignores the second degree of freedom (setting it to zero). I saw this for (close to) constant initial guess, while for heterogeneous initial guess, it works fine and the matrix and vector size are correct, and the solution is correct. So this eliminating row behavior seems to be initial guess dependent. > > I saw this even if I use snes_fd, so we can rule out the possibility of wrong Jacobian. For the FormFunctionLocal(), I checked via SNESComputeFunction and it output the correct vector of residue. > > Are the orders of function calls correct? > > Thank you. > > Xiangdong > > > > > > > > On Thu, May 1, 2014 at 1:58 PM, Barry Smith wrote: > > On May 1, 2014, at 10:32 AM, Xiangdong wrote: > > > Under what condition, SNESGetFunctionNorm() will output different results from SENEGetFunction + VecNorm (with NORM_2)? > > > > For most of my test cases, it is the same. However, when I have some special (trivial) initial guess to the SNES problem, I see different norms. > > Please send more details on your ?trivial? case where the values are different. It could be that we are not setting the function norm properly on early exit from the solvers. > > > > Another phenomenon I noticed with this is that KSP in SNES squeeze my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, and the rhs and solution is with length 25. Do you have any clue on what triggered this? To my surprise, when I output the Jacobian inside the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct numerical entries. Why does the operator obtained from KSP is different and got rows eliminated? These rows got eliminated have only one entries per row, but the rhs in that row is not zero. Eliminating these rows would give wrong solutions. > > Hmm, we never squeeze out rows/columns from the Jacobian. The size of the Jacobian set with SNESSetJacobian() should always match that obtained with KSPGetOperators() on the linear system. Please send more details on how you get this. Are you calling the KSPGetOperators() inside a preconditioner where the the preconditioner has chopped up the operator? > > Barry > > > > > Thank you. > > > > Xiangdong > > > > > > > > > > > > > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley wrote: > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong wrote: > > It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize the array f. Zero the array f solved the problem and gave consistent result. > > > > Just curious, why does not petsc initialize the array f to zero by default inside petsc when passing the f array to FormFunctionLocal? > > > > If you directly set entires, you might not want us to spend the time writing those zeros. > > > > I have another quick question about the array x passed to FormFunctionLocal. If I want to know the which x is evaluated, how can I output x in a vector format? Currently, I created a global vector vecx and a local vector vecx_local, get the array of vecx_local_array, copy the x to vecx_local_array, scatter to global vecx and output vecx. Is there a quick way to restore the array x to a vector and output? > > > > I cannot think of a better way than that. > > > > Matt > > > > Thank you. > > > > Best, > > Xiangdong > > > > > > > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith wrote: > > > > On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: > > > > > Hello everyone, > > > > > > When I run snes program, > > > > ^^^^ what SNES program?? > > > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this norm is different from residue norm (even if solving F(x)=0) > > > > Please send the full output where you see this. > > > > > and also differ from norm of the Jacobian. What is the definition of this "SNES Function Norm?? > > > > The SNES Function Norm as printed by PETSc is suppose to the 2-norm of F(x) - b (where b is usually zero) and this is also the same thing as the ?residue norm? > > > > Barry > > > > > > > > Thank you. > > > > > > Best, > > > Xiangdong > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > From jingyue at gmail.com Thu May 1 23:44:24 2014 From: jingyue at gmail.com (Jingyue Wang) Date: Thu, 01 May 2014 23:44:24 -0500 Subject: [petsc-users] SLEPc configuration problem In-Reply-To: <126D95D7-DDFF-4ED4-995C-D20D07DB5E4F@dsic.upv.es> References: <5362AF70.9050909@gmail.com> <126D95D7-DDFF-4ED4-995C-D20D07DB5E4F@dsic.upv.es> Message-ID: <536322A8.4050504@gmail.com> Dear Jose, Thank you for the suggestion and it works after I removed a few external packages. Now the compilation is successful. Best regards, Jingyue On 05/01/2014 03:48 PM, Jose E. Roman wrote: > El 01/05/2014, a las 22:32, Jingyue Wang escribi?: > >> Hi, >> >> Can anyone please help me on how to configure SLEPc? I have installed PETSc 3.4.4 (compiled with MKL) and downloaded and extracted the source of SLEPc 3.4.4. I set up >> >> export SLEPC_DIR="/home/jwang/opt/slepc-3.4.4" >> export PETSC_DIR="/home/jwang/opt/petsc-3.4.4" >> export PETSC_ARCH=linux-amd64-opt >> >> However, after I enter the source directory of SLEPc and type ./configure, I got the error messages that I append at the end of the email. I tried to read the python configuration code and it seems that the reason is self.framework is None and the reason for self.framework is None is in the script.py in my petsc-3.4.4/config/BuildSystem directory, the following code in function loadConfigure(self, argDB = None): >> ..... >> if not 'configureCache' in argDB: >> self.logPrint('No cached configure in RDict at '+str(argDB.saveFilename)) >> return None >> ..... >> returns a None value. >> >> It seems the reason is SLEPc can not find cached configuration in PETSc, but I don't know how to enable such cached configuration in PETSc... >> >> >> ***********************Error messages***************************************** >> >> Checking environment... >> Checking PETSc installation... >> Checking LAPACK library... >> >> Traceback (most recent call last): >> File "./configure", line 10, in >> execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py')) >> File "./config/configure.py", line 401, in >> cmakeok = cmakeboot.main(slepcdir,petscdir,petscarch=petscconf.ARCH,log=log) >> File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 172, in main >> return PETScMaker(slepcdir,petscdir,petscarch,argDB,framework).cmakeboot(args,log) >> File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 87, in cmakeboot >> self.setup() >> File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 83, in setup >> self.setupModules() >> File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 51, in setupModules >> self.mpi = self.framework.require('config.packages.MPI', None) >> AttributeError: 'NoneType' object has no attribute 'require' >> > This problem has been reported before and it may happen occasionally. > Check the file $PETSC_DIR/$PETSC_ARCH/conf/RDict.db - see if it has a smaller size than usual. If this is the case, then the problem is that PETSc's configuration did not write this file completely, I don't know the reason. > > Suggest to reconfigure PETSc. > > Jose > From lfreret at arrow.utias.utoronto.ca Fri May 2 09:38:42 2014 From: lfreret at arrow.utias.utoronto.ca (Lucie Freret) Date: Fri, 02 May 2014 10:38:42 -0400 Subject: [petsc-users] Petsc with ML and ILU Message-ID: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca> Hello, I would like to solve linear systems using Gmres preconditioned by ML and use ILU(0) on all levels (mg_coarse and mg_levels_x). As I have MATMPIAIJ matrix, I'm using -ksp_type gmres -pc_type ml -mg_levels_ksp_type preonly (-mg_coarse_ksp_type preonly) -mg_levels_pc_type asm (-mg_coarse_pc_type asm) but I get: "Running KSP of preonly doesn't make sense with nonzero initial guess" I tried different keyword to have a zero initial guess but unfortunately, I can't solve this problem. Should I use an other mg_levels_ksp solver of each level or is there a way to initialize guess on each level? Thanks, Lucie ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From knepley at gmail.com Fri May 2 09:43:34 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 2 May 2014 09:43:34 -0500 Subject: [petsc-users] Petsc with ML and ILU In-Reply-To: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca> References: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca> Message-ID: On Fri, May 2, 2014 at 9:38 AM, Lucie Freret < lfreret at arrow.utias.utoronto.ca> wrote: > Hello, > > I would like to solve linear systems using Gmres preconditioned by ML and > use ILU(0) on all levels (mg_coarse and mg_levels_x). > As I have MATMPIAIJ matrix, I'm using > -ksp_type gmres > -pc_type ml > -mg_levels_ksp_type preonly (-mg_coarse_ksp_type preonly) > -mg_levels_pc_type asm (-mg_coarse_pc_type asm) > but I get: > "Running KSP of preonly doesn't make sense with nonzero initial guess" > I tried different keyword to have a zero initial guess but unfortunately, > I can't solve this problem. > Should I use an other mg_levels_ksp solver of each level or is there a way > to initialize guess on each level? > You want "richardson" instead of preonly, since you are doing defect correction in MG. Matt > Thanks, > Lucie > > ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program. > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri May 2 09:46:35 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 02 May 2014 08:46:35 -0600 Subject: [petsc-users] Petsc with ML and ILU In-Reply-To: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca> References: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca> Message-ID: <8761low850.fsf@jedbrown.org> Lucie Freret writes: > Hello, > > I would like to solve linear systems using Gmres preconditioned by ML > and use ILU(0) on all levels (mg_coarse and mg_levels_x). > As I have MATMPIAIJ matrix, I'm using > -ksp_type gmres > -pc_type ml > -mg_levels_ksp_type preonly (-mg_coarse_ksp_type preonly) This should be -mg_levels_ksp_type richardson (the default when using ML), which will compute a residual as necessary before applying the preconditioner. Note that this may need damping, or you could use -mg_levels_ksp_type chebyshev to compute a spectral estimate to combine damping and targeting a range of the spectrum. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From salazardetroya at gmail.com Fri May 2 10:03:35 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Fri, 2 May 2014 10:03:35 -0500 Subject: [petsc-users] Adaptive mesh refinement in Petsc In-Reply-To: References: <48DFB171-4663-4E30-96A6-80103400ACD2@mcs.anl.gov> Message-ID: Thanks a lot for your responses. I will get started with MOOSE. On Thu, May 1, 2014 at 8:04 PM, Derek Gaston wrote: > Miguel, > > I'm the lead for the MOOSE Framework project Barry spoke of... we would > love to help you get up and running with adaptive finite elements for solid > mechanics with MOOSE. If you are doing fairly normal solid mechanics using > small or large strain formulations with some plasticity... most of what you > need is already there. You may need to plug in your particular material > model but that's about it. Mesh adaptivity is built-in and should work out > of the box. The major benefit of using MOOSE is that you can easily couple > in other physics (like heat conduction, chemistry and more) and of course > you have full access to all the power of PETSc. > > I recommend going through the Getting Started material on > http://www.mooseframework.org to get set up... and go ahead and create > yourself a new Application using these instructions: > http://mooseframework.org/create-an-app/ . That Application will > already have full access to our solid mechanics capabilities (as well as > tons of other stuff like heat conduction, chemistry, etc.). > > After that - join up on the moose-users mailing list and you can get in > touch with everyone else doing solid mechanics with MOOSE who can point you > in the right direction depending on your particular application. > > Let me know if you have any questions... > > Derek > > > > > On Thu, May 1, 2014 at 6:31 PM, Barry Smith wrote: > >> >> You also could likely benefit from Moose http://www.mooseframework.orgit sits on top of libMesh which sits on top of PETSc and manages almost all >> of what you need for finite element analysis. >> >> Barry >> >> On May 1, 2014, at 7:19 PM, Matthew Knepley wrote: >> >> > On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya < >> salazardetroya at gmail.com> wrote: >> > Hello everybody >> > >> > I want to implement an adaptive mesh refinement library in a code >> written in petsc. I have checked out some of the available libraries, but I >> want to work with the latest petsc-dev version and I am sure there will be >> many incompatibilities. So far I think I'll end up working with one of >> these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start >> checking out each of them and learn how to use them I though I would ask >> you guys which one you would recommend. My code would be a finite element >> analysis in solid mechanics. I would like to take full advantage of petsc >> capabilities, but I would not mind start with some restrictions. I hope my >> question is not too broad. >> > >> > SAMRAI, Chombo, and Deal II are all structured adaptive refinement >> codes, whereas LibMesh is unstructured. If you want unstructured, there is >> > really no other game in town. If you use deal II, I would suggest >> trying out p4est underneath which gives great scalability. My understanding >> > is that Chombo is mostly used for finite volume and SAMRAI and deal II >> for finite element, but this could be out of date. >> > >> > Matt >> > >> > Take care >> > Miguel >> > >> > -- >> > Miguel Angel Salazar de Troya >> > Graduate Research Assistant >> > Department of Mechanical Science and Engineering >> > University of Illinois at Urbana-Champaign >> > (217) 550-2360 >> > salaza11 at illinois.edu >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> >> > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Fri May 2 10:27:13 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Fri, 2 May 2014 08:27:13 -0700 (PDT) Subject: [petsc-users] ILUTP in PETSc Message-ID: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> Hello, I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf?that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? Many thanks, Qin?? From k.anush at gmail.com Fri May 2 11:34:33 2014 From: k.anush at gmail.com (Anush Krishnan) Date: Fri, 2 May 2014 12:34:33 -0400 Subject: [petsc-users] Assembling a matrix for a DMComposite vector In-Reply-To: <5362C9C2.40602@uni-mainz.de> References: <535F6D64.5040700@uni-mainz.de> <5362C9C2.40602@uni-mainz.de> Message-ID: On 1 May 2014 18:25, Anton Popov wrote: > On 5/1/14 10:39 PM, Anush Krishnan wrote: > > Hi Anton, > > On 29 April 2014 05:14, Anton Popov wrote: > >> >> You can do the whole thing much easier (to my opinion). >> Since you created two DMDA anyway, just do: >> >> - find first index on every processor using MPI_Scan >> - create two global vectors (no ghosts) >> - put proper global indicies to global vectors >> - create two local vectors (with ghosts) and set ALL entries to -1 (to >> have what you need in boundary ghosts) >> - call global-to-local scatter >> >> Done! >> > > Won't the vectors contain floating point values? Are you storing your > indices as real numbers? > > YES, exactly. And then I cast them to PetscInt when I compose stencils. > > Something like this: > idx[0] = (PetscInt) ivx[k][j][i]; > idx[1] = (PetscInt) ivx[k][j][i+1]; > idx[2] = (PetscInt) ivy[k][j][i]; > ... and so on, where ivx, ivy, ... are the index arrays in x, y .. > directions > > Then I insert (actually add) stencils using MatSetValues. > > By the way, you can ideally preallocate in parallel with > MatMPIAIJSetPreallocation. To count precisely number entries in the > diagonal & off-diagonal blocks use the same mechanism to easily access > global indices, and then compare them with the local row range, which is > also known: > - within the range -> d_nnz[i]++; > - outside the range -> o_nnz[i]++; > Thanks a lot for the help. I did exactly that and it worked perfectly. But just to clarify: if I was using 32-bit floats, would I start having trouble when my matrix size reaches ~10 million due to the floating point precision? > > Anton > > > >> >> The advantage is that you can access global indices (including ghosts) in >> every block using i-j-k indexing scheme. >> I personally find this way quite easy to implement with PETSc >> >> Anton >> > > Thank you, > Anush > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anush at bu.edu Fri May 2 11:36:25 2014 From: anush at bu.edu (Anush Krishnan) Date: Fri, 2 May 2014 12:36:25 -0400 Subject: [petsc-users] Assembling a matrix for a DMComposite vector In-Reply-To: <5362C9C2.40602@uni-mainz.de> References: <535F6D64.5040700@uni-mainz.de> <5362C9C2.40602@uni-mainz.de> Message-ID: On 1 May 2014 18:25, Anton Popov wrote: > On 5/1/14 10:39 PM, Anush Krishnan wrote: > > Hi Anton, > > On 29 April 2014 05:14, Anton Popov wrote: > >> >> You can do the whole thing much easier (to my opinion). >> Since you created two DMDA anyway, just do: >> >> - find first index on every processor using MPI_Scan >> - create two global vectors (no ghosts) >> - put proper global indicies to global vectors >> - create two local vectors (with ghosts) and set ALL entries to -1 (to >> have what you need in boundary ghosts) >> - call global-to-local scatter >> >> Done! >> > > Won't the vectors contain floating point values? Are you storing your > indices as real numbers? > > YES, exactly. And then I cast them to PetscInt when I compose stencils. > > Something like this: > idx[0] = (PetscInt) ivx[k][j][i]; > idx[1] = (PetscInt) ivx[k][j][i+1]; > idx[2] = (PetscInt) ivy[k][j][i]; > ... and so on, where ivx, ivy, ... are the index arrays in x, y .. > directions > > Then I insert (actually add) stencils using MatSetValues. > > By the way, you can ideally preallocate in parallel with > MatMPIAIJSetPreallocation. To count precisely number entries in the > diagonal & off-diagonal blocks use the same mechanism to easily access > global indices, and then compare them with the local row range, which is > also known: > - within the range -> d_nnz[i]++; > - outside the range -> o_nnz[i]++; > Thanks a lot for the help. I did exactly that and it worked perfectly. But just to clarify: if I was using 32-bit floats, would I start having trouble when my matrix size reaches ~10 million due to the floating point precision? > > Anton > > > >> >> The advantage is that you can access global indices (including ghosts) in >> every block using i-j-k indexing scheme. >> I personally find this way quite easy to implement with PETSc >> >> Anton >> > > Thank you, > Anush > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From epscodes at gmail.com Fri May 2 12:53:52 2014 From: epscodes at gmail.com (Xiangdong) Date: Fri, 2 May 2014 13:53:52 -0400 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: <8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov> References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> <8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov> Message-ID: On Thu, May 1, 2014 at 10:21 PM, Barry Smith wrote: > > On May 1, 2014, at 9:12 PM, Xiangdong wrote: > > > I came up with a simple example to demonstrate this "eliminating row" > behavior. It happens when the solution x to the linearized equation Ax=b is > out of the bound set by SNESVISetVariableBounds(); > > > > In the attached example, I use snes to solve a simple function x-b=0. > When you run it, it outputs the matrix as 25 rows, while the real Jacobian > should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be > -inf, it will output 50 rows for the Jacobian. In the first case, the norm > given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different. > > > > In solving the nonlinear equations, it is likely that the solution of > the linearized equation is out of bound, but then we can reset the > out-of-bound solution to be lower or upper bound instead of eliminating the > variables (the rows). Any suggestions on doing this in petsc? > > This is what PETSc is doing. It is using the "active set method". > Variables that are at their bounds are ?frozen? and then a smaller system > is solved (involving just the variables not a that bounds) to get the next > search direction. Based on the next search direction some of the variables > on the bounds may be unfrozen and other variables may be frozen. There is a > huge literature on this topic. See for example our buddies ? Nocedal, > Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, > New York: Springer-Verlag. ISBN 978-0-387-30303-1.. > > The SNESGetFunctionNorm and SNESGetFunction+VecNorm may return > different values with the SNES VI solver. If you care about the function > value just use SNESGetFunction() and compute the norm that way. We are > eliminating SNESGetFunctionNorm() from PETSc because it is problematic. > > If you think the SNES VI solver is actually not solving the problem, or > giving the wrong answer than please send us the entire simple code and > we?ll see if we have introduced any bugs into our solver. But note that the > linear system being of different sizes is completely normal for the solver. > Here is an example I do not quite understand. I have a simple function F(X) = [x1+x2+100 ; x1-x2; x3+x4; x3-x4+50]. If I solve this F(X)=0 with no constraint, the exact solution is [x1=-50; x2=-50; x3=-25; x4=25]. If I specify the constraint as x2>=0 and x4>=0, I expect the solution from one iteration of SNES is [-50, 0,-25,25], since the constraint on x2 should be active now. However, the petsc outputs the solution [-50, 0, 0, 0]. Since x3 and x4 does not violate the constraint, why does the solution of x3 and x4 change (note that x3 and x3 are decoupled from x1 and x2)? In this case, the matrix obtained from KSPGetOperators is only 2-by-2, so two variables or constraints are eliminated. Another thing I noticed is that constraints x2>-1e-7 and x4>-1e-7 gives solution [-50,-1e-7,-25,25]; however, constraints x2>-1e-8 and x4>-1e-8 gives the solution [-50,0,0,0]. Attached please find the simple 130-line code showing this behavior. Simply commenting the line 37 to remove the constraints and modifying line 92 to change the lower bounds of x2 and x4. Thanks a lot for your time and help. Best, Xiangdong > > > Barry > > > > > > Thank you. > > > > Best, > > Xiangdong > > > > P.S. If we change the lower bound of field u (line 124) to be zero, then > the Jacobian matrix is set to be NULL by petsc. > > > > > > On Thu, May 1, 2014 at 3:43 PM, Xiangdong wrote: > > Here is the order of functions I called: > > > > DMDACreate3d(); > > > > SNESCreate(); > > > > SNESSetDM(); (DM with dof=2); > > > > DMSetApplicationContext(); > > > > DMDASNESSetFunctionLocal(); > > > > SNESVISetVariableBounds(); > > > > DMDASNESetJacobianLocal(); > > > > SNESSetFromOptions(); > > > > SNESSolve(); > > > > SNESGetKSP(); > > KSPGetSolution(); > > KSPGetRhs(); > > KSPGetOperators(); //get operator kspA, kspx, kspb; > > > > SNESGetFunctionNorm(); ==> get norm fnorma; > > SNESGetFunction(); VecNorm(); ==> get norm fnormb; > > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the > solution x and get norm fnormc; > > > > Inside the FormJacobianLocal(), I output the matrix jac and preB; > > > > I found that fnorma matches the default SNES monitor output "SNES > Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx > obtained by snescomputefunction, mat jac and preB are length 50 or > 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25. > > > > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; > x(2:2:end)=0; It seems that it completely ignores the second degree of > freedom (setting it to zero). I saw this for (close to) constant initial > guess, while for heterogeneous initial guess, it works fine and the matrix > and vector size are correct, and the solution is correct. So this > eliminating row behavior seems to be initial guess dependent. > > > > I saw this even if I use snes_fd, so we can rule out the possibility of > wrong Jacobian. For the FormFunctionLocal(), I checked via > SNESComputeFunction and it output the correct vector of residue. > > > > Are the orders of function calls correct? > > > > Thank you. > > > > Xiangdong > > > > > > > > > > > > > > > > On Thu, May 1, 2014 at 1:58 PM, Barry Smith wrote: > > > > On May 1, 2014, at 10:32 AM, Xiangdong wrote: > > > > > Under what condition, SNESGetFunctionNorm() will output different > results from SENEGetFunction + VecNorm (with NORM_2)? > > > > > > For most of my test cases, it is the same. However, when I have some > special (trivial) initial guess to the SNES problem, I see different norms. > > > > Please send more details on your ?trivial? case where the values are > different. It could be that we are not setting the function norm properly > on early exit from the solvers. > > > > > > Another phenomenon I noticed with this is that KSP in SNES squeeze my > matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When > I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, > and the rhs and solution is with length 25. Do you have any clue on what > triggered this? To my surprise, when I output the Jacobian inside the > FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct > numerical entries. Why does the operator obtained from KSP is different and > got rows eliminated? These rows got eliminated have only one entries per > row, but the rhs in that row is not zero. Eliminating these rows would give > wrong solutions. > > > > Hmm, we never squeeze out rows/columns from the Jacobian. The size of > the Jacobian set with SNESSetJacobian() should always match that obtained > with KSPGetOperators() on the linear system. Please send more details on > how you get this. Are you calling the KSPGetOperators() inside a > preconditioner where the the preconditioner has chopped up the operator? > > > > Barry > > > > > > > > Thank you. > > > > > > Xiangdong > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley > wrote: > > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong wrote: > > > It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo > *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize > the array f. Zero the array f solved the problem and gave consistent result. > > > > > > Just curious, why does not petsc initialize the array f to zero by > default inside petsc when passing the f array to FormFunctionLocal? > > > > > > If you directly set entires, you might not want us to spend the time > writing those zeros. > > > > > > I have another quick question about the array x passed to > FormFunctionLocal. If I want to know the which x is evaluated, how can I > output x in a vector format? Currently, I created a global vector vecx and > a local vector vecx_local, get the array of vecx_local_array, copy the x to > vecx_local_array, scatter to global vecx and output vecx. Is there a quick > way to restore the array x to a vector and output? > > > > > > I cannot think of a better way than that. > > > > > > Matt > > > > > > Thank you. > > > > > > Best, > > > Xiangdong > > > > > > > > > > > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith > wrote: > > > > > > On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: > > > > > > > Hello everyone, > > > > > > > > When I run snes program, > > > > > > ^^^^ what SNES program?? > > > > > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this > norm is different from residue norm (even if solving F(x)=0) > > > > > > Please send the full output where you see this. > > > > > > > and also differ from norm of the Jacobian. What is the definition of > this "SNES Function Norm?? > > > > > > The SNES Function Norm as printed by PETSc is suppose to the 2-norm > of F(x) - b (where b is usually zero) and this is also the same thing as > the ?residue norm? > > > > > > Barry > > > > > > > > > > > Thank you. > > > > > > > > Best, > > > > Xiangdong > > > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: exdemosingle.c Type: text/x-csrc Size: 3398 bytes Desc: not available URL: From yuqing.xia at colorado.edu Fri May 2 14:02:07 2014 From: yuqing.xia at colorado.edu (yuqing xia) Date: Fri, 2 May 2014 13:02:07 -0600 Subject: [petsc-users] About left eigenvector for general eigenvalue problem Message-ID: Hello everyone I am trying to solve a general eigenvalue problem. A x=\lambda B x I also need to get the left eigenvector y A=\lambda y B I tested the result for a special case where A and B are real and symmetric. The left and right eigenvector should be the same. However, there are not. Then I tried to solve the problem A x =\lambda x The left and right eigenvectors are the same in such case. So I am wondering what is the reason. Thanks. Best Yuqing Xia -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri May 2 14:25:53 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 May 2014 14:25:53 -0500 Subject: [petsc-users] ILUTP in PETSc In-Reply-To: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> Message-ID: <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html there are two listed. ./configure ?download-hypre mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid you can also add -help to see what options are available. Both pretty much suck and I can?t image much reason for using them. Barry On May 2, 2014, at 10:27 AM, Qin Lu wrote: > Hello, > > I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? > > Many thanks, > Qin From bsmith at mcs.anl.gov Fri May 2 14:36:09 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 May 2014 14:36:09 -0500 Subject: [petsc-users] About left eigenvector for general eigenvalue problem In-Reply-To: References: Message-ID: <8F3F5B5B-2565-4409-B3C7-0DD216FC9925@mcs.anl.gov> Please send more information about how you tried to compute this and the matrix you used (if small just send the binary matrix or code that generates it). Vague questions like ?why doesn?t it work as I expect?? are really hard to answer. With specifics about what was done and how the answer was different make the question easier and easier to answer. Barry On May 2, 2014, at 2:02 PM, yuqing xia wrote: > Hello everyone > > I am trying to solve a general eigenvalue problem. > A x=\lambda B x > I also need to get the left eigenvector > y A=\lambda y B > > I tested the result for a special case where A and B are real and symmetric. The left and right eigenvector should be the same. However, there are not. > Then I tried to solve the problem > A x =\lambda x > The left and right eigenvectors are the same in such case. So I am wondering what is the reason. Thanks. > > > Best > Yuqing Xia > From bsmith at mcs.anl.gov Fri May 2 14:49:56 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 May 2014 14:49:56 -0500 Subject: [petsc-users] Assembling a matrix for a DMComposite vector In-Reply-To: References: <535F6D64.5040700@uni-mainz.de> <5362C9C2.40602@uni-mainz.de> Message-ID: On May 2, 2014, at 11:36 AM, Anush Krishnan wrote: > > > > On 1 May 2014 18:25, Anton Popov wrote: > On 5/1/14 10:39 PM, Anush Krishnan wrote: >> Hi Anton, >> >> On 29 April 2014 05:14, Anton Popov wrote: >> >> You can do the whole thing much easier (to my opinion). >> Since you created two DMDA anyway, just do: >> >> - find first index on every processor using MPI_Scan >> - create two global vectors (no ghosts) >> - put proper global indicies to global vectors >> - create two local vectors (with ghosts) and set ALL entries to -1 (to have what you need in boundary ghosts) >> - call global-to-local scatter >> >> Done! >> >> Won't the vectors contain floating point values? Are you storing your indices as real numbers? > YES, exactly. And then I cast them to PetscInt when I compose stencils. > > Something like this: > idx[0] = (PetscInt) ivx[k][j][i]; > idx[1] = (PetscInt) ivx[k][j][i+1]; > idx[2] = (PetscInt) ivy[k][j][i]; > ... and so on, where ivx, ivy, ... are the index arrays in x, y .. directions > > Then I insert (actually add) stencils using MatSetValues. > > By the way, you can ideally preallocate in parallel with MatMPIAIJSetPreallocation. To count precisely number entries in the diagonal & off-diagonal blocks use the same mechanism to easily access global indices, and then compare them with the local row range, which is also known: > - within the range -> d_nnz[i]++; > - outside the range -> o_nnz[i]++; > > Thanks a lot for the help. I did exactly that and it worked perfectly. > > But just to clarify: if I was using 32-bit floats, would I start having trouble when my matrix size reaches ~10 million due to the floating point precision? If you ./configure PETSc with ?with-64-bit-indices=1 then PetscInt will not fit in a float and the code will not work. As soon as you switch you 64 bit indices you would need to use doubles if you hope to store PetscInt in them. Barry > > > Anton > >> >> >> The advantage is that you can access global indices (including ghosts) in every block using i-j-k indexing scheme. >> I personally find this way quite easy to implement with PETSc >> >> Anton >> >> Thank you, >> Anush > > From knepley at gmail.com Fri May 2 15:10:03 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 2 May 2014 15:10:03 -0500 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> <8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov> Message-ID: On Fri, May 2, 2014 at 12:53 PM, Xiangdong wrote: > > On Thu, May 1, 2014 at 10:21 PM, Barry Smith wrote: > >> >> On May 1, 2014, at 9:12 PM, Xiangdong wrote: >> >> > I came up with a simple example to demonstrate this "eliminating row" >> behavior. It happens when the solution x to the linearized equation Ax=b is >> out of the bound set by SNESVISetVariableBounds(); >> > >> > In the attached example, I use snes to solve a simple function x-b=0. >> When you run it, it outputs the matrix as 25 rows, while the real Jacobian >> should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be >> -inf, it will output 50 rows for the Jacobian. In the first case, the norm >> given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different. >> > >> > In solving the nonlinear equations, it is likely that the solution of >> the linearized equation is out of bound, but then we can reset the >> out-of-bound solution to be lower or upper bound instead of eliminating the >> variables (the rows). Any suggestions on doing this in petsc? >> >> This is what PETSc is doing. It is using the "active set method". >> Variables that are at their bounds are ?frozen? and then a smaller system >> is solved (involving just the variables not a that bounds) to get the next >> search direction. Based on the next search direction some of the variables >> on the bounds may be unfrozen and other variables may be frozen. There is a >> huge literature on this topic. See for example our buddies ? Nocedal, >> Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, >> New York: Springer-Verlag. ISBN 978-0-387-30303-1.. >> >> The SNESGetFunctionNorm and SNESGetFunction+VecNorm may return >> different values with the SNES VI solver. If you care about the function >> value just use SNESGetFunction() and compute the norm that way. We are >> eliminating SNESGetFunctionNorm() from PETSc because it is problematic. >> >> If you think the SNES VI solver is actually not solving the problem, >> or giving the wrong answer than please send us the entire simple code and >> we?ll see if we have introduced any bugs into our solver. But note that the >> linear system being of different sizes is completely normal for the solver. >> > > Here is an example I do not quite understand. I have a simple function > F(X) = [x1+x2+100 ; x1-x2; x3+x4; x3-x4+50]. If I solve this F(X)=0 with no > constraint, the exact solution is [x1=-50; x2=-50; x3=-25; x4=25]. > > If I specify the constraint as x2>=0 and x4>=0, I expect the solution from > one iteration of SNES is [-50, 0,-25,25], since the constraint on x2 should > be active now. However, the petsc outputs the solution [-50, 0, 0, 0]. > Since x3 and x4 does not violate the constraint, why does the solution of > x3 and x4 change (note that x3 and x3 are decoupled from x1 and x2)? In > this case, the matrix obtained from KSPGetOperators is only 2-by-2, so two > variables or constraints are eliminated. > This just finds a local solution to the constrained problem, and these need not be unique. Matt > Another thing I noticed is that constraints x2>-1e-7 and x4>-1e-7 gives > solution [-50,-1e-7,-25,25]; however, constraints x2>-1e-8 and x4>-1e-8 > gives the solution [-50,0,0,0]. > > Attached please find the simple 130-line code showing this behavior. > Simply commenting the line 37 to remove the constraints and modifying line > 92 to change the lower bounds of x2 and x4. > > Thanks a lot for your time and help. > > Best, > Xiangdong > > > > >> >> >> Barry >> >> >> > >> > Thank you. >> > >> > Best, >> > Xiangdong >> > >> > P.S. If we change the lower bound of field u (line 124) to be zero, >> then the Jacobian matrix is set to be NULL by petsc. >> > >> > >> > On Thu, May 1, 2014 at 3:43 PM, Xiangdong wrote: >> > Here is the order of functions I called: >> > >> > DMDACreate3d(); >> > >> > SNESCreate(); >> > >> > SNESSetDM(); (DM with dof=2); >> > >> > DMSetApplicationContext(); >> > >> > DMDASNESSetFunctionLocal(); >> > >> > SNESVISetVariableBounds(); >> > >> > DMDASNESetJacobianLocal(); >> > >> > SNESSetFromOptions(); >> > >> > SNESSolve(); >> > >> > SNESGetKSP(); >> > KSPGetSolution(); >> > KSPGetRhs(); >> > KSPGetOperators(); //get operator kspA, kspx, kspb; >> > >> > SNESGetFunctionNorm(); ==> get norm fnorma; >> > SNESGetFunction(); VecNorm(); ==> get norm fnormb; >> > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the >> solution x and get norm fnormc; >> > >> > Inside the FormJacobianLocal(), I output the matrix jac and preB; >> > >> > I found that fnorma matches the default SNES monitor output "SNES >> Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx >> obtained by snescomputefunction, mat jac and preB are length 50 or >> 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25. >> > >> > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; >> x(2:2:end)=0; It seems that it completely ignores the second degree of >> freedom (setting it to zero). I saw this for (close to) constant initial >> guess, while for heterogeneous initial guess, it works fine and the matrix >> and vector size are correct, and the solution is correct. So this >> eliminating row behavior seems to be initial guess dependent. >> > >> > I saw this even if I use snes_fd, so we can rule out the possibility of >> wrong Jacobian. For the FormFunctionLocal(), I checked via >> SNESComputeFunction and it output the correct vector of residue. >> > >> > Are the orders of function calls correct? >> > >> > Thank you. >> > >> > Xiangdong >> > >> > >> > >> > >> > >> > >> > >> > On Thu, May 1, 2014 at 1:58 PM, Barry Smith wrote: >> > >> > On May 1, 2014, at 10:32 AM, Xiangdong wrote: >> > >> > > Under what condition, SNESGetFunctionNorm() will output different >> results from SENEGetFunction + VecNorm (with NORM_2)? >> > > >> > > For most of my test cases, it is the same. However, when I have some >> special (trivial) initial guess to the SNES problem, I see different norms. >> > >> > Please send more details on your ?trivial? case where the values are >> different. It could be that we are not setting the function norm properly >> on early exit from the solvers. >> > > >> > > Another phenomenon I noticed with this is that KSP in SNES squeeze my >> matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When >> I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, >> and the rhs and solution is with length 25. Do you have any clue on what >> triggered this? To my surprise, when I output the Jacobian inside the >> FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct >> numerical entries. Why does the operator obtained from KSP is different and >> got rows eliminated? These rows got eliminated have only one entries per >> row, but the rhs in that row is not zero. Eliminating these rows would give >> wrong solutions. >> > >> > Hmm, we never squeeze out rows/columns from the Jacobian. The size >> of the Jacobian set with SNESSetJacobian() should always match that >> obtained with KSPGetOperators() on the linear system. Please send more >> details on how you get this. Are you calling the KSPGetOperators() inside a >> preconditioner where the the preconditioner has chopped up the operator? >> > >> > Barry >> > >> > > >> > > Thank you. >> > > >> > > Xiangdong >> > > >> > > >> > > >> > > >> > > >> > > >> > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley >> wrote: >> > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong >> wrote: >> > > It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo >> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize >> the array f. Zero the array f solved the problem and gave consistent result. >> > > >> > > Just curious, why does not petsc initialize the array f to zero by >> default inside petsc when passing the f array to FormFunctionLocal? >> > > >> > > If you directly set entires, you might not want us to spend the time >> writing those zeros. >> > > >> > > I have another quick question about the array x passed to >> FormFunctionLocal. If I want to know the which x is evaluated, how can I >> output x in a vector format? Currently, I created a global vector vecx and >> a local vector vecx_local, get the array of vecx_local_array, copy the x to >> vecx_local_array, scatter to global vecx and output vecx. Is there a quick >> way to restore the array x to a vector and output? >> > > >> > > I cannot think of a better way than that. >> > > >> > > Matt >> > > >> > > Thank you. >> > > >> > > Best, >> > > Xiangdong >> > > >> > > >> > > >> > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith >> wrote: >> > > >> > > On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: >> > > >> > > > Hello everyone, >> > > > >> > > > When I run snes program, >> > > >> > > ^^^^ what SNES program?? >> > > >> > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this >> norm is different from residue norm (even if solving F(x)=0) >> > > >> > > Please send the full output where you see this. >> > > >> > > > and also differ from norm of the Jacobian. What is the definition >> of this "SNES Function Norm?? >> > > >> > > The SNES Function Norm as printed by PETSc is suppose to the >> 2-norm of F(x) - b (where b is usually zero) and this is also the same >> thing as the ?residue norm? >> > > >> > > Barry >> > > >> > > > >> > > > Thank you. >> > > > >> > > > Best, >> > > > Xiangdong >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > > -- Norbert Wiener >> > > >> > >> > >> > >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xsli at lbl.gov Fri May 2 15:40:19 2014 From: xsli at lbl.gov (Xiaoye S. Li) Date: Fri, 2 May 2014 13:40:19 -0700 Subject: [petsc-users] ILUTP in PETSc In-Reply-To: <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> Message-ID: The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily. In SuperLU distribution: EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c) SRC/zgsitrf.c : the actual ILUTP factorization routine Sherry Li On Fri, May 2, 2014 at 12:25 PM, Barry Smith wrote: > > At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html there are two listed. ./configure ?download-hypre > > mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid > > you can also add -help to see what options are available. > > Both pretty much suck and I can?t image much reason for using them. > > Barry > > > On May 2, 2014, at 10:27 AM, Qin Lu wrote: > > > Hello, > > > > I am interested in using ILUTP preconditioner with PETSc linear solver. > There is an online doc > https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthat mentioned it is available in PETSc with other packages (page 62-63). > Is there any instructions or examples on how to use it? > > > > Many thanks, > > Qin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From song.gao2 at mail.mcgill.ca Fri May 2 16:41:26 2014 From: song.gao2 at mail.mcgill.ca (Song Gao) Date: Fri, 2 May 2014 17:41:26 -0400 Subject: [petsc-users] Question with setting up KSP solver parameters. Message-ID: Dear PETSc users, I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice. My codes looks like call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc ) call KSPSetType ( pet_solv, 'gmres', ierpetsc ) call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) call KSPGetPC ( pet_solv, pet_precon, ierpetsc ) call PCSetType ( pet_precon, 'asm', ierpetsc ) call PCASMSetOverlap ( pet_precon, 1, ierpetsc ) call KSPSetUp ( pet_solv, ierpetsc ) call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc ) ! n_local is one call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc ) call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) call KSPSetFromOptions ( pet_solv, ierpetsc ) call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) ! adding this line, the codes converge call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc ) runing with 1 CPU WITHOUT the line with red color and the codes don't converge runtime options: -ksp_monitor_true_residual -ksp_view 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 ....... 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: asm Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - RESTRICT [0] number of local blocks = 1 Local solve info for each block is in the following KSP and PC objects: - - - - - - - - - - - - - - - - - - [0] local block number 0, size = 22905 KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqbaij rows=22905, cols=22905, bs=5 total: nonzeros=785525, allocated nonzeros=785525 total number of mallocs used during MatSetValues calls =0 block size is 5 - - - - - - - - - - - - - - - - - - linear system matrix followed by preconditioner matrix: Matrix Object: 1 MPI processes type: shell rows=22905, cols=22905 Matrix Object: 1 MPI processes type: seqbaij rows=22905, cols=22905, bs=5 total: nonzeros=785525, allocated nonzeros=785525 total number of mallocs used during MatSetValues calls =0 block size is 5 WARNING: zero iteration in iterative solver runing with 1 CPU WITH the line with red color and the codes converge runtime options: -ksp_monitor_true_residual -ksp_view 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 ............ 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 KSP Object: 1 MPI processes type: gmres GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: asm Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - RESTRICT [0] number of local blocks = 1 Local solve info for each block is in the following KSP and PC objects: - - - - - - - - - - - - - - - - - - [0] local block number 0, size = 22905 KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqbaij rows=22905, cols=22905, bs=5 total: nonzeros=785525, allocated nonzeros=785525 total number of mallocs used during MatSetValues calls =0 block size is 5 - - - - - - - - - - - - - - - - - - linear system matrix followed by preconditioner matrix: Matrix Object: 1 MPI processes type: shell rows=22905, cols=22905 Matrix Object: 1 MPI processes type: seqbaij rows=22905, cols=22905, bs=5 total: nonzeros=785525, allocated nonzeros=785525 total number of mallocs used during MatSetValues calls =0 block size is 5 WARNING: zero iteration in iterative solver What would be my error here? Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri May 2 17:03:30 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 May 2014 17:03:30 -0500 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: References: Message-ID: Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator. Barry On May 2, 2014, at 4:41 PM, Song Gao wrote: > Dear PETSc users, > > I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice. > > My codes looks like > > call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc ) > call KSPSetType ( pet_solv, 'gmres', ierpetsc ) > call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) > call KSPGetPC ( pet_solv, pet_precon, ierpetsc ) > call PCSetType ( pet_precon, 'asm', ierpetsc ) > call PCASMSetOverlap ( pet_precon, 1, ierpetsc ) > call KSPSetUp ( pet_solv, ierpetsc ) > call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc ) ! n_local is one > call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc ) > call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) > call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) > call KSPSetFromOptions ( pet_solv, ierpetsc ) > call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) ! adding this line, the codes converge > call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc ) > > runing with 1 CPU WITHOUT the line with red color and the codes don't converge > > runtime options: -ksp_monitor_true_residual -ksp_view > 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 > 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 > ....... > 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 > 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 > 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 > > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - RESTRICT > [0] number of local blocks = 1 > Local solve info for each block is in the following KSP and PC objects: > - - - - - - - - - - - - - - - - - - > [0] local block number 0, size = 22905 > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: jacobi > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqbaij > rows=22905, cols=22905, bs=5 > total: nonzeros=785525, allocated nonzeros=785525 > total number of mallocs used during MatSetValues calls =0 > block size is 5 > - - - - - - - - - - - - - - - - - - > linear system matrix followed by preconditioner matrix: > Matrix Object: 1 MPI processes > type: shell > rows=22905, cols=22905 > Matrix Object: 1 MPI processes > type: seqbaij > rows=22905, cols=22905, bs=5 > total: nonzeros=785525, allocated nonzeros=785525 > total number of mallocs used during MatSetValues calls =0 > block size is 5 > WARNING: zero iteration in iterative solver > > runing with 1 CPU WITH the line with red color and the codes converge > > runtime options: -ksp_monitor_true_residual -ksp_view > 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 > 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 > 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 > 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 > 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 > ............ > 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 > 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 > 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - RESTRICT > [0] number of local blocks = 1 > Local solve info for each block is in the following KSP and PC objects: > - - - - - - - - - - - - - - - - - - > [0] local block number 0, size = 22905 > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: jacobi > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqbaij > rows=22905, cols=22905, bs=5 > total: nonzeros=785525, allocated nonzeros=785525 > total number of mallocs used during MatSetValues calls =0 > block size is 5 > - - - - - - - - - - - - - - - - - - > linear system matrix followed by preconditioner matrix: > Matrix Object: 1 MPI processes > type: shell > rows=22905, cols=22905 > Matrix Object: 1 MPI processes > type: seqbaij > rows=22905, cols=22905, bs=5 > total: nonzeros=785525, allocated nonzeros=785525 > total number of mallocs used during MatSetValues calls =0 > block size is 5 > WARNING: zero iteration in iterative solver > > > What would be my error here? Thank you. From song.gao2 at mail.mcgill.ca Fri May 2 17:29:24 2014 From: song.gao2 at mail.mcgill.ca (Song Gao) Date: Fri, 2 May 2014 22:29:24 +0000 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: References: , Message-ID: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> Thanks for your quick reply. What confused me is that why would the code works fine if I reset the gmres restart number by recalling kspgmressetrestart just before kspsolve? Sent from my iPhone > On May 2, 2014, at 6:03 PM, "Barry Smith" wrote: > > > Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator. > > Barry > >> On May 2, 2014, at 4:41 PM, Song Gao wrote: >> >> Dear PETSc users, >> >> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice. >> >> My codes looks like >> >> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc ) >> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) >> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) >> call KSPGetPC ( pet_solv, pet_precon, ierpetsc ) >> call PCSetType ( pet_precon, 'asm', ierpetsc ) >> call PCASMSetOverlap ( pet_precon, 1, ierpetsc ) >> call KSPSetUp ( pet_solv, ierpetsc ) >> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc ) ! n_local is one >> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc ) >> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) >> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) >> call KSPSetFromOptions ( pet_solv, ierpetsc ) >> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) ! adding this line, the codes converge >> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc ) >> >> runing with 1 CPU WITHOUT the line with red color and the codes don't converge >> >> runtime options: -ksp_monitor_true_residual -ksp_view >> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 >> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 >> ....... >> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 >> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 >> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 >> >> KSP Object: 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 >> Additive Schwarz: restriction/interpolation type - RESTRICT >> [0] number of local blocks = 1 >> Local solve info for each block is in the following KSP and PC objects: >> - - - - - - - - - - - - - - - - - - >> [0] local block number 0, size = 22905 >> KSP Object: (sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (sub_) 1 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqbaij >> rows=22905, cols=22905, bs=5 >> total: nonzeros=785525, allocated nonzeros=785525 >> total number of mallocs used during MatSetValues calls =0 >> block size is 5 >> - - - - - - - - - - - - - - - - - - >> linear system matrix followed by preconditioner matrix: >> Matrix Object: 1 MPI processes >> type: shell >> rows=22905, cols=22905 >> Matrix Object: 1 MPI processes >> type: seqbaij >> rows=22905, cols=22905, bs=5 >> total: nonzeros=785525, allocated nonzeros=785525 >> total number of mallocs used during MatSetValues calls =0 >> block size is 5 >> WARNING: zero iteration in iterative solver >> >> runing with 1 CPU WITH the line with red color and the codes converge >> >> runtime options: -ksp_monitor_true_residual -ksp_view >> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 >> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 >> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 >> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 >> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 >> ............ >> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 >> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 >> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 >> KSP Object: 1 MPI processes >> type: gmres >> GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 >> Additive Schwarz: restriction/interpolation type - RESTRICT >> [0] number of local blocks = 1 >> Local solve info for each block is in the following KSP and PC objects: >> - - - - - - - - - - - - - - - - - - >> [0] local block number 0, size = 22905 >> KSP Object: (sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (sub_) 1 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqbaij >> rows=22905, cols=22905, bs=5 >> total: nonzeros=785525, allocated nonzeros=785525 >> total number of mallocs used during MatSetValues calls =0 >> block size is 5 >> - - - - - - - - - - - - - - - - - - >> linear system matrix followed by preconditioner matrix: >> Matrix Object: 1 MPI processes >> type: shell >> rows=22905, cols=22905 >> Matrix Object: 1 MPI processes >> type: seqbaij >> rows=22905, cols=22905, bs=5 >> total: nonzeros=785525, allocated nonzeros=785525 >> total number of mallocs used during MatSetValues calls =0 >> block size is 5 >> WARNING: zero iteration in iterative solver >> >> >> What would be my error here? Thank you. > From bsmith at mcs.anl.gov Fri May 2 18:25:50 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 2 May 2014 18:25:50 -0500 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> References: , <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> Message-ID: <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov> On May 2, 2014, at 5:29 PM, Song Gao wrote: > Thanks for your quick reply. What confused me is that why would the code works fine if I reset the gmres restart number by recalling kspgmressetrestart just before kspsolve? It isn?t really working. Something is going wrong (run with valgrind) and setting that restart number and starting the solver just puts it in a ?happier? state so it seems to make more progress. Barry > > Sent from my iPhone > >> On May 2, 2014, at 6:03 PM, "Barry Smith" wrote: >> >> >> Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator. >> >> Barry >> >>> On May 2, 2014, at 4:41 PM, Song Gao wrote: >>> >>> Dear PETSc users, >>> >>> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice. >>> >>> My codes looks like >>> >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc ) >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc ) >>> call PCSetType ( pet_precon, 'asm', ierpetsc ) >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc ) >>> call KSPSetUp ( pet_solv, ierpetsc ) >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc ) ! n_local is one >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc ) >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) >>> call KSPSetFromOptions ( pet_solv, ierpetsc ) >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) ! adding this line, the codes converge >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc ) >>> >>> runing with 1 CPU WITHOUT the line with red color and the codes don't converge >>> >>> runtime options: -ksp_monitor_true_residual -ksp_view >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 >>> ....... >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 >>> >>> KSP Object: 1 MPI processes >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI processes >>> type: asm >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 >>> Additive Schwarz: restriction/interpolation type - RESTRICT >>> [0] number of local blocks = 1 >>> Local solve info for each block is in the following KSP and PC objects: >>> - - - - - - - - - - - - - - - - - - >>> [0] local block number 0, size = 22905 >>> KSP Object: (sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (sub_) 1 MPI processes >>> type: jacobi >>> linear system matrix = precond matrix: >>> Matrix Object: 1 MPI processes >>> type: seqbaij >>> rows=22905, cols=22905, bs=5 >>> total: nonzeros=785525, allocated nonzeros=785525 >>> total number of mallocs used during MatSetValues calls =0 >>> block size is 5 >>> - - - - - - - - - - - - - - - - - - >>> linear system matrix followed by preconditioner matrix: >>> Matrix Object: 1 MPI processes >>> type: shell >>> rows=22905, cols=22905 >>> Matrix Object: 1 MPI processes >>> type: seqbaij >>> rows=22905, cols=22905, bs=5 >>> total: nonzeros=785525, allocated nonzeros=785525 >>> total number of mallocs used during MatSetValues calls =0 >>> block size is 5 >>> WARNING: zero iteration in iterative solver >>> >>> runing with 1 CPU WITH the line with red color and the codes converge >>> >>> runtime options: -ksp_monitor_true_residual -ksp_view >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 >>> ............ >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 >>> KSP Object: 1 MPI processes >>> type: gmres >>> GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI processes >>> type: asm >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 >>> Additive Schwarz: restriction/interpolation type - RESTRICT >>> [0] number of local blocks = 1 >>> Local solve info for each block is in the following KSP and PC objects: >>> - - - - - - - - - - - - - - - - - - >>> [0] local block number 0, size = 22905 >>> KSP Object: (sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (sub_) 1 MPI processes >>> type: jacobi >>> linear system matrix = precond matrix: >>> Matrix Object: 1 MPI processes >>> type: seqbaij >>> rows=22905, cols=22905, bs=5 >>> total: nonzeros=785525, allocated nonzeros=785525 >>> total number of mallocs used during MatSetValues calls =0 >>> block size is 5 >>> - - - - - - - - - - - - - - - - - - >>> linear system matrix followed by preconditioner matrix: >>> Matrix Object: 1 MPI processes >>> type: shell >>> rows=22905, cols=22905 >>> Matrix Object: 1 MPI processes >>> type: seqbaij >>> rows=22905, cols=22905, bs=5 >>> total: nonzeros=785525, allocated nonzeros=785525 >>> total number of mallocs used during MatSetValues calls =0 >>> block size is 5 >>> WARNING: zero iteration in iterative solver >>> >>> >>> What would be my error here? Thank you. >> From danyang.su at gmail.com Sat May 3 14:33:55 2014 From: danyang.su at gmail.com (Danyang Su) Date: Sat, 03 May 2014 12:33:55 -0700 Subject: [petsc-users] Question on ksp examples ex14f.F Message-ID: <536544A3.5080703@gmail.com> Hi All, The codes can run successfully in release mode, but in debug mode, it causes the following error. forrtl: severe (408): fort: (11): Subscript #1 of the array XX has value -665625807 which is less than the lower bound of 1 I can get rid of this error by replacing VecGetArray to VecGetArrayF90 and do not use idx in XX. The same problem exists in ltog from DMDAGetGlobalIndices(). Is there any other way to avoid this kind of error in fortran since the release mode can run without error? Is this caused by the configuration in Fortran? Thanks and regards, Danyang From bsmith at mcs.anl.gov Sat May 3 18:48:49 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 3 May 2014 18:48:49 -0500 Subject: [petsc-users] Question on ksp examples ex14f.F In-Reply-To: <536544A3.5080703@gmail.com> References: <536544A3.5080703@gmail.com> Message-ID: <7EB1B6E2-6574-47A3-894D-D1BA8E34BEAA@mcs.anl.gov> On May 3, 2014, at 2:33 PM, Danyang Su wrote: > Hi All, > > The codes can run successfully in release mode, but in debug mode, it causes the following error. > forrtl: severe (408): fort: (11): Subscript #1 of the array XX has value -665625807 which is less than the lower bound of 1 > > I can get rid of this error by replacing VecGetArray to VecGetArrayF90 and do not use idx in XX. The same problem exists in ltog from DMDAGetGlobalIndices(). > > Is there any other way to avoid this kind of error in fortran since the release mode can run without error? > Is this caused by the configuration in Fortran? Certain Fortran compilers add extra code which check for out of array bounds access. Found out how to turn it off your your compiler. For example https://software.intel.com/en-us/forums/topic/271337 and do some googling. Barry > > Thanks and regards, > > Danyang From lu_qin_2000 at yahoo.com Sat May 3 19:24:30 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Sat, 3 May 2014 17:24:30 -0700 (PDT) Subject: [petsc-users] ILUTP in PETSc In-Reply-To: References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> Message-ID: <1399163070.39276.YahooMailNeo@web160204.mail.bf1.yahoo.com> Thanks a lot for both of you! Qin ________________________________ From: Xiaoye S. Li To: Barry Smith Cc: Qin Lu ; "petsc-users at mcs.anl.gov" Sent: Friday, May 2, 2014 3:40 PM Subject: Re: [petsc-users] ILUTP in PETSc The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily. ? In SuperLU distribution: ? EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c) ? SRC/zgsitrf.c : the actual ILUTP factorization routine Sherry Li On Fri, May 2, 2014 at 12:25 PM, Barry Smith wrote: >At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html ?there are two listed. ./configure ?download-hypre > >mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid > >you can also add -help to see what options are available. > >? Both pretty much suck and I can?t image much reason for using them. > >? ?Barry > > > >On May 2, 2014, at 10:27 AM, Qin Lu wrote: > >> Hello, >> >> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? >> >> Many thanks, >> Qin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Sat May 3 20:01:35 2014 From: danyang.su at gmail.com (Danyang Su) Date: Sat, 03 May 2014 18:01:35 -0700 Subject: [petsc-users] Question on ksp examples ex14f.F In-Reply-To: <7EB1B6E2-6574-47A3-894D-D1BA8E34BEAA@mcs.anl.gov> References: <536544A3.5080703@gmail.com> <7EB1B6E2-6574-47A3-894D-D1BA8E34BEAA@mcs.anl.gov> Message-ID: <5365916F.9010500@gmail.com> Thank, Barry. After turning off "check array bound" option, it can work without any problem. Danyang On 03/05/2014 4:48 PM, Barry Smith wrote: > On May 3, 2014, at 2:33 PM, Danyang Su wrote: > >> Hi All, >> >> The codes can run successfully in release mode, but in debug mode, it causes the following error. >> forrtl: severe (408): fort: (11): Subscript #1 of the array XX has value -665625807 which is less than the lower bound of 1 >> >> I can get rid of this error by replacing VecGetArray to VecGetArrayF90 and do not use idx in XX. The same problem exists in ltog from DMDAGetGlobalIndices(). >> >> Is there any other way to avoid this kind of error in fortran since the release mode can run without error? >> Is this caused by the configuration in Fortran? > Certain Fortran compilers add extra code which check for out of array bounds access. Found out how to turn it off your your compiler. For example https://software.intel.com/en-us/forums/topic/271337 and do some googling. > > Barry > >> Thanks and regards, >> >> Danyang From jed at jedbrown.org Sun May 4 08:56:44 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 04 May 2014 07:56:44 -0600 Subject: [petsc-users] How to do the point-block ILU in PETSc In-Reply-To: References: Message-ID: <87ha55sl43.fsf@jedbrown.org> Please use the mailing list for questions like this. Lulu Liu writes: > Dear Jed, > > I saw in man-page of PCILU: > For BAIJ matrices this implements a point block ILU > > Take /src/snes/examples/tutorials/ex19.c for examples, I add the following > lines > ierr = MatCreate(PETSC_COMM_WORLD,&J); > ierr = MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,mx*my,mx*my); This example uses DMDA so creating your own layout won't generally be the partition you want. You should use DMCreateMatrix(). The matrix type is set via DMSetMatType() and -dm_mat_type. > ierr = MatSetType(J,MATBAIJ); > ierr = MatSetFromOptions(J); > ierr = MatSetUp(J); > ierr = MatAssemblyBegin(J,MAT_FINAL_ASSEMBLY); > ierr = MatAssemblyEnd(J,MAT_FINAL_ASSEMBLY); This assembly should not exist. > ierr = SNESSetJacobian(snes,J,J,NULL,NULL); > > but I got errors, could you tell me how to do the point-block ILU in ex19.c > ( the small block should be 4x4). Thanks! > > ./ex19 -da_grid_x 64 -da_grid_y 64 -contours -draw_pause 1 -snes_monitor > -snes_rtol 1.e-6 -pc_type ilu Don't modify the source at all. Instead, run this: $ mpiexec -n 4 ./ex19 -da_grid_x 64 -da_grid_y 64 -snes_monitor -snes_view -dm_mat_type baij lid velocity = 0.000244141, prandtl # = 1, grashof # = 1 0 SNES Function norm 1.573890417811e-02 1 SNES Function norm 1.602010905072e-06 2 SNES Function norm 1.580493963868e-11 SNES Object: 4 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=368 total number of function evaluations=3 SNESLineSearch Object: 4 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 4 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 4 MPI processes type: bjacobi block Jacobi: number of blocks = 4 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqbaij rows=4096, cols=4096, bs=4 package used to perform factorization: petsc total: nonzeros=79872, allocated nonzeros=79872 total number of mallocs used during MatSetValues calls =0 block size is 4 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqbaij rows=4096, cols=4096, bs=4 total: nonzeros=79872, allocated nonzeros=79872 total number of mallocs used during MatSetValues calls =0 block size is 4 linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpibaij rows=16384, cols=16384, bs=4 total: nonzeros=323584, allocated nonzeros=323584 total number of mallocs used during MatSetValues calls =0 Number of SNES iterations = 2 > lid velocity = 0.000244141, prandtl # = 1, grashof # = 1 > 0 SNES Function norm 1.573890417811e-02 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: > or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatFDColoringCreate_SeqAIJ line 20 > src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringCreate line 367 src/mat/matfd/fdmatrix.c > [0]PETSC ERROR: [0] SNESComputeJacobian_DMDA line 165 > src/snes/utils/dmdasnes.c > [0]PETSC ERROR: [0] SNES user Jacobian function line 2151 > src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeJacobian line 2106 src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_NEWTONLS line 144 src/snes/impls/ls/ls.c > [0]PETSC ERROR: [0] SNESSolve line 3589 src/snes/interface/snes.c > [0]PETSC ERROR: [0] main line 106 src/snes/examples/tutorials/ex19.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.3, unknown > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./ex19 on a arch-darwin-c-debug named kl-12681.local by > liul Sun May 4 15:43:58 2014 > [0]PETSC ERROR: Libraries linked from > /Users/liul/soft/petsc-3.4.3/petsc/arch-darwin-c-debug/lib > [0]PETSC ERROR: Configure run at Sun Mar 9 17:02:57 2014 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran > --download-f-blas-lapack --download-mpich > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > -- > > ------------------------------ > This message and its contents, including attachments are intended solely > for the original recipient. If you are not the intended recipient or have > received this message in error, please notify me immediately and delete > this message from your computer system. Any unauthorized use or > distribution is prohibited. Please consider the environment before printing > this email. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From epscodes at gmail.com Sun May 4 15:32:55 2014 From: epscodes at gmail.com (Xiangdong) Date: Sun, 4 May 2014 16:32:55 -0400 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> <8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov> Message-ID: On Fri, May 2, 2014 at 4:10 PM, Matthew Knepley wrote: > On Fri, May 2, 2014 at 12:53 PM, Xiangdong wrote: >> >> On Thu, May 1, 2014 at 10:21 PM, Barry Smith wrote: >> >>> >>> On May 1, 2014, at 9:12 PM, Xiangdong wrote: >>> >>> > I came up with a simple example to demonstrate this "eliminating row" >>> behavior. It happens when the solution x to the linearized equation Ax=b is >>> out of the bound set by SNESVISetVariableBounds(); >>> > >>> > In the attached example, I use snes to solve a simple function x-b=0. >>> When you run it, it outputs the matrix as 25 rows, while the real Jacobian >>> should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be >>> -inf, it will output 50 rows for the Jacobian. In the first case, the norm >>> given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different. >>> > >>> > In solving the nonlinear equations, it is likely that the solution of >>> the linearized equation is out of bound, but then we can reset the >>> out-of-bound solution to be lower or upper bound instead of eliminating the >>> variables (the rows). Any suggestions on doing this in petsc? >>> >>> This is what PETSc is doing. It is using the "active set method". >>> Variables that are at their bounds are ?frozen? and then a smaller system >>> is solved (involving just the variables not a that bounds) to get the next >>> search direction. Based on the next search direction some of the variables >>> on the bounds may be unfrozen and other variables may be frozen. There is a >>> huge literature on this topic. See for example our buddies ? Nocedal, >>> Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, >>> New York: Springer-Verlag. ISBN 978-0-387-30303-1.. >>> >>> The SNESGetFunctionNorm and SNESGetFunction+VecNorm may return >>> different values with the SNES VI solver. If you care about the function >>> value just use SNESGetFunction() and compute the norm that way. We are >>> eliminating SNESGetFunctionNorm() from PETSc because it is problematic. >>> >>> If you think the SNES VI solver is actually not solving the problem, >>> or giving the wrong answer than please send us the entire simple code and >>> we?ll see if we have introduced any bugs into our solver. But note that the >>> linear system being of different sizes is completely normal for the solver. >>> >> >> Here is an example I do not quite understand. I have a simple function >> F(X) = [x1+x2+100 ; x1-x2; x3+x4; x3-x4+50]. If I solve this F(X)=0 with no >> constraint, the exact solution is [x1=-50; x2=-50; x3=-25; x4=25]. >> >> If I specify the constraint as x2>=0 and x4>=0, I expect the solution >> from one iteration of SNES is [-50, 0,-25,25], since the constraint on x2 >> should be active now. However, the petsc outputs the solution [-50, 0, 0, >> 0]. Since x3 and x4 does not violate the constraint, why does the solution >> of x3 and x4 change (note that x3 and x3 are decoupled from x1 and x2)? In >> this case, the matrix obtained from KSPGetOperators is only 2-by-2, so two >> variables or constraints are eliminated. >> > > This just finds a local solution to the constrained problem, and these > need not be unique. > This might be trivial, but could you please briefly explain how I can obtain the same answer petsc outputs by hand calculation for this simple four-variable example. What I do not understand is when the constraints get activated and the variables get eliminated (matrix reduced from 4-by-4 to 2-by-2). For example, as I mentioned before, when I added x2>=0 and x4>=0 to the unconstrained problem, why did two of these constraints get eliminated (matrix from KSPGetOperators is 2-by-2)? In particular, the exact solution x4=25 does not violate the newly added x4>=0, but still got changed (x4 is actually decoupled from x1 and x2; changes/constraints on x1 and x2 should not affect x4). > > Matt > > >> Another thing I noticed is that constraints x2>-1e-7 and x4>-1e-7 gives >> solution [-50,-1e-7,-25,25]; however, constraints x2>-1e-8 and x4>-1e-8 >> gives the solution [-50,0,0,0]. >> > Is there a small constant number in petsc that caused the jump of the solution when I simply change the lower bound from -1e-7 to -1e-8? Thanks for your time and help. Best, Xiangdong > Attached please find the simple 130-line code showing this behavior. >> Simply commenting the line 37 to remove the constraints and modifying line >> 92 to change the lower bounds of x2 and x4. >> >> Thanks a lot for your time and help. >> >> Best, >> Xiangdong >> >> >> >> >>> >>> >>> Barry >>> >>> >>> > >>> > Thank you. >>> > >>> > Best, >>> > Xiangdong >>> > >>> > P.S. If we change the lower bound of field u (line 124) to be zero, >>> then the Jacobian matrix is set to be NULL by petsc. >>> > >>> > >>> > On Thu, May 1, 2014 at 3:43 PM, Xiangdong wrote: >>> > Here is the order of functions I called: >>> > >>> > DMDACreate3d(); >>> > >>> > SNESCreate(); >>> > >>> > SNESSetDM(); (DM with dof=2); >>> > >>> > DMSetApplicationContext(); >>> > >>> > DMDASNESSetFunctionLocal(); >>> > >>> > SNESVISetVariableBounds(); >>> > >>> > DMDASNESetJacobianLocal(); >>> > >>> > SNESSetFromOptions(); >>> > >>> > SNESSolve(); >>> > >>> > SNESGetKSP(); >>> > KSPGetSolution(); >>> > KSPGetRhs(); >>> > KSPGetOperators(); //get operator kspA, kspx, kspb; >>> > >>> > SNESGetFunctionNorm(); ==> get norm fnorma; >>> > SNESGetFunction(); VecNorm(); ==> get norm fnormb; >>> > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the >>> solution x and get norm fnormc; >>> > >>> > Inside the FormJacobianLocal(), I output the matrix jac and preB; >>> > >>> > I found that fnorma matches the default SNES monitor output "SNES >>> Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx >>> obtained by snescomputefunction, mat jac and preB are length 50 or >>> 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25. >>> > >>> > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; >>> x(2:2:end)=0; It seems that it completely ignores the second degree of >>> freedom (setting it to zero). I saw this for (close to) constant initial >>> guess, while for heterogeneous initial guess, it works fine and the matrix >>> and vector size are correct, and the solution is correct. So this >>> eliminating row behavior seems to be initial guess dependent. >>> > >>> > I saw this even if I use snes_fd, so we can rule out the possibility >>> of wrong Jacobian. For the FormFunctionLocal(), I checked via >>> SNESComputeFunction and it output the correct vector of residue. >>> > >>> > Are the orders of function calls correct? >>> > >>> > Thank you. >>> > >>> > Xiangdong >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > On Thu, May 1, 2014 at 1:58 PM, Barry Smith >>> wrote: >>> > >>> > On May 1, 2014, at 10:32 AM, Xiangdong wrote: >>> > >>> > > Under what condition, SNESGetFunctionNorm() will output different >>> results from SENEGetFunction + VecNorm (with NORM_2)? >>> > > >>> > > For most of my test cases, it is the same. However, when I have some >>> special (trivial) initial guess to the SNES problem, I see different norms. >>> > >>> > Please send more details on your ?trivial? case where the values >>> are different. It could be that we are not setting the function norm >>> properly on early exit from the solvers. >>> > > >>> > > Another phenomenon I noticed with this is that KSP in SNES squeeze >>> my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. >>> When I use KSPGetOperators/rhs/solutions, I found that the operator is >>> 25-by-25, and the rhs and solution is with length 25. Do you have any clue >>> on what triggered this? To my surprise, when I output the Jacobian inside >>> the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct >>> numerical entries. Why does the operator obtained from KSP is different and >>> got rows eliminated? These rows got eliminated have only one entries per >>> row, but the rhs in that row is not zero. Eliminating these rows would give >>> wrong solutions. >>> > >>> > Hmm, we never squeeze out rows/columns from the Jacobian. The size >>> of the Jacobian set with SNESSetJacobian() should always match that >>> obtained with KSPGetOperators() on the linear system. Please send more >>> details on how you get this. Are you calling the KSPGetOperators() inside a >>> preconditioner where the the preconditioner has chopped up the operator? >>> > >>> > Barry >>> > >>> > > >>> > > Thank you. >>> > > >>> > > Xiangdong >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley >>> wrote: >>> > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong >>> wrote: >>> > > It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo >>> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize >>> the array f. Zero the array f solved the problem and gave consistent result. >>> > > >>> > > Just curious, why does not petsc initialize the array f to zero by >>> default inside petsc when passing the f array to FormFunctionLocal? >>> > > >>> > > If you directly set entires, you might not want us to spend the time >>> writing those zeros. >>> > > >>> > > I have another quick question about the array x passed to >>> FormFunctionLocal. If I want to know the which x is evaluated, how can I >>> output x in a vector format? Currently, I created a global vector vecx and >>> a local vector vecx_local, get the array of vecx_local_array, copy the x to >>> vecx_local_array, scatter to global vecx and output vecx. Is there a quick >>> way to restore the array x to a vector and output? >>> > > >>> > > I cannot think of a better way than that. >>> > > >>> > > Matt >>> > > >>> > > Thank you. >>> > > >>> > > Best, >>> > > Xiangdong >>> > > >>> > > >>> > > >>> > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith >>> wrote: >>> > > >>> > > On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: >>> > > >>> > > > Hello everyone, >>> > > > >>> > > > When I run snes program, >>> > > >>> > > ^^^^ what SNES program?? >>> > > >>> > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this >>> norm is different from residue norm (even if solving F(x)=0) >>> > > >>> > > Please send the full output where you see this. >>> > > >>> > > > and also differ from norm of the Jacobian. What is the definition >>> of this "SNES Function Norm?? >>> > > >>> > > The SNES Function Norm as printed by PETSc is suppose to the >>> 2-norm of F(x) - b (where b is usually zero) and this is also the same >>> thing as the ?residue norm? >>> > > >>> > > Barry >>> > > >>> > > > >>> > > > Thank you. >>> > > > >>> > > > Best, >>> > > > Xiangdong >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > -- >>> > > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > > -- Norbert Wiener >>> > > >>> > >>> > >>> > >>> > >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun May 4 15:45:16 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 4 May 2014 15:45:16 -0500 Subject: [petsc-users] questions about the SNES Function Norm In-Reply-To: References: <039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov> <8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov> Message-ID: You will need to work your way through the code in SNESSolve_VINEWTONRSLS() which is in src/snes/impls/vi/rs http://www.mcs.anl.gov/petsc/petsc-dev/src/snes/impls/vi/rs/virs.c.html It is not a trivial algorithm but it is reasonably straightforward. Barry On May 4, 2014, at 3:32 PM, Xiangdong wrote: > > > > On Fri, May 2, 2014 at 4:10 PM, Matthew Knepley wrote: > On Fri, May 2, 2014 at 12:53 PM, Xiangdong wrote: > On Thu, May 1, 2014 at 10:21 PM, Barry Smith wrote: > > On May 1, 2014, at 9:12 PM, Xiangdong wrote: > > > I came up with a simple example to demonstrate this "eliminating row" behavior. It happens when the solution x to the linearized equation Ax=b is out of the bound set by SNESVISetVariableBounds(); > > > > In the attached example, I use snes to solve a simple function x-b=0. When you run it, it outputs the matrix as 25 rows, while the real Jacobian should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be -inf, it will output 50 rows for the Jacobian. In the first case, the norm given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different. > > > > In solving the nonlinear equations, it is likely that the solution of the linearized equation is out of bound, but then we can reset the out-of-bound solution to be lower or upper bound instead of eliminating the variables (the rows). Any suggestions on doing this in petsc? > > This is what PETSc is doing. It is using the "active set method". Variables that are at their bounds are ?frozen? and then a smaller system is solved (involving just the variables not a that bounds) to get the next search direction. Based on the next search direction some of the variables on the bounds may be unfrozen and other variables may be frozen. There is a huge literature on this topic. See for example our buddies ? Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, New York: Springer-Verlag. ISBN 978-0-387-30303-1.. > > The SNESGetFunctionNorm and SNESGetFunction+VecNorm may return different values with the SNES VI solver. If you care about the function value just use SNESGetFunction() and compute the norm that way. We are eliminating SNESGetFunctionNorm() from PETSc because it is problematic. > > If you think the SNES VI solver is actually not solving the problem, or giving the wrong answer than please send us the entire simple code and we?ll see if we have introduced any bugs into our solver. But note that the linear system being of different sizes is completely normal for the solver. > > Here is an example I do not quite understand. I have a simple function F(X) = [x1+x2+100 ; x1-x2; x3+x4; x3-x4+50]. If I solve this F(X)=0 with no constraint, the exact solution is [x1=-50; x2=-50; x3=-25; x4=25]. > > If I specify the constraint as x2>=0 and x4>=0, I expect the solution from one iteration of SNES is [-50, 0,-25,25], since the constraint on x2 should be active now. However, the petsc outputs the solution [-50, 0, 0, 0]. Since x3 and x4 does not violate the constraint, why does the solution of x3 and x4 change (note that x3 and x3 are decoupled from x1 and x2)? In this case, the matrix obtained from KSPGetOperators is only 2-by-2, so two variables or constraints are eliminated. > > This just finds a local solution to the constrained problem, and these need not be unique. > > This might be trivial, but could you please briefly explain how I can obtain the same answer petsc outputs by hand calculation for this simple four-variable example. What I do not understand is when the constraints get activated and the variables get eliminated (matrix reduced from 4-by-4 to 2-by-2). > > For example, as I mentioned before, when I added x2>=0 and x4>=0 to the unconstrained problem, why did two of these constraints get eliminated (matrix from KSPGetOperators is 2-by-2)? In particular, the exact solution x4=25 does not violate the newly added x4>=0, but still got changed (x4 is actually decoupled from x1 and x2; changes/constraints on x1 and x2 should not affect x4). > > > > Matt > > Another thing I noticed is that constraints x2>-1e-7 and x4>-1e-7 gives solution [-50,-1e-7,-25,25]; however, constraints x2>-1e-8 and x4>-1e-8 gives the solution [-50,0,0,0]. > > Is there a small constant number in petsc that caused the jump of the solution when I simply change the lower bound from -1e-7 to -1e-8? > > Thanks for your time and help. > > Best, > Xiangdong > > > Attached please find the simple 130-line code showing this behavior. Simply commenting the line 37 to remove the constraints and modifying line 92 to change the lower bounds of x2 and x4. > > Thanks a lot for your time and help. > > Best, > Xiangdong > > > > > > Barry > > > > > > Thank you. > > > > Best, > > Xiangdong > > > > P.S. If we change the lower bound of field u (line 124) to be zero, then the Jacobian matrix is set to be NULL by petsc. > > > > > > On Thu, May 1, 2014 at 3:43 PM, Xiangdong wrote: > > Here is the order of functions I called: > > > > DMDACreate3d(); > > > > SNESCreate(); > > > > SNESSetDM(); (DM with dof=2); > > > > DMSetApplicationContext(); > > > > DMDASNESSetFunctionLocal(); > > > > SNESVISetVariableBounds(); > > > > DMDASNESetJacobianLocal(); > > > > SNESSetFromOptions(); > > > > SNESSolve(); > > > > SNESGetKSP(); > > KSPGetSolution(); > > KSPGetRhs(); > > KSPGetOperators(); //get operator kspA, kspx, kspb; > > > > SNESGetFunctionNorm(); ==> get norm fnorma; > > SNESGetFunction(); VecNorm(); ==> get norm fnormb; > > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the solution x and get norm fnormc; > > > > Inside the FormJacobianLocal(), I output the matrix jac and preB; > > > > I found that fnorma matches the default SNES monitor output "SNES Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx obtained by snescomputefunction, mat jac and preB are length 50 or 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25. > > > > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; x(2:2:end)=0; It seems that it completely ignores the second degree of freedom (setting it to zero). I saw this for (close to) constant initial guess, while for heterogeneous initial guess, it works fine and the matrix and vector size are correct, and the solution is correct. So this eliminating row behavior seems to be initial guess dependent. > > > > I saw this even if I use snes_fd, so we can rule out the possibility of wrong Jacobian. For the FormFunctionLocal(), I checked via SNESComputeFunction and it output the correct vector of residue. > > > > Are the orders of function calls correct? > > > > Thank you. > > > > Xiangdong > > > > > > > > > > > > > > > > On Thu, May 1, 2014 at 1:58 PM, Barry Smith wrote: > > > > On May 1, 2014, at 10:32 AM, Xiangdong wrote: > > > > > Under what condition, SNESGetFunctionNorm() will output different results from SENEGetFunction + VecNorm (with NORM_2)? > > > > > > For most of my test cases, it is the same. However, when I have some special (trivial) initial guess to the SNES problem, I see different norms. > > > > Please send more details on your ?trivial? case where the values are different. It could be that we are not setting the function norm properly on early exit from the solvers. > > > > > > Another phenomenon I noticed with this is that KSP in SNES squeeze my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, and the rhs and solution is with length 25. Do you have any clue on what triggered this? To my surprise, when I output the Jacobian inside the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct numerical entries. Why does the operator obtained from KSP is different and got rows eliminated? These rows got eliminated have only one entries per row, but the rhs in that row is not zero. Eliminating these rows would give wrong solutions. > > > > Hmm, we never squeeze out rows/columns from the Jacobian. The size of the Jacobian set with SNESSetJacobian() should always match that obtained with KSPGetOperators() on the linear system. Please send more details on how you get this. Are you calling the KSPGetOperators() inside a preconditioner where the the preconditioner has chopped up the operator? > > > > Barry > > > > > > > > Thank you. > > > > > > Xiangdong > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley wrote: > > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong wrote: > > > It turns out to a be a bug in my FormFunctionLocal(DMDALocalInfo *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize the array f. Zero the array f solved the problem and gave consistent result. > > > > > > Just curious, why does not petsc initialize the array f to zero by default inside petsc when passing the f array to FormFunctionLocal? > > > > > > If you directly set entires, you might not want us to spend the time writing those zeros. > > > > > > I have another quick question about the array x passed to FormFunctionLocal. If I want to know the which x is evaluated, how can I output x in a vector format? Currently, I created a global vector vecx and a local vector vecx_local, get the array of vecx_local_array, copy the x to vecx_local_array, scatter to global vecx and output vecx. Is there a quick way to restore the array x to a vector and output? > > > > > > I cannot think of a better way than that. > > > > > > Matt > > > > > > Thank you. > > > > > > Best, > > > Xiangdong > > > > > > > > > > > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith wrote: > > > > > > On Apr 28, 2014, at 3:23 PM, Xiangdong wrote: > > > > > > > Hello everyone, > > > > > > > > When I run snes program, > > > > > > ^^^^ what SNES program?? > > > > > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this norm is different from residue norm (even if solving F(x)=0) > > > > > > Please send the full output where you see this. > > > > > > > and also differ from norm of the Jacobian. What is the definition of this "SNES Function Norm?? > > > > > > The SNES Function Norm as printed by PETSc is suppose to the 2-norm of F(x) - b (where b is usually zero) and this is also the same thing as the ?residue norm? > > > > > > Barry > > > > > > > > > > > Thank you. > > > > > > > > Best, > > > > Xiangdong > > > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From francium87 at hotmail.com Mon May 5 07:25:22 2014 From: francium87 at hotmail.com (linjing bo) Date: Mon, 5 May 2014 12:25:22 +0000 Subject: [petsc-users] VecValidValues() reports NaN found Message-ID: Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Floating point exception! [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:03:20 2014 [3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c [3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c [3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below ---------------------------Solver section-------------------------- call VecSet( pet_bp_b, one, ierr) vecidx=[0,1] call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) write(*,*) ' first two values ', first(1), first(2) call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) CHKERRQ(ierr) -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 5 07:27:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 5 May 2014 07:27:52 -0500 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: Message-ID: On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: > Hi, I'm trying to use PETSc's ksp method to solve a linear system. When > running, Error is reported by VecValidValues() that NaN or Inf is found > with error message listed below > > > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Floating point > exception! > [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite > at beginning of function: Parameter number > 2! > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [3]PETSC ERROR: See docs/changes/index.html for recent > updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [3]PETSC ERROR: See docs/index.html for manual > pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:03:20 > 2014 > > [3]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [3]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: VecValidValues() line 28 in > /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c > It looks like the vector after preconditioner application is bad. What is the preconditioner? Matt > > [3]PETSC ERROR: PCApply() line 436 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [3]PETSC ERROR: KSP_PCApply() line 227 in > /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h > [3]PETSC ERROR: KSPInitialResidual() line 64 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c > [3]PETSC ERROR: KSPSolve_GMRES() line 239 in > /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c > [3]PETSC ERROR: KSPSolve() line 441 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > After read the source code shown by backtrack informations, I realize the > problem is in the right hand side vector. So I make a trial of set right > hand side vector to ONE by VecSet, But the program still shows error > message above, and using VecView or VecGetValue to investigate the first > value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly > describe the problem. The code related is listed below > > ---------------------------Solver section-------------------------- > > call VecSet( pet_bp_b, one, ierr) > > vecidx=[0,1] > call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) > write(*,*) ' first two values ', first(1), first(2) > > call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) > call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) > call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) > CHKERRQ(ierr) > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From francium87 at hotmail.com Mon May 5 07:56:39 2014 From: francium87 at hotmail.com (linjing bo) Date: Mon, 5 May 2014 12:56:39 +0000 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: , Message-ID: I use JACOBI. The message showed is with JACOBI. Wired situation is that the backtrack information shows the location is before actually apply PC, so I guess the rhs vec is not changed at this point. Another wired thing is : Because the original code is to complex. I write out the A matrix in Ax=b, and write a small test code to read in this matrix and solve it, no error showed. The KSP, PC are all set to be the same. When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function: [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Argument out of range! [2]PETSC ERROR: Cannot log negative flops! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:51:27 2014 [2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c Date: Mon, 5 May 2014 07:27:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Floating point exception! [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:03:20 2014 [3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c It looks like the vector after preconditioner application is bad. What is the preconditioner? Matt [3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c [3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below ---------------------------Solver section-------------------------- call VecSet( pet_bp_b, one, ierr) vecidx=[0,1] call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) write(*,*) ' first two values ', first(1), first(2) call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) CHKERRQ(ierr) -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 5 08:12:05 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 5 May 2014 08:12:05 -0500 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: Message-ID: On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: > I use JACOBI. The message showed is with JACOBI. > > > Wired situation is that the backtrack information shows the location is > before actually apply PC, so I guess the rhs vec is not changed at this > point. > > Another wired thing is : Because the original code is to complex. I write > out the A matrix in Ax=b, and write a small test code to read in this > matrix and solve it, no error showed. The KSP, PC are all set to be the > same. > > When I try to using ILU, more wired error happens, the backtrack info > shows it died in a Flops logging function: > 1) Run in serial until it works 2) It looks like you have memory overwriting problems. Run with valgrind Matt > [2]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: Argument out of > range! > [2]PETSC ERROR: Cannot log negative > flops! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [2]PETSC ERROR: See docs/changes/index.html for recent > updates. > [2]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [2]PETSC ERROR: See docs/index.html for manual > pages. > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:51:27 > 2014 > > [2]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [2]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [2]PETSC ERROR: > ------------------------------------------------------------------------ > > [2]PETSC ERROR: PetscLogFlops() line 204 in > /tmp/petsc-3.4.4/include/petsclog.h > [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in > /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c > > [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in > /tmp/petsc-3.4.4/src/mat/interface/matrix.c > [2]PETSC ERROR: PCSetUp_ILU() line 232 in > /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c > [2]PETSC ERROR: PCSetUp() line 890 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [2]PETSC ERROR: KSPSetUp() line 278 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > [2]PETSC ERROR: KSPSolve() line 399 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > > > ------------------------------ > Date: Mon, 5 May 2014 07:27:52 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: > > Hi, I'm trying to use PETSc's ksp method to solve a linear system. When > running, Error is reported by VecValidValues() that NaN or Inf is found > with error message listed below > > > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Floating point > exception! > [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite > at beginning of function: Parameter number > 2! > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [3]PETSC ERROR: See docs/changes/index.html for recent > updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [3]PETSC ERROR: See docs/index.html for manual > pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:03:20 > 2014 > > [3]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [3]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: VecValidValues() line 28 in > /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c > > > It looks like the vector after preconditioner application is bad. What is > the preconditioner? > > Matt > > > > [3]PETSC ERROR: PCApply() line 436 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [3]PETSC ERROR: KSP_PCApply() line 227 in > /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h > [3]PETSC ERROR: KSPInitialResidual() line 64 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c > [3]PETSC ERROR: KSPSolve_GMRES() line 239 in > /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c > [3]PETSC ERROR: KSPSolve() line 441 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > After read the source code shown by backtrack informations, I realize the > problem is in the right hand side vector. So I make a trial of set right > hand side vector to ONE by VecSet, But the program still shows error > message above, and using VecView or VecGetValue to investigate the first > value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly > describe the problem. The code related is listed below > > ---------------------------Solver section-------------------------- > > call VecSet( pet_bp_b, one, ierr) > > vecidx=[0,1] > call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) > write(*,*) ' first two values ', first(1), first(2) > > call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) > call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) > call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) > CHKERRQ(ierr) > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From francium87 at hotmail.com Mon May 5 08:15:38 2014 From: francium87 at hotmail.com (linjing bo) Date: Mon, 5 May 2014 13:15:38 +0000 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: , , , Message-ID: Ok, I will try it . Thanks for your advise. Date: Mon, 5 May 2014 08:12:05 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: I use JACOBI. The message showed is with JACOBI. Wired situation is that the backtrack information shows the location is before actually apply PC, so I guess the rhs vec is not changed at this point. Another wired thing is : Because the original code is to complex. I write out the A matrix in Ax=b, and write a small test code to read in this matrix and solve it, no error showed. The KSP, PC are all set to be the same. When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function: 1) Run in serial until it works 2) It looks like you have memory overwriting problems. Run with valgrind Matt [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Argument out of range! [2]PETSC ERROR: Cannot log negative flops! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:51:27 2014 [2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c Date: Mon, 5 May 2014 07:27:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Floating point exception! [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:03:20 2014 [3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c It looks like the vector after preconditioner application is bad. What is the preconditioner? Matt [3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c [3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below ---------------------------Solver section-------------------------- call VecSet( pet_bp_b, one, ierr) vecidx=[0,1] call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) write(*,*) ' first two values ', first(1), first(2) call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) CHKERRQ(ierr) -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From song.gao2 at mail.mcgill.ca Mon May 5 08:28:25 2014 From: song.gao2 at mail.mcgill.ca (Song Gao) Date: Mon, 5 May 2014 09:28:25 -0400 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov> References: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov> Message-ID: Thanks for reply. What do you mean by a ?happier? state? I check the converged solution (the one which call kspgmressetrestart twice), the solution should be correct. I run with valgrind both codes (one call kspgmressetrestart once and another call kspgmressetrestart twice) Both of them have the errors: what does this mean? Thank you in advance. ==7858== Conditional jump or move depends on uninitialised value(s) ==7858== at 0xE71DFB: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== ==7858== Conditional jump or move depends on uninitialised value(s) ==7858== at 0xE71E25: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== On Fri, May 2, 2014 at 7:25 PM, Barry Smith wrote: > > On May 2, 2014, at 5:29 PM, Song Gao wrote: > > > Thanks for your quick reply. What confused me is that why would the > code works fine if I reset the gmres restart number by recalling > kspgmressetrestart just before kspsolve? > > It isn?t really working. Something is going wrong (run with valgrind) > and setting that restart number and starting the solver just puts it in a > ?happier? state so it seems to make more progress. > > Barry > > > > > Sent from my iPhone > > > >> On May 2, 2014, at 6:03 PM, "Barry Smith" wrote: > >> > >> > >> Your shell matrix is buggy in some way. Whenever the residual norm > jumps like crazy at a restart it means that something is wrong with the > operator. > >> > >> Barry > >> > >>> On May 2, 2014, at 4:41 PM, Song Gao wrote: > >>> > >>> Dear PETSc users, > >>> > >>> I'm solving a linear system in KSP and trying to setup the solver in > codes. But I feel strange because my codes don't converge unless I call > KSPGMRESSetRestart twice. > >>> > >>> My codes looks like > >>> > >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, > DIFFERENT_NONZERO_PATTERN, ierpetsc ) > >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) > >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) > >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc ) > >>> call PCSetType ( pet_precon, 'asm', ierpetsc ) > >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc ) > >>> call KSPSetUp ( pet_solv, ierpetsc ) > >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, > ierpetsc ) ! n_local is one > >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc ) > >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) > >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) > >>> call KSPSetFromOptions ( pet_solv, ierpetsc ) > >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) ! adding > this line, the codes converge > >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc ) > >>> > >>> runing with 1 CPU WITHOUT the line with red color and the codes don't > converge > >>> > >>> runtime options: -ksp_monitor_true_residual -ksp_view > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm > 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm > 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm > 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 > >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm > 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 > >>> ....... > >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm > 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 > >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm > 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 > >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm > 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 > >>> > >>> KSP Object: 1 MPI processes > >>> type: gmres > >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > >>> GMRES: happy breakdown tolerance 1e-30 > >>> maximum iterations=10000, initial guess is zero > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > >>> left preconditioning > >>> using PRECONDITIONED norm type for convergence test > >>> PC Object: 1 MPI processes > >>> type: asm > >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > >>> Additive Schwarz: restriction/interpolation type - RESTRICT > >>> [0] number of local blocks = 1 > >>> Local solve info for each block is in the following KSP and PC > objects: > >>> - - - - - - - - - - - - - - - - - - > >>> [0] local block number 0, size = 22905 > >>> KSP Object: (sub_) 1 MPI processes > >>> type: preonly > >>> maximum iterations=10000, initial guess is zero > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > >>> left preconditioning > >>> using NONE norm type for convergence test > >>> PC Object: (sub_) 1 MPI processes > >>> type: jacobi > >>> linear system matrix = precond matrix: > >>> Matrix Object: 1 MPI processes > >>> type: seqbaij > >>> rows=22905, cols=22905, bs=5 > >>> total: nonzeros=785525, allocated nonzeros=785525 > >>> total number of mallocs used during MatSetValues calls =0 > >>> block size is 5 > >>> - - - - - - - - - - - - - - - - - - > >>> linear system matrix followed by preconditioner matrix: > >>> Matrix Object: 1 MPI processes > >>> type: shell > >>> rows=22905, cols=22905 > >>> Matrix Object: 1 MPI processes > >>> type: seqbaij > >>> rows=22905, cols=22905, bs=5 > >>> total: nonzeros=785525, allocated nonzeros=785525 > >>> total number of mallocs used during MatSetValues calls =0 > >>> block size is 5 > >>> WARNING: zero iteration in iterative solver > >>> > >>> runing with 1 CPU WITH the line with red color and the codes converge > >>> > >>> runtime options: -ksp_monitor_true_residual -ksp_view > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm > 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm > 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 > >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm > 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 > >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm > 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 > >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm > 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 > >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm > 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 > >>> ............ > >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm > 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 > >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm > 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 > >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm > 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 > >>> KSP Object: 1 MPI processes > >>> type: gmres > >>> GMRES: restart=29, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > >>> GMRES: happy breakdown tolerance 1e-30 > >>> maximum iterations=10000, initial guess is zero > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > >>> left preconditioning > >>> using PRECONDITIONED norm type for convergence test > >>> PC Object: 1 MPI processes > >>> type: asm > >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > >>> Additive Schwarz: restriction/interpolation type - RESTRICT > >>> [0] number of local blocks = 1 > >>> Local solve info for each block is in the following KSP and PC > objects: > >>> - - - - - - - - - - - - - - - - - - > >>> [0] local block number 0, size = 22905 > >>> KSP Object: (sub_) 1 MPI processes > >>> type: preonly > >>> maximum iterations=10000, initial guess is zero > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > >>> left preconditioning > >>> using NONE norm type for convergence test > >>> PC Object: (sub_) 1 MPI processes > >>> type: jacobi > >>> linear system matrix = precond matrix: > >>> Matrix Object: 1 MPI processes > >>> type: seqbaij > >>> rows=22905, cols=22905, bs=5 > >>> total: nonzeros=785525, allocated nonzeros=785525 > >>> total number of mallocs used during MatSetValues calls =0 > >>> block size is 5 > >>> - - - - - - - - - - - - - - - - - - > >>> linear system matrix followed by preconditioner matrix: > >>> Matrix Object: 1 MPI processes > >>> type: shell > >>> rows=22905, cols=22905 > >>> Matrix Object: 1 MPI processes > >>> type: seqbaij > >>> rows=22905, cols=22905, bs=5 > >>> total: nonzeros=785525, allocated nonzeros=785525 > >>> total number of mallocs used during MatSetValues calls =0 > >>> block size is 5 > >>> WARNING: zero iteration in iterative solver > >>> > >>> > >>> What would be my error here? Thank you. > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon May 5 09:03:01 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 5 May 2014 09:03:01 -0500 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: References: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov> Message-ID: <07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov> If you run valgrind with the debug version of the libraries it will provide more information about the line numbers where the problem occurred, etc. recommend doing that. Either your initial solution or right hand side has garbage in it or the wrong blas may be being linked in. But there is definitely a problem Barry On May 5, 2014, at 8:28 AM, Song Gao wrote: > Thanks for reply. What do you mean by a ?happier? state? I check the converged solution (the one which call kspgmressetrestart twice), the solution should be correct. > > I run with valgrind both codes (one call kspgmressetrestart once and another call kspgmressetrestart twice) > Both of them have the errors: what does this mean? Thank you in advance. > ==7858== Conditional jump or move depends on uninitialised value(s) > ==7858== at 0xE71DFB: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== > ==7858== Conditional jump or move depends on uninitialised value(s) > ==7858== at 0xE71E25: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > ==7858== > > > On Fri, May 2, 2014 at 7:25 PM, Barry Smith wrote: > > On May 2, 2014, at 5:29 PM, Song Gao wrote: > > > Thanks for your quick reply. What confused me is that why would the code works fine if I reset the gmres restart number by recalling kspgmressetrestart just before kspsolve? > > It isn?t really working. Something is going wrong (run with valgrind) and setting that restart number and starting the solver just puts it in a ?happier? state so it seems to make more progress. > > Barry > > > > > Sent from my iPhone > > > >> On May 2, 2014, at 6:03 PM, "Barry Smith" wrote: > >> > >> > >> Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator. > >> > >> Barry > >> > >>> On May 2, 2014, at 4:41 PM, Song Gao wrote: > >>> > >>> Dear PETSc users, > >>> > >>> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice. > >>> > >>> My codes looks like > >>> > >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc ) > >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) > >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) > >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc ) > >>> call PCSetType ( pet_precon, 'asm', ierpetsc ) > >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc ) > >>> call KSPSetUp ( pet_solv, ierpetsc ) > >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc ) ! n_local is one > >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc ) > >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) > >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) > >>> call KSPSetFromOptions ( pet_solv, ierpetsc ) > >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) ! adding this line, the codes converge > >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc ) > >>> > >>> runing with 1 CPU WITHOUT the line with red color and the codes don't converge > >>> > >>> runtime options: -ksp_monitor_true_residual -ksp_view > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 > >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 > >>> ....... > >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 > >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 > >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 > >>> > >>> KSP Object: 1 MPI processes > >>> type: gmres > >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > >>> GMRES: happy breakdown tolerance 1e-30 > >>> maximum iterations=10000, initial guess is zero > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > >>> left preconditioning > >>> using PRECONDITIONED norm type for convergence test > >>> PC Object: 1 MPI processes > >>> type: asm > >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > >>> Additive Schwarz: restriction/interpolation type - RESTRICT > >>> [0] number of local blocks = 1 > >>> Local solve info for each block is in the following KSP and PC objects: > >>> - - - - - - - - - - - - - - - - - - > >>> [0] local block number 0, size = 22905 > >>> KSP Object: (sub_) 1 MPI processes > >>> type: preonly > >>> maximum iterations=10000, initial guess is zero > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > >>> left preconditioning > >>> using NONE norm type for convergence test > >>> PC Object: (sub_) 1 MPI processes > >>> type: jacobi > >>> linear system matrix = precond matrix: > >>> Matrix Object: 1 MPI processes > >>> type: seqbaij > >>> rows=22905, cols=22905, bs=5 > >>> total: nonzeros=785525, allocated nonzeros=785525 > >>> total number of mallocs used during MatSetValues calls =0 > >>> block size is 5 > >>> - - - - - - - - - - - - - - - - - - > >>> linear system matrix followed by preconditioner matrix: > >>> Matrix Object: 1 MPI processes > >>> type: shell > >>> rows=22905, cols=22905 > >>> Matrix Object: 1 MPI processes > >>> type: seqbaij > >>> rows=22905, cols=22905, bs=5 > >>> total: nonzeros=785525, allocated nonzeros=785525 > >>> total number of mallocs used during MatSetValues calls =0 > >>> block size is 5 > >>> WARNING: zero iteration in iterative solver > >>> > >>> runing with 1 CPU WITH the line with red color and the codes converge > >>> > >>> runtime options: -ksp_monitor_true_residual -ksp_view > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 > >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 > >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 > >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 > >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 > >>> ............ > >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 > >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 > >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 > >>> KSP Object: 1 MPI processes > >>> type: gmres > >>> GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > >>> GMRES: happy breakdown tolerance 1e-30 > >>> maximum iterations=10000, initial guess is zero > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > >>> left preconditioning > >>> using PRECONDITIONED norm type for convergence test > >>> PC Object: 1 MPI processes > >>> type: asm > >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > >>> Additive Schwarz: restriction/interpolation type - RESTRICT > >>> [0] number of local blocks = 1 > >>> Local solve info for each block is in the following KSP and PC objects: > >>> - - - - - - - - - - - - - - - - - - > >>> [0] local block number 0, size = 22905 > >>> KSP Object: (sub_) 1 MPI processes > >>> type: preonly > >>> maximum iterations=10000, initial guess is zero > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > >>> left preconditioning > >>> using NONE norm type for convergence test > >>> PC Object: (sub_) 1 MPI processes > >>> type: jacobi > >>> linear system matrix = precond matrix: > >>> Matrix Object: 1 MPI processes > >>> type: seqbaij > >>> rows=22905, cols=22905, bs=5 > >>> total: nonzeros=785525, allocated nonzeros=785525 > >>> total number of mallocs used during MatSetValues calls =0 > >>> block size is 5 > >>> - - - - - - - - - - - - - - - - - - > >>> linear system matrix followed by preconditioner matrix: > >>> Matrix Object: 1 MPI processes > >>> type: shell > >>> rows=22905, cols=22905 > >>> Matrix Object: 1 MPI processes > >>> type: seqbaij > >>> rows=22905, cols=22905, bs=5 > >>> total: nonzeros=785525, allocated nonzeros=785525 > >>> total number of mallocs used during MatSetValues calls =0 > >>> block size is 5 > >>> WARNING: zero iteration in iterative solver > >>> > >>> > >>> What would be my error here? Thank you. > >> > > From asmund.ervik at ntnu.no Mon May 5 09:24:57 2014 From: asmund.ervik at ntnu.no (=?ISO-8859-1?Q?=C5smund_Ervik?=) Date: Mon, 05 May 2014 16:24:57 +0200 Subject: [petsc-users] Question with setting up KSP solver parameters In-Reply-To: References: Message-ID: <53679F39.8080905@ntnu.no> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I would suggest also running valgrind with the additional option "--track-origins=yes" which will show you where the uninitialized values are coming from. Regards, ?smund On 05. mai 2014 16:03, petsc-users-request at mcs.anl.gov wrote: > From: Song Gao To: Barry Smith > Cc: petsc-users , > Dario Isola Subject: Re: [petsc-users] > Question with setting up KSP solver parameters. Message-ID: > > > Content-Type: text/plain; charset="utf-8" > > Thanks for reply. What do you mean by a ?happier? state? I check > the converged solution (the one which call kspgmressetrestart > twice), the solution should be correct. > > I run with valgrind both codes (one call kspgmressetrestart once > and another call kspgmressetrestart twice) Both of them have the > errors: what does this mean? Thank you in > advance. ==7858== Conditional jump or move depends on uninitialised > value(s) ==7858== at 0xE71DFB: SearchPath (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0xE71640: mkl_cfg_file (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0xE6E068: DDOT (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0x73281A: VecNorm_Seq (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0x730BF4: VecNormalize (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0x7BC5A8: KSPSolve_GMRES (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0xB8A06E: KSPSolve (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0x7B659F: kspsolve_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0x5EAE84: petsolv_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0x4ECD46: flowsol_ng_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0x507E4E: iterprc_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858== > by 0x51D1B4: solnalg_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTZ585AAoJED+FDAHgGz19680H+wZNRjtbaBMCIAkWTaCjql3N dwMMvBoPezDJFuVBOgBhns+no3FMBFP4lHqcZGEMJasxZSvS4pHXAgXpDZtL+amw WLwK3mEPUMXYq/yT1AW/9HyT9fQx1738jOoKlRIaEL1SR+PfSzL8fnsi/ERpz2Tb hs4wwPczEazRWMzyA3w8jDcWdGamcfO3fXPg6vAXMEG2TTjNUuwivV9tLEBeOy6v GQypVm6hIvgE8fLsmTwYs3fnh8sZrw5QDV67fDnGSe3RrSc3jXbznu/j0JRtj0Rr fRAj4S2kT/NYF07W2I7BeE1kvscgAbupmhAIpkSS8g/vBZRlKir/F7OOanYlP4k= =XUTa -----END PGP SIGNATURE----- From song.gao2 at mail.mcgill.ca Mon May 5 10:00:34 2014 From: song.gao2 at mail.mcgill.ca (Song Gao) Date: Mon, 5 May 2014 11:00:34 -0400 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: <07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov> References: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov> <07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov> Message-ID: Thank you. Runing with mpirun -np 1 valgrind --track-origins=yes ~/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG -ksp_monitor_true_residual -ksp_view gives the following information. ==8222== Conditional jump or move depends on uninitialised value(s) ==8222== at 0x216E9A7: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==8222== by 0x216E1EC: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==8222== by 0x216AC14: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==8222== by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222== by 0x126F431: VecNorm (rvector.c:166) ==8222== by 0x127039D: VecNormalize (rvector.c:261) ==8222== by 0x1405507: KSPGMRESCycle (gmres.c:127) ==8222== by 0x140695F: KSPSolve_GMRES (gmres.c:231) ==8222== by 0x1BEEF66: KSPSolve (itfunc.c:446) ==8222== by 0x13F95E8: kspsolve_ (itfuncf.c:219) ==8222== by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222== by 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222== Uninitialised value was created by a stack allocation ==8222== at 0x216E97F: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==8222== ==8222== Conditional jump or move depends on uninitialised value(s) ==8222== at 0x216E9D1: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==8222== by 0x216E1EC: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==8222== by 0x216AC14: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==8222== by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222== by 0x126F431: VecNorm (rvector.c:166) ==8222== by 0x127039D: VecNormalize (rvector.c:261) ==8222== by 0x1405507: KSPGMRESCycle (gmres.c:127) ==8222== by 0x140695F: KSPSolve_GMRES (gmres.c:231) ==8222== by 0x1BEEF66: KSPSolve (itfunc.c:446) ==8222== by 0x13F95E8: kspsolve_ (itfuncf.c:219) ==8222== by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222== by 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222== Uninitialised value was created by a stack allocation ==8222== at 0x216E97F: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==8222== On Mon, May 5, 2014 at 10:03 AM, Barry Smith wrote: > > If you run valgrind with the debug version of the libraries it will > provide more information about the line numbers where the problem occurred, > etc. recommend doing that. > > Either your initial solution or right hand side has garbage in it or > the wrong blas may be being linked in. But there is definitely a problem > > Barry > > On May 5, 2014, at 8:28 AM, Song Gao wrote: > > > Thanks for reply. What do you mean by a ?happier? state? I check the > converged solution (the one which call kspgmressetrestart twice), the > solution should be correct. > > > > I run with valgrind both codes (one call kspgmressetrestart once and > another call kspgmressetrestart twice) > > Both of them have the errors: what does this > mean? Thank you in advance. > > ==7858== Conditional jump or move depends on uninitialised value(s) > > ==7858== at 0xE71DFB: SearchPath (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xE71640: mkl_cfg_file (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xE6E068: DDOT (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x73281A: VecNorm_Seq (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x730BF4: VecNormalize (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x7BC5A8: KSPSolve_GMRES (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xB8A06E: KSPSolve (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x7B659F: kspsolve_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x5EAE84: petsolv_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x4ECD46: flowsol_ng_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x507E4E: iterprc_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x51D1B4: solnalg_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== > > ==7858== Conditional jump or move depends on uninitialised value(s) > > ==7858== at 0xE71E25: SearchPath (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xE71640: mkl_cfg_file (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xE6E068: DDOT (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x73281A: VecNorm_Seq (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x730BF4: VecNormalize (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x7BC5A8: KSPSolve_GMRES (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xB8A06E: KSPSolve (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x7B659F: kspsolve_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x5EAE84: petsolv_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x4ECD46: flowsol_ng_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x507E4E: iterprc_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x51D1B4: solnalg_ (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== > > > > > > On Fri, May 2, 2014 at 7:25 PM, Barry Smith wrote: > > > > On May 2, 2014, at 5:29 PM, Song Gao wrote: > > > > > Thanks for your quick reply. What confused me is that why would the > code works fine if I reset the gmres restart number by recalling > kspgmressetrestart just before kspsolve? > > > > It isn?t really working. Something is going wrong (run with valgrind) > and setting that restart number and starting the solver just puts it in a > ?happier? state so it seems to make more progress. > > > > Barry > > > > > > > > Sent from my iPhone > > > > > >> On May 2, 2014, at 6:03 PM, "Barry Smith" wrote: > > >> > > >> > > >> Your shell matrix is buggy in some way. Whenever the residual norm > jumps like crazy at a restart it means that something is wrong with the > operator. > > >> > > >> Barry > > >> > > >>> On May 2, 2014, at 4:41 PM, Song Gao > wrote: > > >>> > > >>> Dear PETSc users, > > >>> > > >>> I'm solving a linear system in KSP and trying to setup the solver in > codes. But I feel strange because my codes don't converge unless I call > KSPGMRESSetRestart twice. > > >>> > > >>> My codes looks like > > >>> > > >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, > DIFFERENT_NONZERO_PATTERN, ierpetsc ) > > >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) > > >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) > > >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc ) > > >>> call PCSetType ( pet_precon, 'asm', ierpetsc ) > > >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc ) > > >>> call KSPSetUp ( pet_solv, ierpetsc ) > > >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, > pet_solv_sub, ierpetsc ) ! n_local is one > > >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc ) > > >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) > > >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) > > >>> call KSPSetFromOptions ( pet_solv, ierpetsc ) > > >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) ! > adding this line, the codes converge > > >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc ) > > >>> > > >>> runing with 1 CPU WITHOUT the line with red color and the codes > don't converge > > >>> > > >>> runtime options: -ksp_monitor_true_residual -ksp_view > > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm > 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > > >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm > 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > > >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm > 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 > > >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm > 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 > > >>> ....... > > >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm > 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 > > >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm > 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 > > >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm > 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 > > >>> > > >>> KSP Object: 1 MPI processes > > >>> type: gmres > > >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > >>> GMRES: happy breakdown tolerance 1e-30 > > >>> maximum iterations=10000, initial guess is zero > > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > >>> left preconditioning > > >>> using PRECONDITIONED norm type for convergence test > > >>> PC Object: 1 MPI processes > > >>> type: asm > > >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > > >>> Additive Schwarz: restriction/interpolation type - RESTRICT > > >>> [0] number of local blocks = 1 > > >>> Local solve info for each block is in the following KSP and PC > objects: > > >>> - - - - - - - - - - - - - - - - - - > > >>> [0] local block number 0, size = 22905 > > >>> KSP Object: (sub_) 1 MPI processes > > >>> type: preonly > > >>> maximum iterations=10000, initial guess is zero > > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > >>> left preconditioning > > >>> using NONE norm type for convergence test > > >>> PC Object: (sub_) 1 MPI processes > > >>> type: jacobi > > >>> linear system matrix = precond matrix: > > >>> Matrix Object: 1 MPI processes > > >>> type: seqbaij > > >>> rows=22905, cols=22905, bs=5 > > >>> total: nonzeros=785525, allocated nonzeros=785525 > > >>> total number of mallocs used during MatSetValues calls =0 > > >>> block size is 5 > > >>> - - - - - - - - - - - - - - - - - - > > >>> linear system matrix followed by preconditioner matrix: > > >>> Matrix Object: 1 MPI processes > > >>> type: shell > > >>> rows=22905, cols=22905 > > >>> Matrix Object: 1 MPI processes > > >>> type: seqbaij > > >>> rows=22905, cols=22905, bs=5 > > >>> total: nonzeros=785525, allocated nonzeros=785525 > > >>> total number of mallocs used during MatSetValues calls =0 > > >>> block size is 5 > > >>> WARNING: zero iteration in iterative solver > > >>> > > >>> runing with 1 CPU WITH the line with red color and the codes > converge > > >>> > > >>> runtime options: -ksp_monitor_true_residual -ksp_view > > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm > 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > > >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm > 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 > > >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm > 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 > > >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm > 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 > > >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm > 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 > > >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm > 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 > > >>> ............ > > >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm > 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 > > >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm > 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 > > >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm > 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 > > >>> KSP Object: 1 MPI processes > > >>> type: gmres > > >>> GMRES: restart=29, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > >>> GMRES: happy breakdown tolerance 1e-30 > > >>> maximum iterations=10000, initial guess is zero > > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > >>> left preconditioning > > >>> using PRECONDITIONED norm type for convergence test > > >>> PC Object: 1 MPI processes > > >>> type: asm > > >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > > >>> Additive Schwarz: restriction/interpolation type - RESTRICT > > >>> [0] number of local blocks = 1 > > >>> Local solve info for each block is in the following KSP and PC > objects: > > >>> - - - - - - - - - - - - - - - - - - > > >>> [0] local block number 0, size = 22905 > > >>> KSP Object: (sub_) 1 MPI processes > > >>> type: preonly > > >>> maximum iterations=10000, initial guess is zero > > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > >>> left preconditioning > > >>> using NONE norm type for convergence test > > >>> PC Object: (sub_) 1 MPI processes > > >>> type: jacobi > > >>> linear system matrix = precond matrix: > > >>> Matrix Object: 1 MPI processes > > >>> type: seqbaij > > >>> rows=22905, cols=22905, bs=5 > > >>> total: nonzeros=785525, allocated nonzeros=785525 > > >>> total number of mallocs used during MatSetValues calls =0 > > >>> block size is 5 > > >>> - - - - - - - - - - - - - - - - - - > > >>> linear system matrix followed by preconditioner matrix: > > >>> Matrix Object: 1 MPI processes > > >>> type: shell > > >>> rows=22905, cols=22905 > > >>> Matrix Object: 1 MPI processes > > >>> type: seqbaij > > >>> rows=22905, cols=22905, bs=5 > > >>> total: nonzeros=785525, allocated nonzeros=785525 > > >>> total number of mallocs used during MatSetValues calls =0 > > >>> block size is 5 > > >>> WARNING: zero iteration in iterative solver > > >>> > > >>> > > >>> What would be my error here? Thank you. > > >> > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon May 5 11:27:47 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 5 May 2014 11:27:47 -0500 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: References: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov> <07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov> Message-ID: <44FAE78D-1C12-45FD-A4CF-0AFF50F7352F@mcs.anl.gov> Please email configure.log and make.log for this build. Barry On May 5, 2014, at 10:00 AM, Song Gao wrote: > Thank you. > Runing with > mpirun -np 1 valgrind --track-origins=yes ~/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG -ksp_monitor_true_residual -ksp_view > > gives the following information. > > ==8222== Conditional jump or move depends on uninitialised value(s) > ==8222== at 0x216E9A7: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x216E1EC: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x216AC14: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x1291100: VecNorm_Seq (bvec2.c:239) > ==8222== by 0x126F431: VecNorm (rvector.c:166) > ==8222== by 0x127039D: VecNormalize (rvector.c:261) > ==8222== by 0x1405507: KSPGMRESCycle (gmres.c:127) > ==8222== by 0x140695F: KSPSolve_GMRES (gmres.c:231) > ==8222== by 0x1BEEF66: KSPSolve (itfunc.c:446) > ==8222== by 0x13F95E8: kspsolve_ (itfuncf.c:219) > ==8222== by 0xC5E51F: petsolv_ (PETSOLV.F:375) > ==8222== by 0x612C35: flowsol_ng_ (flowsol_ng.F:275) > ==8222== Uninitialised value was created by a stack allocation > ==8222== at 0x216E97F: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== > ==8222== Conditional jump or move depends on uninitialised value(s) > ==8222== at 0x216E9D1: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x216E1EC: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x216AC14: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x1291100: VecNorm_Seq (bvec2.c:239) > ==8222== by 0x126F431: VecNorm (rvector.c:166) > ==8222== by 0x127039D: VecNormalize (rvector.c:261) > ==8222== by 0x1405507: KSPGMRESCycle (gmres.c:127) > ==8222== by 0x140695F: KSPSolve_GMRES (gmres.c:231) > ==8222== by 0x1BEEF66: KSPSolve (itfunc.c:446) > ==8222== by 0x13F95E8: kspsolve_ (itfuncf.c:219) > ==8222== by 0xC5E51F: petsolv_ (PETSOLV.F:375) > ==8222== by 0x612C35: flowsol_ng_ (flowsol_ng.F:275) > ==8222== Uninitialised value was created by a stack allocation > ==8222== at 0x216E97F: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== > > > > On Mon, May 5, 2014 at 10:03 AM, Barry Smith wrote: > > If you run valgrind with the debug version of the libraries it will provide more information about the line numbers where the problem occurred, etc. recommend doing that. > > Either your initial solution or right hand side has garbage in it or the wrong blas may be being linked in. But there is definitely a problem > > Barry > > On May 5, 2014, at 8:28 AM, Song Gao wrote: > > > Thanks for reply. What do you mean by a ?happier? state? I check the converged solution (the one which call kspgmressetrestart twice), the solution should be correct. > > > > I run with valgrind both codes (one call kspgmressetrestart once and another call kspgmressetrestart twice) > > Both of them have the errors: what does this mean? Thank you in advance. > > ==7858== Conditional jump or move depends on uninitialised value(s) > > ==7858== at 0xE71DFB: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== > > ==7858== Conditional jump or move depends on uninitialised value(s) > > ==7858== at 0xE71E25: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > > ==7858== > > > > > > On Fri, May 2, 2014 at 7:25 PM, Barry Smith wrote: > > > > On May 2, 2014, at 5:29 PM, Song Gao wrote: > > > > > Thanks for your quick reply. What confused me is that why would the code works fine if I reset the gmres restart number by recalling kspgmressetrestart just before kspsolve? > > > > It isn?t really working. Something is going wrong (run with valgrind) and setting that restart number and starting the solver just puts it in a ?happier? state so it seems to make more progress. > > > > Barry > > > > > > > > Sent from my iPhone > > > > > >> On May 2, 2014, at 6:03 PM, "Barry Smith" wrote: > > >> > > >> > > >> Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator. > > >> > > >> Barry > > >> > > >>> On May 2, 2014, at 4:41 PM, Song Gao wrote: > > >>> > > >>> Dear PETSc users, > > >>> > > >>> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice. > > >>> > > >>> My codes looks like > > >>> > > >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc ) > > >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) > > >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) > > >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc ) > > >>> call PCSetType ( pet_precon, 'asm', ierpetsc ) > > >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc ) > > >>> call KSPSetUp ( pet_solv, ierpetsc ) > > >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc ) ! n_local is one > > >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc ) > > >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) > > >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) > > >>> call KSPSetFromOptions ( pet_solv, ierpetsc ) > > >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) ! adding this line, the codes converge > > >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc ) > > >>> > > >>> runing with 1 CPU WITHOUT the line with red color and the codes don't converge > > >>> > > >>> runtime options: -ksp_monitor_true_residual -ksp_view > > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > > >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > > >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 > > >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 > > >>> ....... > > >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 > > >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 > > >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 > > >>> > > >>> KSP Object: 1 MPI processes > > >>> type: gmres > > >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > >>> GMRES: happy breakdown tolerance 1e-30 > > >>> maximum iterations=10000, initial guess is zero > > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > >>> left preconditioning > > >>> using PRECONDITIONED norm type for convergence test > > >>> PC Object: 1 MPI processes > > >>> type: asm > > >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > > >>> Additive Schwarz: restriction/interpolation type - RESTRICT > > >>> [0] number of local blocks = 1 > > >>> Local solve info for each block is in the following KSP and PC objects: > > >>> - - - - - - - - - - - - - - - - - - > > >>> [0] local block number 0, size = 22905 > > >>> KSP Object: (sub_) 1 MPI processes > > >>> type: preonly > > >>> maximum iterations=10000, initial guess is zero > > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > >>> left preconditioning > > >>> using NONE norm type for convergence test > > >>> PC Object: (sub_) 1 MPI processes > > >>> type: jacobi > > >>> linear system matrix = precond matrix: > > >>> Matrix Object: 1 MPI processes > > >>> type: seqbaij > > >>> rows=22905, cols=22905, bs=5 > > >>> total: nonzeros=785525, allocated nonzeros=785525 > > >>> total number of mallocs used during MatSetValues calls =0 > > >>> block size is 5 > > >>> - - - - - - - - - - - - - - - - - - > > >>> linear system matrix followed by preconditioner matrix: > > >>> Matrix Object: 1 MPI processes > > >>> type: shell > > >>> rows=22905, cols=22905 > > >>> Matrix Object: 1 MPI processes > > >>> type: seqbaij > > >>> rows=22905, cols=22905, bs=5 > > >>> total: nonzeros=785525, allocated nonzeros=785525 > > >>> total number of mallocs used during MatSetValues calls =0 > > >>> block size is 5 > > >>> WARNING: zero iteration in iterative solver > > >>> > > >>> runing with 1 CPU WITH the line with red color and the codes converge > > >>> > > >>> runtime options: -ksp_monitor_true_residual -ksp_view > > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > > >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 > > >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 > > >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 > > >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 > > >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 > > >>> ............ > > >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 > > >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 > > >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 > > >>> KSP Object: 1 MPI processes > > >>> type: gmres > > >>> GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > >>> GMRES: happy breakdown tolerance 1e-30 > > >>> maximum iterations=10000, initial guess is zero > > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > >>> left preconditioning > > >>> using PRECONDITIONED norm type for convergence test > > >>> PC Object: 1 MPI processes > > >>> type: asm > > >>> Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1 > > >>> Additive Schwarz: restriction/interpolation type - RESTRICT > > >>> [0] number of local blocks = 1 > > >>> Local solve info for each block is in the following KSP and PC objects: > > >>> - - - - - - - - - - - - - - - - - - > > >>> [0] local block number 0, size = 22905 > > >>> KSP Object: (sub_) 1 MPI processes > > >>> type: preonly > > >>> maximum iterations=10000, initial guess is zero > > >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > >>> left preconditioning > > >>> using NONE norm type for convergence test > > >>> PC Object: (sub_) 1 MPI processes > > >>> type: jacobi > > >>> linear system matrix = precond matrix: > > >>> Matrix Object: 1 MPI processes > > >>> type: seqbaij > > >>> rows=22905, cols=22905, bs=5 > > >>> total: nonzeros=785525, allocated nonzeros=785525 > > >>> total number of mallocs used during MatSetValues calls =0 > > >>> block size is 5 > > >>> - - - - - - - - - - - - - - - - - - > > >>> linear system matrix followed by preconditioner matrix: > > >>> Matrix Object: 1 MPI processes > > >>> type: shell > > >>> rows=22905, cols=22905 > > >>> Matrix Object: 1 MPI processes > > >>> type: seqbaij > > >>> rows=22905, cols=22905, bs=5 > > >>> total: nonzeros=785525, allocated nonzeros=785525 > > >>> total number of mallocs used during MatSetValues calls =0 > > >>> block size is 5 > > >>> WARNING: zero iteration in iterative solver > > >>> > > >>> > > >>> What would be my error here? Thank you. > > >> > > > > > > From asmund.ervik at ntnu.no Mon May 5 12:33:52 2014 From: asmund.ervik at ntnu.no (=?windows-1252?Q?=C5smund_Ervik?=) Date: Mon, 05 May 2014 19:33:52 +0200 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: References: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov> <07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov> Message-ID: <5367CB80.1040203@ntnu.no> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 You should also compile your own code "fensapngnew" with debug flags, specifically "-g" for gcc/gfortran or icc/ifort. This tells the compiler to generate the information necessary for gdb or valgrind to do their job. Then you would get more detailed information than just ''' ==8222== Uninitialised value was created by a stack allocation ==8222== at 0x216E97F: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ''' E.g. when I have an error, I get a full backtrace with line numbers in my source code, like: ''' ==5277== Uninitialised value was created by a heap allocation ==5277== at 0x4C277AB: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==5277== by 0x6BE3FE: __navier_stokes_MOD_rhs_ns (navier_stokes.f90:59) ==5277== by 0x6C712A: __rhs_MOD_dfdt_1phase (rhs.f90:109) ==5277== by 0x4EF52C: __rk_MOD_forward_euler (rk.f90:2168) ==5277== by 0x642764: __rk_wrapper_MOD_rk_step (rk_wrapper.f90:313) ==5277== by 0x7FA8B8: MAIN__ (meph.F90:179) ==5277== by 0x7FC5B9: main (meph.F90:2) ==5277== ''' On 05. mai 2014 17:00, Song Gao wrote: > Thank you. Runing with mpirun -np 1 valgrind --track-origins=yes > ~/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG > -ksp_monitor_true_residual -ksp_view > > gives the following information. > > ==8222== Conditional jump or move depends on uninitialised > value(s) ==8222== at 0x216E9A7: SearchPath (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x216E1EC: mkl_cfg_file (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x216AC14: DDOT (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222== by > 0x126F431: VecNorm (rvector.c:166) ==8222== by 0x127039D: > VecNormalize (rvector.c:261) ==8222== by 0x1405507: > KSPGMRESCycle (gmres.c:127) ==8222== by 0x140695F: > KSPSolve_GMRES (gmres.c:231) ==8222== by 0x1BEEF66: KSPSolve > (itfunc.c:446) ==8222== by 0x13F95E8: kspsolve_ (itfuncf.c:219) > ==8222== by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222== by > 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222== Uninitialised > value was created by a stack allocation ==8222== at 0x216E97F: > SearchPath (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== ==8222== Conditional jump or move depends on uninitialised > value(s) ==8222== at 0x216E9D1: SearchPath (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x216E1EC: mkl_cfg_file (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x216AC14: DDOT (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222== by > 0x126F431: VecNorm (rvector.c:166) ==8222== by 0x127039D: > VecNormalize (rvector.c:261) ==8222== by 0x1405507: > KSPGMRESCycle (gmres.c:127) ==8222== by 0x140695F: > KSPSolve_GMRES (gmres.c:231) ==8222== by 0x1BEEF66: KSPSolve > (itfunc.c:446) ==8222== by 0x13F95E8: kspsolve_ (itfuncf.c:219) > ==8222== by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222== by > 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222== Uninitialised > value was created by a stack allocation ==8222== at 0x216E97F: > SearchPath (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ==8222== > > > > On Mon, May 5, 2014 at 10:03 AM, Barry Smith > wrote: > >> >> If you run valgrind with the debug version of the libraries it >> will provide more information about the line numbers where the >> problem occurred, etc. recommend doing that. >> >> Either your initial solution or right hand side has garbage in it >> or the wrong blas may be being linked in. But there is definitely >> a problem >> >> Barry >> >> On May 5, 2014, at 8:28 AM, Song Gao >> wrote: >> >>> Thanks for reply. What do you mean by a ?happier? state? I >>> check the >> converged solution (the one which call kspgmressetrestart twice), >> the solution should be correct. >>> >>> I run with valgrind both codes (one call kspgmressetrestart >>> once and >> another call kspgmressetrestart twice) >>> Both of them have the errors: what >>> does this >> mean? Thank you in advance. >>> ==7858== Conditional jump or move depends on uninitialised >>> value(s) ==7858== at 0xE71DFB: SearchPath (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0xE71640: mkl_cfg_file (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0xE6E068: DDOT (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x73281A: VecNorm_Seq (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x730BF4: VecNormalize (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x7BC5A8: KSPSolve_GMRES (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0xB8A06E: KSPSolve (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x7B659F: kspsolve_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x5EAE84: petsolv_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x4ECD46: flowsol_ng_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x507E4E: iterprc_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x51D1B4: solnalg_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== ==7858== Conditional jump or move depends on >>> uninitialised value(s) ==7858== at 0xE71E25: SearchPath (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0xE71640: mkl_cfg_file (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0xE6E068: DDOT (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x73281A: VecNorm_Seq (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x730BF4: VecNormalize (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x7BC5A8: KSPSolve_GMRES (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0xB8A06E: KSPSolve (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x7B659F: kspsolve_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x5EAE84: petsolv_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x4ECD46: flowsol_ng_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x507E4E: iterprc_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== by 0x51D1B4: solnalg_ (in >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) >>> ==7858== >>> >>> >>> On Fri, May 2, 2014 at 7:25 PM, Barry Smith >>> wrote: >>> >>> On May 2, 2014, at 5:29 PM, Song Gao >>> wrote: >>> >>>> Thanks for your quick reply. What confused me is that why >>>> would the >> code works fine if I reset the gmres restart number by recalling >> kspgmressetrestart just before kspsolve? >>> >>> It isn?t really working. Something is going wrong (run with >>> valgrind) >> and setting that restart number and starting the solver just puts >> it in a ?happier? state so it seems to make more progress. >>> >>> Barry >>> >>>> >>>> Sent from my iPhone >>>> >>>>> On May 2, 2014, at 6:03 PM, "Barry Smith" >>>>> wrote: >>>>> >>>>> >>>>> Your shell matrix is buggy in some way. Whenever the >>>>> residual norm >> jumps like crazy at a restart it means that something is wrong >> with the operator. >>>>> >>>>> Barry >>>>> >>>>>> On May 2, 2014, at 4:41 PM, Song Gao >>>>>> >> wrote: >>>>>> >>>>>> Dear PETSc users, >>>>>> >>>>>> I'm solving a linear system in KSP and trying to setup >>>>>> the solver in >> codes. But I feel strange because my codes don't converge unless >> I call KSPGMRESSetRestart twice. >>>>>> >>>>>> My codes looks like >>>>>> >>>>>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, >>>>>> pet_matp, >> DIFFERENT_NONZERO_PATTERN, ierpetsc ) >>>>>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) call >>>>>> KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) call >>>>>> KSPGetPC ( pet_solv, pet_precon, ierpetsc ) call >>>>>> PCSetType ( pet_precon, 'asm', ierpetsc ) call >>>>>> PCASMSetOverlap ( pet_precon, 1, ierpetsc ) call KSPSetUp >>>>>> ( pet_solv, ierpetsc ) call PCASMGetSubKSP ( pet_precon, >>>>>> n_local, first_local, >> pet_solv_sub, ierpetsc ) ! n_local is one >>>>>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc >>>>>> ) call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) >>>>>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) >>>>>> call KSPSetFromOptions ( pet_solv, ierpetsc ) call >>>>>> KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) >>>>>> ! >> adding this line, the codes converge >>>>>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc >>>>>> ) >>>>>> >>>>>> runing with 1 CPU WITHOUT the line with red color and >>>>>> the codes >> don't converge >>>>>> >>>>>> runtime options: -ksp_monitor_true_residual -ksp_view 0 >>>>>> KSP preconditioned resid norm 6.585278940829e+00 true >>>>>> resid norm >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >>>>>> 1 KSP preconditioned resid norm 6.585278219510e+00 true >>>>>> resid norm >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >>>>>> 2 KSP preconditioned resid norm 2.198638170622e+00 true >>>>>> resid norm >> 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 >>>>>> 3 KSP preconditioned resid norm 1.599896387215e+00 true >>>>>> resid norm >> 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 >>>>>> ....... 28 KSP preconditioned resid norm >>>>>> 4.478466011191e-01 true resid norm >> 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 >>>>>> 29 KSP preconditioned resid norm 4.398129572260e-01 true >>>>>> resid norm >> 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 >>>>>> 30 KSP preconditioned resid norm 2.783227613716e+12 true >>>>>> resid norm >> 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 >>>>>> >>>>>> KSP Object: 1 MPI processes type: gmres GMRES: >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >>>>>> GMRES: happy breakdown tolerance 1e-30 maximum >>>>>> iterations=10000, initial guess is zero tolerances: >>>>>> relative=1e-05, absolute=1e-50, divergence=10000 left >>>>>> preconditioning using PRECONDITIONED norm type for >>>>>> convergence test PC Object: 1 MPI processes type: asm >>>>>> Additive Schwarz: total subdomain blocks = 1, amount of >>>>>> overlap = 1 Additive Schwarz: restriction/interpolation >>>>>> type - RESTRICT [0] number of local blocks = 1 Local >>>>>> solve info for each block is in the following KSP and PC >> objects: >>>>>> - - - - - - - - - - - - - - - - - - [0] local block >>>>>> number 0, size = 22905 KSP Object: (sub_) 1 MPI >>>>>> processes type: preonly maximum iterations=10000, initial >>>>>> guess is zero tolerances: relative=1e-05, >>>>>> absolute=1e-50, divergence=10000 left preconditioning >>>>>> using NONE norm type for convergence test PC Object: >>>>>> (sub_) 1 MPI processes type: jacobi linear system >>>>>> matrix = precond matrix: Matrix Object: 1 MPI >>>>>> processes type: seqbaij rows=22905, cols=22905, bs=5 >>>>>> total: nonzeros=785525, allocated nonzeros=785525 total >>>>>> number of mallocs used during MatSetValues calls =0 block >>>>>> size is 5 - - - - - - - - - - - - - - - - - - linear >>>>>> system matrix followed by preconditioner matrix: Matrix >>>>>> Object: 1 MPI processes type: shell rows=22905, >>>>>> cols=22905 Matrix Object: 1 MPI processes type: >>>>>> seqbaij rows=22905, cols=22905, bs=5 total: >>>>>> nonzeros=785525, allocated nonzeros=785525 total number >>>>>> of mallocs used during MatSetValues calls =0 block size >>>>>> is 5 WARNING: zero iteration in iterative solver >>>>>> >>>>>> runing with 1 CPU WITH the line with red color and the >>>>>> codes >> converge >>>>>> >>>>>> runtime options: -ksp_monitor_true_residual -ksp_view 0 >>>>>> KSP preconditioned resid norm 6.585278940829e+00 true >>>>>> resid norm >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 >>>>>> 1 KSP preconditioned resid norm 2.566248171026e+00 true >>>>>> resid norm >> 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 >>>>>> 2 KSP preconditioned resid norm 1.410418402651e+00 true >>>>>> resid norm >> 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 >>>>>> 3 KSP preconditioned resid norm 9.665409287757e-01 true >>>>>> resid norm >> 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 >>>>>> 4 KSP preconditioned resid norm 4.469486152454e-01 true >>>>>> resid norm >> 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 >>>>>> 5 KSP preconditioned resid norm 2.474889829653e-01 true >>>>>> resid norm >> 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 >>>>>> ............ 24 KSP preconditioned resid norm >>>>>> 9.518780877620e-05 true resid norm >> 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 >>>>>> 25 KSP preconditioned resid norm 6.837876679998e-05 true >>>>>> resid norm >> 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 >>>>>> 26 KSP preconditioned resid norm 4.864361942316e-05 true >>>>>> resid norm >> 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 >>>>>> KSP Object: 1 MPI processes type: gmres GMRES: >>>>>> restart=29, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >>>>>> GMRES: happy breakdown tolerance 1e-30 maximum >>>>>> iterations=10000, initial guess is zero tolerances: >>>>>> relative=1e-05, absolute=1e-50, divergence=10000 left >>>>>> preconditioning using PRECONDITIONED norm type for >>>>>> convergence test PC Object: 1 MPI processes type: asm >>>>>> Additive Schwarz: total subdomain blocks = 1, amount of >>>>>> overlap = 1 Additive Schwarz: restriction/interpolation >>>>>> type - RESTRICT [0] number of local blocks = 1 Local >>>>>> solve info for each block is in the following KSP and PC >> objects: >>>>>> - - - - - - - - - - - - - - - - - - [0] local block >>>>>> number 0, size = 22905 KSP Object: (sub_) 1 MPI >>>>>> processes type: preonly maximum iterations=10000, initial >>>>>> guess is zero tolerances: relative=1e-05, >>>>>> absolute=1e-50, divergence=10000 left preconditioning >>>>>> using NONE norm type for convergence test PC Object: >>>>>> (sub_) 1 MPI processes type: jacobi linear system >>>>>> matrix = precond matrix: Matrix Object: 1 MPI >>>>>> processes type: seqbaij rows=22905, cols=22905, bs=5 >>>>>> total: nonzeros=785525, allocated nonzeros=785525 total >>>>>> number of mallocs used during MatSetValues calls =0 block >>>>>> size is 5 - - - - - - - - - - - - - - - - - - linear >>>>>> system matrix followed by preconditioner matrix: Matrix >>>>>> Object: 1 MPI processes type: shell rows=22905, >>>>>> cols=22905 Matrix Object: 1 MPI processes type: >>>>>> seqbaij rows=22905, cols=22905, bs=5 total: >>>>>> nonzeros=785525, allocated nonzeros=785525 total number >>>>>> of mallocs used during MatSetValues calls =0 block size >>>>>> is 5 WARNING: zero iteration in iterative solver >>>>>> >>>>>> >>>>>> What would be my error here? Thank you. >>>>> >>> >>> >> >> > -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTZ8t/AAoJED+FDAHgGz19xocH/i2A2Ccw3BTypkyicy6dqAQE wgVqukXnBI//adXHSe60uQBtL4OmjMiGOSt/Egye6N2QF/29yMzNdwTmHw6DZSRC C8yyPpVMEOPwB2WED0ui+IGSYq6JglOVplT5lCf2T99Y/gZNiqugCNz0ydnA5KnP 9W0O1yO2/2xgE4bMEibVhFIPsaXKGyTLv1ZjZLgdnbnTYFbCZqJk+9lVOOpQlqBZ mrzE+9GjO+0+BucEwI4Ekw4b9PI/Yctl0JW7zx+ZmviRsXRF4L3aO2SeFm1fBSnh XPIreXBNB6vyAmPFBx9TJZHQFucJIsFLHrlrea6onePKBx4Eg3JcpOlX8GdJr5w= =MkKg -----END PGP SIGNATURE----- From song.gao2 at mail.mcgill.ca Mon May 5 14:06:02 2014 From: song.gao2 at mail.mcgill.ca (Song Gao) Date: Mon, 5 May 2014 15:06:02 -0400 Subject: [petsc-users] Question with setting up KSP solver parameters. In-Reply-To: <5367CB80.1040203@ntnu.no> References: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca> <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov> <07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov> <5367CB80.1040203@ntnu.no> Message-ID: Thank you. Barry, Please see the attached log files. Asmund, my apologize, I forget to make clean and recompile the code. But I still don't see the full backtrace. I checked the compilation log and all source files are compiled with -g flag. ==9475== Conditional jump or move depends on uninitialised value(s) ==9475== at 0x216F00F: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==9475== by 0x216E854: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==9475== by 0x216B27C: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==9475== by 0x1291768: VecNorm_Seq (bvec2.c:239) ==9475== by 0x126FA99: VecNorm (rvector.c:166) ==9475== by 0x1270A05: VecNormalize (rvector.c:261) ==9475== by 0x1405B6F: KSPGMRESCycle (gmres.c:127) ==9475== by 0x1406FC7: KSPSolve_GMRES (gmres.c:231) ==9475== by 0x1BEF5CE: KSPSolve (itfunc.c:446) ==9475== by 0x13F9C50: kspsolve_ (itfuncf.c:219) ==9475== by 0xC5EB87: petsolv_ (PETSOLV.F:375) ==9475== by 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==9475== Uninitialised value was created by a stack allocation ==9475== at 0x216EFE7: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==9475== ==9475== Conditional jump or move depends on uninitialised value(s) ==9475== at 0x216F039: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==9475== by 0x216E854: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==9475== by 0x216B27C: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==9475== by 0x1291768: VecNorm_Seq (bvec2.c:239) ==9475== by 0x126FA99: VecNorm (rvector.c:166) ==9475== by 0x1270A05: VecNormalize (rvector.c:261) ==9475== by 0x1405B6F: KSPGMRESCycle (gmres.c:127) ==9475== by 0x1406FC7: KSPSolve_GMRES (gmres.c:231) ==9475== by 0x1BEF5CE: KSPSolve (itfunc.c:446) ==9475== by 0x13F9C50: kspsolve_ (itfuncf.c:219) ==9475== by 0xC5EB87: petsolv_ (PETSOLV.F:375) ==9475== by 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==9475== Uninitialised value was created by a stack allocation ==9475== at 0x216EFE7: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) ==9475== 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 2 KSP preconditioned resid norm 2.198671238042e+00 true resid norm 1.365127786174e-01 ||r(i)||/||b|| 1.419158195200e+01 3 KSP preconditioned resid norm 1.599921867950e+00 true resid norm 1.445986203309e-01 ||r(i)||/||b|| 1.503216908596e+01 ................... On Mon, May 5, 2014 at 1:33 PM, ?smund Ervik wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > You should also compile your own code "fensapngnew" with debug flags, > specifically "-g" for gcc/gfortran or icc/ifort. This tells the > compiler to generate the information necessary for gdb or valgrind to > do their job. Then you would get more detailed information than just > > ''' > ==8222== Uninitialised value was created by a stack allocation > ==8222== at 0x216E97F: SearchPath (in > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > ''' > > E.g. when I have an error, I get a full backtrace with line numbers in > my source code, like: > ''' > ==5277== Uninitialised value was created by a heap allocation > ==5277== at 0x4C277AB: malloc (in > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) > ==5277== by 0x6BE3FE: __navier_stokes_MOD_rhs_ns (navier_stokes.f90:59) > ==5277== by 0x6C712A: __rhs_MOD_dfdt_1phase (rhs.f90:109) > ==5277== by 0x4EF52C: __rk_MOD_forward_euler (rk.f90:2168) > ==5277== by 0x642764: __rk_wrapper_MOD_rk_step (rk_wrapper.f90:313) > ==5277== by 0x7FA8B8: MAIN__ (meph.F90:179) > ==5277== by 0x7FC5B9: main (meph.F90:2) > ==5277== > ''' > > > On 05. mai 2014 17:00, Song Gao wrote: > > Thank you. Runing with mpirun -np 1 valgrind --track-origins=yes > > ~/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG > > -ksp_monitor_true_residual -ksp_view > > > > gives the following information. > > > > ==8222== Conditional jump or move depends on uninitialised > > value(s) ==8222== at 0x216E9A7: SearchPath (in > > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > > ==8222== by 0x216E1EC: mkl_cfg_file (in > > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > > ==8222== by 0x216AC14: DDOT (in > > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > > ==8222== by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222== by > > 0x126F431: VecNorm (rvector.c:166) ==8222== by 0x127039D: > > VecNormalize (rvector.c:261) ==8222== by 0x1405507: > > KSPGMRESCycle (gmres.c:127) ==8222== by 0x140695F: > > KSPSolve_GMRES (gmres.c:231) ==8222== by 0x1BEEF66: KSPSolve > > (itfunc.c:446) ==8222== by 0x13F95E8: kspsolve_ (itfuncf.c:219) > > ==8222== by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222== by > > 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222== Uninitialised > > value was created by a stack allocation ==8222== at 0x216E97F: > > SearchPath (in > > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > > ==8222== ==8222== Conditional jump or move depends on uninitialised > > value(s) ==8222== at 0x216E9D1: SearchPath (in > > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > > ==8222== by 0x216E1EC: mkl_cfg_file (in > > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > > ==8222== by 0x216AC14: DDOT (in > > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > > ==8222== by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222== by > > 0x126F431: VecNorm (rvector.c:166) ==8222== by 0x127039D: > > VecNormalize (rvector.c:261) ==8222== by 0x1405507: > > KSPGMRESCycle (gmres.c:127) ==8222== by 0x140695F: > > KSPSolve_GMRES (gmres.c:231) ==8222== by 0x1BEEF66: KSPSolve > > (itfunc.c:446) ==8222== by 0x13F95E8: kspsolve_ (itfuncf.c:219) > > ==8222== by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222== by > > 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222== Uninitialised > > value was created by a stack allocation ==8222== at 0x216E97F: > > SearchPath (in > > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) > > ==8222== > > > > > > > > On Mon, May 5, 2014 at 10:03 AM, Barry Smith > > wrote: > > > >> > >> If you run valgrind with the debug version of the libraries it > >> will provide more information about the line numbers where the > >> problem occurred, etc. recommend doing that. > >> > >> Either your initial solution or right hand side has garbage in it > >> or the wrong blas may be being linked in. But there is definitely > >> a problem > >> > >> Barry > >> > >> On May 5, 2014, at 8:28 AM, Song Gao > >> wrote: > >> > >>> Thanks for reply. What do you mean by a ?happier? state? I > >>> check the > >> converged solution (the one which call kspgmressetrestart twice), > >> the solution should be correct. > >>> > >>> I run with valgrind both codes (one call kspgmressetrestart > >>> once and > >> another call kspgmressetrestart twice) > >>> Both of them have the errors: what > >>> does this > >> mean? Thank you in advance. > >>> ==7858== Conditional jump or move depends on uninitialised > >>> value(s) ==7858== at 0xE71DFB: SearchPath (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0xE71640: mkl_cfg_file (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0xE6E068: DDOT (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x73281A: VecNorm_Seq (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x730BF4: VecNormalize (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x7BC5A8: KSPSolve_GMRES (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0xB8A06E: KSPSolve (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x7B659F: kspsolve_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x5EAE84: petsolv_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x4ECD46: flowsol_ng_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x507E4E: iterprc_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x51D1B4: solnalg_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== ==7858== Conditional jump or move depends on > >>> uninitialised value(s) ==7858== at 0xE71E25: SearchPath (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0xE71640: mkl_cfg_file (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0xE6E068: DDOT (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x73281A: VecNorm_Seq (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x730BF4: VecNormalize (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x7BC5A8: KSPSolve_GMRES (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0xB8A06E: KSPSolve (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x7B659F: kspsolve_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x5EAE84: petsolv_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x4ECD46: flowsol_ng_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x507E4E: iterprc_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== by 0x51D1B4: solnalg_ (in > >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) > >>> ==7858== > >>> > >>> > >>> On Fri, May 2, 2014 at 7:25 PM, Barry Smith > >>> wrote: > >>> > >>> On May 2, 2014, at 5:29 PM, Song Gao > >>> wrote: > >>> > >>>> Thanks for your quick reply. What confused me is that why > >>>> would the > >> code works fine if I reset the gmres restart number by recalling > >> kspgmressetrestart just before kspsolve? > >>> > >>> It isn?t really working. Something is going wrong (run with > >>> valgrind) > >> and setting that restart number and starting the solver just puts > >> it in a ?happier? state so it seems to make more progress. > >>> > >>> Barry > >>> > >>>> > >>>> Sent from my iPhone > >>>> > >>>>> On May 2, 2014, at 6:03 PM, "Barry Smith" > >>>>> wrote: > >>>>> > >>>>> > >>>>> Your shell matrix is buggy in some way. Whenever the > >>>>> residual norm > >> jumps like crazy at a restart it means that something is wrong > >> with the operator. > >>>>> > >>>>> Barry > >>>>> > >>>>>> On May 2, 2014, at 4:41 PM, Song Gao > >>>>>> > >> wrote: > >>>>>> > >>>>>> Dear PETSc users, > >>>>>> > >>>>>> I'm solving a linear system in KSP and trying to setup > >>>>>> the solver in > >> codes. But I feel strange because my codes don't converge unless > >> I call KSPGMRESSetRestart twice. > >>>>>> > >>>>>> My codes looks like > >>>>>> > >>>>>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, > >>>>>> pet_matp, > >> DIFFERENT_NONZERO_PATTERN, ierpetsc ) > >>>>>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) call > >>>>>> KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) call > >>>>>> KSPGetPC ( pet_solv, pet_precon, ierpetsc ) call > >>>>>> PCSetType ( pet_precon, 'asm', ierpetsc ) call > >>>>>> PCASMSetOverlap ( pet_precon, 1, ierpetsc ) call KSPSetUp > >>>>>> ( pet_solv, ierpetsc ) call PCASMGetSubKSP ( pet_precon, > >>>>>> n_local, first_local, > >> pet_solv_sub, ierpetsc ) ! n_local is one > >>>>>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc > >>>>>> ) call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) > >>>>>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) > >>>>>> call KSPSetFromOptions ( pet_solv, ierpetsc ) call > >>>>>> KSPGMRESSetRestart ( pet_solv, 29, ierpetsc ) > >>>>>> ! > >> adding this line, the codes converge > >>>>>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc > >>>>>> ) > >>>>>> > >>>>>> runing with 1 CPU WITHOUT the line with red color and > >>>>>> the codes > >> don't converge > >>>>>> > >>>>>> runtime options: -ksp_monitor_true_residual -ksp_view 0 > >>>>>> KSP preconditioned resid norm 6.585278940829e+00 true > >>>>>> resid norm > >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>>>>> 1 KSP preconditioned resid norm 6.585278219510e+00 true > >>>>>> resid norm > >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>>>>> 2 KSP preconditioned resid norm 2.198638170622e+00 true > >>>>>> resid norm > >> 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01 > >>>>>> 3 KSP preconditioned resid norm 1.599896387215e+00 true > >>>>>> resid norm > >> 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01 > >>>>>> ....... 28 KSP preconditioned resid norm > >>>>>> 4.478466011191e-01 true resid norm > >> 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01 > >>>>>> 29 KSP preconditioned resid norm 4.398129572260e-01 true > >>>>>> resid norm > >> 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01 > >>>>>> 30 KSP preconditioned resid norm 2.783227613716e+12 true > >>>>>> resid norm > >> 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01 > >>>>>> > >>>>>> KSP Object: 1 MPI processes type: gmres GMRES: > >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt > >> Orthogonalization with no iterative refinement > >>>>>> GMRES: happy breakdown tolerance 1e-30 maximum > >>>>>> iterations=10000, initial guess is zero tolerances: > >>>>>> relative=1e-05, absolute=1e-50, divergence=10000 left > >>>>>> preconditioning using PRECONDITIONED norm type for > >>>>>> convergence test PC Object: 1 MPI processes type: asm > >>>>>> Additive Schwarz: total subdomain blocks = 1, amount of > >>>>>> overlap = 1 Additive Schwarz: restriction/interpolation > >>>>>> type - RESTRICT [0] number of local blocks = 1 Local > >>>>>> solve info for each block is in the following KSP and PC > >> objects: > >>>>>> - - - - - - - - - - - - - - - - - - [0] local block > >>>>>> number 0, size = 22905 KSP Object: (sub_) 1 MPI > >>>>>> processes type: preonly maximum iterations=10000, initial > >>>>>> guess is zero tolerances: relative=1e-05, > >>>>>> absolute=1e-50, divergence=10000 left preconditioning > >>>>>> using NONE norm type for convergence test PC Object: > >>>>>> (sub_) 1 MPI processes type: jacobi linear system > >>>>>> matrix = precond matrix: Matrix Object: 1 MPI > >>>>>> processes type: seqbaij rows=22905, cols=22905, bs=5 > >>>>>> total: nonzeros=785525, allocated nonzeros=785525 total > >>>>>> number of mallocs used during MatSetValues calls =0 block > >>>>>> size is 5 - - - - - - - - - - - - - - - - - - linear > >>>>>> system matrix followed by preconditioner matrix: Matrix > >>>>>> Object: 1 MPI processes type: shell rows=22905, > >>>>>> cols=22905 Matrix Object: 1 MPI processes type: > >>>>>> seqbaij rows=22905, cols=22905, bs=5 total: > >>>>>> nonzeros=785525, allocated nonzeros=785525 total number > >>>>>> of mallocs used during MatSetValues calls =0 block size > >>>>>> is 5 WARNING: zero iteration in iterative solver > >>>>>> > >>>>>> runing with 1 CPU WITH the line with red color and the > >>>>>> codes > >> converge > >>>>>> > >>>>>> runtime options: -ksp_monitor_true_residual -ksp_view 0 > >>>>>> KSP preconditioned resid norm 6.585278940829e+00 true > >>>>>> resid norm > >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00 > >>>>>> 1 KSP preconditioned resid norm 2.566248171026e+00 true > >>>>>> resid norm > >> 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01 > >>>>>> 2 KSP preconditioned resid norm 1.410418402651e+00 true > >>>>>> resid norm > >> 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01 > >>>>>> 3 KSP preconditioned resid norm 9.665409287757e-01 true > >>>>>> resid norm > >> 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01 > >>>>>> 4 KSP preconditioned resid norm 4.469486152454e-01 true > >>>>>> resid norm > >> 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01 > >>>>>> 5 KSP preconditioned resid norm 2.474889829653e-01 true > >>>>>> resid norm > >> 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02 > >>>>>> ............ 24 KSP preconditioned resid norm > >>>>>> 9.518780877620e-05 true resid norm > >> 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05 > >>>>>> 25 KSP preconditioned resid norm 6.837876679998e-05 true > >>>>>> resid norm > >> 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05 > >>>>>> 26 KSP preconditioned resid norm 4.864361942316e-05 true > >>>>>> resid norm > >> 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05 > >>>>>> KSP Object: 1 MPI processes type: gmres GMRES: > >>>>>> restart=29, using Classical (unmodified) Gram-Schmidt > >> Orthogonalization with no iterative refinement > >>>>>> GMRES: happy breakdown tolerance 1e-30 maximum > >>>>>> iterations=10000, initial guess is zero tolerances: > >>>>>> relative=1e-05, absolute=1e-50, divergence=10000 left > >>>>>> preconditioning using PRECONDITIONED norm type for > >>>>>> convergence test PC Object: 1 MPI processes type: asm > >>>>>> Additive Schwarz: total subdomain blocks = 1, amount of > >>>>>> overlap = 1 Additive Schwarz: restriction/interpolation > >>>>>> type - RESTRICT [0] number of local blocks = 1 Local > >>>>>> solve info for each block is in the following KSP and PC > >> objects: > >>>>>> - - - - - - - - - - - - - - - - - - [0] local block > >>>>>> number 0, size = 22905 KSP Object: (sub_) 1 MPI > >>>>>> processes type: preonly maximum iterations=10000, initial > >>>>>> guess is zero tolerances: relative=1e-05, > >>>>>> absolute=1e-50, divergence=10000 left preconditioning > >>>>>> using NONE norm type for convergence test PC Object: > >>>>>> (sub_) 1 MPI processes type: jacobi linear system > >>>>>> matrix = precond matrix: Matrix Object: 1 MPI > >>>>>> processes type: seqbaij rows=22905, cols=22905, bs=5 > >>>>>> total: nonzeros=785525, allocated nonzeros=785525 total > >>>>>> number of mallocs used during MatSetValues calls =0 block > >>>>>> size is 5 - - - - - - - - - - - - - - - - - - linear > >>>>>> system matrix followed by preconditioner matrix: Matrix > >>>>>> Object: 1 MPI processes type: shell rows=22905, > >>>>>> cols=22905 Matrix Object: 1 MPI processes type: > >>>>>> seqbaij rows=22905, cols=22905, bs=5 total: > >>>>>> nonzeros=785525, allocated nonzeros=785525 total number > >>>>>> of mallocs used during MatSetValues calls =0 block size > >>>>>> is 5 WARNING: zero iteration in iterative solver > >>>>>> > >>>>>> > >>>>>> What would be my error here? Thank you. > >>>>> > >>> > >>> > >> > >> > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.22 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQEcBAEBAgAGBQJTZ8t/AAoJED+FDAHgGz19xocH/i2A2Ccw3BTypkyicy6dqAQE > wgVqukXnBI//adXHSe60uQBtL4OmjMiGOSt/Egye6N2QF/29yMzNdwTmHw6DZSRC > C8yyPpVMEOPwB2WED0ui+IGSYq6JglOVplT5lCf2T99Y/gZNiqugCNz0ydnA5KnP > 9W0O1yO2/2xgE4bMEibVhFIPsaXKGyTLv1ZjZLgdnbnTYFbCZqJk+9lVOOpQlqBZ > mrzE+9GjO+0+BucEwI4Ekw4b9PI/Yctl0JW7zx+ZmviRsXRF4L3aO2SeFm1fBSnh > XPIreXBNB6vyAmPFBx9TJZHQFucJIsFLHrlrea6onePKBx4Eg3JcpOlX8GdJr5w= > =MkKg > -----END PGP SIGNATURE----- > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 2520564 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: text/x-log Size: 46399 bytes Desc: not available URL: From paulhuaizhang at gmail.com Mon May 5 20:14:44 2014 From: paulhuaizhang at gmail.com (huaibao zhang) Date: Mon, 5 May 2014 21:14:44 -0400 Subject: [petsc-users] a naive question about assembly Message-ID: <35D8669B-AF79-48EA-AF1D-CC8A21EEC4A0@gmail.com> Hello, I looked up the manual, but still felt quite confused about why have to do assembly. Does it have to do with parallelization? Since all of the processors are loading the data at the same time, they need to a pause before one can use the whole vector? See a piece of code: for (int c=0;c From knepley at gmail.com Mon May 5 20:18:27 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 5 May 2014 20:18:27 -0500 Subject: [petsc-users] a naive question about assembly In-Reply-To: <35D8669B-AF79-48EA-AF1D-CC8A21EEC4A0@gmail.com> References: <35D8669B-AF79-48EA-AF1D-CC8A21EEC4A0@gmail.com> Message-ID: On Mon, May 5, 2014 at 8:14 PM, huaibao zhang wrote: > > Hello, > > I looked up the manual, but still felt quite confused about why have to > do assembly. Does it have to do with parallelization? Since all of the > processors are loading the data at the same time, they need to a pause > before one can use the whole vector? > If one process sets a value owned by another process, it has to tell it. Matt > See a piece of code: > > for (int c=0;c row=grid[gid].myOffset+c; > value=p; > > VecSetValues(soln_n,1,&row,&value,INSERT_VALUES); > } > VecAssemblyBegin(soln_n); VecAssemblyEnd(soln_n); > > > Thanks, > Paul > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 5 20:24:50 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 5 May 2014 20:24:50 -0500 Subject: [petsc-users] a naive question about assembly In-Reply-To: References: <35D8669B-AF79-48EA-AF1D-CC8A21EEC4A0@gmail.com> Message-ID: On Mon, May 5, 2014 at 8:22 PM, huaibao zhang wrote: > > Matt, > > THanks for the answer. > I think my question is why have to do assembly? > In the piece of my code, 2 processors are inserting the dada to a public > vector soon_n. > This is explained very well in the book Using MPI: http://www.mcs.anl.gov/research/projects/mpi/usingmpi/ If process 0 inserts a value for process 1, then somehow process 1 must be told. That happens in VecAssembly(). Matt > Paul > > > > On May 5, 2014, at 9:18 PM, Matthew Knepley wrote: > > On Mon, May 5, 2014 at 8:14 PM, huaibao zhang wrote: > >> >> Hello, >> >> I looked up the manual, but still felt quite confused about why have to >> do assembly. Does it have to do with parallelization? Since all of the >> processors are loading the data at the same time, they need to a pause >> before one can use the whole vector? >> > > If one process sets a value owned by another process, it has to tell it. > > Matt > > >> See a piece of code: >> >> for (int c=0;c> row=grid[gid].myOffset+c; >> value=p; >> >> VecSetValues(soln_n,1,&row,&value,INSERT_VALUES); >> } >> VecAssemblyBegin(soln_n); VecAssemblyEnd(soln_n); >> >> >> Thanks, >> Paul >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From puesoek at uni-mainz.de Tue May 6 07:23:40 2014 From: puesoek at uni-mainz.de (=?iso-8859-1?Q?P=FCs=F6k=2C_Adina-Erika?=) Date: Tue, 6 May 2014 12:23:40 +0000 Subject: [petsc-users] Problem with MatZeroRowsColumnsIS() Message-ID: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de> Hello! I was trying to implement some internal Dirichlet boundary conditions into an aij matrix of the form: A=( VV VP; PV PP ). The idea was to create an internal block (let's say Dirichlet block) that moves with constant velocity within the domain (i.e. check all the dofs within the block and set the values accordingly to the desired motion). Ideally, this means to zero the rows and columns in VV, VP, PV corresponding to the dirichlet dofs and modify the corresponding rhs values. However, since we have submatrices and not a monolithic matrix A, we can choose to modify only VV and PV matrices. The global indices of the velocity points within the Dirichlet block are contained in the arrays rowid_array. What I want to point out is that the function MatZeroRowsColumnsIS() seems to create parallel artefacts, compared to MatZeroRowsIS() when run on more than 1 processor. Moreover, the results on 1 cpu are identical. See below the results of the test (the Dirichlet block is outlined in white) and the piece of the code involved where the 1) - 2) parts are the only difference. Thanks, Adina Pusok // Create an IS required by MatZeroRows() ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx); CHKERRQ(ierr); ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy); CHKERRQ(ierr); ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz); CHKERRQ(ierr); 1) /* ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr);*/ 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); ierr = ISDestroy(&isx); CHKERRQ(ierr); ierr = ISDestroy(&isy); CHKERRQ(ierr); ierr = ISDestroy(&isz); CHKERRQ(ierr); Results (velocity) with MatZeroRowsColumnsIS(). 1cpu[cid:779A0024-2BBB-4F8D-AB25-114C5B3D111C at Geo.Uni-Mainz.DE] 4cpu[cid:9FAA7278-A3FE-4A5D-B7A1-05AAFCA43181 at Geo.Uni-Mainz.DE] Results (velocity) with MatZeroRowsIS(): 1cpu[cid:C0C73566-0D52-484C-A858-01A184C23597 at Geo.Uni-Mainz.DE] 4cpu[cid:7A9FD8A2-C2FC-41B3-88BF-11F77628E874 at Geo.Uni-Mainz.DE] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: r01_1cpu_rows_columns.png Type: image/png Size: 28089 bytes Desc: r01_1cpu_rows_columns.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: r01_rows_columns.png Type: image/png Size: 28325 bytes Desc: r01_rows_columns.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: r01_1cpu_rows.png Type: image/png Size: 28089 bytes Desc: r01_1cpu_rows.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: r01_rows.png Type: image/png Size: 28045 bytes Desc: r01_rows.png URL: From knepley at gmail.com Tue May 6 09:22:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 May 2014 09:22:52 -0500 Subject: [petsc-users] Problem with MatZeroRowsColumnsIS() In-Reply-To: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de> References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de> Message-ID: On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika wrote: > Hello! > > I was trying to implement some internal Dirichlet boundary conditions > into an aij matrix of the form: A=( VV VP; PV PP ). The idea was to > create an internal block (let's say Dirichlet block) that moves with > constant velocity within the domain (i.e. check all the dofs within the > block and set the values accordingly to the desired motion). > > Ideally, this means to zero the rows and columns in VV, VP, PV > corresponding to the dirichlet dofs and modify the corresponding rhs > values. However, since we have submatrices and not a monolithic matrix A, > we can choose to modify only VV and PV matrices. > The global indices of the velocity points within the Dirichlet block are > contained in the arrays rowid_array. > > What I want to point out is that the function MatZeroRowsColumnsIS() > seems to create parallel artefacts, compared to MatZeroRowsIS() when run on > more than 1 processor. Moreover, the results on 1 cpu are identical. > See below the results of the test (the Dirichlet block is outlined in > white) and the piece of the code involved where the 1) - 2) parts are the > only difference. > I am assuming that you are showing the result of solving the equations. It would be more useful, and presumably just as easy to say: a) Are the correct rows zeroed out? b) Is the diagonal element correct? c) Is the rhs value correct? d) Are the columns zeroed correctly? If we know where the problem is, its easier to fix. For example, if the rhs values are correct and the rows are zeroed, then something is wrong with the solution procedure. Since ZeroRows() works and ZeroRowsColumns() does not, this is a distinct possibility. Thanks, Matt > Thanks, > Adina Pusok > > // Create an IS required by MatZeroRows() > ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array, > PETSC_COPY_VALUES,&isx); CHKERRQ(ierr); > ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array, > PETSC_COPY_VALUES,&isy); CHKERRQ(ierr); > ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array, > PETSC_COPY_VALUES,&isz); CHKERRQ(ierr); > > 1) /* ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ( > ierr); > ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr > );*/ > > 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr); > > ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > > ierr = ISDestroy(&isx); CHKERRQ(ierr); > ierr = ISDestroy(&isy); CHKERRQ(ierr); > ierr = ISDestroy(&isz); CHKERRQ(ierr); > > > Results (velocity) with MatZeroRowsColumnsIS(). > 1cpu 4cpu > > Results (velocity) with MatZeroRowsIS(): > 1cpu 4cpu > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: r01_1cpu_rows_columns.png Type: image/png Size: 28089 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: r01_1cpu_rows.png Type: image/png Size: 28089 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: r01_rows_columns.png Type: image/png Size: 28325 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: r01_rows.png Type: image/png Size: 28045 bytes Desc: not available URL: From francium87 at hotmail.com Tue May 6 22:53:46 2014 From: francium87 at hotmail.com (linjing bo) Date: Wed, 7 May 2014 03:53:46 +0000 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: , , , , Message-ID: The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal? --------------------------------------------------------------------------------------------- ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75) ==31551== by 0x5E1C41F: KSPSetUp (itfunc.c:239) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ... ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011) ==31551== by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655) ==31551== by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756) ==31551== by 0x573D152: MatILUFactorSymbolic (matrix.c:6240) ==31551== by 0x5CFB843: PCSetUp_ILU (ilu.c:204) ==31551== by 0x5D8BAA7: PCSetUp (precon.c:890) ==31551== by 0x5E1C639: KSPSetUp (itfunc.c:278) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ------------------------------------------------------------------------------------------------- From: francium87 at hotmail.com To: knepley at gmail.com CC: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] VecValidValues() reports NaN found Date: Mon, 5 May 2014 13:15:38 +0000 Ok, I will try it . Thanks for your advise. Date: Mon, 5 May 2014 08:12:05 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: I use JACOBI. The message showed is with JACOBI. Wired situation is that the backtrack information shows the location is before actually apply PC, so I guess the rhs vec is not changed at this point. Another wired thing is : Because the original code is to complex. I write out the A matrix in Ax=b, and write a small test code to read in this matrix and solve it, no error showed. The KSP, PC are all set to be the same. When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function: 1) Run in serial until it works 2) It looks like you have memory overwriting problems. Run with valgrind Matt [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Argument out of range! [2]PETSC ERROR: Cannot log negative flops! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:51:27 2014 [2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c Date: Mon, 5 May 2014 07:27:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Floating point exception! [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:03:20 2014 [3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c It looks like the vector after preconditioner application is bad. What is the preconditioner? Matt [3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c [3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below ---------------------------Solver section-------------------------- call VecSet( pet_bp_b, one, ierr) vecidx=[0,1] call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) write(*,*) ' first two values ', first(1), first(2) call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) CHKERRQ(ierr) -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Zafer.Leylek at student.adfa.edu.au Tue May 6 23:39:38 2014 From: Zafer.Leylek at student.adfa.edu.au (Zafer Leylek) Date: Wed, 7 May 2014 04:39:38 +0000 Subject: [petsc-users] MatGetMumpsRINFOG() Message-ID: <996B4E35EA834745A31426395DB0BBA01DFD960C@ADFAPWEXMBX02.ad.adfa.edu.au> Hi, I am trying to get mumps to return the matrix determinant. I have set the ICNTL option using: MatMumpsSetIcntl(A,33,1); and can view the determinant using PCView(pc, PETSC_VIEWER_STDOUT_WORLD); I need to use the determinant in my code. Is there a way I can get petsc to return this parameter. If not, is it possible to implement the MatGetMumpsRINFOG() as suggested in: http://lists.mcs.anl.gov/pipermail/petsc-users/2011-September/010225.html King Regards ZL -------------- next part -------------- An HTML attachment was scrubbed... URL: From altriaex86 at gmail.com Wed May 7 04:01:53 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Wed, 7 May 2014 19:01:53 +1000 Subject: [petsc-users] Inserting -nan+iG at matrix entry problem Message-ID: Hi,all I got a "Inserting -nan+iG error" at function MatSetValues. My code like this: I first use code below to change a double into PETScScalar (I am using Complex version). *for(i=0;i From rupp at iue.tuwien.ac.at Wed May 7 05:08:00 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Wed, 7 May 2014 12:08:00 +0200 Subject: [petsc-users] Inserting -nan+iG at matrix entry problem In-Reply-To: References: Message-ID: <536A0600.6020005@iue.tuwien.ac.at> Hi, from your description it sounds like this might be a memory corruption issue. Please run your code through valgrind first. If this doesn't show any errors, please send more context (sources, Makefile, etc.). We can only guess what e.g. the type of 'temp' is or whether the correct header files get picked up. Best regards, Karli On 05/07/2014 11:01 AM, ??? wrote: > Hi,all > > I got a "Inserting -nan+iG error" at function MatSetValues. > > My code like this: > > I first use code below to change a double into PETScScalar (I am using > Complex version). > *for(i=0;i Then I use code below to insert values into matrix. > *ierr = MatSetValues(A,n,Conlumn_ptr,n,Ai,temp,INSERT_VALUES);* > > Here is how problem happens: > > I compile my PETSc code into a .so lib and test it with a simple matrix > and*it passed*. So I link it with the other part of my program. > > However, it keeps telling me > * > Inserting -nan+iG at matrix entry (2,3)!* > > The (2,3) is zero actually, and I could print it with std::cerr which > tells me it is zero. The other part of my program,with which generates > actual matrix I will deal, is correct.(I could use ARPACK with it.) > > I was confused about why PETSc recognize a zero into -nan. In my simple > test, there is also zero entry, at (0,0) however. For the other part is > compiled itself, I guess there might be some problem with compiling > options. But I have no idea about it. Could anybody help me? > > > Guoxi From knepley at gmail.com Wed May 7 05:52:13 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 May 2014 05:52:13 -0500 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: Message-ID: On Tue, May 6, 2014 at 10:53 PM, linjing bo wrote: > The Valgrind shows memory leak in memalign() called by KSPSetup and > PCSetup. Is that normal? > Did you call KSPDestroy()? Matt > > > --------------------------------------------------------------------------------------------- > ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 > of 3,327 > ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) > ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) > ==31551== by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75) > ==31551== by 0x5E1C41F: KSPSetUp (itfunc.c:239) > ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) > ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) > ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) > ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) > ==31551== by 0x46CB4E: MAIN__ (main.F90:96) > ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) > > ... > > ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 > of 3,327 > ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) > ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) > ==31551== by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011) > ==31551== by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 > (aijfact.c:1655) > ==31551== by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756) > ==31551== by 0x573D152: MatILUFactorSymbolic (matrix.c:6240) > ==31551== by 0x5CFB843: PCSetUp_ILU (ilu.c:204) > ==31551== by 0x5D8BAA7: PCSetUp (precon.c:890) > ==31551== by 0x5E1C639: KSPSetUp (itfunc.c:278) > ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) > ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) > ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) > ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) > ==31551== by 0x46CB4E: MAIN__ (main.F90:96) > ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) > > > ------------------------------------------------------------------------------------------------- > > > ------------------------------ > From: francium87 at hotmail.com > To: knepley at gmail.com > CC: petsc-users at mcs.anl.gov > Subject: RE: [petsc-users] VecValidValues() reports NaN found > Date: Mon, 5 May 2014 13:15:38 +0000 > > Ok, I will try it . Thanks for your advise. > > ------------------------------ > Date: Mon, 5 May 2014 08:12:05 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: > > I use JACOBI. The message showed is with JACOBI. > > > Wired situation is that the backtrack information shows the location is > before actually apply PC, so I guess the rhs vec is not changed at this > point. > > Another wired thing is : Because the original code is to complex. I write > out the A matrix in Ax=b, and write a small test code to read in this > matrix and solve it, no error showed. The KSP, PC are all set to be the > same. > > When I try to using ILU, more wired error happens, the backtrack info > shows it died in a Flops logging function: > > > 1) Run in serial until it works > > 2) It looks like you have memory overwriting problems. Run with valgrind > > Matt > > > [2]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: Argument out of > range! > [2]PETSC ERROR: Cannot log negative > flops! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [2]PETSC ERROR: See docs/changes/index.html for recent > updates. > [2]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [2]PETSC ERROR: See docs/index.html for manual > pages. > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:51:27 > 2014 > > [2]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [2]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [2]PETSC ERROR: > ------------------------------------------------------------------------ > > [2]PETSC ERROR: PetscLogFlops() line 204 in > /tmp/petsc-3.4.4/include/petsclog.h > [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in > /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c > > [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in > /tmp/petsc-3.4.4/src/mat/interface/matrix.c > [2]PETSC ERROR: PCSetUp_ILU() line 232 in > /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c > [2]PETSC ERROR: PCSetUp() line 890 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [2]PETSC ERROR: KSPSetUp() line 278 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > [2]PETSC ERROR: KSPSolve() line 399 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > > > ------------------------------ > Date: Mon, 5 May 2014 07:27:52 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: > > Hi, I'm trying to use PETSc's ksp method to solve a linear system. When > running, Error is reported by VecValidValues() that NaN or Inf is found > with error message listed below > > > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Floating point > exception! > [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite > at beginning of function: Parameter number > 2! > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [3]PETSC ERROR: See docs/changes/index.html for recent > updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [3]PETSC ERROR: See docs/index.html for manual > pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:03:20 > 2014 > > [3]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [3]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: VecValidValues() line 28 in > /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c > > > It looks like the vector after preconditioner application is bad. What is > the preconditioner? > > Matt > > > > [3]PETSC ERROR: PCApply() line 436 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [3]PETSC ERROR: KSP_PCApply() line 227 in > /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h > [3]PETSC ERROR: KSPInitialResidual() line 64 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c > [3]PETSC ERROR: KSPSolve_GMRES() line 239 in > /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c > [3]PETSC ERROR: KSPSolve() line 441 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > After read the source code shown by backtrack informations, I realize the > problem is in the right hand side vector. So I make a trial of set right > hand side vector to ONE by VecSet, But the program still shows error > message above, and using VecView or VecGetValue to investigate the first > value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly > describe the problem. The code related is listed below > > ---------------------------Solver section-------------------------- > > call VecSet( pet_bp_b, one, ierr) > > vecidx=[0,1] > call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) > write(*,*) ' first two values ', first(1), first(2) > > call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) > call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) > call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) > CHKERRQ(ierr) > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From altriaex86 at gmail.com Wed May 7 08:03:07 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Wed, 7 May 2014 23:03:07 +1000 Subject: [petsc-users] Is there any method to call multiple mpi run inside program Message-ID: Hi, all I use SLEPc as part of my program, which means I compile it as .so library. I want my program work like this: execute serially, reach Point A, parallel solving, then serially again. I know I could use mpirun -np 4 in terminal to call the whole program, but this will let the serial part be executed 4 times. What I want is only call mpi at the eigensolving part. Is there any function that could achieve something like that? Thanks a lot. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed May 7 08:16:52 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 07 May 2014 07:16:52 -0600 Subject: [petsc-users] Is there any method to call multiple mpi run inside program In-Reply-To: References: Message-ID: <87eh05pw3f.fsf@jedbrown.org> ??? writes: > Hi, all > > I use SLEPc as part of my program, which means I compile it as .so library. > > I want my program work like this: execute serially, reach Point A, parallel > solving, then serially again. > > I know I could use mpirun -np 4 in terminal to call the whole program, > but this will let the serial part be executed 4 times. This may not be bad, but you can MPI_Comm_rank(MPI_COMM_WORLD,&rank), if (rank == 0) { ... do the serial stuff ...}. > What I want is only call mpi at the eigensolving part. Is there any > function that could achieve something like that? There is MPI_Comm_spawn, but it's sort of a mess for resource management/portability problem so I would recommend just using mpiexec. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From hzhang at mcs.anl.gov Wed May 7 09:53:24 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Wed, 7 May 2014 09:53:24 -0500 Subject: [petsc-users] MatGetMumpsRINFOG() In-Reply-To: <16ccafbcb0c941b09a063946350a6688@GEORGE.anl.gov> References: <16ccafbcb0c941b09a063946350a6688@GEORGE.anl.gov> Message-ID: Zafer : Sure, I can add MatGetMumpsxxx(). I'll let you know after I'm done (1-2 days). Hong > Hi, > > I am trying to get mumps to return the matrix determinant. I have set the > ICNTL option using: > > MatMumpsSetIcntl(A,33,1); > > and can view the determinant using > > PCView(pc, PETSC_VIEWER_STDOUT_WORLD); > > I need to use the determinant in my code. Is there a way I can get petsc to > return this parameter. If not, is it possible to implement the > MatGetMumpsRINFOG() as suggested in: > > http://lists.mcs.anl.gov/pipermail/petsc-users/2011-September/010225.html > > King Regards > > ZL From info at jubileedvds.com Wed May 7 17:33:04 2014 From: info at jubileedvds.com (Jubilee DVDs) Date: Wed, 7 May 2014 17:33:04 -0500 (CDT) Subject: [petsc-users] Jubilee DVDs Newsletter Message-ID: <1195896-1399501804761-133838-250313049-1-0@b.ss55.mailboxesmore.com> An HTML attachment was scrubbed... URL: From zonexo at gmail.com Wed May 7 20:35:39 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 08 May 2014 09:35:39 +0800 Subject: [petsc-users] Override PETSc compile options Message-ID: <536ADF6B.9020602@gmail.com> Hi, I want to override PETSc compile options. During compile, PETSc automatically uses -Wall etc How can I change that? -- Thank you Yours sincerely, TAY wee-beng From knepley at gmail.com Wed May 7 20:46:33 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 May 2014 20:46:33 -0500 Subject: [petsc-users] Override PETSc compile options In-Reply-To: <536ADF6B.9020602@gmail.com> References: <536ADF6B.9020602@gmail.com> Message-ID: On Wed, May 7, 2014 at 8:35 PM, TAY wee-beng wrote: > Hi, > > I want to override PETSc compile options. During compile, PETSc > automatically uses -Wall etc > > How can I change that? All the options are given in -help for configure. That compiler options can be overridden using --COPTFLAGS Matt > > -- > Thank you > > Yours sincerely, > > TAY wee-beng > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Wed May 7 20:53:26 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 08 May 2014 09:53:26 +0800 Subject: [petsc-users] Override PETSc compile options In-Reply-To: References: <536ADF6B.9020602@gmail.com> Message-ID: <536AE396.7020102@gmail.com> Hi Matt, Sorry, I mean during the compilation of my own codes using my own makefile. Thank you Yours sincerely, TAY wee-beng On 8/5/2014 9:46 AM, Matthew Knepley wrote: > On Wed, May 7, 2014 at 8:35 PM, TAY wee-beng > wrote: > > Hi, > > I want to override PETSc compile options. During compile, PETSc > automatically uses -Wall etc > > How can I change that? > > > All the options are given in -help for configure. That compiler > options can be overridden using --COPTFLAGS > > Matt > > > -- > Thank you > > Yours sincerely, > > TAY wee-beng > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 7 21:09:33 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 May 2014 21:09:33 -0500 Subject: [petsc-users] Override PETSc compile options In-Reply-To: <536AE396.7020102@gmail.com> References: <536ADF6B.9020602@gmail.com> <536AE396.7020102@gmail.com> Message-ID: On Wed, May 7, 2014 at 8:53 PM, TAY wee-beng wrote: > Hi Matt, > > Sorry, I mean during the compilation of my own codes using my own makefile. > I know, this is how you change the default PETSc compile flags. Matt > Thank you > > Yours sincerely, > > TAY wee-beng > > On 8/5/2014 9:46 AM, Matthew Knepley wrote: > > On Wed, May 7, 2014 at 8:35 PM, TAY wee-beng wrote: > >> Hi, >> >> I want to override PETSc compile options. During compile, PETSc >> automatically uses -Wall etc >> >> How can I change that? > > > All the options are given in -help for configure. That compiler options > can be overridden using --COPTFLAGS > > Matt > > >> >> -- >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed May 7 21:13:54 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 7 May 2014 21:13:54 -0500 Subject: [petsc-users] Override PETSc compile options In-Reply-To: References: <536ADF6B.9020602@gmail.com> Message-ID: On Wed, 7 May 2014, Matthew Knepley wrote: > On Wed, May 7, 2014 at 8:35 PM, TAY wee-beng wrote: > > > Hi, > > > > I want to override PETSc compile options. During compile, PETSc > > automatically uses -Wall etc > > > > How can I change that? > > > All the options are given in -help for configure. That compiler options can > be overridden using --COPTFLAGS Actually CFLAGS should be used to ovewride -Wall type options Satish From balay at mcs.anl.gov Wed May 7 21:15:39 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 7 May 2014 21:15:39 -0500 Subject: [petsc-users] Override PETSc compile options In-Reply-To: <536AE396.7020102@gmail.com> References: <536ADF6B.9020602@gmail.com> <536AE396.7020102@gmail.com> Message-ID: On Wed, 7 May 2014, TAY wee-beng wrote: > Hi Matt, > > > Sorry, I mean during the compilation of my own codes using my own makefile. > $ grep Wall arch-linux2-c-debug/conf/petscvariables FC_LINKER_FLAGS = -fPIC -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g -O0 CC_FLAGS = -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 FC_FLAGS = -fPIC -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g -O0 CXX_FLAGS = -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -fPIC PCC_LINKER_FLAGS = -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 PCC_FLAGS = -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 balay at asterix /home/balay/petsc (master) You can redefine the corresponding variables in your makefile as needed. [after the include directive] Satish From zonexo at gmail.com Wed May 7 21:16:59 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 08 May 2014 10:16:59 +0800 Subject: [petsc-users] Override PETSc compile options In-Reply-To: References: <536ADF6B.9020602@gmail.com> <536AE396.7020102@gmail.com> Message-ID: <536AE91B.3030201@gmail.com> On 8/5/2014 10:15 AM, Satish Balay wrote: > On Wed, 7 May 2014, TAY wee-beng wrote: > >> Hi Matt, >> >> >> Sorry, I mean during the compilation of my own codes using my own makefile. >> > $ grep Wall arch-linux2-c-debug/conf/petscvariables > FC_LINKER_FLAGS = -fPIC -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g -O0 > CC_FLAGS = -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 > FC_FLAGS = -fPIC -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g -O0 > CXX_FLAGS = -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -fPIC > PCC_LINKER_FLAGS = -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 > PCC_FLAGS = -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 > balay at asterix /home/balay/petsc (master) > > > You can redefine the corresponding variables in your makefile as needed. [after the include directive] Ya, that was what I was looking for! Thanks! > > Satish From likunt at caltech.edu Wed May 7 22:11:33 2014 From: likunt at caltech.edu (likunt at caltech.edu) Date: Wed, 7 May 2014 20:11:33 -0700 (PDT) Subject: [petsc-users] question on ksp Message-ID: <59019.131.215.220.164.1399518693.squirrel@webmail.caltech.edu> Dear Petsc developers, I am solving a linear system Ax=b. The rhs vector b and the matrix A are defined as follows, DMDACreate1d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,M,3,1,NULL,&da); DMCreateGlobalVector(da, &b); MatCreate(PETSC_COMM_WORLD, &A); MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, M*3, M*3); MatMPIAIJSetPreallocation(A, 7, NULL, 7, NULL); MatSetUp(A); There is a Memory corruption problem when calling KSPSetOperators(ksp, A, A, DIFFERENT_NONZERO_PATTERN); KSPSolve(ksp, x, b); since the partition of A and b are not consistent. Should I use KSPSetDM and KSPSetComputeOperators for sovling this problem? Thanks, From bsmith at mcs.anl.gov Wed May 7 22:27:05 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 7 May 2014 22:27:05 -0500 Subject: [petsc-users] question on ksp In-Reply-To: <59019.131.215.220.164.1399518693.squirrel@webmail.caltech.edu> References: <59019.131.215.220.164.1399518693.squirrel@webmail.caltech.edu> Message-ID: Use DMCreateMatrix() and it will return the correctly sized matrix, with the correct parallel layout and the the correct nonzero preallocation for the given DM. After these changes let us know if you have any problems. Barry On May 7, 2014, at 10:11 PM, wrote: > Dear Petsc developers, > > I am solving a linear system Ax=b. The rhs vector b and the matrix A are > defined as follows, > > DMDACreate1d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,M,3,1,NULL,&da); > DMCreateGlobalVector(da, &b); > > MatCreate(PETSC_COMM_WORLD, &A); > MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, M*3, M*3); > MatMPIAIJSetPreallocation(A, 7, NULL, 7, NULL); > MatSetUp(A); > > There is a Memory corruption problem when calling > KSPSetOperators(ksp, A, A, DIFFERENT_NONZERO_PATTERN); > KSPSolve(ksp, x, b); > > since the partition of A and b are not consistent. Should I use > > KSPSetDM and KSPSetComputeOperators > > for sovling this problem? > > Thanks, > > > > > > From francium87 at hotmail.com Thu May 8 02:49:03 2014 From: francium87 at hotmail.com (linjing bo) Date: Thu, 8 May 2014 07:49:03 +0000 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: , , , , , , Message-ID: Yes, I called KSPDestroy(). I have reproduce the problem using a small C code, this code with default ilu preconditioner will show an error [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: Cannot log negative flops! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.info by jlin Thu May 8 15:43:13 2014 [0]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [0]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [0]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [0]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c ===================================================Below is the code-------------------------------------------------static char help[] = "Solve"; #include int main(int argc, char **args){ Vec x,b,u; Mat A; KSP ksp; PC pc; PetscViewer fd; PetscErrorCode ierr; PetscReal tol=1.e-4; PetscScalar one = 1.0; PetscInt n=1023; PetscInitialize(&argc,&args,(char*)0,help); ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr); ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr); ierr = VecSetFromOptions(x);CHKERRQ(ierr); ierr = VecDuplicate(x,&b);CHKERRQ(ierr); PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", FILE_MODE_READ, &fd); ierr = MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \ 11,PETSC_NULL,11,PETSC_NULL,&A); ierr = MatLoad(A, fd); PetscViewerDestroy(&fd); VecSet( b, one); VecSet( x, one); VecAssemblyBegin(b); VecAssemblyEnd(b); VecAssemblyBegin(x); VecAssemblyEnd(x); ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr); ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr); ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); KSPDestroy(&ksp); VecDestroy(&x); VecDestroy(&b); PetscFinalize(); return 0; }----------------------------------------------------------- Date: Wed, 7 May 2014 05:52:13 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Tue, May 6, 2014 at 10:53 PM, linjing bo wrote: The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal? Did you call KSPDestroy()? Matt --------------------------------------------------------------------------------------------- ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75) ==31551== by 0x5E1C41F: KSPSetUp (itfunc.c:239) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ... ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011) ==31551== by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655) ==31551== by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756) ==31551== by 0x573D152: MatILUFactorSymbolic (matrix.c:6240) ==31551== by 0x5CFB843: PCSetUp_ILU (ilu.c:204) ==31551== by 0x5D8BAA7: PCSetUp (precon.c:890) ==31551== by 0x5E1C639: KSPSetUp (itfunc.c:278) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ------------------------------------------------------------------------------------------------- From: francium87 at hotmail.com To: knepley at gmail.com CC: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] VecValidValues() reports NaN found Date: Mon, 5 May 2014 13:15:38 +0000 Ok, I will try it . Thanks for your advise. Date: Mon, 5 May 2014 08:12:05 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: I use JACOBI. The message showed is with JACOBI. Wired situation is that the backtrack information shows the location is before actually apply PC, so I guess the rhs vec is not changed at this point. Another wired thing is : Because the original code is to complex. I write out the A matrix in Ax=b, and write a small test code to read in this matrix and solve it, no error showed. The KSP, PC are all set to be the same. When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function: 1) Run in serial until it works 2) It looks like you have memory overwriting problems. Run with valgrind Matt [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Argument out of range! [2]PETSC ERROR: Cannot log negative flops! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:51:27 2014 [2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c Date: Mon, 5 May 2014 07:27:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Floating point exception! [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:03:20 2014 [3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c It looks like the vector after preconditioner application is bad. What is the preconditioner? Matt [3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c [3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below ---------------------------Solver section-------------------------- call VecSet( pet_bp_b, one, ierr) vecidx=[0,1] call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) write(*,*) ' first two values ', first(1), first(2) call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) CHKERRQ(ierr) -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 8 05:20:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 8 May 2014 05:20:52 -0500 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: Message-ID: On Thu, May 8, 2014 at 2:49 AM, linjing bo wrote: > Yes, I called KSPDestroy(). I have reproduce the problem using a small C > code, this code with default ilu preconditioner will show an error > Can you also send your matrix so I can run it? Thanks, Matt > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Cannot log negative flops! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.infoby jlin Thu May 8 15:43:13 2014 > [0]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 > [0]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscLogFlops() line 204 in > /tmp/petsc-3.4.4/include/petsclog.h > [0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in > /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c > [0]PETSC ERROR: MatLUFactorNumeric() line 2889 in > /tmp/petsc-3.4.4/src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_ILU() line 232 in > /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c > [0]PETSC ERROR: PCSetUp() line 890 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 278 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 399 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c > =================================================== > Below is the code > ------------------------------------------------- > static char help[] = "Solve"; > #include > int main(int argc, char **args){ > Vec x,b,u; > Mat A; > KSP ksp; > PC pc; > PetscViewer fd; > PetscErrorCode ierr; > PetscReal tol=1.e-4; > PetscScalar one = 1.0; > PetscInt n=1023; > PetscInitialize(&argc,&args,(char*)0,help); > ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) x, > "Solution");CHKERRQ(ierr); > ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr); > ierr = VecSetFromOptions(x);CHKERRQ(ierr); > ierr = VecDuplicate(x,&b);CHKERRQ(ierr); > PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", > FILE_MODE_READ, &fd); > ierr = > MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \ > 11,PETSC_NULL,11,PETSC_NULL,&A); > ierr = MatLoad(A, fd); > PetscViewerDestroy(&fd); > VecSet( b, one); > VecSet( x, one); > VecAssemblyBegin(b); > VecAssemblyEnd(b); > VecAssemblyBegin(x); > VecAssemblyEnd(x); > ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr); > ierr = > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); > ierr = > KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); > ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); > ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr); > ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); > ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); > > KSPDestroy(&ksp); > VecDestroy(&x); > VecDestroy(&b); > PetscFinalize(); > return 0; > } > ----------------------------------------------------------- > > ------------------------------ > Date: Wed, 7 May 2014 05:52:13 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Tue, May 6, 2014 at 10:53 PM, linjing bo wrote: > > The Valgrind shows memory leak in memalign() called by KSPSetup and > PCSetup. Is that normal? > > > > Did you call KSPDestroy()? > > Matt > > > > --------------------------------------------------------------------------------------------- > ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 > of 3,327 > ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) > ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) > ==31551== by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75) > ==31551== by 0x5E1C41F: KSPSetUp (itfunc.c:239) > ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) > ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) > ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) > ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) > ==31551== by 0x46CB4E: MAIN__ (main.F90:96) > ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) > > ... > > ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 > of 3,327 > ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) > ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) > ==31551== by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011) > ==31551== by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 > (aijfact.c:1655) > ==31551== by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756) > ==31551== by 0x573D152: MatILUFactorSymbolic (matrix.c:6240) > ==31551== by 0x5CFB843: PCSetUp_ILU (ilu.c:204) > ==31551== by 0x5D8BAA7: PCSetUp (precon.c:890) > ==31551== by 0x5E1C639: KSPSetUp (itfunc.c:278) > ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) > ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) > ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) > ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) > ==31551== by 0x46CB4E: MAIN__ (main.F90:96) > ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) > > > ------------------------------------------------------------------------------------------------- > > > ------------------------------ > From: francium87 at hotmail.com > To: knepley at gmail.com > CC: petsc-users at mcs.anl.gov > Subject: RE: [petsc-users] VecValidValues() reports NaN found > Date: Mon, 5 May 2014 13:15:38 +0000 > > Ok, I will try it . Thanks for your advise. > > ------------------------------ > Date: Mon, 5 May 2014 08:12:05 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: > > I use JACOBI. The message showed is with JACOBI. > > > Wired situation is that the backtrack information shows the location is > before actually apply PC, so I guess the rhs vec is not changed at this > point. > > Another wired thing is : Because the original code is to complex. I write > out the A matrix in Ax=b, and write a small test code to read in this > matrix and solve it, no error showed. The KSP, PC are all set to be the > same. > > When I try to using ILU, more wired error happens, the backtrack info > shows it died in a Flops logging function: > > > 1) Run in serial until it works > > 2) It looks like you have memory overwriting problems. Run with valgrind > > Matt > > > [2]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: Argument out of > range! > [2]PETSC ERROR: Cannot log negative > flops! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [2]PETSC ERROR: See docs/changes/index.html for recent > updates. > [2]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [2]PETSC ERROR: See docs/index.html for manual > pages. > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:51:27 > 2014 > > [2]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [2]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [2]PETSC ERROR: > ------------------------------------------------------------------------ > > [2]PETSC ERROR: PetscLogFlops() line 204 in > /tmp/petsc-3.4.4/include/petsclog.h > [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in > /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c > > [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in > /tmp/petsc-3.4.4/src/mat/interface/matrix.c > [2]PETSC ERROR: PCSetUp_ILU() line 232 in > /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c > [2]PETSC ERROR: PCSetUp() line 890 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [2]PETSC ERROR: KSPSetUp() line 278 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > [2]PETSC ERROR: KSPSolve() line 399 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > > > ------------------------------ > Date: Mon, 5 May 2014 07:27:52 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: > > Hi, I'm trying to use PETSc's ksp method to solve a linear system. When > running, Error is reported by VecValidValues() that NaN or Inf is found > with error message listed below > > > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Floating point > exception! > [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite > at beginning of function: Parameter number > 2! > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [3]PETSC ERROR: See docs/changes/index.html for recent > updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [3]PETSC ERROR: See docs/index.html for manual > pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:03:20 > 2014 > > [3]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [3]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: VecValidValues() line 28 in > /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c > > > It looks like the vector after preconditioner application is bad. What is > the preconditioner? > > Matt > > > > [3]PETSC ERROR: PCApply() line 436 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [3]PETSC ERROR: KSP_PCApply() line 227 in > /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h > [3]PETSC ERROR: KSPInitialResidual() line 64 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c > [3]PETSC ERROR: KSPSolve_GMRES() line 239 in > /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c > [3]PETSC ERROR: KSPSolve() line 441 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > After read the source code shown by backtrack informations, I realize the > problem is in the right hand side vector. So I make a trial of set right > hand side vector to ONE by VecSet, But the program still shows error > message above, and using VecView or VecGetValue to investigate the first > value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly > describe the problem. The code related is listed below > > ---------------------------Solver section-------------------------- > > call VecSet( pet_bp_b, one, ierr) > > vecidx=[0,1] > call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) > write(*,*) ' first two values ', first(1), first(2) > > call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) > call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) > call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) > CHKERRQ(ierr) > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From francium87 at hotmail.com Thu May 8 05:44:00 2014 From: francium87 at hotmail.com (linjing bo) Date: Thu, 8 May 2014 10:44:00 +0000 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: , , , , , , , , Message-ID: Sorry , forgot to attach file. Thanks in advance. But does the elements in matrix really matters a lot ? Date: Thu, 8 May 2014 05:20:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Thu, May 8, 2014 at 2:49 AM, linjing bo wrote: Yes, I called KSPDestroy(). I have reproduce the problem using a small C code, this code with default ilu preconditioner will show an error Can you also send your matrix so I can run it? Thanks, Matt [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: Cannot log negative flops! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.info by jlin Thu May 8 15:43:13 2014 [0]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [0]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [0]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [0]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c ===================================================Below is the code-------------------------------------------------static char help[] = "Solve"; #include int main(int argc, char **args){ Vec x,b,u; Mat A; KSP ksp; PC pc; PetscViewer fd; PetscErrorCode ierr; PetscReal tol=1.e-4; PetscScalar one = 1.0; PetscInt n=1023; PetscInitialize(&argc,&args,(char*)0,help); ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr); ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr); ierr = VecSetFromOptions(x);CHKERRQ(ierr); ierr = VecDuplicate(x,&b);CHKERRQ(ierr); PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", FILE_MODE_READ, &fd); ierr = MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \ 11,PETSC_NULL,11,PETSC_NULL,&A); ierr = MatLoad(A, fd); PetscViewerDestroy(&fd); VecSet( b, one); VecSet( x, one); VecAssemblyBegin(b); VecAssemblyEnd(b); VecAssemblyBegin(x); VecAssemblyEnd(x); ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr); ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr); ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); KSPDestroy(&ksp); VecDestroy(&x); VecDestroy(&b); PetscFinalize(); return 0; }----------------------------------------------------------- Date: Wed, 7 May 2014 05:52:13 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Tue, May 6, 2014 at 10:53 PM, linjing bo wrote: The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal? Did you call KSPDestroy()? Matt --------------------------------------------------------------------------------------------- ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75) ==31551== by 0x5E1C41F: KSPSetUp (itfunc.c:239) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ... ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011) ==31551== by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655) ==31551== by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756) ==31551== by 0x573D152: MatILUFactorSymbolic (matrix.c:6240) ==31551== by 0x5CFB843: PCSetUp_ILU (ilu.c:204) ==31551== by 0x5D8BAA7: PCSetUp (precon.c:890) ==31551== by 0x5E1C639: KSPSetUp (itfunc.c:278) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ------------------------------------------------------------------------------------------------- From: francium87 at hotmail.com To: knepley at gmail.com CC: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] VecValidValues() reports NaN found Date: Mon, 5 May 2014 13:15:38 +0000 Ok, I will try it . Thanks for your advise. Date: Mon, 5 May 2014 08:12:05 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: I use JACOBI. The message showed is with JACOBI. Wired situation is that the backtrack information shows the location is before actually apply PC, so I guess the rhs vec is not changed at this point. Another wired thing is : Because the original code is to complex. I write out the A matrix in Ax=b, and write a small test code to read in this matrix and solve it, no error showed. The KSP, PC are all set to be the same. When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function: 1) Run in serial until it works 2) It looks like you have memory overwriting problems. Run with valgrind Matt [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Argument out of range! [2]PETSC ERROR: Cannot log negative flops! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:51:27 2014 [2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c Date: Mon, 5 May 2014 07:27:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Floating point exception! [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:03:20 2014 [3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c It looks like the vector after preconditioner application is bad. What is the preconditioner? Matt [3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c [3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below ---------------------------Solver section-------------------------- call VecSet( pet_bp_b, one, ierr) vecidx=[0,1] call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) write(*,*) ' first two values ', first(1), first(2) call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) CHKERRQ(ierr) -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tor0bp.bin Type: application/octet-stream Size: 135976 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tor0bp.bin.info Type: application/octet-stream Size: 22 bytes Desc: not available URL: From scanmail at anl.gov Thu May 8 05:44:31 2014 From: scanmail at anl.gov (Administrator) Date: Thu, 8 May 2014 05:44:31 -0500 Subject: [petsc-users] [MailServer Notification]Argonne Antivirus Quarantine Notification - DO NOT REPLY Message-ID: <6FAB217316DE4B008E79C04251562526@anl.gov> Do not reply to this message. The reply address is not monitored. The message below has been quarantined by the Argonne National Laboratory Antivirus filtering system. The message was filtered for having been detected of having malicious content or an attachment that matches the laboratory?s filtering criteria. From: francium87 at hotmail.com; To: knepley at gmail.com;petsc-users at mcs.anl.gov; Subject: Re: [petsc-users] VecValidValues() reports NaN found Attachment: tor0bp.bin Date: 5/8/2014 5:44:08 AM If you have any questions regarding the Argonne's antivirus filtering product, or feel that the attachment was incorrectly identified, please contact the CIS Service Desk at help at anl.gov or x-9999 option 2. From francium87 at hotmail.com Thu May 8 05:48:05 2014 From: francium87 at hotmail.com (linjing bo) Date: Thu, 8 May 2014 10:48:05 +0000 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: , , , , , , , , Message-ID: Sorry, forgot to attatch the matrix file Date: Thu, 8 May 2014 05:20:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Thu, May 8, 2014 at 2:49 AM, linjing bo wrote: Yes, I called KSPDestroy(). I have reproduce the problem using a small C code, this code with default ilu preconditioner will show an error Can you also send your matrix so I can run it? Thanks, Matt [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: Cannot log negative flops! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.info by jlin Thu May 8 15:43:13 2014 [0]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [0]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [0]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [0]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c ===================================================Below is the code-------------------------------------------------static char help[] = "Solve"; #include int main(int argc, char **args){ Vec x,b,u; Mat A; KSP ksp; PC pc; PetscViewer fd; PetscErrorCode ierr; PetscReal tol=1.e-4; PetscScalar one = 1.0; PetscInt n=1023; PetscInitialize(&argc,&args,(char*)0,help); ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr); ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr); ierr = VecSetFromOptions(x);CHKERRQ(ierr); ierr = VecDuplicate(x,&b);CHKERRQ(ierr); PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", FILE_MODE_READ, &fd); ierr = MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \ 11,PETSC_NULL,11,PETSC_NULL,&A); ierr = MatLoad(A, fd); PetscViewerDestroy(&fd); VecSet( b, one); VecSet( x, one); VecAssemblyBegin(b); VecAssemblyEnd(b); VecAssemblyBegin(x); VecAssemblyEnd(x); ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr); ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr); ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); KSPDestroy(&ksp); VecDestroy(&x); VecDestroy(&b); PetscFinalize(); return 0; }----------------------------------------------------------- Date: Wed, 7 May 2014 05:52:13 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Tue, May 6, 2014 at 10:53 PM, linjing bo wrote: The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal? Did you call KSPDestroy()? Matt --------------------------------------------------------------------------------------------- ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75) ==31551== by 0x5E1C41F: KSPSetUp (itfunc.c:239) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ... ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011) ==31551== by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655) ==31551== by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756) ==31551== by 0x573D152: MatILUFactorSymbolic (matrix.c:6240) ==31551== by 0x5CFB843: PCSetUp_ILU (ilu.c:204) ==31551== by 0x5D8BAA7: PCSetUp (precon.c:890) ==31551== by 0x5E1C639: KSPSetUp (itfunc.c:278) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ------------------------------------------------------------------------------------------------- From: francium87 at hotmail.com To: knepley at gmail.com CC: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] VecValidValues() reports NaN found Date: Mon, 5 May 2014 13:15:38 +0000 Ok, I will try it . Thanks for your advise. Date: Mon, 5 May 2014 08:12:05 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: I use JACOBI. The message showed is with JACOBI. Wired situation is that the backtrack information shows the location is before actually apply PC, so I guess the rhs vec is not changed at this point. Another wired thing is : Because the original code is to complex. I write out the A matrix in Ax=b, and write a small test code to read in this matrix and solve it, no error showed. The KSP, PC are all set to be the same. When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function: 1) Run in serial until it works 2) It looks like you have memory overwriting problems. Run with valgrind Matt [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Argument out of range! [2]PETSC ERROR: Cannot log negative flops! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:51:27 2014 [2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c Date: Mon, 5 May 2014 07:27:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Floating point exception! [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:03:20 2014 [3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c It looks like the vector after preconditioner application is bad. What is the preconditioner? Matt [3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c [3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below ---------------------------Solver section-------------------------- call VecSet( pet_bp_b, one, ierr) vecidx=[0,1] call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) write(*,*) ' first two values ', first(1), first(2) call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) CHKERRQ(ierr) -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: matrix.zip Type: application/zip Size: 50130 bytes Desc: not available URL: From scanmail at anl.gov Thu May 8 05:48:37 2014 From: scanmail at anl.gov (Administrator) Date: Thu, 8 May 2014 05:48:37 -0500 Subject: [petsc-users] [MailServer Notification]Argonne Antivirus Quarantine Notification - DO NOT REPLY Message-ID: <31D15166009D4985AD204832B9E3F228@anl.gov> Do not reply to this message. The reply address is not monitored. The message below has been quarantined by the Argonne National Laboratory Antivirus filtering system. The message was filtered for having been detected of having malicious content or an attachment that matches the laboratory?s filtering criteria. From: francium87 at hotmail.com; To: knepley at gmail.com;petsc-users at mcs.anl.gov; Subject: Re: [petsc-users] VecValidValues() reports NaN found Attachment: matrix.zip Date: 5/8/2014 5:48:12 AM If you have any questions regarding the Argonne's antivirus filtering product, or feel that the attachment was incorrectly identified, please contact the CIS Service Desk at help at anl.gov or x-9999 option 2. From knepley at gmail.com Thu May 8 06:27:26 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 8 May 2014 06:27:26 -0500 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: Message-ID: On Thu, May 8, 2014 at 5:44 AM, linjing bo wrote: > Sorry , forgot to attach file. Thanks in advance. > But does the elements in matrix really matters a lot ? > Yes, unfortunately. The problem is that you have no diagonal element in row 0. I do not think our factorization routine can handle this, but I will check with Hong. If you put a 0 there, it should work fine. Thanks, Matt > > ------------------------------ > Date: Thu, 8 May 2014 05:20:52 -0500 > > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Thu, May 8, 2014 at 2:49 AM, linjing bo wrote: > > Yes, I called KSPDestroy(). I have reproduce the problem using a small C > code, this code with default ilu preconditioner will show an error > > > Can you also send your matrix so I can run it? > > Thanks, > > Matt > > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Cannot log negative flops! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.infoby jlin Thu May 8 15:43:13 2014 > [0]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 > [0]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscLogFlops() line 204 in > /tmp/petsc-3.4.4/include/petsclog.h > [0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in > /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c > [0]PETSC ERROR: MatLUFactorNumeric() line 2889 in > /tmp/petsc-3.4.4/src/mat/interface/matrix.c > [0]PETSC ERROR: PCSetUp_ILU() line 232 in > /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c > [0]PETSC ERROR: PCSetUp() line 890 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: KSPSetUp() line 278 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: KSPSolve() line 399 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c > =================================================== > Below is the code > ------------------------------------------------- > static char help[] = "Solve"; > #include > int main(int argc, char **args){ > Vec x,b,u; > Mat A; > KSP ksp; > PC pc; > PetscViewer fd; > PetscErrorCode ierr; > PetscReal tol=1.e-4; > PetscScalar one = 1.0; > PetscInt n=1023; > PetscInitialize(&argc,&args,(char*)0,help); > ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) x, > "Solution");CHKERRQ(ierr); > ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr); > ierr = VecSetFromOptions(x);CHKERRQ(ierr); > ierr = VecDuplicate(x,&b);CHKERRQ(ierr); > PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", > FILE_MODE_READ, &fd); > ierr = > MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \ > 11,PETSC_NULL,11,PETSC_NULL,&A); > ierr = MatLoad(A, fd); > PetscViewerDestroy(&fd); > VecSet( b, one); > VecSet( x, one); > VecAssemblyBegin(b); > VecAssemblyEnd(b); > VecAssemblyBegin(x); > VecAssemblyEnd(x); > ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr); > ierr = > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); > ierr = > KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); > ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); > ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr); > ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); > ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); > > KSPDestroy(&ksp); > VecDestroy(&x); > VecDestroy(&b); > PetscFinalize(); > return 0; > } > ----------------------------------------------------------- > > ------------------------------ > Date: Wed, 7 May 2014 05:52:13 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Tue, May 6, 2014 at 10:53 PM, linjing bo wrote: > > The Valgrind shows memory leak in memalign() called by KSPSetup and > PCSetup. Is that normal? > > > > Did you call KSPDestroy()? > > Matt > > > > --------------------------------------------------------------------------------------------- > ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 > of 3,327 > ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) > ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) > ==31551== by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75) > ==31551== by 0x5E1C41F: KSPSetUp (itfunc.c:239) > ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) > ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) > ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) > ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) > ==31551== by 0x46CB4E: MAIN__ (main.F90:96) > ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) > > ... > > ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 > of 3,327 > ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) > ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) > ==31551== by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011) > ==31551== by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 > (aijfact.c:1655) > ==31551== by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756) > ==31551== by 0x573D152: MatILUFactorSymbolic (matrix.c:6240) > ==31551== by 0x5CFB843: PCSetUp_ILU (ilu.c:204) > ==31551== by 0x5D8BAA7: PCSetUp (precon.c:890) > ==31551== by 0x5E1C639: KSPSetUp (itfunc.c:278) > ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) > ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) > ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) > ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) > ==31551== by 0x46CB4E: MAIN__ (main.F90:96) > ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) > > > ------------------------------------------------------------------------------------------------- > > > ------------------------------ > From: francium87 at hotmail.com > To: knepley at gmail.com > CC: petsc-users at mcs.anl.gov > Subject: RE: [petsc-users] VecValidValues() reports NaN found > Date: Mon, 5 May 2014 13:15:38 +0000 > > Ok, I will try it . Thanks for your advise. > > ------------------------------ > Date: Mon, 5 May 2014 08:12:05 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: > > I use JACOBI. The message showed is with JACOBI. > > > Wired situation is that the backtrack information shows the location is > before actually apply PC, so I guess the rhs vec is not changed at this > point. > > Another wired thing is : Because the original code is to complex. I write > out the A matrix in Ax=b, and write a small test code to read in this > matrix and solve it, no error showed. The KSP, PC are all set to be the > same. > > When I try to using ILU, more wired error happens, the backtrack info > shows it died in a Flops logging function: > > > 1) Run in serial until it works > > 2) It looks like you have memory overwriting problems. Run with valgrind > > Matt > > > [2]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: Argument out of > range! > [2]PETSC ERROR: Cannot log negative > flops! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [2]PETSC ERROR: See docs/changes/index.html for recent > updates. > [2]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [2]PETSC ERROR: See docs/index.html for manual > pages. > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:51:27 > 2014 > > [2]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [2]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [2]PETSC ERROR: > ------------------------------------------------------------------------ > > [2]PETSC ERROR: PetscLogFlops() line 204 in > /tmp/petsc-3.4.4/include/petsclog.h > [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in > /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c > > [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in > /tmp/petsc-3.4.4/src/mat/interface/matrix.c > [2]PETSC ERROR: PCSetUp_ILU() line 232 in > /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c > [2]PETSC ERROR: PCSetUp() line 890 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [2]PETSC ERROR: KSPSetUp() line 278 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > [2]PETSC ERROR: KSPSolve() line 399 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > > > ------------------------------ > Date: Mon, 5 May 2014 07:27:52 -0500 > Subject: Re: [petsc-users] VecValidValues() reports NaN found > From: knepley at gmail.com > To: francium87 at hotmail.com > CC: petsc-users at mcs.anl.gov > > On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: > > Hi, I'm trying to use PETSc's ksp method to solve a linear system. When > running, Error is reported by VecValidValues() that NaN or Inf is found > with error message listed below > > > [3]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [3]PETSC ERROR: Floating point > exception! > [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite > at beginning of function: Parameter number > 2! > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, > 2014 > [3]PETSC ERROR: See docs/changes/index.html for recent > updates. > [3]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [3]PETSC ERROR: See docs/index.html for manual > pages. > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by > jlin Mon May 5 20:03:20 > 2014 > > [3]PETSC ERROR: Libraries linked from > /opt/sfw/petsc/3.4.4/intel/openmpi/lib > [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 > 2014 > [3]PETSC ERROR: Configure options > --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi > --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel > --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 > --with-mpiexec=mpiexec > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: VecValidValues() line 28 in > /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c > > > It looks like the vector after preconditioner application is bad. What is > the preconditioner? > > Matt > > > > [3]PETSC ERROR: PCApply() line 436 in > /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c > [3]PETSC ERROR: KSP_PCApply() line 227 in > /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h > [3]PETSC ERROR: KSPInitialResidual() line 64 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c > [3]PETSC ERROR: KSPSolve_GMRES() line 239 in > /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c > [3]PETSC ERROR: KSPSolve() line 441 in > /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c > > > After read the source code shown by backtrack informations, I realize the > problem is in the right hand side vector. So I make a trial of set right > hand side vector to ONE by VecSet, But the program still shows error > message above, and using VecView or VecGetValue to investigate the first > value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly > describe the problem. The code related is listed below > > ---------------------------Solver section-------------------------- > > call VecSet( pet_bp_b, one, ierr) > > vecidx=[0,1] > call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) > write(*,*) ' first two values ', first(1), first(2) > > call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) > call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) > call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) > CHKERRQ(ierr) > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From francium87 at hotmail.com Thu May 8 06:46:44 2014 From: francium87 at hotmail.com (linjing bo) Date: Thu, 8 May 2014 11:46:44 +0000 Subject: [petsc-users] VecValidValues() reports NaN found In-Reply-To: References: , , , , , , , , , , Message-ID: Ok. Thanks for your attention. I will check the matrix generation part, there should be diagonal element. Date: Thu, 8 May 2014 06:27:26 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Thu, May 8, 2014 at 5:44 AM, linjing bo wrote: Sorry , forgot to attach file. Thanks in advance. But does the elements in matrix really matters a lot ? Yes, unfortunately. The problem is that you have no diagonal element in row 0. I do not think our factorization routine can handle this, but I will check with Hong. Ifyou put a 0 there, it should work fine. Thanks, Matt Date: Thu, 8 May 2014 05:20:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Thu, May 8, 2014 at 2:49 AM, linjing bo wrote: Yes, I called KSPDestroy(). I have reproduce the problem using a small C code, this code with default ilu preconditioner will show an error Can you also send your matrix so I can run it? Thanks, Matt [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: Cannot log negative flops! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.info by jlin Thu May 8 15:43:13 2014 [0]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [0]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [0]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [0]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c ===================================================Below is the code-------------------------------------------------static char help[] = "Solve"; #include int main(int argc, char **args){ Vec x,b,u; Mat A; KSP ksp; PC pc; PetscViewer fd; PetscErrorCode ierr; PetscReal tol=1.e-4; PetscScalar one = 1.0; PetscInt n=1023; PetscInitialize(&argc,&args,(char*)0,help); ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr); ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr); ierr = VecSetFromOptions(x);CHKERRQ(ierr); ierr = VecDuplicate(x,&b);CHKERRQ(ierr); PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", FILE_MODE_READ, &fd); ierr = MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \ 11,PETSC_NULL,11,PETSC_NULL,&A); ierr = MatLoad(A, fd); PetscViewerDestroy(&fd); VecSet( b, one); VecSet( x, one); VecAssemblyBegin(b); VecAssemblyEnd(b); VecAssemblyBegin(x); VecAssemblyEnd(x); ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr); ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr); ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr); ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr); ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); KSPDestroy(&ksp); VecDestroy(&x); VecDestroy(&b); PetscFinalize(); return 0; }----------------------------------------------------------- Date: Wed, 7 May 2014 05:52:13 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Tue, May 6, 2014 at 10:53 PM, linjing bo wrote: The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal? Did you call KSPDestroy()? Matt --------------------------------------------------------------------------------------------- ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75) ==31551== by 0x5E1C41F: KSPSetUp (itfunc.c:239) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ... ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327 ==31551== at 0x4A05458: memalign (vg_replace_malloc.c:727) ==31551== by 0x5498E79: PetscMallocAlign (mal.c:27) ==31551== by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011) ==31551== by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655) ==31551== by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756) ==31551== by 0x573D152: MatILUFactorSymbolic (matrix.c:6240) ==31551== by 0x5CFB843: PCSetUp_ILU (ilu.c:204) ==31551== by 0x5D8BAA7: PCSetUp (precon.c:890) ==31551== by 0x5E1C639: KSPSetUp (itfunc.c:278) ==31551== by 0x5E1821D: KSPSolve (itfunc.c:399) ==31551== by 0x5CAD7F1: kspsolve_ (itfuncf.c:219) ==31551== by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434) ==31551== by 0x831618: field_solver_defi_ (field_defi.F90:57) ==31551== by 0x46CB4E: MAIN__ (main.F90:96) ==31551== by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc) ------------------------------------------------------------------------------------------------- From: francium87 at hotmail.com To: knepley at gmail.com CC: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] VecValidValues() reports NaN found Date: Mon, 5 May 2014 13:15:38 +0000 Ok, I will try it . Thanks for your advise. Date: Mon, 5 May 2014 08:12:05 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:56 AM, linjing bo wrote: I use JACOBI. The message showed is with JACOBI. Wired situation is that the backtrack information shows the location is before actually apply PC, so I guess the rhs vec is not changed at this point. Another wired thing is : Because the original code is to complex. I write out the A matrix in Ax=b, and write a small test code to read in this matrix and solve it, no error showed. The KSP, PC are all set to be the same. When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function: 1) Run in serial until it works 2) It looks like you have memory overwriting problems. Run with valgrind Matt [2]PETSC ERROR: --------------------- Error Message ------------------------------------ [2]PETSC ERROR: Argument out of range! [2]PETSC ERROR: Cannot log negative flops! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:51:27 2014 [2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c [2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c [2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c [2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c Date: Mon, 5 May 2014 07:27:52 -0500 Subject: Re: [petsc-users] VecValidValues() reports NaN found From: knepley at gmail.com To: francium87 at hotmail.com CC: petsc-users at mcs.anl.gov On Mon, May 5, 2014 at 7:25 AM, linjing bo wrote: Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Floating point exception! [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May 5 20:03:20 2014 [3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014 [3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c It looks like the vector after preconditioner application is bad. What is the preconditioner? Matt [3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c [3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c [3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c [3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below ---------------------------Solver section-------------------------- call VecSet( pet_bp_b, one, ierr) vecidx=[0,1] call VecGetValues( pet_bp_b, 2, vecidx, first, ierr) write(*,*) ' first two values ', first(1), first(2) call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr) call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr) call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr) CHKERRQ(ierr) -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Zafer.Leylek at student.adfa.edu.au Thu May 8 08:45:05 2014 From: Zafer.Leylek at student.adfa.edu.au (Zafer Leylek) Date: Thu, 8 May 2014 13:45:05 +0000 Subject: [petsc-users] VecDot usage in parallel Message-ID: <996B4E35EA834745A31426395DB0BBA01DFD9754@ADFAPWEXMBX02.ad.adfa.edu.au> I have recently started using petsc and have little experience with parallel programming. I am having problem with the following section of my code: KSPGetPC(ksp,&pc); PCSetType(pc,PCCHOLESKY); PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); PCFactorSetUpMatSolverPackage(pc); PCFactorGetMatrix(pc,&L); MatMumpsSetIcntl(L,7,2); MatMumpsSetCntl(L,1,0.0); MatMumpsSetIcntl(L,33,1); KSPSetUp(ksp); KSPSolve(ksp, y, alpha); VecDot(y, alpha, &sigma); when I run it using a single processor (mpiexec -np 1 ....) I get the correct answer, when I run using 2 processors I get sigma = 4*sigma and so on. How can I solve this problem?? ZL -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu May 8 09:11:11 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 08 May 2014 08:11:11 -0600 Subject: [petsc-users] VecDot usage in parallel In-Reply-To: <996B4E35EA834745A31426395DB0BBA01DFD9754@ADFAPWEXMBX02.ad.adfa.edu.au> References: <996B4E35EA834745A31426395DB0BBA01DFD9754@ADFAPWEXMBX02.ad.adfa.edu.au> Message-ID: <87iopgl5s0.fsf@jedbrown.org> Zafer Leylek writes: > I have recently started using petsc and have little experience with parallel programming. > > I am having problem with the following section of my code: > > KSPGetPC(ksp,&pc); > PCSetType(pc,PCCHOLESKY); > PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS); > PCFactorSetUpMatSolverPackage(pc); > PCFactorGetMatrix(pc,&L); > MatMumpsSetIcntl(L,7,2); > MatMumpsSetCntl(L,1,0.0); > MatMumpsSetIcntl(L,33,1); > KSPSetUp(ksp); > KSPSolve(ksp, y, alpha); > > VecDot(y, alpha, &sigma); > > when I run it using a single processor (mpiexec -np 1 ....) I get the > correct answer, when I run using 2 processors I get sigma = 4*sigma > and so on. View y and alpha. I suspect you are solving a different problem in these cases. VecDot computes the parallel dot product. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From likunt at caltech.edu Thu May 8 11:08:05 2014 From: likunt at caltech.edu (likunt at caltech.edu) Date: Thu, 8 May 2014 09:08:05 -0700 (PDT) Subject: [petsc-users] normalization of a vector Message-ID: <59909.131.215.220.161.1399565285.squirrel@webmail.caltech.edu> Dear Petsc developers, I have a vector V={u1, u2, u3, v1, v2, v3}. I need to normalize each 3d vector and reset V, i.e. V={u1/|u|, u2/|u|, u3/|u|, v1/|v|, v2/|v|, v3/|v|}, with |u| and |v| denotes the magnitudes of {u1,u2,u3} and {v1,v2,v3}. I tried VecGetValues(V, 3, col, val); normalization of val; VecSetValues(V, 3, col, val, INSERT_VALUES); but I got the error message PETSC ERROR: Object is in wrong state! PETSC ERROR: You have already added values; you cannot now insert! Is there any fast way to do that? Thanks. From mrestelli at gmail.com Thu May 8 11:25:11 2014 From: mrestelli at gmail.com (marco restelli) Date: Thu, 8 May 2014 18:25:11 +0200 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices Message-ID: Hi, I have a Cartesian communicator and some matrices distributed along the "x" direction. I would like to compute an all_reduce operation for these matrices in the y direction, and I wander whether there is a PETSc function for this. More precisely: a matrix A is distributed among processors 0 , 1 , 2 another A is distributed among processors 3 , 4 , 5 another A is distributed among processors 6 , 7 , 8 ... The x direction is 0,1,2; while the y direction is 0,3,6,... I would like to compute a matrix B = "sum of the matrices A" and a copy of B should be distributed among processors 0,1,2, another copy among 3,4,5 and so on. A way of doing this is getting the matrix coefficients, broadcasting them along the y direction and summing them in the matrix B; maybe however there is already a PETSc function doing this. Thank you, regards Marco From knepley at gmail.com Thu May 8 11:29:21 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 8 May 2014 11:29:21 -0500 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: References: Message-ID: On Thu, May 8, 2014 at 11:25 AM, marco restelli wrote: > Hi, > I have a Cartesian communicator and some matrices distributed along > the "x" direction. I would like to compute an all_reduce operation for > these matrices in the y direction, and I wander whether there is a > PETSc function for this. > > > More precisely: > > a matrix A is distributed among processors 0 , 1 , 2 > another A is distributed among processors 3 , 4 , 5 > another A is distributed among processors 6 , 7 , 8 > ... > > The x direction is 0,1,2; while the y direction is 0,3,6,... > > I would like to compute a matrix B = "sum of the matrices A" and a > copy of B should be distributed among processors 0,1,2, another copy > among 3,4,5 and so on. > > A way of doing this is getting the matrix coefficients, broadcasting > them along the y direction and summing them in the matrix B; maybe > however there is already a PETSc function doing this. > There is nothing like this in PETSc. There are many tools for this using dense matrices in Elemental, but I have not seen anything for sparse matrices. Matt > Thank you, regards > Marco > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 8 12:44:33 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 8 May 2014 12:44:33 -0500 Subject: [petsc-users] normalization of a vector In-Reply-To: <59909.131.215.220.161.1399565285.squirrel@webmail.caltech.edu> References: <59909.131.215.220.161.1399565285.squirrel@webmail.caltech.edu> Message-ID: Use VecGetArray(). Loop over each the 3 tuples doing the normalization and then use VecRestoreArray(). VecGetArray/RestoreArray() do not copy values so are much faster than VecGetValues(). Barry Each process will just loop over its local part of the vector. On May 8, 2014, at 11:08 AM, likunt at caltech.edu wrote: > Dear Petsc developers, > > I have a vector V={u1, u2, u3, v1, v2, v3}. I need to normalize each 3d > vector and reset V, i.e. > > V={u1/|u|, u2/|u|, u3/|u|, v1/|v|, v2/|v|, v3/|v|}, > with |u| and |v| denotes the magnitudes of {u1,u2,u3} and {v1,v2,v3}. > > I tried > > VecGetValues(V, 3, col, val); > normalization of val; > VecSetValues(V, 3, col, val, INSERT_VALUES); > > but I got the error message > > PETSC ERROR: Object is in wrong state! > PETSC ERROR: You have already added values; you cannot now insert! > > Is there any fast way to do that? Thanks. > > From mrestelli at gmail.com Thu May 8 14:06:32 2014 From: mrestelli at gmail.com (marco restelli) Date: Thu, 8 May 2014 21:06:32 +0200 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: References: Message-ID: 2014-05-08 18:29 GMT+0200, Matthew Knepley : > On Thu, May 8, 2014 at 11:25 AM, marco restelli > wrote: > >> Hi, >> I have a Cartesian communicator and some matrices distributed along >> the "x" direction. I would like to compute an all_reduce operation for >> these matrices in the y direction, and I wander whether there is a >> PETSc function for this. >> >> >> More precisely: >> >> a matrix A is distributed among processors 0 , 1 , 2 >> another A is distributed among processors 3 , 4 , 5 >> another A is distributed among processors 6 , 7 , 8 >> ... >> >> The x direction is 0,1,2; while the y direction is 0,3,6,... >> >> I would like to compute a matrix B = "sum of the matrices A" and a >> copy of B should be distributed among processors 0,1,2, another copy >> among 3,4,5 and so on. >> >> A way of doing this is getting the matrix coefficients, broadcasting >> them along the y direction and summing them in the matrix B; maybe >> however there is already a PETSc function doing this. >> > > There is nothing like this in PETSc. There are many tools for this using > dense > matrices in Elemental, but I have not seen anything for sparse matrices. > > Matt > OK, thank you. Now, to do it myself, is MatGetRow the best way to get all the local nonzero entries of a matrix? Marco From knepley at gmail.com Thu May 8 14:13:18 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 8 May 2014 14:13:18 -0500 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: References: Message-ID: On Thu, May 8, 2014 at 2:06 PM, marco restelli wrote: > 2014-05-08 18:29 GMT+0200, Matthew Knepley : > > On Thu, May 8, 2014 at 11:25 AM, marco restelli > > wrote: > > > >> Hi, > >> I have a Cartesian communicator and some matrices distributed along > >> the "x" direction. I would like to compute an all_reduce operation for > >> these matrices in the y direction, and I wander whether there is a > >> PETSc function for this. > >> > >> > >> More precisely: > >> > >> a matrix A is distributed among processors 0 , 1 , 2 > >> another A is distributed among processors 3 , 4 , 5 > >> another A is distributed among processors 6 , 7 , 8 > >> ... > >> > >> The x direction is 0,1,2; while the y direction is 0,3,6,... > >> > >> I would like to compute a matrix B = "sum of the matrices A" and a > >> copy of B should be distributed among processors 0,1,2, another copy > >> among 3,4,5 and so on. > >> > >> A way of doing this is getting the matrix coefficients, broadcasting > >> them along the y direction and summing them in the matrix B; maybe > >> however there is already a PETSc function doing this. > >> > > > > There is nothing like this in PETSc. There are many tools for this using > > dense > > matrices in Elemental, but I have not seen anything for sparse matrices. > > > > Matt > > > > OK, thank you. > > Now, to do it myself, is MatGetRow the best way to get all the local > nonzero entries of a matrix? I think MatGetSubmatrices() is probably better. Matt > > Marco > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrestelli at gmail.com Thu May 8 14:45:03 2014 From: mrestelli at gmail.com (marco restelli) Date: Thu, 8 May 2014 21:45:03 +0200 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: References: Message-ID: 2014-05-08 21:13 GMT+0200, Matthew Knepley : > On Thu, May 8, 2014 at 2:06 PM, marco restelli wrote: > >> 2014-05-08 18:29 GMT+0200, Matthew Knepley : >> > On Thu, May 8, 2014 at 11:25 AM, marco restelli >> > wrote: >> > >> >> Hi, >> >> I have a Cartesian communicator and some matrices distributed along >> >> the "x" direction. I would like to compute an all_reduce operation for >> >> these matrices in the y direction, and I wander whether there is a >> >> PETSc function for this. >> >> >> >> >> >> More precisely: >> >> >> >> a matrix A is distributed among processors 0 , 1 , 2 >> >> another A is distributed among processors 3 , 4 , 5 >> >> another A is distributed among processors 6 , 7 , 8 >> >> ... >> >> >> >> The x direction is 0,1,2; while the y direction is 0,3,6,... >> >> >> >> I would like to compute a matrix B = "sum of the matrices A" and a >> >> copy of B should be distributed among processors 0,1,2, another copy >> >> among 3,4,5 and so on. >> >> >> >> A way of doing this is getting the matrix coefficients, broadcasting >> >> them along the y direction and summing them in the matrix B; maybe >> >> however there is already a PETSc function doing this. >> >> >> > >> > There is nothing like this in PETSc. There are many tools for this >> > using >> > dense >> > matrices in Elemental, but I have not seen anything for sparse >> > matrices. >> > >> > Matt >> > >> >> OK, thank you. >> >> Now, to do it myself, is MatGetRow the best way to get all the local >> nonzero entries of a matrix? > > > I think MatGetSubmatrices() is probably better. > > Matt Matt, thanks but this I don't understand. What I want is getting three arrays (i,j,coeff) with all the nonzero local coefficients, so that I can send them around with MPI. MatGetSubmatrices would give me some PETSc objects, which I can not pass to MPI, right? Marco From tlk0812 at hotmail.com Thu May 8 17:14:29 2014 From: tlk0812 at hotmail.com (LikunTan) Date: Fri, 9 May 2014 06:14:29 +0800 Subject: [petsc-users] question on VecView Message-ID: Dear Petsc Developers, Instead of outputting a vector vertically, is there an option to output it horizontally, i.e. v[1] v[2] v[3] .......... Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 8 17:32:32 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 8 May 2014 17:32:32 -0500 Subject: [petsc-users] question on VecView In-Reply-To: References: Message-ID: On Thu, May 8, 2014 at 5:14 PM, LikunTan wrote: > Dear Petsc Developers, > > Instead of outputting a vector vertically, is there an option to output it > horizontally, i.e. > > v[1] v[2] v[3] .......... > No, you would have to write it. Matt > Thanks, > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Thu May 8 18:36:42 2014 From: friedmud at gmail.com (Derek Gaston) Date: Thu, 8 May 2014 17:36:42 -0600 Subject: [petsc-users] Hiring for the MOOSE Team! Message-ID: We're hiring on the MOOSE Framework team! Come join a high-intensity computational science team that is devoted to open source and innovative development methods! The MOOSE Framework ( http://www.mooseframework.org ) is a high-level, parallel, multiscale, multiphysics, PDE solution framework built on libMesh and PETSc. Working on the MOOSE Framework provides ample opportunity for anyone with a computational science background. You can work on massively parallel algorithms, innovative graphical user interfaces, numerical methods, software development methodologies and much more. Most importantly: the work you do every day will have a direct impact on our hundreds of users and the multiple science programs that depend on MOOSE. This position includes opportunities to travel to conferences. In addition, publishing papers is highly encouraged. Here is a direct link to the job posting: http://1.usa.gov/1hAX8zX? Let me know if you have any questions! And please forward this on to anyone that you think may be interested! Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu May 8 22:47:32 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 08 May 2014 21:47:32 -0600 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: References: Message-ID: <87lhubk3zf.fsf@jedbrown.org> marco restelli writes: > Matt, thanks but this I don't understand. What I want is getting three > arrays (i,j,coeff) with all the nonzero local coefficients, so that I > can send them around with MPI. > > MatGetSubmatrices would give me some PETSc objects, which I can not > pass to MPI, right? I'm not sure you want this, but you can use MatGetRowIJ and similar to access the representation you're asking for if you are dead set on depending on a specific data format rather than using generic interfaces. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From mrestelli at gmail.com Fri May 9 03:15:28 2014 From: mrestelli at gmail.com (marco restelli) Date: Fri, 9 May 2014 10:15:28 +0200 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: <87lhubk3zf.fsf@jedbrown.org> References: <87lhubk3zf.fsf@jedbrown.org> Message-ID: 2014-05-09 5:47 GMT+0200, Jed Brown : > marco restelli writes: >> Matt, thanks but this I don't understand. What I want is getting three >> arrays (i,j,coeff) with all the nonzero local coefficients, so that I >> can send them around with MPI. >> >> MatGetSubmatrices would give me some PETSc objects, which I can not >> pass to MPI, right? > > I'm not sure you want this, but you can use MatGetRowIJ and similar to > access the representation you're asking for if you are dead set on > depending on a specific data format rather than using generic > interfaces. > Jed, thank you. This is probably not the PETSc solution, but still it might a solution! I have found this example for MatGetRowIJ: http://www.stce.rwth-aachen.de/trac/petsc/browser/src/mat/examples/tests/ex79f.F?rev=a52934f9a5da430fdd891fa538a66c376435ec4c My understanding is that I need to: 1) get the sequential part of the matrix, i.e. those rows stored on this processor call MatMPIAIJGetSeqAIJ(A,Ad,Ao,icol,iicol,ierr) 2) get the indexes of these rows call MatGetOwnershipRange(A,rstart,rend,ierr) 3) get the indexes i,j of the local portion of the matrix (compressed form) call MatGetRowIJ(Ad,one,zero,zero,n,ia,iia,ja,jja,done,ierr) 4) get the corresponding elements call MatGetArray(Ad,aa,aaa,ierr) 5) WARNING: the row indexes obtained with MatGetRowIJ are local to this processor, so they must be corrected with rstart to obtain the corresponding global indexes 6) clean-up Does this make sense? Marco From knepley at gmail.com Fri May 9 06:14:53 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 May 2014 06:14:53 -0500 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: References: <87lhubk3zf.fsf@jedbrown.org> Message-ID: On Fri, May 9, 2014 at 3:15 AM, marco restelli wrote: > 2014-05-09 5:47 GMT+0200, Jed Brown : > > marco restelli writes: > >> Matt, thanks but this I don't understand. What I want is getting three > >> arrays (i,j,coeff) with all the nonzero local coefficients, so that I > >> can send them around with MPI. > >> > >> MatGetSubmatrices would give me some PETSc objects, which I can not > >> pass to MPI, right? > > > > I'm not sure you want this, but you can use MatGetRowIJ and similar to > > access the representation you're asking for if you are dead set on > > depending on a specific data format rather than using generic > > interfaces. > > > > Jed, thank you. This is probably not the PETSc solution, but still it > might a solution! > > I have found this example for MatGetRowIJ: > I really do not think you want to do this. It is complex, fragile and I believe the performance improvement to be non-existent. You can get the effect you want JUST by using one function. For example, suppose you want 2 procs to get rows [0,5] and two procs to get rows [1,3], then procs A.B MatGetSubmatrices(A, 2, [0,5], 2, [0,5], ..., &submat) procs C, D MatGetSubmatrices(A, 2, [1,3], 2, [1,3], ..., &submat) and its done. No MPI, no extraction which depends on the Mat data structure. Matt > > http://www.stce.rwth-aachen.de/trac/petsc/browser/src/mat/examples/tests/ex79f.F?rev=a52934f9a5da430fdd891fa538a66c376435ec4c > > My understanding is that I need to: > > 1) get the sequential part of the matrix, i.e. those rows stored on > this processor > call MatMPIAIJGetSeqAIJ(A,Ad,Ao,icol,iicol,ierr) > > 2) get the indexes of these rows > call MatGetOwnershipRange(A,rstart,rend,ierr) > > 3) get the indexes i,j of the local portion of the matrix (compressed > form) > call MatGetRowIJ(Ad,one,zero,zero,n,ia,iia,ja,jja,done,ierr) > > 4) get the corresponding elements > call MatGetArray(Ad,aa,aaa,ierr) > > 5) WARNING: the row indexes obtained with MatGetRowIJ are local to > this processor, so they must be corrected with rstart to obtain the > corresponding global indexes > > 6) clean-up > > > Does this make sense? > > > Marco > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From spk at ldeo.columbia.edu Fri May 9 06:47:58 2014 From: spk at ldeo.columbia.edu (Samar Khatiwala) Date: Fri, 9 May 2014 07:47:58 -0400 Subject: [petsc-users] possible performance issues with PETSc on Cray In-Reply-To: <8761mgw0ol.fsf@jedbrown.org> References: <283192B3-CD15-4118-A688-EB6E500831A6@ldeo.columbia.edu> <8761mgw0ol.fsf@jedbrown.org> Message-ID: Hi Jed et al., Just wanted to report back on the resolution of this issue. The computing support people at HLRN in Germany submitted a test case to CRAY re. performance on their XC30. CRAY has finally gotten back with a solution, which is to use the run-time option -vecscatter_alltoall. Apparently this is a known issue and according to the HLRN folks passing this command line option to PETSc seems to work nicely. Thanks again for your help. Samar On Apr 11, 2014, at 7:44 AM, Jed Brown wrote: > Samar Khatiwala writes: > >> Hello, >> >> This is a somewhat vague query but I and a colleague have been running PETSc (3.4.3.0) on a Cray >> XC30 in Germany (https://www.hlrn.de/home/view/System3/WebHome) and the system administrators >> alerted us to some anomalies with our jobs that may or may not be related to PETSc but I thought I'd ask >> here in case others have noticed something similar. >> >> First, there was a large variation in run-time for identical jobs, sometimes as much as 50%. We didn't >> really pick up on this but other users complained to the IT people that their jobs were taking a performance >> hit with a similar variation in run-time. At that point we're told the IT folks started monitoring jobs and >> carrying out tests to see what was going on. They discovered that (1) this always happened when we were >> running our jobs and (2) the problem got worse with physical proximity to the nodes on which our jobs were >> running (what they described as a "strong interaction" between our jobs and others presumably through the >> communication network). > > It sounds like you are strong scaling (smallish subdomains) so that your > application is sensitive to network latency. I see significant > performance variability on XC-30 with this Full Multigrid solver that is > not using PETSc. > > http://59a2.org/files/hopper-vs-edison.3semilogx.png > > See the factor of 2 performance variability for the samples of the ~15M > element case. This operation is limited by instruction issue rather > than bandwidth (indeed, it is several times faster than doing the same > operations with assembled matrices). Here the variability is within the > same application performing repeated solves. If you get a different > partition on a different run, you can see larger variation. > > If your matrices are large enough, your performance will be limited by > memory bandwidth. (This is the typical case, but sufficiently small > matrices can fit in cache.) I once encountered a batch system that did > not properly reset nodes between runs, leaving a partially-filled > ramdisk distributed asymmetrically across the memory busses. This led > to 3x performance reduction on 4-socket nodes because much of the memory > demanded by the application would be faulted onto one memory bus. > Presumably your machine has a resource manager that would not allow such > things to happen. From knepley at gmail.com Fri May 9 06:50:01 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 May 2014 06:50:01 -0500 Subject: [petsc-users] possible performance issues with PETSc on Cray In-Reply-To: References: <283192B3-CD15-4118-A688-EB6E500831A6@ldeo.columbia.edu> <8761mgw0ol.fsf@jedbrown.org> Message-ID: On Fri, May 9, 2014 at 6:47 AM, Samar Khatiwala wrote: > Hi Jed et al., > > Just wanted to report back on the resolution of this issue. The computing > support people at HLRN in Germany > submitted a test case to CRAY re. performance on their XC30. CRAY has > finally gotten back with a solution, > which is to use the run-time option -vecscatter_alltoall. Apparently this > is a known issue and according to the > HLRN folks passing this command line option to PETSc seems to work nicely. > What this does is replace point-to-point communication (MPI_Send/Recv) with collective communication (MI_Alltoall). Thanks, Matt > Thanks again for your help. > > Samar > > On Apr 11, 2014, at 7:44 AM, Jed Brown wrote: > > > Samar Khatiwala writes: > > > >> Hello, > >> > >> This is a somewhat vague query but I and a colleague have been running > PETSc (3.4.3.0) on a Cray > >> XC30 in Germany (https://www.hlrn.de/home/view/System3/WebHome) and > the system administrators > >> alerted us to some anomalies with our jobs that may or may not be > related to PETSc but I thought I'd ask > >> here in case others have noticed something similar. > >> > >> First, there was a large variation in run-time for identical jobs, > sometimes as much as 50%. We didn't > >> really pick up on this but other users complained to the IT people that > their jobs were taking a performance > >> hit with a similar variation in run-time. At that point we're told the > IT folks started monitoring jobs and > >> carrying out tests to see what was going on. They discovered that (1) > this always happened when we were > >> running our jobs and (2) the problem got worse with physical proximity > to the nodes on which our jobs were > >> running (what they described as a "strong interaction" between our jobs > and others presumably through the > >> communication network). > > > > It sounds like you are strong scaling (smallish subdomains) so that your > > application is sensitive to network latency. I see significant > > performance variability on XC-30 with this Full Multigrid solver that is > > not using PETSc. > > > > http://59a2.org/files/hopper-vs-edison.3semilogx.png > > > > See the factor of 2 performance variability for the samples of the ~15M > > element case. This operation is limited by instruction issue rather > > than bandwidth (indeed, it is several times faster than doing the same > > operations with assembled matrices). Here the variability is within the > > same application performing repeated solves. If you get a different > > partition on a different run, you can see larger variation. > > > > If your matrices are large enough, your performance will be limited by > > memory bandwidth. (This is the typical case, but sufficiently small > > matrices can fit in cache.) I once encountered a batch system that did > > not properly reset nodes between runs, leaving a partially-filled > > ramdisk distributed asymmetrically across the memory busses. This led > > to 3x performance reduction on 4-socket nodes because much of the memory > > demanded by the application would be faulted onto one memory bus. > > Presumably your machine has a resource manager that would not allow such > > things to happen. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrestelli at gmail.com Fri May 9 07:19:24 2014 From: mrestelli at gmail.com (marco restelli) Date: Fri, 9 May 2014 14:19:24 +0200 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: References: <87lhubk3zf.fsf@jedbrown.org> Message-ID: 2014-05-09 13:14 GMT+0200, Matthew Knepley : > On Fri, May 9, 2014 at 3:15 AM, marco restelli wrote: > >> 2014-05-09 5:47 GMT+0200, Jed Brown : >> > marco restelli writes: >> >> Matt, thanks but this I don't understand. What I want is getting three >> >> arrays (i,j,coeff) with all the nonzero local coefficients, so that I >> >> can send them around with MPI. >> >> >> >> MatGetSubmatrices would give me some PETSc objects, which I can not >> >> pass to MPI, right? >> > >> > I'm not sure you want this, but you can use MatGetRowIJ and similar to >> > access the representation you're asking for if you are dead set on >> > depending on a specific data format rather than using generic >> > interfaces. >> > >> >> Jed, thank you. This is probably not the PETSc solution, but still it >> might a solution! >> >> I have found this example for MatGetRowIJ: >> > > I really do not think you want to do this. It is complex, fragile and I > believe the performance > improvement to be non-existent. You can get the effect you want JUST by > using one function. > For example, suppose you want 2 procs to get rows [0,5] and two procs to > get rows [1,3], then > > procs A.B > > MatGetSubmatrices(A, 2, [0,5], 2, [0,5], ..., &submat) > > procs C, D > > MatGetSubmatrices(A, 2, [1,3], 2, [1,3], ..., &submat) > > and its done. No MPI, no extraction which depends on the Mat data > structure. Matt, I understand that the idea is to avoid using MPI, but I don't see how getting a submatrix is related to my problem. Probably a simpler version of my problem is the following: one matrix is distributed on procs. 0,1 another matrix is distributed on procs. 2,3 The two matrices have the same size and I want to add them. For the resulting matrix, I want two copies, one is distributed among 0,1 and the second one among 2,3. A possibility that I see now is creating a third matrix, with the same size, distributed among all the four processors: 0,1,2,3, setting it to zero and then letting processors 0,1 add their matrix, and also 2,3 add their own. Then I could convert the result into two matrices, making the two copies that I need. This works provided that in MatAXPY I can uses matrices distributed on different processors: given that the function computes Y = a*X + Y in my case it would be Y -> procs. 0,1,2,3 X -> procs. 0,1 Would this work? Marco From ant_mil at hotmail.com Fri May 9 07:29:39 2014 From: ant_mil at hotmail.com (Antonios Mylonakis) Date: Fri, 9 May 2014 15:29:39 +0300 Subject: [petsc-users] Errors in running Message-ID: Dear Sir or Madam I am a new PETSc user. I am using PETSc library with fortran. I have the following problem. I want to use the matrix-free form of krylov solvers. So I am starting by using the example ex14f.F.In this example, within subroutine mymult() I try call another subroutine which calculates the vector I need as the result of the matrix-vector multiplication.In this second subroutine the vector is defined as a simple array. (Is this the problem?) The problem is that I receive errors when I'm attempting to run the program. The problem seems to be related with memory, but I am not sure. The first line of errors can be seen below:"Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range" Could you help me? Thanks in advance -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri May 9 07:30:10 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 May 2014 07:30:10 -0500 Subject: [petsc-users] Equivalent of all_reduce for sparse matrices In-Reply-To: References: <87lhubk3zf.fsf@jedbrown.org> Message-ID: On Fri, May 9, 2014 at 7:19 AM, marco restelli wrote: > 2014-05-09 13:14 GMT+0200, Matthew Knepley : > > On Fri, May 9, 2014 at 3:15 AM, marco restelli > wrote: > > > >> 2014-05-09 5:47 GMT+0200, Jed Brown : > >> > marco restelli writes: > >> >> Matt, thanks but this I don't understand. What I want is getting > three > >> >> arrays (i,j,coeff) with all the nonzero local coefficients, so that I > >> >> can send them around with MPI. > >> >> > >> >> MatGetSubmatrices would give me some PETSc objects, which I can not > >> >> pass to MPI, right? > >> > > >> > I'm not sure you want this, but you can use MatGetRowIJ and similar to > >> > access the representation you're asking for if you are dead set on > >> > depending on a specific data format rather than using generic > >> > interfaces. > >> > > >> > >> Jed, thank you. This is probably not the PETSc solution, but still it > >> might a solution! > >> > >> I have found this example for MatGetRowIJ: > >> > > > > I really do not think you want to do this. It is complex, fragile and I > > believe the performance > > improvement to be non-existent. You can get the effect you want JUST by > > using one function. > > For example, suppose you want 2 procs to get rows [0,5] and two procs to > > get rows [1,3], then > > > > procs A.B > > > > MatGetSubmatrices(A, 2, [0,5], 2, [0,5], ..., &submat) > > > > procs C, D > > > > MatGetSubmatrices(A, 2, [1,3], 2, [1,3], ..., &submat) > > > > and its done. No MPI, no extraction which depends on the Mat data > > structure. > > Matt, I understand that the idea is to avoid using MPI, but I don't > see how getting a submatrix is related to my problem. > > Probably a simpler version of my problem is the following: > > one matrix is distributed on procs. 0,1 > another matrix is distributed on procs. 2,3 > > The two matrices have the same size and I want to add them. For the > resulting matrix, I want two copies, one is distributed among 0,1 and > the second one among 2,3. > If you want distributed matrices to come out you could make one call to MatGetSubmatrix() for each group, but that is unattractive for a large number of groups. I am not seeing the value you get by distributing these matrices if you are just going to make copies later. Matt A possibility that I see now is creating a third matrix, with the same > size, distributed among all the four processors: 0,1,2,3, setting it > to zero and then letting processors 0,1 add their matrix, and also 2,3 > add their own. Then I could convert the result into two matrices, > making the two copies that I need. > > This works provided that in MatAXPY I can uses matrices distributed on > different processors: given that the function computes > > Y = a*X + Y > > in my case it would be > Y -> procs. 0,1,2,3 > X -> procs. 0,1 > > Would this work? > > Marco > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri May 9 07:30:08 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 09 May 2014 07:30:08 -0500 Subject: [petsc-users] possible performance issues with PETSc on Cray In-Reply-To: References: <283192B3-CD15-4118-A688-EB6E500831A6@ldeo.columbia.edu> <8761mgw0ol.fsf@jedbrown.org> Message-ID: <87fvkjjfsf.fsf@jedbrown.org> Samar Khatiwala writes: > CRAY has finally gotten back with a solution, > which is to use the run-time option -vecscatter_alltoall. Apparently > this is a known issue and according to the HLRN folks passing this > command line option to PETSc seems to work nicely. This option is good when you have nearly-dense rows or columns (in terms of processors depended on). For problems with actual dense rows or columns, it is good to formulate as a sparse matrix plus a low-rank correction. The other cases are usually poor dof layout, and reordering will make the graph sparser. Sparse problems with good layout usually run faster with the default VecScatter, though there are exceptions (mostly non-PDE problems). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From knepley at gmail.com Fri May 9 07:31:23 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 May 2014 07:31:23 -0500 Subject: [petsc-users] Errors in running In-Reply-To: References: Message-ID: On Fri, May 9, 2014 at 7:29 AM, Antonios Mylonakis wrote: > Dear Sir or Madam > > I am a new PETSc user. I am using PETSc library with fortran. > I have the following problem. I want to use the matrix-free form of krylov > solvers. So I am starting by using the example ex14f.F. > In this example, within subroutine mymult() I try call another subroutine > which calculates the vector I need as the result of the matrix-vector > multiplication.In this second subroutine the vector is defined as a simple > array. (Is this the problem?) > The problem is that I receive errors when I'm attempting to run the > program. The problem seems to be related with memory, but I am not sure. > > The first line of errors can be seen below: > "Caught signal number 11 SEGV: Segmentation Violation, probably memory > access out of range > Always send the entire error meesage. The rest of the message tells you to run valgrind. Matt > Could you help me? > > Thanks in advance > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From spk at ldeo.columbia.edu Fri May 9 07:40:37 2014 From: spk at ldeo.columbia.edu (Samar Khatiwala) Date: Fri, 9 May 2014 08:40:37 -0400 Subject: [petsc-users] possible performance issues with PETSc on Cray In-Reply-To: <87fvkjjfsf.fsf@jedbrown.org> References: <283192B3-CD15-4118-A688-EB6E500831A6@ldeo.columbia.edu> <8761mgw0ol.fsf@jedbrown.org> <87fvkjjfsf.fsf@jedbrown.org> Message-ID: Hi Jed, This is useful to know. My matrices are all very sparse but just may not be ordered optimally (there's a problem-specific reason why I order them in a certain way). That said, this is the first time in many years of similar computations with similar matrices that I've encountered this problem. It may just be peculiar to the XC30's. Thanks, Samar On May 9, 2014, at 8:30 AM, Jed Brown wrote: > Samar Khatiwala writes: >> CRAY has finally gotten back with a solution, >> which is to use the run-time option -vecscatter_alltoall. Apparently >> this is a known issue and according to the HLRN folks passing this >> command line option to PETSc seems to work nicely. > > This option is good when you have nearly-dense rows or columns (in terms > of processors depended on). For problems with actual dense rows or > columns, it is good to formulate as a sparse matrix plus a low-rank > correction. The other cases are usually poor dof layout, and reordering > will make the graph sparser. Sparse problems with good layout usually > run faster with the default VecScatter, though there are exceptions > (mostly non-PDE problems). From bsmith at mcs.anl.gov Fri May 9 08:00:29 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 9 May 2014 08:00:29 -0500 Subject: [petsc-users] Errors in running In-Reply-To: References: Message-ID: <85533106-1C5E-498F-8BB5-F078CD218E05@mcs.anl.gov> On May 9, 2014, at 7:29 AM, Antonios Mylonakis wrote: > Dear Sir or Madam > > I am a new PETSc user. I am using PETSc library with fortran. > I have the following problem. I want to use the matrix-free form of krylov solvers. So I am starting by using the example ex14f.F. > In this example, within subroutine mymult() I try call another subroutine which calculates the vector I need as the result of the matrix-vector multiplication.In this second subroutine the vector is defined as a simple array. (Is this the problem?) A PETSc Vec is NOT a simple array you cannot do something like myroutine( x) double x(*) ?. anotherroutine(y) Vec y call myroutine(y) to access local entries in PETSc Vec directly you need to call VecGetArray() or VecGetArrayF90() Barry > The problem is that I receive errors when I'm attempting to run the program. The problem seems to be related with memory, but I am not sure. > > The first line of errors can be seen below: > "Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range" > > > Could you help me? > > Thanks in advance From puesoek at uni-mainz.de Fri May 9 09:50:36 2014 From: puesoek at uni-mainz.de (=?iso-8859-1?Q?P=FCs=F6k=2C_Adina-Erika?=) Date: Fri, 9 May 2014 14:50:36 +0000 Subject: [petsc-users] Problem with MatZeroRowsColumnsIS() In-Reply-To: References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de> Message-ID: <57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de> Yes, I tested the implementation with both MatZeroRowsIS() and MatZeroRowsColumnsIS(). But first, I will be more explicit about the problem I was set to solve: We have a Dirichlet block of size (L,W,H) and centered (xc,yc,zc), which is much smaller than the model domain, and we set Vx = Vpush, Vy=0 within the block (Vz is let free for easier convergence). As I said before, since the code does not have a monolithic matrix, but 4 submatrices (VV VP; PV PP), and the rhs has 2 sub vectors rhs=(f; g), my approach is to modify only (VV, VP, f) for the Dirichlet BC. The way I tested the implementation: 1) Output (VV, VP, f, Dirichlet dofs) - unmodified (no Dirichlet BC) 2) Output (VV, VP, f, Dirichlet dofs) - a) modified with MatZeroRowsIS(), - b) modified with MatZeroRowsColumnsIS() -> S_PETSc Again, the only difference between a) and b) is: // ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); // ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); 3) Read them in Matlab and perform the exact same operations on the unmodified matrices and f vector. -> S_Matlab 4) Compare S_PETSc with S_Matlab. If the implementation is correct, they should be equal (VV, VP, f). 5) Check for 1 cpu and 4 cpus. Now to answer your questions: a,b,d) Yes, matrix modification is done correctly (check the spy diagrams below) in all cases: MatZeroRowsIS() and MatZeroRowsColumnsIS() on 1 and 4 cpus. I should have said that in the piece of code above: v_vv = 1.0; v_vp = 0.0; The vector x_push is a duplicate of rhs, with zero elements except the values for the Dirichlet dofs. c) The rhs is a different matter. With MatZeroRows() there is no problem. The rhs is equivalent with the one in Matlab, sequential and parallel. However, with MatZeroRowsColumns(), the residual contains nonzero elements, and in parallel the nonzero pattern is even bigger (1 cpu - 63, 4 cpu - 554). But if you look carefully, the values of the nonzero residuals are very small < +/- 1e-10. So, I did a tolerance filter: tol = 1e-10; res = f_petsc - f_mod_matlab; for i=1:length(res) if abs(res(i))>0 & abs(res(i))> wrote: Hello! I was trying to implement some internal Dirichlet boundary conditions into an aij matrix of the form: A=( VV VP; PV PP ). The idea was to create an internal block (let's say Dirichlet block) that moves with constant velocity within the domain (i.e. check all the dofs within the block and set the values accordingly to the desired motion). Ideally, this means to zero the rows and columns in VV, VP, PV corresponding to the dirichlet dofs and modify the corresponding rhs values. However, since we have submatrices and not a monolithic matrix A, we can choose to modify only VV and PV matrices. The global indices of the velocity points within the Dirichlet block are contained in the arrays rowid_array. What I want to point out is that the function MatZeroRowsColumnsIS() seems to create parallel artefacts, compared to MatZeroRowsIS() when run on more than 1 processor. Moreover, the results on 1 cpu are identical. See below the results of the test (the Dirichlet block is outlined in white) and the piece of the code involved where the 1) - 2) parts are the only difference. I am assuming that you are showing the result of solving the equations. It would be more useful, and presumably just as easy to say: a) Are the correct rows zeroed out? b) Is the diagonal element correct? c) Is the rhs value correct? d) Are the columns zeroed correctly? If we know where the problem is, its easier to fix. For example, if the rhs values are correct and the rows are zeroed, then something is wrong with the solution procedure. Since ZeroRows() works and ZeroRowsColumns() does not, this is a distinct possibility. Thanks, Matt Thanks, Adina Pusok // Create an IS required by MatZeroRows() ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx); CHKERRQ(ierr); ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy); CHKERRQ(ierr); ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz); CHKERRQ(ierr); 1) /* ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr);*/ 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr); ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); ierr = ISDestroy(&isx); CHKERRQ(ierr); ierr = ISDestroy(&isy); CHKERRQ(ierr); ierr = ISDestroy(&isz); CHKERRQ(ierr); Results (velocity) with MatZeroRowsColumnsIS(). 1cpu 4cpu Results (velocity) with MatZeroRowsIS(): 1cpu 4cpu -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: spy_zerorows_1cpu.png Type: image/png Size: 15916 bytes Desc: spy_zerorows_1cpu.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: spy_zerorows_4cpu.png Type: image/png Size: 17690 bytes Desc: spy_zerorows_4cpu.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: spy_zerorowscol_1cpu.png Type: image/png Size: 16300 bytes Desc: spy_zerorowscol_1cpu.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: spy_zerorowscol_4cpu.png Type: image/png Size: 18174 bytes Desc: spy_zerorowscol_4cpu.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: residual_tol.png Type: image/png Size: 3577 bytes Desc: residual_tol.png URL: From bsmith at mcs.anl.gov Fri May 9 14:31:01 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 9 May 2014 14:31:01 -0500 Subject: [petsc-users] Errors in running In-Reply-To: References: , <85533106-1C5E-498F-8BB5-F078CD218E05@mcs.anl.gov> Message-ID: Always respond to ALL on mailing lists, otherwise you may never get an answer. Since xx_v() is a PetscScalar, pointer :: xx_v(:) I believe it is best if your myroutine() takes a PetscScalar, pointer :: xx_v(:) argument, not a real:: Also by default PetscScalar is a double precision number. If you wish PETSc to use single precision numbers then you must ./configure it with ?with-precision=single Barry On May 9, 2014, at 9:23 AM, Antonios Mylonakis wrote: > Thanks for your help. > > So if I understand well I should do sth like this (?): > > myroutine(x) > real:: x(2) > > .... > > end > > > anotherroutine(y) > vec y > PetscScalar, pointer :: xx_v(:) > > call myroutine(y) > VecGetArrayF90(y,xx_v,ierr) > edit xx_v > VecRestoreArray(y,xx_v,ierr) > ... > end > > > >> Subject: Re: [petsc-users] Errors in running >> From: bsmith at mcs.anl.gov >> Date: Fri, 9 May 2014 08:00:29 -0500 >> CC: petsc-users at mcs.anl.gov >> To: ant_mil at hotmail.com >> >> >> On May 9, 2014, at 7:29 AM, Antonios Mylonakis wrote: >> >>> Dear Sir or Madam >>> >>> I am a new PETSc user. I am using PETSc library with fortran. >>> I have the following problem. I want to use the matrix-free form of krylov solvers. So I am starting by using the example ex14f.F. >>> In this example, within subroutine mymult() I try call another subroutine which calculates the vector I need as the result of the matrix-vector multiplication.In this second subroutine the vector is defined as a simple array. (Is this the problem?) >> >> A PETSc Vec is NOT a simple array you cannot do something like >> >> myroutine( x) >> double x(*) >> ?. >> >> >> anotherroutine(y) >> Vec y >> call myroutine(y) >> >> to access local entries in PETSc Vec directly you need to call VecGetArray() or VecGetArrayF90() >> >> Barry >> >> >>> The problem is that I receive errors when I'm attempting to run the program. The problem seems to be related with memory, but I am not sure. >>> >>> The first line of errors can be seen below: >>> "Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range" >>> >>> >>> Could you help me? >>> >>> Thanks in advance >> From info at jubileedvds.com Sat May 10 21:40:26 2014 From: info at jubileedvds.com (Jubilee DVDs) Date: Sun, 11 May 2014 04:40:26 +0200 (SAST) Subject: [petsc-users] Jubilee DVDs Newsletter Message-ID: <1195896-1399775882773-133838-250313049-1-0@b.ss51.mailboxesmore.com> An HTML attachment was scrubbed... URL: From likunt at caltech.edu Sat May 10 23:11:22 2014 From: likunt at caltech.edu (likunt at caltech.edu) Date: Sat, 10 May 2014 21:11:22 -0700 (PDT) Subject: [petsc-users] about VecScatter Message-ID: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu> Dear Petsc developers, I have a vector object M, I need all the elements of it in all the processors. Here is a part of my code ////////////////////////////////////////////////////////////// Vec M; VecScatterCreateToAll(M,&scatter_ctx,&N); VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); VecGetArray(N, &aM); for(i=xs; i References: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu> Message-ID: <87oaz5c5cj.fsf@jedbrown.org> likunt at caltech.edu writes: > Dear Petsc developers, > > I have a vector object M, I need all the elements of it in all the > processors. > > Here is a part of my code > > ////////////////////////////////////////////////////////////// > Vec M; > VecScatterCreateToAll(M,&scatter_ctx,&N); > VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); > VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); > > VecGetArray(N, &aM); > > for(i=xs; i { > //within the loop, requires all the elements of aM > } > //////////////////////////////////////////////////////////// > > but this seems not working well. The phrase "not working" should never appear unqualified in polite conversation. Send steps to reproduce, what you expect, and what you observe. > Would you please suggest a more efficient way? Thank you. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From likunt at caltech.edu Sat May 10 23:40:55 2014 From: likunt at caltech.edu (likunt at caltech.edu) Date: Sat, 10 May 2014 21:40:55 -0700 (PDT) Subject: [petsc-users] about VecScatter In-Reply-To: <87oaz5c5cj.fsf@jedbrown.org> References: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu> <87oaz5c5cj.fsf@jedbrown.org> Message-ID: <52104.131.215.220.161.1399783255.squirrel@webmail.caltech.edu> Dear Jed, Thanks for your reply. Below is a more complete version of the code. I need to loop over all the elements of aM to compute a new Vector called result. But this process is very slow, I would appreciate if you can give advice on speeding it up. Many thanks. ////////////////////////////////////////////////////////////// Vec M,N,result; DM da; DMDACreate1d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,NODE,1,1,NULL,&da); DMCreateGlobalVector(da, &M); //set values of M .. VecScatterCreateToAll(M,&scatter_ctx,&N); VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); VecGetArray(N, &aM); DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); for(i=xs; i likunt at caltech.edu writes: > >> Dear Petsc developers, >> >> I have a vector object M, I need all the elements of it in all the >> processors. >> >> Here is a part of my code >> >> ////////////////////////////////////////////////////////////// >> Vec M; >> VecScatterCreateToAll(M,&scatter_ctx,&N); >> VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); >> VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); >> >> VecGetArray(N, &aM); >> >> for(i=xs; i > What are xs and xm in this setting. What do you intend? > >> { >> //within the loop, requires all the elements of aM >> } >> //////////////////////////////////////////////////////////// >> >> but this seems not working well. > > The phrase "not working" should never appear unqualified in polite > conversation. Send steps to reproduce, what you expect, and what you > observe. > >> Would you please suggest a more efficient way? Thank you. > From knepley at gmail.com Sun May 11 06:27:25 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 11 May 2014 06:27:25 -0500 Subject: [petsc-users] about VecScatter In-Reply-To: <52104.131.215.220.161.1399783255.squirrel@webmail.caltech.edu> References: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu> <87oaz5c5cj.fsf@jedbrown.org> <52104.131.215.220.161.1399783255.squirrel@webmail.caltech.edu> Message-ID: On Sat, May 10, 2014 at 11:40 PM, wrote: > Dear Jed, > > Thanks for your reply. Below is a more complete version of the code. I > need to loop over all the elements of aM to compute a new Vector called > result. But this process is very slow, I would appreciate if you can give > advice on speeding it up. Many thanks. > > ////////////////////////////////////////////////////////////// > Vec M,N,result; > DM da; > DMDACreate1d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,NODE,1,1,NULL,&da); > DMCreateGlobalVector(da, &M); > //set values of M .. > VecScatterCreateToAll(M,&scatter_ctx,&N); > VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); > VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); > VecGetArray(N, &aM); > 1) Everything below is nonsensical. The values of M are already in N. 2) Sending all values to a single process is inherently slow. It should not be done in parallel computing Matt > DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); > for(i=xs; i { > val=0.0; > for(j=0; j { > val=val+aM[j]; > > } > VecSetValues(result, 1, i, val, INSERT_VALUES); > } > > VecRestoreArray(N, &aM); > VecAssemblyBegin(result); > VecAssemblyEnd(result); > //////////////////////////////////////////////////////////// > > > > likunt at caltech.edu writes: > > > >> Dear Petsc developers, > >> > >> I have a vector object M, I need all the elements of it in all the > >> processors. > >> > >> Here is a part of my code > >> > >> ////////////////////////////////////////////////////////////// > >> Vec M; > >> VecScatterCreateToAll(M,&scatter_ctx,&N); > >> VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); > >> VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD); > >> > >> VecGetArray(N, &aM); > >> > >> for(i=xs; i > > > What are xs and xm in this setting. What do you intend? > > > >> { > >> //within the loop, requires all the elements of aM > >> } > >> //////////////////////////////////////////////////////////// > >> > >> but this seems not working well. > > > > The phrase "not working" should never appear unqualified in polite > > conversation. Send steps to reproduce, what you expect, and what you > > observe. > > > >> Would you please suggest a more efficient way? Thank you. > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlk0812 at hotmail.com Sun May 11 20:07:17 2014 From: tlk0812 at hotmail.com (LikunTan) Date: Mon, 12 May 2014 09:07:17 +0800 Subject: [petsc-users] Normalize vectors Message-ID: Dear Petsc developers, I have a vector M which consists of a series of 3d vectors, and I want to reset M by normalizing each 3d vector. Here is my code: /**********************************************************************VecGetArray(M, &aM); DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); for(node=xs; nodeNDOF; l++) { aM[node*3+l]=val[l]/mag; }}VecRestoreArray(M, &aM); VecAssemblyBegin(M);VecAssemblyEnd(M); VecView(M, PETSC_VIEWER_STDOUT_WORLD);**********************************************************************/ but I got the error at the last step:--------------------------------------------------------------------------mpiexec noticed that process rank 3 with PID 17156 on node compute-21-8.local exited on signal 6 (Aborted).--------------------------------------------------------------------------and if I commented out VecView, and used vector M for other operations, e.g.KSPSolve(ksp, M, b), I got "memory corruption" message. Your comment on this issue is well appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun May 11 20:09:40 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 11 May 2014 20:09:40 -0500 Subject: [petsc-users] Normalize vectors In-Reply-To: References: Message-ID: On Sun, May 11, 2014 at 8:07 PM, LikunTan wrote: > Dear Petsc developers, > > I have a vector M which consists of a series of 3d vectors, and I want to > reset M by normalizing each 3d vector. Here is my code: > > /********************************************************************** > VecGetArray(M, &aM); > Its easier not to make a mistake if you use DMDAVecGetArrayDof() Matt > DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); > for(node=xs; node { > mag=0.0; > for(l=0; l<3; l++) > { > val[l]=aM[node*3+l]; > mag=mag+val[l]*val[l]; > } > mag=sqrt(mag); > for(l=0; lNDOF; l++) > { > aM[node*3+l]=val[l]/mag; > } > } > VecRestoreArray(M, &aM); > VecAssemblyBegin(M); > VecAssemblyEnd(M); > VecView(M, PETSC_VIEWER_STDOUT_WORLD); > **********************************************************************/ > > but I got the error at the last step: > -------------------------------------------------------------------------- > mpiexec noticed that process rank 3 with PID 17156 on node > compute-21-8.local exited on signal 6 (Aborted). > -------------------------------------------------------------------------- > and if I commented out VecView, and used vector M for other operations, > e.g. > KSPSolve(ksp, M, b), I got "memory corruption" message. Your comment on > this issue is well appreciated. > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun May 11 21:08:11 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 11 May 2014 21:08:11 -0500 Subject: [petsc-users] Normalize vectors In-Reply-To: References: Message-ID: <8FBEE805-C520-4700-A9D7-97D4DB4EDC79@mcs.anl.gov> On May 11, 2014, at 8:07 PM, LikunTan wrote: > Dear Petsc developers, > > I have a vector M which consists of a series of 3d vectors, and I want to reset M by normalizing each 3d vector. Here is my code: This code is wrong in several ways > > /********************************************************************** > VecGetArray(M, &aM); This always returns an array that is indexed starting at 0, so the code below when you access with aM[node*3+l] is like totally reading from the wrong place. Instead use typedef struct { PetscScalar x,y,z; } Field; Field *aM; DMDAVecGetArray(da,M,&aM); This routine returns an array whose index starts at xs > DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); > > for(node=xs; node { > mag=PetscSqrtScalar(aM[xs].x*aM[xs].x + aM[xs].y*aM[xs].y aM[xs].z*aM[xs].z); > if (mag != 0.0) { > aM[xs].x /= mag; > aM[xs].y /= mag; > aM[xs].z /= mag; } > } > DMDAVecRestoreArray(da,M, &aM); > When accessing the vector arrays directly you do not need VecAssemblyBegin/End() they are only for use with VecSetValues() Also use valgrind to find memory access problems: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > VecView(M, PETSC_VIEWER_STDOUT_WORLD); > **********************************************************************/ > > but I got the error at the last step: > -------------------------------------------------------------------------- > mpiexec noticed that process rank 3 with PID 17156 on node compute-21-8.local exited on signal 6 (Aborted). > -------------------------------------------------------------------------- > and if I commented out VecView, and used vector M for other operations, e.g. > KSPSolve(ksp, M, b), I got "memory corruption" message. Your comment on this issue is well appreciated. From hzhang at mcs.anl.gov Mon May 12 10:28:29 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Mon, 12 May 2014 10:28:29 -0500 Subject: [petsc-users] MatGetMumpsRINFOG() In-Reply-To: <16ccafbcb0c941b09a063946350a6688@GEORGE.anl.gov> References: <16ccafbcb0c941b09a063946350a6688@GEORGE.anl.gov> Message-ID: Zafer, MatGetMumpsXXX() are added to petsc development https://bitbucket.org/petsc/petsc/commits/d28e04f7b1f5a73d3305399116f492322ea7448c Hong On Tue, May 6, 2014 at 11:39 PM, Zafer Leylek wrote: > Hi, > > I am trying to get mumps to return the matrix determinant. I have set the > ICNTL option using: > > MatMumpsSetIcntl(A,33,1); > > and can view the determinant using > > PCView(pc, PETSC_VIEWER_STDOUT_WORLD); > > I need to use the determinant in my code. Is there a way I can get petsc to > return this parameter. If not, is it possible to implement the > MatGetMumpsRINFOG() as suggested in: > > http://lists.mcs.anl.gov/pipermail/petsc-users/2011-September/010225.html > > King Regards > > ZL From likunt at caltech.edu Mon May 12 12:27:51 2014 From: likunt at caltech.edu (likunt at caltech.edu) Date: Mon, 12 May 2014 10:27:51 -0700 (PDT) Subject: [petsc-users] solving Ax=b with constant A Message-ID: <58653.131.215.220.165.1399915671.squirrel@webmail.caltech.edu> Dear Petsc developers, I am solving a linear system Ax=b, while A is constant and b is changing in each time step. Here is the code I wrote: /**************************************************************** ...compute matrix A... KSPCreate(PETSC_COMM_WORLD, &ksp); CHKERRQ(ierr); KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER); CHKERRQ(ierr); KSPSetTolerances(ksp, 1.e-5, 1.E-50, PETSC_DEFAULT, PETSC_DEFAULT); KSPSetFromOptions(ksp); for(int step=0; step References: <58653.131.215.220.165.1399915671.squirrel@webmail.caltech.edu> Message-ID: On Mon, May 12, 2014 at 12:27 PM, wrote: > Dear Petsc developers, > > I am solving a linear system Ax=b, while A is constant and b is changing > in each time step. Here is the code I wrote: > > /**************************************************************** > ...compute matrix A... > KSPCreate(PETSC_COMM_WORLD, &ksp); CHKERRQ(ierr); > KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER); CHKERRQ(ierr); > KSPSetTolerances(ksp, 1.e-5, 1.E-50, PETSC_DEFAULT, PETSC_DEFAULT); > KSPSetFromOptions(ksp); > for(int step=0; step { > ... compute vector b ... > KSPSolve(ksp, b, x); > } > *****************************************************************/ > > I tested a system with size 1725*1725, on 4 processors, it takes 0.06s. > Would you please let me know if there is a way to improve its efficiency? > It would be amazing if we could do that given the description. First, we do not know exactly what solver is being used (-ksp_view), but lets assume its GMRES/ILU(0) which is the default. Second, we have no idea what the convergence was like (-ksp_monitor_true_residual -ksp_converged_reason), so we do not know what the bottleneck is, and have no performance monitoring (-log_summary). Lastly, even if we had that we have no idea what the operator is so that we could make intelligent suggestions for other preconditioners. Matt > Thanks. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Mon May 12 16:54:17 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Mon, 12 May 2014 14:54:17 -0700 (PDT) Subject: [petsc-users] ILUTP in PETSc In-Reply-To: References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> Message-ID: <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com> Hello, I have built?PETSc with SuperLU,?but what are?PETSc's command line?options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?) ? Do I need to do some programming in order to call SuperLU's preconditioner,?or the command line options would work??? ? Many thanks, Qin??? ?From: Xiaoye S. Li To: Barry Smith Cc: Qin Lu ; "petsc-users at mcs.anl.gov" Sent: Friday, May 2, 2014 3:40 PM Subject: Re: [petsc-users] ILUTP in PETSc The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily. ? In SuperLU distribution: ? EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c) ? SRC/zgsitrf.c : the actual ILUTP factorization routine Sherry Li On Fri, May 2, 2014 at 12:25 PM, Barry Smith wrote: >At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html ?there are two listed. ./configure ?download-hypre > >mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid > >you can also add -help to see what options are available. > >? Both pretty much suck and I can?t image much reason for using them. > >? ?Barry > > > >On May 2, 2014, at 10:27 AM, Qin Lu wrote: > >> Hello, >> >> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? >> >> Many thanks, >> Qin > >?? ?? From bsmith at mcs.anl.gov Mon May 12 17:11:12 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 12 May 2014 17:11:12 -0500 Subject: [petsc-users] ILUTP in PETSc In-Reply-To: <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com> References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com> Message-ID: <8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov> See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html On May 12, 2014, at 4:54 PM, Qin Lu wrote: > Hello, > > I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?) > > Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work? > > Many thanks, > Qin > > > From: Xiaoye S. Li > To: Barry Smith > Cc: Qin Lu ; "petsc-users at mcs.anl.gov" > Sent: Friday, May 2, 2014 3:40 PM > Subject: Re: [petsc-users] ILUTP in PETSc > > > > The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily. > > In SuperLU distribution: > > EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c) > > SRC/zgsitrf.c : the actual ILUTP factorization routine > > > Sherry Li > > > > On Fri, May 2, 2014 at 12:25 PM, Barry Smith wrote: > > >> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html there are two listed. ./configure ?download-hypre >> >> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid >> >> you can also add -help to see what options are available. >> >> Both pretty much suck and I can?t image much reason for using them. >> >> Barry >> >> >> >> On May 2, 2014, at 10:27 AM, Qin Lu wrote: >> >>> Hello, >>> >>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? >>> >>> Many thanks, >>> Qin >> >> From zonexo at gmail.com Mon May 12 21:52:08 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Tue, 13 May 2014 10:52:08 +0800 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: References: <534C9A2C.5060404@gmail.com> <534C9DB5.9070407@gmail.com> <53514B8A.90901@gmail.com> <495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> Message-ID: <537188D8.2030307@gmail.com> Hi, I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. Thank you. Yours sincerely, TAY wee-beng On 21/4/2014 8:58 AM, Barry Smith wrote: > Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. > > Barry > > On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: > >> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>>>>>> >>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>>>>>> >>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>>>>> Hmm, >>>>>>> >>>>>>> Interface DMDAVecGetArrayF90 >>>>>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>>>>> USE_DM_HIDE >>>>>>> DM_HIDE da1 >>>>>>> VEC_HIDE v >>>>>>> PetscScalar,pointer :: d1(:,:,:) >>>>>>> PetscErrorCode ierr >>>>>>> End Subroutine >>>>>>> >>>>>>> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>> Hi, >>>>>>> >>>>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >>>>>>> >>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >>>>>>> >>>>>>> Also, supposed I call: >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>> >>>>>>> u_array .... >>>>>>> >>>>>>> v_array .... etc >>>>>>> >>>>>>> Now to restore the array, does it matter the sequence they are restored? >>>>>>> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >>>>>>> >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>>> >>>>>>> u_array = 0.d0 >>>>>>> >>>>>>> v_array = 0.d0 >>>>>>> >>>>>>> w_array = 0.d0 >>>>>>> >>>>>>> p_array = 0.d0 >>>>>>> >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>>> >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>>>>>> >>>>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >>>>>> Hi Matt, >>>>>> >>>>>> Do you mean putting the above lines into ex11f90.F and test? >>>>>> >>>>>> It already has DMDAVecGetArray(). Just run it. >>>>> Hi, >>>>> >>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >>>>> >>>>> No the global/local difference should not matter. >>>>> >>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >>>>> >>>>> DMGetLocalVector() >>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >>>> >>>> If so, when should I call them? >>>> >>>> You just need a local vector from somewhere. >> Hi, >> >> Anyone can help with the questions below? Still trying to find why my code doesn't work. >> >> Thanks. >>> Hi, >>> >>> I insert part of my error region code into ex11f90: >>> >>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>> >>> u_array = 0.d0 >>> >>> v_array = 0.d0 >>> >>> w_array = 0.d0 >>> >>> p_array = 0.d0 >>> >>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>> >>> It worked w/o error. I'm going to change the way the modules are defined in my code. >>> >>> My code contains a main program and a number of modules files, with subroutines inside e.g. >>> >>> module solve >>> <- add include file? >>> subroutine RRK >>> <- add include file? >>> end subroutine RRK >>> >>> end module solve >>> >>> So where should the include files (#include ) be placed? >>> >>> After the module or inside the subroutine? >>> >>> Thanks. >>>> Matt >>>> >>>> Thanks. >>>>> Matt >>>>> >>>>> Thanks. >>>>>> Matt >>>>>> >>>>>> Thanks >>>>>> >>>>>> Regards. >>>>>>> Matt >>>>>>> >>>>>>> As in w, then v and u? >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> thanks >>>>>>> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >>>>>>> Hi, >>>>>>> >>>>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >>>>>>> >>>>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >>>>>>> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >>>>>>> >>>>>>> >>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>>>>> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> Thanks. >>>>>>> Barry >>>>>>> >>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >>>>>>> >>>>>>> However, by re-writing my code, I found out a few things: >>>>>>> >>>>>>> 1. if I write my code this way: >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>> >>>>>>> u_array = .... >>>>>>> >>>>>>> v_array = .... >>>>>>> >>>>>>> w_array = .... >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> The code runs fine. >>>>>>> >>>>>>> 2. if I write my code this way: >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>> >>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>>>>> >>>>>>> where the subroutine is: >>>>>>> >>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>> >>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>> >>>>>>> u ... >>>>>>> v... >>>>>>> w ... >>>>>>> >>>>>>> end subroutine uvw_array_change. >>>>>>> >>>>>>> The above will give an error at : >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> 3. Same as above, except I change the order of the last 3 lines to: >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>> >>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>> >>>>>>> So they are now in reversed order. Now it works. >>>>>>> >>>>>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>>>>> >>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>> >>>>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>> >>>>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>> >>>>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>> >>>>>>> u ... >>>>>>> v... >>>>>>> w ... >>>>>>> >>>>>>> end subroutine uvw_array_change. >>>>>>> >>>>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >>>>>>> >>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >>>>>>> >>>>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> Yours sincerely, >>>>>>> >>>>>>> TAY wee-beng >>>>>>> >>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>>>>> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>>> >>>>>>> >>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>>>>>> >>>>>>> Hi Barry, >>>>>>> >>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >>>>>>> >>>>>>> I have attached my code. >>>>>>> >>>>>>> Thank you >>>>>>> >>>>>>> Yours sincerely, >>>>>>> >>>>>>> TAY wee-beng >>>>>>> >>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>>>>> Please send the code that creates da_w and the declarations of w_array >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> Hi Barry, >>>>>>> >>>>>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>>>>> >>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>> >>>>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >>>>>>> >>>>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >>>>>>> >>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>> -------------------------------------------------------------------------- >>>>>>> An MPI process has executed an operation involving a call to the >>>>>>> "fork()" system call to create a child process. Open MPI is currently >>>>>>> operating in a condition that could result in memory corruption or >>>>>>> other system errors; your MPI job may hang, crash, or produce silent >>>>>>> data corruption. The use of fork() (or system() or other calls that >>>>>>> create child processes) is strongly discouraged. >>>>>>> >>>>>>> The process that invoked fork was: >>>>>>> >>>>>>> Local host: n12-76 (PID 20235) >>>>>>> MPI_COMM_WORLD rank: 2 >>>>>>> >>>>>>> If you are *absolutely sure* that your application will successfully >>>>>>> and correctly survive a call to fork(), you may disable this warning >>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>>>>> -------------------------------------------------------------------------- >>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >>>>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>>>>>> >>>>>>> .... >>>>>>> >>>>>>> 1 >>>>>>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>> [1]PETSC ERROR: or see >>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>>>> [1]PETSC ERROR: to get more information on the crash. >>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>>>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>> [3]PETSC ERROR: or see >>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>>>> [3]PETSC ERROR: to get more information on the crash. >>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>>>> >>>>>>> ... >>>>>>> Thank you. >>>>>>> >>>>>>> Yours sincerely, >>>>>>> >>>>>>> TAY wee-beng >>>>>>> >>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>>>>> >>>>>>> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> This routines don?t have any parallel communication in them so are unlikely to hang. >>>>>>> >>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>>>>> >>>>>>> >>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>>>>> >>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >>>>>>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>> -- >>>>>>> Thank you. >>>>>>> >>>>>>> Yours sincerely, >>>>>>> >>>>>>> TAY wee-beng >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener From altriaex86 at gmail.com Tue May 13 00:02:08 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Tue, 13 May 2014 15:02:08 +1000 Subject: [petsc-users] Configured with superlu but cannot find a package Message-ID: Hi, there >From the error message below I am sure I configured PETSc with superLU and superLU-DIST. However, it told me there's no such package. Or, is mpiaij not compatible with superlu? According to the manual, I think it should be compatible with it. Thanks a lot. [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: Matrix format mpiaij does not have a solver package superlu for LU. Perhaps you must ./configure with --download-superlu! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named altria-Aspire-5830TG by root Tue May 13 14:53:33 2014 [0]PETSC ERROR: Libraries linked from /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib [0]PETSC ERROR: Configure run at Tue May 13 14:43:13 2014 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran --with-cxx=g++ --download-mpich --download-scalapack --download-metis --download-parmetis --download-mumps --download-PASTIX --download-superLU --download-superLU-dist --with-scalar-type=complex --with-clanguage=cxx My code input(Ap,Ai,Ax,Az,size,nz); //Process input MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE); EPSCreate( PETSC_COMM_WORLD, &eps ); //Setup Solver EPSSetOperators(eps,A,NULL); EPSSetProblemType(eps,EPS_NHEP); EPSSetDimensions(eps,1,6,0); EPSSetType(eps,type); EPSSetTarget(eps,offset); EPSSetWhichEigenpairs(eps,EPS_TARGET_REAL); //Set Target //EPSSetExtraction(eps,EPS_HARMONIC); EPSGetST(eps,&st); //shift-and-invert STSetType(st,STSINVERT); STSetShift(st,offset); STGetKSP(st,&ksp); KSPSetType(ksp,KSPPREONLY); KSPGetPC(ksp,&pc); PCSetType(pc,PCLU); PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU); EPSSolve(eps); EPSGetConverged(eps,&nconv); Function input MatCreate(PETSC_COMM_WORLD,&A); MatSetType(A,MATMPIAIJ); MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size); MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp); MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); Guoxi -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue May 13 00:10:35 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 May 2014 00:10:35 -0500 Subject: [petsc-users] Configured with superlu but cannot find a package In-Reply-To: References: Message-ID: Hm - none of these options should have any 'capitalized' letters [and superlu_dist has an '_' - not a '-' --download-PASTIX --download-superLU--download-superLU-dist They should be: --download-pastix --download-superlu --download-superlu_dist Satish On Tue, 13 May 2014, ??? wrote: > Hi, there > > From the error message below I am sure I configured PETSc with superLU and > superLU-DIST. However, it told me there's no such package. > Or, is mpiaij not compatible with superlu? According to the manual, I think > it should be compatible with it. Thanks a lot. > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Matrix format mpiaij does not have a solver package superlu > for LU. Perhaps you must ./configure with --download-superlu! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named > altria-Aspire-5830TG by root Tue May 13 14:53:33 2014 > [0]PETSC ERROR: Libraries linked from > /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib > [0]PETSC ERROR: Configure run at Tue May 13 14:43:13 2014 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran > --with-cxx=g++ --download-mpich --download-scalapack --download-metis > --download-parmetis --download-mumps --download-PASTIX --download-superLU > --download-superLU-dist --with-scalar-type=complex --with-clanguage=cxx > > My code > > input(Ap,Ai,Ax,Az,size,nz); //Process input > MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE); > EPSCreate( PETSC_COMM_WORLD, &eps ); //Setup Solver > EPSSetOperators(eps,A,NULL); > EPSSetProblemType(eps,EPS_NHEP); > EPSSetDimensions(eps,1,6,0); > EPSSetType(eps,type); > EPSSetTarget(eps,offset); > EPSSetWhichEigenpairs(eps,EPS_TARGET_REAL); //Set Target > > > //EPSSetExtraction(eps,EPS_HARMONIC); > EPSGetST(eps,&st); //shift-and-invert > STSetType(st,STSINVERT); > STSetShift(st,offset); > STGetKSP(st,&ksp); > KSPSetType(ksp,KSPPREONLY); > KSPGetPC(ksp,&pc); > PCSetType(pc,PCLU); > PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU); > EPSSolve(eps); > EPSGetConverged(eps,&nconv); > > > Function input > > MatCreate(PETSC_COMM_WORLD,&A); > MatSetType(A,MATMPIAIJ); > MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size); > MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp); > MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); > > > > Guoxi > From altriaex86 at gmail.com Tue May 13 01:07:15 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Tue, 13 May 2014 16:07:15 +1000 Subject: [petsc-users] Configured with superlu but cannot find a package In-Reply-To: References: Message-ID: Oh, I fixed the spelling and reconfigured it but I got error again. [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: No support for this operation for this object type! [0]PETSC ERROR: Matrix format mpiaij does not have a solver package superlu for LU. Perhaps you must ./configure with --download-superlu! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named altria-Aspire-5830TG by root Tue May 13 16:06:00 2014 [0]PETSC ERROR: Libraries linked from /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib [0]PETSC ERROR: Configure run at Tue May 13 15:58:29 2014 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran --with-cxx=g++ --download-mpich --download-scalapack --download-metis --download-parmetis --download-mumps --download-superlu --download-superlu_dist --with-scalar-type=complex --with-clanguage=cxx 2014-05-13 15:10 GMT+10:00 Satish Balay : > Hm - none of these options should have any 'capitalized' letters > > [and superlu_dist has an '_' - not a '-' > > --download-PASTIX --download-superLU--download-superLU-dist > > They should be: > > --download-pastix --download-superlu --download-superlu_dist > > Satish > > On Tue, 13 May 2014, ??? wrote: > > > Hi, there > > > > From the error message below I am sure I configured PETSc with superLU > and > > superLU-DIST. However, it told me there's no such package. > > Or, is mpiaij not compatible with superlu? According to the manual, I > think > > it should be compatible with it. Thanks a lot. > > > > [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > [0]PETSC ERROR: No support for this operation for this object type! > > [0]PETSC ERROR: Matrix format mpiaij does not have a solver package > superlu > > for LU. Perhaps you must ./configure with --download-superlu! > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named > > altria-Aspire-5830TG by root Tue May 13 14:53:33 2014 > > [0]PETSC ERROR: Libraries linked from > > /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib > > [0]PETSC ERROR: Configure run at Tue May 13 14:43:13 2014 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran > > --with-cxx=g++ --download-mpich --download-scalapack --download-metis > > --download-parmetis --download-mumps --download-PASTIX --download-superLU > > --download-superLU-dist --with-scalar-type=complex --with-clanguage=cxx > > > > My code > > > > input(Ap,Ai,Ax,Az,size,nz); //Process input > > MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE); > > EPSCreate( PETSC_COMM_WORLD, &eps ); //Setup Solver > > EPSSetOperators(eps,A,NULL); > > EPSSetProblemType(eps,EPS_NHEP); > > EPSSetDimensions(eps,1,6,0); > > EPSSetType(eps,type); > > EPSSetTarget(eps,offset); > > EPSSetWhichEigenpairs(eps,EPS_TARGET_REAL); //Set Target > > > > > > //EPSSetExtraction(eps,EPS_HARMONIC); > > EPSGetST(eps,&st); //shift-and-invert > > STSetType(st,STSINVERT); > > STSetShift(st,offset); > > STGetKSP(st,&ksp); > > KSPSetType(ksp,KSPPREONLY); > > KSPGetPC(ksp,&pc); > > PCSetType(pc,PCLU); > > PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU); > > EPSSolve(eps); > > EPSGetConverged(eps,&nconv); > > > > > > Function input > > > > MatCreate(PETSC_COMM_WORLD,&A); > > MatSetType(A,MATMPIAIJ); > > MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size); > > MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp); > > MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); > > MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); > > > > > > > > Guoxi > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From altriaex86 at gmail.com Tue May 13 02:28:08 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Tue, 13 May 2014 17:28:08 +1000 Subject: [petsc-users] Get wrong answer when use multi-process Message-ID: Hi, all I am confused about my code, for it could return right answer when I use 1 process, but return totally wrong answer when more than 1 process. This is how I feed data to it. I have a CSR matrix, represented by Ap(pointer),Ai(index),and temp(data). First determine local matrix for each process. Then feed data to them. int temprank,localsize,line_pos; line_pos = 0; if(rank == 0) { localsize = size/pro + ((size % pro) > rank); } else { for (temprank = 0;temprank temprank); line_pos += localsize; } } Lin_index = new int [localsize+1]; for(i=0;i From romain.veltz at inria.fr Tue May 13 03:22:34 2014 From: romain.veltz at inria.fr (Veltz Romain) Date: Tue, 13 May 2014 10:22:34 +0200 Subject: [petsc-users] Continuation Message-ID: <96748DAF-78E0-4068-AF66-D70E91D2F6A1@inria.fr> Dear Petsc users, I would like to perform numerical continuation with Petsc but it lacks this functionality. Hence, I am wondering if anybody has a class for Moore-Penrose continuation or Pseudo-arclength continuation done in Petsc before I start doing it myself? Thank you for your help, Veltz Romain -------------- next part -------------- An HTML attachment was scrubbed... URL: From altriaex86 at gmail.com Tue May 13 07:33:47 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Tue, 13 May 2014 22:33:47 +1000 Subject: [petsc-users] Cannot open graphic monitor Message-ID: Hi, I tried to open the graphic monitor by char common_options[] = "-st_ksp_type preonly \ -st_pc_type lu \ -st_pc_factor_mat_solver_package mumps \ -eps_tol 1e-9 \ -eps_monitor_lg_all \ -draw_pause .2"; ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr); Then ./program But nothing comes out. Should I install any other package first to get it? Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Tue May 13 09:16:04 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Tue, 13 May 2014 09:16:04 -0500 Subject: [petsc-users] Configured with superlu but cannot find a package In-Reply-To: <3bc6d9c38c644588b1ca5e0e7bb04721@LUCKMAN.anl.gov> References: <3bc6d9c38c644588b1ca5e0e7bb04721@LUCKMAN.anl.gov> Message-ID: ?? : > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: No support for this operation for this object type! > [0]PETSC ERROR: Matrix format mpiaij does not have a solver package superlu ^^^^^^ ^^^^^^^^ Superlu is a sequential package. For parallel, you must use superlu_dist. Suggest install both Superlu and superlu_dist ( --download-superlu_dist). Hong > for LU. Perhaps you must ./configure with --download-superlu! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named > altria-Aspire-5830TG by root Tue May 13 16:06:00 2014 > > [0]PETSC ERROR: Libraries linked from > /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib > [0]PETSC ERROR: Configure run at Tue May 13 15:58:29 2014 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran > --with-cxx=g++ --download-mpich --download-scalapack --download-metis > --download-parmetis --download-mumps --download-superlu > --download-superlu_dist --with-scalar-type=complex --with-clanguage=cxx > > > > 2014-05-13 15:10 GMT+10:00 Satish Balay : > >> Hm - none of these options should have any 'capitalized' letters >> >> [and superlu_dist has an '_' - not a '-' >> >> --download-PASTIX --download-superLU--download-superLU-dist >> >> They should be: >> >> --download-pastix --download-superlu --download-superlu_dist >> >> Satish >> >> On Tue, 13 May 2014, ??? wrote: >> >> > Hi, there >> > >> > From the error message below I am sure I configured PETSc with superLU >> > and >> > superLU-DIST. However, it told me there's no such package. >> > Or, is mpiaij not compatible with superlu? According to the manual, I >> > think >> > it should be compatible with it. Thanks a lot. >> > >> > [0]PETSC ERROR: --------------------- Error Message >> > ------------------------------------ >> > [0]PETSC ERROR: No support for this operation for this object type! >> > [0]PETSC ERROR: Matrix format mpiaij does not have a solver package >> > superlu >> > for LU. Perhaps you must ./configure with --download-superlu! >> > [0]PETSC ERROR: >> > ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 >> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> > [0]PETSC ERROR: See docs/index.html for manual pages. >> > [0]PETSC ERROR: >> > ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named >> > altria-Aspire-5830TG by root Tue May 13 14:53:33 2014 >> > [0]PETSC ERROR: Libraries linked from >> > /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib >> > [0]PETSC ERROR: Configure run at Tue May 13 14:43:13 2014 >> > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran >> > --with-cxx=g++ --download-mpich --download-scalapack --download-metis >> > --download-parmetis --download-mumps --download-PASTIX >> > --download-superLU >> > --download-superLU-dist --with-scalar-type=complex --with-clanguage=cxx >> > >> > My code >> > >> > input(Ap,Ai,Ax,Az,size,nz); //Process input >> > MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE); >> > EPSCreate( PETSC_COMM_WORLD, &eps ); //Setup Solver >> > EPSSetOperators(eps,A,NULL); >> > EPSSetProblemType(eps,EPS_NHEP); >> > EPSSetDimensions(eps,1,6,0); >> > EPSSetType(eps,type); >> > EPSSetTarget(eps,offset); >> > EPSSetWhichEigenpairs(eps,EPS_TARGET_REAL); //Set Target >> > >> > >> > //EPSSetExtraction(eps,EPS_HARMONIC); >> > EPSGetST(eps,&st); //shift-and-invert >> > STSetType(st,STSINVERT); >> > STSetShift(st,offset); >> > STGetKSP(st,&ksp); >> > KSPSetType(ksp,KSPPREONLY); >> > KSPGetPC(ksp,&pc); >> > PCSetType(pc,PCLU); >> > PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU); >> > EPSSolve(eps); >> > EPSGetConverged(eps,&nconv); >> > >> > >> > Function input >> > >> > MatCreate(PETSC_COMM_WORLD,&A); >> > MatSetType(A,MATMPIAIJ); >> > MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size); >> > MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp); >> > MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); >> > MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); >> > >> > >> > >> > Guoxi >> > > > From bsmith at mcs.anl.gov Tue May 13 11:03:21 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 11:03:21 -0500 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: <537188D8.2030307@gmail.com> References: <534C9A2C.5060404@gmail.com> <534C9DB5.9070407@gmail.com> <53514B8A.90901@gmail.com> <495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> Message-ID: Please send you current code. So we may compile and run it. Barry On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: > Hi, > > I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. > > Thank you. > > Yours sincerely, > > TAY wee-beng > > On 21/4/2014 8:58 AM, Barry Smith wrote: >> Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. >> >> Barry >> >> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >> >>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>>>>>>> >>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>>>>>>> >>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>>>>>> Hmm, >>>>>>>> >>>>>>>> Interface DMDAVecGetArrayF90 >>>>>>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>>>>>> USE_DM_HIDE >>>>>>>> DM_HIDE da1 >>>>>>>> VEC_HIDE v >>>>>>>> PetscScalar,pointer :: d1(:,:,:) >>>>>>>> PetscErrorCode ierr >>>>>>>> End Subroutine >>>>>>>> >>>>>>>> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>>> Hi, >>>>>>>> >>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >>>>>>>> >>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >>>>>>>> >>>>>>>> Also, supposed I call: >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> >>>>>>>> u_array .... >>>>>>>> >>>>>>>> v_array .... etc >>>>>>>> >>>>>>>> Now to restore the array, does it matter the sequence they are restored? >>>>>>>> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >>>>>>>> >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>>>> >>>>>>>> u_array = 0.d0 >>>>>>>> >>>>>>>> v_array = 0.d0 >>>>>>>> >>>>>>>> w_array = 0.d0 >>>>>>>> >>>>>>>> p_array = 0.d0 >>>>>>>> >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>>>> >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>>>>>>> >>>>>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >>>>>>> Hi Matt, >>>>>>> >>>>>>> Do you mean putting the above lines into ex11f90.F and test? >>>>>>> >>>>>>> It already has DMDAVecGetArray(). Just run it. >>>>>> Hi, >>>>>> >>>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >>>>>> >>>>>> No the global/local difference should not matter. >>>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >>>>>> >>>>>> DMGetLocalVector() >>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >>>>> >>>>> If so, when should I call them? >>>>> >>>>> You just need a local vector from somewhere. >>> Hi, >>> >>> Anyone can help with the questions below? Still trying to find why my code doesn't work. >>> >>> Thanks. >>>> Hi, >>>> >>>> I insert part of my error region code into ex11f90: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> u_array = 0.d0 >>>> v_array = 0.d0 >>>> w_array = 0.d0 >>>> p_array = 0.d0 >>>> >>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> It worked w/o error. I'm going to change the way the modules are defined in my code. >>>> >>>> My code contains a main program and a number of modules files, with subroutines inside e.g. >>>> >>>> module solve >>>> <- add include file? >>>> subroutine RRK >>>> <- add include file? >>>> end subroutine RRK >>>> >>>> end module solve >>>> >>>> So where should the include files (#include ) be placed? >>>> >>>> After the module or inside the subroutine? >>>> >>>> Thanks. >>>>> Matt >>>>> Thanks. >>>>>> Matt >>>>>> Thanks. >>>>>>> Matt >>>>>>> Thanks >>>>>>> >>>>>>> Regards. >>>>>>>> Matt >>>>>>>> As in w, then v and u? >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> thanks >>>>>>>> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >>>>>>>> Hi, >>>>>>>> >>>>>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >>>>>>>> >>>>>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >>>>>>>> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >>>>>>>> >>>>>>>> >>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>>>>>> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> Thanks. >>>>>>>> Barry >>>>>>>> >>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >>>>>>>> >>>>>>>> However, by re-writing my code, I found out a few things: >>>>>>>> >>>>>>>> 1. if I write my code this way: >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> >>>>>>>> u_array = .... >>>>>>>> >>>>>>>> v_array = .... >>>>>>>> >>>>>>>> w_array = .... >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> The code runs fine. >>>>>>>> >>>>>>>> 2. if I write my code this way: >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> >>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>>>>>> >>>>>>>> where the subroutine is: >>>>>>>> >>>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>>> >>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>>> >>>>>>>> u ... >>>>>>>> v... >>>>>>>> w ... >>>>>>>> >>>>>>>> end subroutine uvw_array_change. >>>>>>>> >>>>>>>> The above will give an error at : >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> 3. Same as above, except I change the order of the last 3 lines to: >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> >>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> >>>>>>>> So they are now in reversed order. Now it works. >>>>>>>> >>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>>>>>> >>>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>>> >>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>>> >>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>>> >>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>>> >>>>>>>> u ... >>>>>>>> v... >>>>>>>> w ... >>>>>>>> >>>>>>>> end subroutine uvw_array_change. >>>>>>>> >>>>>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >>>>>>>> >>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >>>>>>>> >>>>>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >>>>>>>> >>>>>>>> Thank you. >>>>>>>> >>>>>>>> Yours sincerely, >>>>>>>> >>>>>>>> TAY wee-beng >>>>>>>> >>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>>>>>> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>>>> >>>>>>>> >>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>>>>>>> >>>>>>>> Hi Barry, >>>>>>>> >>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >>>>>>>> >>>>>>>> I have attached my code. >>>>>>>> >>>>>>>> Thank you >>>>>>>> >>>>>>>> Yours sincerely, >>>>>>>> >>>>>>>> TAY wee-beng >>>>>>>> >>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>>>>>> Please send the code that creates da_w and the declarations of w_array >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hi Barry, >>>>>>>> >>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>>>>>> >>>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>>> >>>>>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >>>>>>>> >>>>>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >>>>>>>> >>>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>>> -------------------------------------------------------------------------- >>>>>>>> An MPI process has executed an operation involving a call to the >>>>>>>> "fork()" system call to create a child process. Open MPI is currently >>>>>>>> operating in a condition that could result in memory corruption or >>>>>>>> other system errors; your MPI job may hang, crash, or produce silent >>>>>>>> data corruption. The use of fork() (or system() or other calls that >>>>>>>> create child processes) is strongly discouraged. >>>>>>>> >>>>>>>> The process that invoked fork was: >>>>>>>> >>>>>>>> Local host: n12-76 (PID 20235) >>>>>>>> MPI_COMM_WORLD rank: 2 >>>>>>>> >>>>>>>> If you are *absolutely sure* that your application will successfully >>>>>>>> and correctly survive a call to fork(), you may disable this warning >>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>>>>>> -------------------------------------------------------------------------- >>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >>>>>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>>>>>>> >>>>>>>> .... >>>>>>>> >>>>>>>> 1 >>>>>>>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>>> [1]PETSC ERROR: or see >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >>>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>>>>> [1]PETSC ERROR: to get more information on the crash. >>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>>>>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>>> [3]PETSC ERROR: or see >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >>>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>>>>> [3]PETSC ERROR: to get more information on the crash. >>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>>>>> >>>>>>>> ... >>>>>>>> Thank you. >>>>>>>> >>>>>>>> Yours sincerely, >>>>>>>> >>>>>>>> TAY wee-beng >>>>>>>> >>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>>>>>> >>>>>>>> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> This routines don?t have any parallel communication in them so are unlikely to hang. >>>>>>>> >>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >>>>>>>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> -- >>>>>>>> Thank you. >>>>>>>> >>>>>>>> Yours sincerely, >>>>>>>> >>>>>>>> TAY wee-beng >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener > From talebi.hossein at gmail.com Tue May 13 11:07:58 2014 From: talebi.hossein at gmail.com (Hossein Talebi) Date: Tue, 13 May 2014 18:07:58 +0200 Subject: [petsc-users] PetscLayoutCreate for Fortran Message-ID: Hi All, I am using PETSC from Fortran. I would like to define my own layout i.e. which row belongs to which CPU since I have already done the domain decomposition. It appears that "PetscLayoutCreate" and the other routine do this. But in the manual it says it is not provided in Fortran. Is there any way that I can do this using Fortran? Anyone has an example? Cheers Hossein -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 13 11:36:38 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 May 2014 11:36:38 -0500 Subject: [petsc-users] PetscLayoutCreate for Fortran In-Reply-To: References: Message-ID: On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi wrote: > Hi All, > > > I am using PETSC from Fortran. I would like to define my own layout i.e. > which row belongs to which CPU since I have already done the domain > decomposition. It appears that "PetscLayoutCreate" and the other > routine do this. But in the manual it says it is not provided in Fortran. > > Is there any way that I can do this using Fortran? Anyone has an example? > You can do this for Vec and Mat directly. Do you want it for something else? Thanks, Matt > Cheers > Hossein > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From talebi.hossein at gmail.com Tue May 13 11:42:51 2014 From: talebi.hossein at gmail.com (Hossein Talebi) Date: Tue, 13 May 2014 18:42:51 +0200 Subject: [petsc-users] PetscLayoutCreate for Fortran In-Reply-To: References: Message-ID: I have already decomposed the Finite Element system using Metis. I just need to have the global rows exactly like how I define and I like to have the answer in the same layout so I don't have to move things around the processes again. No, I don't need it for something else. Cheers Hossein On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley wrote: > On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi > wrote: > >> Hi All, >> >> >> I am using PETSC from Fortran. I would like to define my own layout i.e. >> which row belongs to which CPU since I have already done the domain >> decomposition. It appears that "PetscLayoutCreate" and the other >> routine do this. But in the manual it says it is not provided in Fortran. >> >> Is there any way that I can do this using Fortran? Anyone has an example? >> > > You can do this for Vec and Mat directly. Do you want it for something > else? > > Thanks, > > Matt > > >> Cheers >> Hossein >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- www.permix.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 13 11:45:23 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 May 2014 11:45:23 -0500 Subject: [petsc-users] PetscLayoutCreate for Fortran In-Reply-To: References: Message-ID: On Tue, May 13, 2014 at 11:42 AM, Hossein Talebi wrote: > > I have already decomposed the Finite Element system using Metis. I just > need to have the global rows exactly like how I define and I like to have > the answer in the same layout so I don't have to move things around the > processes again. > > No, I don't need it for something else. > PetscLayout is only for contiguous sets of indices. If you want to distribute them, you need to use VecScatter. Thanks, Matt > Cheers > Hossein > > > > > On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley wrote: > >> On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi < >> talebi.hossein at gmail.com> wrote: >> >>> Hi All, >>> >>> >>> I am using PETSC from Fortran. I would like to define my own layout i.e. >>> which row belongs to which CPU since I have already done the domain >>> decomposition. It appears that "PetscLayoutCreate" and the other >>> routine do this. But in the manual it says it is not provided in Fortran. >>> >>> Is there any way that I can do this using Fortran? Anyone has an example? >>> >> >> You can do this for Vec and Mat directly. Do you want it for something >> else? >> >> Thanks, >> >> Matt >> >> >>> Cheers >>> Hossein >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > www.permix.org > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Vincent.De-Groof at uibk.ac.at Tue May 13 12:13:29 2014 From: Vincent.De-Groof at uibk.ac.at (De Groof, Vincent Frans Maria) Date: Tue, 13 May 2014 17:13:29 +0000 Subject: [petsc-users] Memory usage during matrix factorization Message-ID: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at> Hi, I'm investigating the performance of a few different direct solvers and I'd like to compare the memory requirements of the different solvers and orderings. I am especially interested in the memory usage necessary to store the factored matrix. I experimented with the PetscMemoryGetCurrentUsage and PetscMemoryGetMaximumUsage before and after KSPSolve. But these seem to return the memory usage on 1 process and not the total memory usage. Is this correct? I also noticed that the difference in maximum memory usage is very small before and after KSPSolve. Does it register the memory usage in external packages? thanks, Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Tue May 13 12:17:26 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Tue, 13 May 2014 10:17:26 -0700 (PDT) Subject: [petsc-users] ILUTP in PETSc In-Reply-To: <8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov> References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com> <8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov> Message-ID: <1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com> I?tried to use command line options as the example suggested ('-ksp_type preonly -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8') without changing my source code, but then the call to KSPSetUp returned error number 56. ? Does this mean I still need to change the source code (such as adding calls to PCFactorSetMatSolverPackage, PCFactorGetMatrix, etc.)in addition to the command line options? ? I ask this since?the use of SuperLU seems to be different from using Hypre, which can?be invoked with command line options?without changing source code. ? Thanks a lot, Qin? ----- Original Message ----- From: Barry Smith To: Qin Lu Cc: Xiaoye S. Li ; "petsc-users at mcs.anl.gov" Sent: Monday, May 12, 2014 5:11 PM Subject: Re: [petsc-users] ILUTP in PETSc ? See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html On May 12, 2014, at 4:54 PM, Qin Lu wrote: > Hello, > > I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?) >? > Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work?? >? > Many thanks, > Qin? > > >? From: Xiaoye S. Li > To: Barry Smith > Cc: Qin Lu ; "petsc-users at mcs.anl.gov" > Sent: Friday, May 2, 2014 3:40 PM > Subject: Re: [petsc-users] ILUTP in PETSc > > > > The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily.? > > In SuperLU distribution: > >? EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c) > >? SRC/zgsitrf.c : the actual ILUTP factorization routine > > > Sherry Li > > > > On Fri, May 2, 2014 at 12:25 PM, Barry Smith wrote: > > >> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html there are two listed. ./configure ?download-hypre >> >> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid >> >> you can also add -help to see what options are available. >> >>? Both pretty much suck and I can?t image much reason for using them. >> >>? ? Barry >> >> >> >> On May 2, 2014, at 10:27 AM, Qin Lu wrote: >> >>> Hello, >>> >>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthat mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? >>> >>> Many thanks, >>> Qin >> >>? ? ? From talebi.hossein at gmail.com Tue May 13 12:47:49 2014 From: talebi.hossein at gmail.com (Hossein Talebi) Date: Tue, 13 May 2014 19:47:49 +0200 Subject: [petsc-users] PetscLayoutCreate for Fortran In-Reply-To: References: Message-ID: Thank you. If I understand correctly, before inserting the values into the Mat and Vec, I should call VecScatter as in the ''ex30f.F" example to set Vec and Mat with the new indexes, right? On Tue, May 13, 2014 at 6:45 PM, Matthew Knepley wrote: > On Tue, May 13, 2014 at 11:42 AM, Hossein Talebi > wrote: > >> >> I have already decomposed the Finite Element system using Metis. I just >> need to have the global rows exactly like how I define and I like to have >> the answer in the same layout so I don't have to move things around the >> processes again. >> >> No, I don't need it for something else. >> > > PetscLayout is only for contiguous sets of indices. If you want to > distribute them, you need to use VecScatter. > > Thanks, > > Matt > > >> Cheers >> Hossein >> >> >> >> >> On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley wrote: >> >>> On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi < >>> talebi.hossein at gmail.com> wrote: >>> >>>> Hi All, >>>> >>>> >>>> I am using PETSC from Fortran. I would like to define my own layout >>>> i.e. which row belongs to which CPU since I have already done the domain >>>> decomposition. It appears that "PetscLayoutCreate" and the other >>>> routine do this. But in the manual it says it is not provided in Fortran. >>>> >>>> Is there any way that I can do this using Fortran? Anyone has an >>>> example? >>>> >>> >>> You can do this for Vec and Mat directly. Do you want it for something >>> else? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Cheers >>>> Hossein >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> www.permix.org >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- www.permix.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From atmmachado at gmail.com Tue May 13 12:54:15 2014 From: atmmachado at gmail.com (=?UTF-8?B?QW5kcsOpIFRpbcOzdGhlbw==?=) Date: Tue, 13 May 2014 14:54:15 -0300 Subject: [petsc-users] help: petsc-dev + petsc4py acessing the tao optimizations solvers ? Message-ID: I read about the merger of the TAO solvers on the PETSC-DEV. How can I use the TAO's constrainded optimization solver on the PETSC-DEV (via petsc4py)? Can you show me some simple python script to deal with classical linear constrainded optimization problems like: minimize sum(x) subject to x >= 0 and Ax = b Thanks for your time. Andre -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 13 13:11:58 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 May 2014 13:11:58 -0500 Subject: [petsc-users] PetscLayoutCreate for Fortran In-Reply-To: References: Message-ID: On Tue, May 13, 2014 at 12:47 PM, Hossein Talebi wrote: > Thank you. > > If I understand correctly, before inserting the values into the Mat and > Vec, I should call VecScatter as in the ''ex30f.F" example to set Vec and > Mat with the new indexes, right? > VecScatter is a way to send information among processes, so if you need to reorganize your information before inserting into the Vec, then yes you would use it. Thanks, Matt > > On Tue, May 13, 2014 at 6:45 PM, Matthew Knepley wrote: > >> On Tue, May 13, 2014 at 11:42 AM, Hossein Talebi < >> talebi.hossein at gmail.com> wrote: >> >>> >>> I have already decomposed the Finite Element system using Metis. I just >>> need to have the global rows exactly like how I define and I like to have >>> the answer in the same layout so I don't have to move things around the >>> processes again. >>> >>> No, I don't need it for something else. >>> >> >> PetscLayout is only for contiguous sets of indices. If you want to >> distribute them, you need to use VecScatter. >> >> Thanks, >> >> Matt >> >> >>> Cheers >>> Hossein >>> >>> >>> >>> >>> On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley wrote: >>> >>>> On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi < >>>> talebi.hossein at gmail.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> >>>>> I am using PETSC from Fortran. I would like to define my own layout >>>>> i.e. which row belongs to which CPU since I have already done the domain >>>>> decomposition. It appears that "PetscLayoutCreate" and the other >>>>> routine do this. But in the manual it says it is not provided in Fortran. >>>>> >>>>> Is there any way that I can do this using Fortran? Anyone has an >>>>> example? >>>>> >>>> >>>> You can do this for Vec and Mat directly. Do you want it for something >>>> else? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Cheers >>>>> Hossein >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >>> >>> -- >>> www.permix.org >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > www.permix.org > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue May 13 12:45:12 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 13 May 2014 11:45:12 -0600 Subject: [petsc-users] Get wrong answer when use multi-process In-Reply-To: References: Message-ID: <87bnv1a7yv.fsf@jedbrown.org> ??? writes: > Hi, all > > I am confused about my code, for it could return right answer when I use 1 > process, but return totally wrong answer when more than 1 process. > > This is how I feed data to it. > > I have a CSR matrix, represented by Ap(pointer),Ai(index),and temp(data). This matrix is stored redundantly on each process? You should run with valgrind and confirm that you assemble the same matrix in parallel before worrying about solvers. > First determine local matrix for each process. Then feed data to them. > > int temprank,localsize,line_pos; > line_pos = 0; > if(rank == 0) > { > localsize = size/pro + ((size % pro) > rank); > } > else > { > for (temprank = 0;temprank { > localsize = size/pro + ((size % pro) > temprank); > line_pos += localsize; > } > } > > Lin_index = new int [localsize+1]; > for(i=0;i { > Lin_index [i] = Ap[line_pos+i]-Ap[line_pos]; > } > std::cerr<<"line_pos "< MatMPIAIJSetPreallocationCSR(A,Lin_index,Ai+line_pos,temp+line_pos); > > I use spectral transform with MATSOLVERMUMPS to calculate eigenvalue. > > > The strange thing is, when I run it with one process, the eigenvalue is > what I want, typically, > (8.39485e+13,5.3263) (3.93842e+13,-82.6948) first two. > But for 2 process: > eigenvalue (2.76523e+13,7.62222e+12) > eigenvalue (2.76523e+13,-7.62222e+12) > > 3 process: > eigenvalue (6.81292e+13,-3071.82) > eigenvalue (3.49533e+13,2.48858e+13) > > 4 > eigenvalue (9.7562e+13,5012.4) > eigenvalue (7.2019e+13,8.28561e+13) > > However, it could pass simple test like > int n = 12; > int nz = 12; > int Ap[13] = {0,1,2,3,4,5,6,7,8,9,10,11,12}; > int Ai[12] = { 0,1,2,3,4,5,6,7,8,9,10,11}; > double Ax[12] = {-1,-2,-3,-4,-5,6,7,8,9,10,11}; > double Az[12] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0}; > > > Do you have any idea about it? > > Thanks a lot!! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From jaolive at MIT.EDU Tue May 13 13:20:07 2014 From: jaolive at MIT.EDU (Jean-Arthur Louis Olive) Date: Tue, 13 May 2014 18:20:07 +0000 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve References: <53725A86.3070804@uidaho.edu> Message-ID: <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> Hi all, we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations. So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below). RESIDUAL 1 (NO COUPLING): for (j=info->ys; jys+info->ym; j++) { for (i=info->xs; ixs+info->xm; i++) { f[j][i].P = x[j][i].P - 3000000; f[j][i].vx= 2*x[j][i].vx; f[j][i].vy= 3*x[j][i].vy - 2; f[j][i].T = x[j][i].T; } RESIDUAL 2 (ONE COUPLING TERM): for (j=info->ys; jys+info->ym; j++) { for (i=info->xs; ixs+info->xm; i++) { f[j][i].P = x[j][i].P - 3; f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; f[j][i].vy= x[j][i].vy - 2; f[j][i].T = x[j][i].T; } } and our default set of options is: OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below: Result from Solve - RESIDUAL 1 0 SNES Function norm 8.485281374240e+07 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 1 SNES Function norm 1.131370849896e+02 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 2 SNES Function norm 1.131370849896e+02 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 With the coupled residual (Residual 2), the norms do not match, see below Result from Solve - RESIDUAL 2: 0 SNES Function norm 1.019803902719e+02 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 1 SNES Function norm 1.697056274848e+02 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 2 SNES Function norm 3.236770473841e-07 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second. Result from Solve with -snes_fd - RESIDUAL 2 0 SNES Function norm 8.485281374240e+07 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 1 SNES Function norm 2.039607805429e+02 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 3 SNES Function norm 2.549509757105e+01 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms? Thanks a lot, Arthur and Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1855 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue May 13 13:20:23 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 13:20:23 -0500 Subject: [petsc-users] Get wrong answer when use multi-process In-Reply-To: References: Message-ID: 1) make sure the matrix is the same with 1 and 2 processes. Once the matrix is built you can call MatView(mat,NULL) to display it. 2) once the matrices are the same make sure the eigensolver converges in both cases. Barry On May 13, 2014, at 2:28 AM, ??? wrote: > Hi, all > > I am confused about my code, for it could return right answer when I use 1 process, but return totally wrong answer when more than 1 process. > > This is how I feed data to it. > > I have a CSR matrix, represented by Ap(pointer),Ai(index),and temp(data). > > First determine local matrix for each process. Then feed data to them. > > int temprank,localsize,line_pos; > line_pos = 0; > if(rank == 0) > { > localsize = size/pro + ((size % pro) > rank); > } > else > { > for (temprank = 0;temprank { > localsize = size/pro + ((size % pro) > temprank); > line_pos += localsize; > } > } > > Lin_index = new int [localsize+1]; > for(i=0;i { > Lin_index [i] = Ap[line_pos+i]-Ap[line_pos]; > } > std::cerr<<"line_pos "< MatMPIAIJSetPreallocationCSR(A,Lin_index,Ai+line_pos,temp+line_pos); > > I use spectral transform with MATSOLVERMUMPS to calculate eigenvalue. > > > The strange thing is, when I run it with one process, the eigenvalue is what I want, typically, > (8.39485e+13,5.3263) (3.93842e+13,-82.6948) first two. > But for 2 process: > eigenvalue (2.76523e+13,7.62222e+12) > eigenvalue (2.76523e+13,-7.62222e+12) > > 3 process: > eigenvalue (6.81292e+13,-3071.82) > eigenvalue (3.49533e+13,2.48858e+13) > > 4 > eigenvalue (9.7562e+13,5012.4) > eigenvalue (7.2019e+13,8.28561e+13) > > However, it could pass simple test like > int n = 12; > int nz = 12; > int Ap[13] = {0,1,2,3,4,5,6,7,8,9,10,11,12}; > int Ai[12] = { 0,1,2,3,4,5,6,7,8,9,10,11}; > double Ax[12] = {-1,-2,-3,-4,-5,6,7,8,9,10,11}; > double Az[12] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0}; > > > Do you have any idea about it? > > Thanks a lot!! > > From bsmith at mcs.anl.gov Tue May 13 13:28:22 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 13:28:22 -0500 Subject: [petsc-users] Memory usage during matrix factorization In-Reply-To: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at> References: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at> Message-ID: These return what the operating system reports is being used by the process so it includes any external packages. It is for a single process if you want the value over a set of processes then use MP_Allreduce() to sum them up. Barry Here is the code: it is only as reliable as the OS is at reporting the values. #if defined(PETSC_USE_PROCFS_FOR_SIZE) sprintf(proc,"/proc/%d",(int)getpid()); if ((fd = open(proc,O_RDONLY)) == -1) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_FILE_OPEN,"Unable to access system file %s to get memory usage data",file); if (ioctl(fd,PIOCPSINFO,&prusage) == -1) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_FILE_READ,"Unable to access system file %s to get memory usage data",file); *mem = (PetscLogDouble)prusage.pr_byrssize; close(fd); #elif defined(PETSC_USE_SBREAK_FOR_SIZE) *mem = (PetscLogDouble)(8*fd - 4294967296); /* 2^32 - upper bits */ #elif defined(PETSC_USE_PROC_FOR_SIZE) && defined(PETSC_HAVE_GETPAGESIZE) sprintf(proc,"/proc/%d/statm",(int)getpid()); if (!(file = fopen(proc,"r"))) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_FILE_OPEN,"Unable to access system file %s to get memory usage data",proc); if (fscanf(file,"%d %d",&mm,&rss) != 2) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_SYS,"Failed to read two integers (mm and rss) from %s",proc); *mem = ((PetscLogDouble)rss) * ((PetscLogDouble)getpagesize()); err = fclose(file); if (err) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_SYS,"fclose() failed on file"); #elif defined(PETSC_HAVE_GETRUSAGE) getrusage(RUSAGE_SELF,&temp); #if defined(PETSC_USE_KBYTES_FOR_SIZE) *mem = 1024.0 * ((PetscLogDouble)temp.ru_maxrss); #elif defined(PETSC_USE_PAGES_FOR_SIZE) && defined(PETSC_HAVE_GETPAGESIZE) *mem = ((PetscLogDouble)getpagesize())*((PetscLogDouble)temp.ru_maxrss); #else *mem = temp.ru_maxrss; #endif On May 13, 2014, at 12:13 PM, De Groof, Vincent Frans Maria wrote: > Hi, > > > I'm investigating the performance of a few different direct solvers and I'd like to compare the memory requirements of the different solvers and orderings. I am especially interested in the memory usage necessary to store the factored matrix. > > I experimented with the PetscMemoryGetCurrentUsage and PetscMemoryGetMaximumUsage before and after KSPSolve. But these seem to return the memory usage on 1 process and not the total memory usage. Is this correct? I also noticed that the difference in maximum memory usage is very small before and after KSPSolve. Does it register the memory usage in external packages? > > > > thanks, > Vincent From jed at jedbrown.org Tue May 13 17:59:11 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 13 May 2014 16:59:11 -0600 Subject: [petsc-users] Memory usage during matrix factorization In-Reply-To: References: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at> Message-ID: <87vbt98ev4.fsf@jedbrown.org> Barry Smith writes: > Here is the code: it is only as reliable as the OS is at reporting the values. HPC vendors have a habit of implementing these functions to return nonsense. Sometimes they provide non-standard functions to return useful information. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From gideon.simpson at gmail.com Tue May 13 19:16:08 2014 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 13 May 2014 20:16:08 -0400 Subject: [petsc-users] configuration on cluster with intel compilers/mkl Message-ID: I?m trying to set up petsc on an intel cluster that has the intel compilers and the MKL, but when I try to configure, I get the error, TESTING: checkLib from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:113) ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- You set a value for --with-blas-lapack-lib=, but ['/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a'] cannot be used -gideon -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue May 13 19:55:28 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 19:55:28 -0500 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve In-Reply-To: <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> References: <53725A86.3070804@uidaho.edu> <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> Message-ID: <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> What do you mean by ?''the default ?coloring? method??? If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below. Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian. Barry On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive wrote: > Hi all, > we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations. > > So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below). > > RESIDUAL 1 (NO COUPLING): > for (j=info->ys; jys+info->ym; j++) { > for (i=info->xs; ixs+info->xm; i++) { > f[j][i].P = x[j][i].P - 3000000; > f[j][i].vx= 2*x[j][i].vx; > f[j][i].vy= 3*x[j][i].vy - 2; > f[j][i].T = x[j][i].T; > } > > RESIDUAL 2 (ONE COUPLING TERM): > for (j=info->ys; jys+info->ym; j++) { > for (i=info->xs; ixs+info->xm; i++) { > f[j][i].P = x[j][i].P - 3; > f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; > f[j][i].vy= x[j][i].vy - 2; > f[j][i].T = x[j][i].T; > } > } > > > and our default set of options is: > > > OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp > > > With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below: > > > Result from Solve - RESIDUAL 1 > 0 SNES Function norm 8.485281374240e+07 > 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 > 1 SNES Function norm 1.131370849896e+02 > 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 > 2 SNES Function norm 1.131370849896e+02 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 > > > With the coupled residual (Residual 2), the norms do not match, see below > > > Result from Solve - RESIDUAL 2: > 0 SNES Function norm 1.019803902719e+02 > 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 > 1 SNES Function norm 1.697056274848e+02 > 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 > 2 SNES Function norm 3.236770473841e-07 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > > > Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second. > > > Result from Solve with -snes_fd - RESIDUAL 2 > 0 SNES Function norm 8.485281374240e+07 > 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 > 1 SNES Function norm 2.039607805429e+02 > 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 > 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] > 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 > 3 SNES Function norm 2.549509757105e+01 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 > > > Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms? > > Thanks a lot, > Arthur and Eric From bsmith at mcs.anl.gov Tue May 13 19:56:27 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 19:56:27 -0500 Subject: [petsc-users] configuration on cluster with intel compilers/mkl In-Reply-To: References: Message-ID: You always need to send configure.log so we can see why the library was unacceptable. On May 13, 2014, at 7:16 PM, Gideon Simpson wrote: > I?m trying to set up petsc on an intel cluster that has the intel compilers and the MKL, but when I try to configure, I get the error, > > TESTING: checkLib from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:113) ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > ------------------------------------------------------------------------------- > You set a value for --with-blas-lapack-lib=, but ['/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a'] cannot be used > > > > -gideon > From gideon.simpson at gmail.com Tue May 13 19:59:42 2014 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 13 May 2014 20:59:42 -0400 Subject: [petsc-users] configuration on cluster with intel compilers/mkl In-Reply-To: References: Message-ID: Log attached, -gideon On May 13, 2014, at 8:56 PM, Barry Smith wrote: > > You always need to send configure.log so we can see why the library was unacceptable. > > On May 13, 2014, at 7:16 PM, Gideon Simpson wrote: > >> I?m trying to set up petsc on an intel cluster that has the intel compilers and the MKL, but when I try to configure, I get the error, >> >> TESTING: checkLib from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:113) ******************************************************************************* >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): >> ------------------------------------------------------------------------------- >> You set a value for --with-blas-lapack-lib=, but ['/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a'] cannot be used >> >> >> >> -gideon >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2137370 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue May 13 20:06:39 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 20:06:39 -0500 Subject: [petsc-users] PetscLayoutCreate for Fortran In-Reply-To: References: Message-ID: <44B5B5E9-5F44-4649-A01E-003189B05AD4@mcs.anl.gov> On May 13, 2014, at 11:42 AM, Hossein Talebi wrote: > > I have already decomposed the Finite Element system using Metis. I just need to have the global rows exactly like how I define and I like to have the answer in the same layout so I don't have to move things around the processes again. Metis tells you a good partitioning IT DOES NOT MOVE the elements to form a good partitioning. Do you move the elements around based on what metis told you and similarly do you renumber the elements (and vertices) to be contiquously numbered on each process with the first process getting the first set of numbers, the second process the second set of numbers etc? If you do all that then when you create Vec and Mat you should simply set the local size (based on the number of local vertices on each process). You never need to use PetscLayoutCreate and in fact if your code was in C you would never use PetscLayoutCreate() If you do not do all that then you need to do that first before you start calling PETSc. Barry > > No, I don't need it for something else. > > Cheers > Hossein > > > > > On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley wrote: > On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi wrote: > Hi All, > > > I am using PETSC from Fortran. I would like to define my own layout i.e. which row belongs to which CPU since I have already done the domain decomposition. It appears that "PetscLayoutCreate" and the other routine do this. But in the manual it says it is not provided in Fortran. > > Is there any way that I can do this using Fortran? Anyone has an example? > > You can do this for Vec and Mat directly. Do you want it for something else? > > Thanks, > > Matt > > Cheers > Hossein > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > -- > www.permix.org From bsmith at mcs.anl.gov Tue May 13 20:14:16 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 20:14:16 -0500 Subject: [petsc-users] configuration on cluster with intel compilers/mkl In-Reply-To: References: Message-ID: Your MPI compiler is using 32 bit pointers (why?) TEST configureCompilerFlags from config.compilerFlags(/home/simpson/software/petsc-intel/config/BuildSystem/config/compilerFlags.py:65) TESTING: configureCompilerFlags from config.compilerFlags(config/BuildSystem/config/compilerFlags.py:65) Get the default compiler flags Pushing language C sh: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc --version Executing: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc --version sh: gcc (GCC) 4.8.1 Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. getCompilerVersion: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc gcc (GCC) 4.8.1 sh: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc -show Executing: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc -show sh: gcc -m32 -I/cm/shared/apps/intel/mpi/4.1.1.036/ia32/include -L/cm/shared/apps/intel/mpi/4.1.1.036/ia32/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /cm/shared/apps/intel/mpi/4.1.1.036/ia32/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/4.1 -lmpigf -lmpi -lmpigi -ldl -lrt -lpthread But you ask to use 64 bit pointer MKL libraries with /cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a Cannot be done. Either use a 64 bit pointer mpicc or a 32 bit pointer mkl library. Barry On May 13, 2014, at 7:59 PM, Gideon Simpson wrote: > Log attached, > -gideon > > On May 13, 2014, at 8:56 PM, Barry Smith wrote: > >> >> You always need to send configure.log so we can see why the library was unacceptable. >> >> On May 13, 2014, at 7:16 PM, Gideon Simpson wrote: >> >>> I?m trying to set up petsc on an intel cluster that has the intel compilers and the MKL, but when I try to configure, I get the error, >>> >>> TESTING: checkLib from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:113) ******************************************************************************* >>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): >>> ------------------------------------------------------------------------------- >>> You set a value for --with-blas-lapack-lib=, but ['/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a'] cannot be used >>> >>> >>> >>> -gideon >>> >> > > From knepley at gmail.com Tue May 13 20:27:38 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 May 2014 20:27:38 -0500 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve In-Reply-To: <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> References: <53725A86.3070804@uidaho.edu> <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> Message-ID: On Tue, May 13, 2014 at 7:55 PM, Barry Smith wrote: > > What do you mean by ?''the default ?coloring? method??? > > If you are using DMDA and either DMGetColoring or the SNESSetDM > approach and dof is 4 then we color each of the 4 variables per grid point > with a different color so coupling between variables within a grid point is > not a problem. This would not explain the problem you are seeing below. > > Run your code with -snes_type test and read the results and follow the > directions to debug your Jacobian. I think there may actually be a bug with the coloring for unstructured grids. I am distilling it down to a nice test case. Matt > > Barry > > > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive > wrote: > > > Hi all, > > we are using PETSc to solve the steady state Stokes equations with > non-linear viscosities using finite difference. Recently we have realized > that our true residual norm after the last KSP solve did not match next > SNES function norm when solving the linear Stokes equations. > > > > So to understand this better, we set up two extremely simple linear > residuals, one with no coupling between variables (vx, vy, P and T), the > other with one coupling term (shown below). > > > > RESIDUAL 1 (NO COUPLING): > > for (j=info->ys; jys+info->ym; j++) { > > for (i=info->xs; ixs+info->xm; i++) { > > f[j][i].P = x[j][i].P - 3000000; > > f[j][i].vx= 2*x[j][i].vx; > > f[j][i].vy= 3*x[j][i].vy - 2; > > f[j][i].T = x[j][i].T; > > } > > > > RESIDUAL 2 (ONE COUPLING TERM): > > for (j=info->ys; jys+info->ym; j++) { > > for (i=info->xs; ixs+info->xm; i++) { > > f[j][i].P = x[j][i].P - 3; > > f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; > > f[j][i].vy= x[j][i].vy - 2; > > f[j][i].T = x[j][i].T; > > } > > } > > > > > > and our default set of options is: > > > > > > OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 > -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor > -snes_converged_reason -snes_view -log_summary -options_left 1 > -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp > > > > > > With the uncoupled residual (Residual 1), we get matching KSP and SNES > norm, highlighted below: > > > > > > Result from Solve - RESIDUAL 1 > > 0 SNES Function norm 8.485281374240e+07 > > 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm > 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm > 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 > > 1 SNES Function norm 1.131370849896e+02 > > 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm > 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 2 SNES Function norm 1.131370849896e+02 > > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 > > > > > > With the coupled residual (Residual 2), the norms do not match, see below > > > > > > Result from Solve - RESIDUAL 2: > > 0 SNES Function norm 1.019803902719e+02 > > 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm > 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm > 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 > > 1 SNES Function norm 1.697056274848e+02 > > 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm > 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm > 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 > > 2 SNES Function norm 3.236770473841e-07 > > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > > > > > > Lastly, if we add -snes_fd to our options, the norms for residual 2 get > better - they match after the first iteration but not after the second. > > > > > > Result from Solve with -snes_fd - RESIDUAL 2 > > 0 SNES Function norm 8.485281374240e+07 > > 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm > 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm > 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 > > 1 SNES Function norm 2.039607805429e+02 > > 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm > 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm > 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 > > 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] > > 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm > 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 > > 3 SNES Function norm 2.549509757105e+01 > > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 > > > > > > Does this mean that our Jacobian is not approximated properly by the > default ?coloring? method when it has off-diagonal terms? > > > > Thanks a lot, > > Arthur and Eric > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue May 13 20:31:49 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 20:31:49 -0500 Subject: [petsc-users] ILUTP in PETSc In-Reply-To: <1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com> References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com> <8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov> <1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com> Message-ID: Works fine for me. Please please please ALWAYS cut and paste the entire error message that is printed. We print the information for a reason, because it provides clues as to what went wrong. ./ex10 -f0 ~/Datafiles/Matrices/arco1 -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8 -ksp_monitor_true_residual -ksp_rtol 1.e-12 -ksp_view 0 KSP preconditioned resid norm 2.544968580491e+03 true resid norm 7.410897708964e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.467110329809e-06 true resid norm 1.439993537311e-07 ||r(i)||/||b|| 1.943075716143e-08 2 KSP preconditioned resid norm 1.522204461523e-12 true resid norm 2.699724724531e-11 ||r(i)||/||b|| 3.642911871885e-12 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 0, needed 0 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=1501, cols=1501 package used to perform factorization: superlu total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 SuperLU run parameters: Equil: YES ColPerm: 3 IterRefine: 0 SymmetricMode: NO DiagPivotThresh: 0.1 PivotGrowth: NO ConditionNumber: NO RowPerm: 1 ReplaceTinyPivot: NO PrintStat: NO lwork: 0 ILU_DropTol: 1e-08 ILU_FillTol: 0.01 ILU_FillFactor: 10 ILU_DropRule: 9 ILU_Norm: 2 ILU_MILU: 0 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1501, cols=1501 total: nonzeros=26131, allocated nonzeros=26131 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 501 nodes, limit used is 5 Number of iterations = 2 Residual norm 2.69972e-11 ~/Src/petsc/src/ksp/ksp/examples/tutorials master On May 13, 2014, at 12:17 PM, Qin Lu wrote: > I tried to use command line options as the example suggested ('-ksp_type preonly -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8') without changing my source code, but then the call to KSPSetUp returned error number 56. > > Does this mean I still need to change the source code (such as adding calls to PCFactorSetMatSolverPackage, PCFactorGetMatrix, etc.)in addition to the command line options? > > I ask this since the use of SuperLU seems to be different from using Hypre, which can be invoked with command line options without changing source code. > > Thanks a lot, > Qin > > > ----- Original Message ----- > From: Barry Smith > To: Qin Lu > Cc: Xiaoye S. Li ; "petsc-users at mcs.anl.gov" > Sent: Monday, May 12, 2014 5:11 PM > Subject: Re: [petsc-users] ILUTP in PETSc > > > See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html > > > > > > On May 12, 2014, at 4:54 PM, Qin Lu wrote: > >> Hello, >> >> I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?) >> >> Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work? >> >> Many thanks, >> Qin >> >> >> From: Xiaoye S. Li >> To: Barry Smith >> Cc: Qin Lu ; "petsc-users at mcs.anl.gov" >> Sent: Friday, May 2, 2014 3:40 PM >> Subject: Re: [petsc-users] ILUTP in PETSc >> >> >> >> The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily. >> >> In SuperLU distribution: >> >> EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c) >> >> SRC/zgsitrf.c : the actual ILUTP factorization routine >> >> >> Sherry Li >> >> >> >> On Fri, May 2, 2014 at 12:25 PM, Barry Smith wrote: >> >> >>> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html there are two listed. ./configure ?download-hypre >>> >>> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid >>> >>> you can also add -help to see what options are available. >>> >>> Both pretty much suck and I can?t image much reason for using them. >>> >>> Barry >>> >>> >>> >>> On May 2, 2014, at 10:27 AM, Qin Lu wrote: >>> >>>> Hello, >>>> >>>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthat mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? >>>> >>>> Many thanks, >>>> Qin >>> >>> From bsmith at mcs.anl.gov Tue May 13 20:38:51 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 20:38:51 -0500 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve In-Reply-To: References: <53725A86.3070804@uidaho.edu> <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> Message-ID: <91410EE5-8E83-4BEC-95C1-D3BA72BDABA7@mcs.anl.gov> Matt, The code fragments they sent sure look like they are using DMDA 2d and they talk about finite differences. Barry I am sure there are bugs in the unstructured grids code also :-) On May 13, 2014, at 8:27 PM, Matthew Knepley wrote: > On Tue, May 13, 2014 at 7:55 PM, Barry Smith wrote: > > What do you mean by ?''the default ?coloring? method??? > > If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below. > > Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian. > > I think there may actually be a bug with the coloring for unstructured grids. I am distilling it down to a nice test case. > > Matt > > > Barry > > > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive wrote: > > > Hi all, > > we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations. > > > > So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below). > > > > RESIDUAL 1 (NO COUPLING): > > for (j=info->ys; jys+info->ym; j++) { > > for (i=info->xs; ixs+info->xm; i++) { > > f[j][i].P = x[j][i].P - 3000000; > > f[j][i].vx= 2*x[j][i].vx; > > f[j][i].vy= 3*x[j][i].vy - 2; > > f[j][i].T = x[j][i].T; > > } > > > > RESIDUAL 2 (ONE COUPLING TERM): > > for (j=info->ys; jys+info->ym; j++) { > > for (i=info->xs; ixs+info->xm; i++) { > > f[j][i].P = x[j][i].P - 3; > > f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; > > f[j][i].vy= x[j][i].vy - 2; > > f[j][i].T = x[j][i].T; > > } > > } > > > > > > and our default set of options is: > > > > > > OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp > > > > > > With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below: > > > > > > Result from Solve - RESIDUAL 1 > > 0 SNES Function norm 8.485281374240e+07 > > 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 > > 1 SNES Function norm 1.131370849896e+02 > > 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 2 SNES Function norm 1.131370849896e+02 > > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 > > > > > > With the coupled residual (Residual 2), the norms do not match, see below > > > > > > Result from Solve - RESIDUAL 2: > > 0 SNES Function norm 1.019803902719e+02 > > 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 > > 1 SNES Function norm 1.697056274848e+02 > > 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 > > 2 SNES Function norm 3.236770473841e-07 > > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > > > > > > Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second. > > > > > > Result from Solve with -snes_fd - RESIDUAL 2 > > 0 SNES Function norm 8.485281374240e+07 > > 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 > > 1 SNES Function norm 2.039607805429e+02 > > 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 > > 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] > > 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 > > 3 SNES Function norm 2.549509757105e+01 > > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 > > > > > > Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms? > > > > Thanks a lot, > > Arthur and Eric > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From bsmith at mcs.anl.gov Tue May 13 21:00:50 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 13 May 2014 21:00:50 -0500 Subject: [petsc-users] Problem with MatZeroRowsColumnsIS() In-Reply-To: <57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de> References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de> <57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de> Message-ID: <1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov> Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to be wrong. You wrote Calculating the norm2 of the residuals defined above in each case gives: MatZeroRowsIS() 1cpu: norm(res,2) = 0 MatZeroRowsIS() 4cpu: norm(res,2) = 0 MatZeroRowsColumnsIS() 1cpu: norm(res,2) = 1.6880e-10 MatZeroRowsColumnsIS() 4cpu: norm(res,2) = 7.3786e+06 why do you conclude this is wrong? MatZeroRowsColumnsIS() IS suppose to change the right hand side in a way different than MatZeroRowsIS(). Explanation. For simplicity reorder the matrix rows/columns so that zeroed ones come last and the matrix is symmetric. Then you have ( A B ) (x_A) = (b_A) ( B D ) (x_B) (b_B) with MatZeroRows the new system is ( A B ) (x_A) = (b_A) ( 0 I ) (x_B) (x_B) it has the same solution as the original problem with the give x_B with MatZeroRowsColumns the new system is ( A 0 ) (x_A) = (b_A) - B*x_B ( 0 I ) (x_B) (x_B) note the right hand side needs to be changed so that the new problem has the same solution. Barry On May 9, 2014, at 9:50 AM, P?s?k, Adina-Erika wrote: > Yes, I tested the implementation with both MatZeroRowsIS() and MatZeroRowsColumnsIS(). But first, I will be more explicit about the problem I was set to solve: > > We have a Dirichlet block of size (L,W,H) and centered (xc,yc,zc), which is much smaller than the model domain, and we set Vx = Vpush, Vy=0 within the block (Vz is let free for easier convergence). > As I said before, since the code does not have a monolithic matrix, but 4 submatrices (VV VP; PV PP), and the rhs has 2 sub vectors rhs=(f; g), my approach is to modify only (VV, VP, f) for the Dirichlet BC. > > The way I tested the implementation: > 1) Output (VV, VP, f, Dirichlet dofs) - unmodified (no Dirichlet BC) > 2) Output (VV, VP, f, Dirichlet dofs) - a) modified with MatZeroRowsIS(), > - b) modified with MatZeroRowsColumnsIS() -> S_PETSc > Again, the only difference between a) and b) is: > // > ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > // ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > > ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > 3) Read them in Matlab and perform the exact same operations on the unmodified matrices and f vector. -> S_Matlab > 4) Compare S_PETSc with S_Matlab. If the implementation is correct, they should be equal (VV, VP, f). > 5) Check for 1 cpu and 4 cpus. > > Now to answer your questions: > > a,b,d) Yes, matrix modification is done correctly (check the spy diagrams below) in all cases: MatZeroRowsIS() and MatZeroRowsColumnsIS() on 1 and 4 cpus. > > I should have said that in the piece of code above: > v_vv = 1.0; > v_vp = 0.0; > The vector x_push is a duplicate of rhs, with zero elements except the values for the Dirichlet dofs. > > c) The rhs is a different matter. With MatZeroRows() there is no problem. The rhs is equivalent with the one in Matlab, sequential and parallel. > However, with MatZeroRowsColumns(), the residual contains nonzero elements, and in parallel the nonzero pattern is even bigger (1 cpu - 63, 4 cpu - 554). But if you look carefully, the values of the nonzero residuals are very small < +/- 1e-10. > So, I did a tolerance filter: > > tol = 1e-10; > res = f_petsc - f_mod_matlab; > for i=1:length(res) > if abs(res(i))>0 & abs(res(i)) res(i)=0; > end > end > > and then the f_petsc and f_mod_matlab are equivalent on 1 and 4 cpus (figure 5). So it seems that MatZeroRowsColumnsIS() might give some nonzero residuals. > > Calculating the norm2 of the residuals defined above in each case gives: > MatZeroRowsIS() 1cpu: norm(res,2) = 0 > MatZeroRowsIS() 4cpu: norm(res,2) = 0 > MatZeroRowsColumnsIS() 1cpu: norm(res,2) = 1.6880e-10 > MatZeroRowsColumnsIS() 4cpu: norm(res,2) = 7.3786e+06 > > Since this is purely a problem of matrix and vector assembly/manipulation, I think the nonzero residuals of the rhs with MatZeroRowsColumnsIS() give the parallel artefacts that I showed last time. > If you need the raw data and the matlab scripts that I used for testing for your consideration, please let me know. > > Thanks, > Adina > > When performing the manual operations on the unmodified matrices and rhs vector in Matlab, I took into account: > - matlab indexing = petsc indexing +1; > - the vectors written to file for matlab (PETSC_VIEWER_BINARY_MATLAB) have the natural ordering, rather than the petsc ordering. On 1 cpu, they are equivalent, but on 4 cpus, the Dirichlet BC indices had to be converted to natural indices in order to perform the correct operations on the rhs. > > > > > > > > On May 6, 2014, at 4:22 PM, Matthew Knepley wrote: > >> On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika wrote: >> Hello! >> >> I was trying to implement some internal Dirichlet boundary conditions into an aij matrix of the form: A=( VV VP; PV PP ). The idea was to create an internal block (let's say Dirichlet block) that moves with constant velocity within the domain (i.e. check all the dofs within the block and set the values accordingly to the desired motion). >> >> Ideally, this means to zero the rows and columns in VV, VP, PV corresponding to the dirichlet dofs and modify the corresponding rhs values. However, since we have submatrices and not a monolithic matrix A, we can choose to modify only VV and PV matrices. >> The global indices of the velocity points within the Dirichlet block are contained in the arrays rowid_array. >> >> What I want to point out is that the function MatZeroRowsColumnsIS() seems to create parallel artefacts, compared to MatZeroRowsIS() when run on more than 1 processor. Moreover, the results on 1 cpu are identical. >> See below the results of the test (the Dirichlet block is outlined in white) and the piece of the code involved where the 1) - 2) parts are the only difference. >> >> I am assuming that you are showing the result of solving the equations. It would be more useful, and presumably just as easy >> to say: >> >> a) Are the correct rows zeroed out? >> >> b) Is the diagonal element correct? >> >> c) Is the rhs value correct? >> >> d) Are the columns zeroed correctly? >> >> If we know where the problem is, its easier to fix. For example, if the rhs values are >> correct and the rows are zeroed, then something is wrong with the solution procedure. >> Since ZeroRows() works and ZeroRowsColumns() does not, this is a distinct possibility. >> >> Thanks, >> >> Matt >> >> Thanks, >> Adina Pusok >> >> // Create an IS required by MatZeroRows() >> ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx); CHKERRQ(ierr); >> ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy); CHKERRQ(ierr); >> ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz); CHKERRQ(ierr); >> >> 1) /* >> ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); >> ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); >> ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr);*/ >> >> 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); >> ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); >> ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr); >> >> ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >> ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >> ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr); >> >> ierr = ISDestroy(&isx); CHKERRQ(ierr); >> ierr = ISDestroy(&isy); CHKERRQ(ierr); >> ierr = ISDestroy(&isz); CHKERRQ(ierr); >> >> >> Results (velocity) with MatZeroRowsColumnsIS(). >> 1cpu 4cpu >> >> Results (velocity) with MatZeroRowsIS(): >> 1cpu 4cpu >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From jed at jedbrown.org Tue May 13 23:28:19 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 13 May 2014 22:28:19 -0600 Subject: [petsc-users] Problem with MatZeroRowsColumnsIS() In-Reply-To: <1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov> References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de> <57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de> <1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov> Message-ID: <87k39p7zmk.fsf@jedbrown.org> Barry Smith writes: > Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to be wrong. Haha. Though MatZeroRowsColumns_MPIAIJ uses PetscSF, the implementation was written by Matt. I think it's correct, however, at least as of Matt's January changes in 'master'. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From talebi.hossein at gmail.com Wed May 14 00:43:21 2014 From: talebi.hossein at gmail.com (Hossein Talebi) Date: Wed, 14 May 2014 07:43:21 +0200 Subject: [petsc-users] PetscLayoutCreate for Fortran In-Reply-To: <44B5B5E9-5F44-4649-A01E-003189B05AD4@mcs.anl.gov> References: <44B5B5E9-5F44-4649-A01E-003189B05AD4@mcs.anl.gov> Message-ID: Thank you. Well, only the first part. I move around the elements and identify the Halo nodes etc. However, I do not renumber the vertices to be contiguous on the CPUs like what you said. BUT, I just noticed: I partition the domain based on the computational wight of the elements which is different to that of Mat-Vec calculation. This means my portioning may not be efficient for the solution process. I think I will then go with the copy-in, solve, copy-out option. On Wed, May 14, 2014 at 3:06 AM, Barry Smith wrote: > > On May 13, 2014, at 11:42 AM, Hossein Talebi > wrote: > > > > > I have already decomposed the Finite Element system using Metis. I just > need to have the global rows exactly like how I define and I like to have > the answer in the same layout so I don't have to move things around the > processes again. > > Metis tells you a good partitioning IT DOES NOT MOVE the elements to > form a good partitioning. Do you move the elements around based on what > metis told you and similarly do you renumber the elements (and vertices) to > be contiquously numbered on each process with the first process getting the > first set of numbers, the second process the second set of numbers etc? > > If you do all that then when you create Vec and Mat you should simply > set the local size (based on the number of local vertices on each process). > You never need to use PetscLayoutCreate and in fact if your code was in C > you would never use PetscLayoutCreate() > > If you do not do all that then you need to do that first before you > start calling PETSc. > > Barry > > > > > No, I don't need it for something else. > > > > Cheers > > Hossein > > > > > > > > > > On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley > wrote: > > On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi < > talebi.hossein at gmail.com> wrote: > > Hi All, > > > > > > I am using PETSC from Fortran. I would like to define my own layout i.e. > which row belongs to which CPU since I have already done the domain > decomposition. It appears that "PetscLayoutCreate" and the other routine > do this. But in the manual it says it is not provided in Fortran. > > > > Is there any way that I can do this using Fortran? Anyone has an example? > > > > You can do this for Vec and Mat directly. Do you want it for something > else? > > > > Thanks, > > > > Matt > > > > Cheers > > Hossein > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > -- > > www.permix.org > > -- www.permix.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Vincent.De-Groof at uibk.ac.at Wed May 14 03:29:47 2014 From: Vincent.De-Groof at uibk.ac.at (De Groof, Vincent Frans Maria) Date: Wed, 14 May 2014 08:29:47 +0000 Subject: [petsc-users] Memory usage during matrix factorization In-Reply-To: <87vbt98ev4.fsf@jedbrown.org> References: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at> , <87vbt98ev4.fsf@jedbrown.org> Message-ID: <17A78B9D13564547AC894B88C159674720382707@XMBX4.uibk.ac.at> Thanks. I made a new function based on the PetscGetCurrentUsage which does what I want. It seems like I am being lucky as the numbers returned by the OS seem to be reasonable. thanks again, Vincent ________________________________________ Von: Jed Brown [jed at jedbrown.org] Gesendet: Mittwoch, 14. Mai 2014 00:59 An: Barry Smith; De Groof, Vincent Frans Maria Cc: petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Memory usage during matrix factorization Barry Smith writes: > Here is the code: it is only as reliable as the OS is at reporting the values. HPC vendors have a habit of implementing these functions to return nonsense. Sometimes they provide non-standard functions to return useful information. From C.Klaij at marin.nl Wed May 14 04:02:39 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 14 May 2014 09:02:39 +0000 Subject: [petsc-users] petsc 3.4, mat_view and prefix problem Message-ID: I'm having problems using mat_view in petsc 3.4.3 in combination with a prefix. For example in ../snes/examples/tutorials/ex70: mpiexec -n 2 ./ex70 -nx 16 -ny 24 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower -user_ksp -a00_mat_view does not print the matrix a00 to screen. This used to work in 3.3 versions before the single consistent -xxx_view scheme. Similarly, if I add this at line 105 of ../ksp/ksp/examples/tutorials/ex1f.F: call MatSetOptionsPrefix(A,"a_",ierr) then running with -mat_view still prints the matrix to screen but running with -a_mat_view doesn't. I expected the opposite. The problem only occurs with mat, not with ksp. For example, if I add this at line 184 of ../ksp/ksp/examples/tutorials/ex1f.F: call KSPSetOptionsPrefix(ksp,"a_",ierr) then running with -a_ksp_monitor does print the residuals to screen and -ksp_monitor doesn't, as expected. dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From christophe.ortiz at ciemat.es Wed May 14 06:29:55 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Wed, 14 May 2014 13:29:55 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero Message-ID: Hi all, I am experiencing some problems of memory corruption with PetscMemzero(). I set the values of the Jacobian by blocks using MatSetValuesBlocked(). To do so, I use some temporary two-dimensional arrays[dof][dof] that I must reset at each loop. Inside FormIJacobian, for instance, I declare the following two-dimensional array: PetscScalar diag[dof][dof]; and then, to zero the array diag[][] I do ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar)); So far no problem. It works fine. Now, what I want is to have diag[][] as a global array so all functions can have access to it. Therefore, I declare it outside main(). Since outside the main() I still do not know dof, which is determined later inside main(), I declare the two-dimensional array diag as follows: PetscScalar **diag; Then, inside main(), once dof is determined, I allocate memory for diag as follows: diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); for (k = 0; k < dof; k++){ diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); } That is, the classical way to allocate memory using the pointer notation. Then, when it comes to zero the two-dimensional array diag[][] inside FormIJacobian, I do as before: ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar)); Compilation goes well but when I launch the executable, after few timesteps I get the following memory corruption message: TSAdapt 'basic': step 0 accepted t=0 + 1.000e-16 wlte=8.5e-05 family='arkimex' scheme=0:'1bee' dt=1.100e-16 TSAdapt 'basic': step 1 accepted t=1e-16 + 1.100e-16 wlte=4.07e-13 family='arkimex' scheme=0:'3' dt=1.210e-16 TSAdapt 'basic': step 2 accepted t=2.1e-16 + 1.210e-16 wlte=1.15e-13 family='arkimex' scheme=0:'3' dt=1.331e-16 TSAdapt 'basic': step 3 accepted t=3.31e-16 + 1.331e-16 wlte=1.14e-13 family='arkimex' scheme=0:'3' dt=1.464e-16 [0]PETSC ERROR: PetscMallocValidate: error detected at TSComputeIJacobian() line 719 in src/ts/interface/ts.c [0]PETSC ERROR: Memory [id=0(0)] at address 0x243c260 is corrupted (probably write past end of array) [0]PETSC ERROR: Memory originally allocated in (null)() line 0 in src/mat/impls/aij/seq/(null) [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Memory corruption! [0]PETSC ERROR: ! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.1, Jun, 10, 2013 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./diffusion on a icc-nompi-double-blas-debug named mazinger.ciemat.es by u5751 Wed May 14 13:23:26 2014 [0]PETSC ERROR: Libraries linked from /home/u5751/petsc-3.4.1/icc-nompi-double-blas-debug/lib [0]PETSC ERROR: Configure run at Wed Apr 2 14:01:51 2014 [0]PETSC ERROR: Configure options --with-mpi=0 --with-cc=icc --with-cxx=icc --with-clanguage=cxx --with-debugging=1 --with-scalar-type=real --with-precision=double --download-f-blas-lapack [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscMallocValidate() line 149 in src/sys/memory/mtr.c [0]PETSC ERROR: TSComputeIJacobian() line 719 in src/ts/interface/ts.c [0]PETSC ERROR: SNESTSFormJacobian_ARKIMEX() line 995 in src/ts/impls/arkimex/arkimex.c [0]PETSC ERROR: SNESTSFormJacobian() line 3397 in src/ts/interface/ts.c [0]PETSC ERROR: SNESComputeJacobian() line 2152 in src/snes/interface/snes.c [0]PETSC ERROR: SNESSolve_NEWTONLS() line 218 in src/snes/impls/ls/ls.c [0]PETSC ERROR: SNESSolve() line 3636 in src/snes/interface/snes.c [0]PETSC ERROR: TSStep_ARKIMEX() line 765 in src/ts/impls/arkimex/arkimex.c [0]PETSC ERROR: TSStep() line 2458 in src/ts/interface/ts.c [0]PETSC ERROR: TSSolve() line 2583 in src/ts/interface/ts.c [0]PETSC ERROR: main() line 2690 in src/ts/examples/tutorials/diffusion.cxx ./compile_diffusion: line 25: 17061 Aborted ./diffusion -ts_adapt_monitor -ts_adapt_basic_clip 0.01,1.10 -draw_pause -2 -ts_arkimex_type 3 -ts_max_snes_failures -1 -snes_type newtonls -snes_linesearch_type basic -ksp_type gmres -pc_type ilu Did I do something wrong ? Or is it due to the pointer notation to declare the two-dimensional array that conflicts with PetscMemzero ? Many thanks in advance for your help. Christophe -- Q Por favor, piense en el medio ambiente antes de imprimir este mensaje. Please consider the environment before printing this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christophe.ortiz at ciemat.es Wed May 14 06:53:04 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Wed, 14 May 2014 13:53:04 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero Message-ID: Ok, I just found the answer regarding the memory corruption with the two-dimensional array and PetscMemzero. Instead of ierr = PetscMemzero(diagbl,dof*dofsizeof(PetscScalar));CHKERRQ(ierr); One must do the following: for (k = 0; k < dof; k++) { ierr = PetscMemzero(diagbl[k],dof*sizeof(PetscScalar));CHKERRQ(ierr); } Indeed, due to the ** notation, the two-dimensional array is made of dof rows of dof columns. You cannot set dof*dof values to just one row but you must iterate through the rows and set dof values each time. Now it works fine. Christophe -- Q Por favor, piense en el medio ambiente antes de imprimir este mensaje. Please consider the environment before printing this email. On Wed, May 14, 2014 at 1:29 PM, wrote: > Send petsc-users mailing list submissions to > petsc-users at mcs.anl.gov > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > petsc-users-request at mcs.anl.gov > > You can reach the person managing the list at > petsc-users-owner at mcs.anl.gov > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. Re: Problem with MatZeroRowsColumnsIS() (Barry Smith) > 2. Re: Problem with MatZeroRowsColumnsIS() (Jed Brown) > 3. Re: PetscLayoutCreate for Fortran (Hossein Talebi) > 4. Re: Memory usage during matrix factorization > (De Groof, Vincent Frans Maria) > 5. petsc 3.4, mat_view and prefix problem (Klaij, Christiaan) > 6. Memory corruption with two-dimensional array and > PetscMemzero (Christophe Ortiz) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 13 May 2014 21:00:50 -0500 > From: Barry Smith > To: "P?s?k, Adina-Erika" > Cc: "" > Subject: Re: [petsc-users] Problem with MatZeroRowsColumnsIS() > Message-ID: <1C839F22-8ADF-4904-B136-D1461AE38187 at mcs.anl.gov> > Content-Type: text/plain; charset="iso-8859-1" > > > Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to be > wrong. > > You wrote > > Calculating the norm2 of the residuals defined above in each case gives: > MatZeroRowsIS() 1cpu: norm(res,2) = 0 > MatZeroRowsIS() 4cpu: norm(res,2) = 0 > MatZeroRowsColumnsIS() 1cpu: norm(res,2) = 1.6880e-10 > MatZeroRowsColumnsIS() 4cpu: norm(res,2) = 7.3786e+06 > > why do you conclude this is wrong? MatZeroRowsColumnsIS() IS suppose to > change the right hand side in a way different than MatZeroRowsIS(). > > Explanation. For simplicity reorder the matrix rows/columns so that > zeroed ones come last and the matrix is symmetric. Then you have > > ( A B ) (x_A) = (b_A) > ( B D ) (x_B) (b_B) > > with MatZeroRows the new system is > > ( A B ) (x_A) = (b_A) > ( 0 I ) (x_B) (x_B) > > it has the same solution as the original problem with the give x_B > > with MatZeroRowsColumns the new system is > > ( A 0 ) (x_A) = (b_A) - B*x_B > ( 0 I ) (x_B) (x_B) > > note the right hand side needs to be changed so that the new problem has > the same solution. > > Barry > > > On May 9, 2014, at 9:50 AM, P?s?k, Adina-Erika > wrote: > > > Yes, I tested the implementation with both MatZeroRowsIS() and > MatZeroRowsColumnsIS(). But first, I will be more explicit about the > problem I was set to solve: > > > > We have a Dirichlet block of size (L,W,H) and centered (xc,yc,zc), which > is much smaller than the model domain, and we set Vx = Vpush, Vy=0 within > the block (Vz is let free for easier convergence). > > As I said before, since the code does not have a monolithic matrix, but > 4 submatrices (VV VP; PV PP), and the rhs has 2 sub vectors rhs=(f; g), my > approach is to modify only (VV, VP, f) for the Dirichlet BC. > > > > The way I tested the implementation: > > 1) Output (VV, VP, f, Dirichlet dofs) - unmodified (no Dirichlet BC) > > 2) Output (VV, VP, f, Dirichlet dofs) - a) modified with MatZeroRowsIS(), > > - b) modified with MatZeroRowsColumnsIS() -> S_PETSc > > Again, the only difference between a) and b) is: > > // > > ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > > // ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); > CHKERRQ(ierr); > > > > ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > > ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > > 3) Read them in Matlab and perform the exact same operations on the > unmodified matrices and f vector. -> S_Matlab > > 4) Compare S_PETSc with S_Matlab. If the implementation is correct, they > should be equal (VV, VP, f). > > 5) Check for 1 cpu and 4 cpus. > > > > Now to answer your questions: > > > > a,b,d) Yes, matrix modification is done correctly (check the spy > diagrams below) in all cases: MatZeroRowsIS() and MatZeroRowsColumnsIS() > on 1 and 4 cpus. > > > > I should have said that in the piece of code above: > > v_vv = 1.0; > > v_vp = 0.0; > > The vector x_push is a duplicate of rhs, with zero elements except the > values for the Dirichlet dofs. > > > > c) The rhs is a different matter. With MatZeroRows() there is no > problem. The rhs is equivalent with the one in Matlab, sequential and > parallel. > > However, with MatZeroRowsColumns(), the residual contains nonzero > elements, and in parallel the nonzero pattern is even bigger (1 cpu - 63, 4 > cpu - 554). But if you look carefully, the values of the nonzero residuals > are very small < +/- 1e-10. > > So, I did a tolerance filter: > > > > tol = 1e-10; > > res = f_petsc - f_mod_matlab; > > for i=1:length(res) > > if abs(res(i))>0 & abs(res(i)) > res(i)=0; > > end > > end > > > > and then the f_petsc and f_mod_matlab are equivalent on 1 and 4 cpus > (figure 5). So it seems that MatZeroRowsColumnsIS() might give some nonzero > residuals. > > > > Calculating the norm2 of the residuals defined above in each case gives: > > MatZeroRowsIS() 1cpu: norm(res,2) = 0 > > MatZeroRowsIS() 4cpu: norm(res,2) = 0 > > MatZeroRowsColumnsIS() 1cpu: norm(res,2) = 1.6880e-10 > > MatZeroRowsColumnsIS() 4cpu: norm(res,2) = 7.3786e+06 > > > > Since this is purely a problem of matrix and vector > assembly/manipulation, I think the nonzero residuals of the rhs with > MatZeroRowsColumnsIS() give the parallel artefacts that I showed last time. > > If you need the raw data and the matlab scripts that I used for testing > for your consideration, please let me know. > > > > Thanks, > > Adina > > > > When performing the manual operations on the unmodified matrices and rhs > vector in Matlab, I took into account: > > - matlab indexing = petsc indexing +1; > > - the vectors written to file for matlab (PETSC_VIEWER_BINARY_MATLAB) > have the natural ordering, rather than the petsc ordering. On 1 cpu, they > are equivalent, but on 4 cpus, the Dirichlet BC indices had to be converted > to natural indices in order to perform the correct operations on the rhs. > > > > > > > > > > > > > > > > On May 6, 2014, at 4:22 PM, Matthew Knepley wrote: > > > >> On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika < > puesoek at uni-mainz.de> wrote: > >> Hello! > >> > >> I was trying to implement some internal Dirichlet boundary conditions > into an aij matrix of the form: A=( VV VP; PV PP ). The idea was to > create an internal block (let's say Dirichlet block) that moves with > constant velocity within the domain (i.e. check all the dofs within the > block and set the values accordingly to the desired motion). > >> > >> Ideally, this means to zero the rows and columns in VV, VP, PV > corresponding to the dirichlet dofs and modify the corresponding rhs > values. However, since we have submatrices and not a monolithic matrix A, > we can choose to modify only VV and PV matrices. > >> The global indices of the velocity points within the Dirichlet block > are contained in the arrays rowid_array. > >> > >> What I want to point out is that the function MatZeroRowsColumnsIS() > seems to create parallel artefacts, compared to MatZeroRowsIS() when run on > more than 1 processor. Moreover, the results on 1 cpu are identical. > >> See below the results of the test (the Dirichlet block is outlined in > white) and the piece of the code involved where the 1) - 2) parts are the > only difference. > >> > >> I am assuming that you are showing the result of solving the equations. > It would be more useful, and presumably just as easy > >> to say: > >> > >> a) Are the correct rows zeroed out? > >> > >> b) Is the diagonal element correct? > >> > >> c) Is the rhs value correct? > >> > >> d) Are the columns zeroed correctly? > >> > >> If we know where the problem is, its easier to fix. For example, if the > rhs values are > >> correct and the rows are zeroed, then something is wrong with the > solution procedure. > >> Since ZeroRows() works and ZeroRowsColumns() does not, this is a > distinct possibility. > >> > >> Thanks, > >> > >> Matt > >> > >> Thanks, > >> Adina Pusok > >> > >> // Create an IS required by MatZeroRows() > >> ierr = > ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx); > CHKERRQ(ierr); > >> ierr = > ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy); > CHKERRQ(ierr); > >> ierr = > ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz); > CHKERRQ(ierr); > >> > >> 1) /* > >> ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > >> ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > >> ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs); > CHKERRQ(ierr);*/ > >> > >> 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > >> ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > >> ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr); > >> > >> ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > >> ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > >> ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > >> > >> ierr = ISDestroy(&isx); CHKERRQ(ierr); > >> ierr = ISDestroy(&isy); CHKERRQ(ierr); > >> ierr = ISDestroy(&isz); CHKERRQ(ierr); > >> > >> > >> Results (velocity) with MatZeroRowsColumnsIS(). > >> 1cpu 4cpu > >> > >> Results (velocity) with MatZeroRowsIS(): > >> 1cpu 4cpu > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >> -- Norbert Wiener > > > > > > ------------------------------ > > Message: 2 > Date: Tue, 13 May 2014 22:28:19 -0600 > From: Jed Brown > To: Barry Smith , P?s?k, Adina-Erika > > Cc: "" > Subject: Re: [petsc-users] Problem with MatZeroRowsColumnsIS() > Message-ID: <87k39p7zmk.fsf at jedbrown.org> > Content-Type: text/plain; charset="us-ascii" > > Barry Smith writes: > > > Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to > be wrong. > > Haha. Though MatZeroRowsColumns_MPIAIJ uses PetscSF, the implementation > was written by Matt. I think it's correct, however, at least as of > Matt's January changes in 'master'. > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: not available > Type: application/pgp-signature > Size: 835 bytes > Desc: not available > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/4f5868b4/attachment-0001.pgp > > > > ------------------------------ > > Message: 3 > Date: Wed, 14 May 2014 07:43:21 +0200 > From: Hossein Talebi > To: Barry Smith > Cc: petsc-users > Subject: Re: [petsc-users] PetscLayoutCreate for Fortran > Message-ID: > 4b3hoO7_b3DwESQ at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Thank you. > > Well, only the first part. I move around the elements and identify the Halo > nodes etc. However, I do not renumber the vertices to be contiguous on the > CPUs like what you said. > > BUT, I just noticed: I partition the domain based on the computational > wight of the elements which is different to that of Mat-Vec calculation. > This means my portioning may not be efficient for the solution process. > > I think I will then go with the copy-in, solve, copy-out option. > > > > > On Wed, May 14, 2014 at 3:06 AM, Barry Smith wrote: > > > > > On May 13, 2014, at 11:42 AM, Hossein Talebi > > wrote: > > > > > > > > I have already decomposed the Finite Element system using Metis. I just > > need to have the global rows exactly like how I define and I like to have > > the answer in the same layout so I don't have to move things around the > > processes again. > > > > Metis tells you a good partitioning IT DOES NOT MOVE the elements to > > form a good partitioning. Do you move the elements around based on what > > metis told you and similarly do you renumber the elements (and vertices) > to > > be contiquously numbered on each process with the first process getting > the > > first set of numbers, the second process the second set of numbers etc? > > > > If you do all that then when you create Vec and Mat you should simply > > set the local size (based on the number of local vertices on each > process). > > You never need to use PetscLayoutCreate and in fact if your code was in C > > you would never use PetscLayoutCreate() > > > > If you do not do all that then you need to do that first before you > > start calling PETSc. > > > > Barry > > > > > > > > No, I don't need it for something else. > > > > > > Cheers > > > Hossein > > > > > > > > > > > > > > > On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley > > wrote: > > > On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi < > > talebi.hossein at gmail.com> wrote: > > > Hi All, > > > > > > > > > I am using PETSC from Fortran. I would like to define my own layout > i.e. > > which row belongs to which CPU since I have already done the domain > > decomposition. It appears that "PetscLayoutCreate" and the other > routine > > do this. But in the manual it says it is not provided in Fortran. > > > > > > Is there any way that I can do this using Fortran? Anyone has an > example? > > > > > > You can do this for Vec and Mat directly. Do you want it for something > > else? > > > > > > Thanks, > > > > > > Matt > > > > > > Cheers > > > Hossein > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > their > > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > -- > > > www.permix.org > > > > > > > -- > www.permix.org > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/fcae21ab/attachment-0001.html > > > > ------------------------------ > > Message: 4 > Date: Wed, 14 May 2014 08:29:47 +0000 > From: "De Groof, Vincent Frans Maria" > To: Jed Brown , Barry Smith > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] Memory usage during matrix factorization > Message-ID: > <17A78B9D13564547AC894B88C159674720382707 at XMBX4.uibk.ac.at> > Content-Type: text/plain; charset="us-ascii" > > > Thanks. I made a new function based on the PetscGetCurrentUsage which does > what I want. It seems like I am being lucky as the numbers returned by the > OS seem to be reasonable. > > > thanks again, > Vincent > ________________________________________ > Von: Jed Brown [jed at jedbrown.org] > Gesendet: Mittwoch, 14. Mai 2014 00:59 > An: Barry Smith; De Groof, Vincent Frans Maria > Cc: petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Memory usage during matrix factorization > > Barry Smith writes: > > Here is the code: it is only as reliable as the OS is at reporting > the values. > > HPC vendors have a habit of implementing these functions to return > nonsense. Sometimes they provide non-standard functions to return > useful information. > > > ------------------------------ > > Message: 5 > Date: Wed, 14 May 2014 09:02:39 +0000 > From: "Klaij, Christiaan" > To: "petsc-users at mcs.anl.gov" > Subject: [petsc-users] petsc 3.4, mat_view and prefix problem > Message-ID: > Content-Type: text/plain; charset="utf-8" > > I'm having problems using mat_view in petsc 3.4.3 in combination > with a prefix. For example in ../snes/examples/tutorials/ex70: > > mpiexec -n 2 ./ex70 -nx 16 -ny 24 -ksp_type fgmres -pc_type > fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower > -user_ksp -a00_mat_view > > does not print the matrix a00 to screen. This used to work in 3.3 > versions before the single consistent -xxx_view scheme. > > Similarly, if I add this at line 105 of > ../ksp/ksp/examples/tutorials/ex1f.F: > > call MatSetOptionsPrefix(A,"a_",ierr) > > then running with -mat_view still prints the matrix to screen but > running with -a_mat_view doesn't. I expected the opposite. > > The problem only occurs with mat, not with ksp. For example, if I > add this at line 184 of ../ksp/ksp/examples/tutorials/ex1f.F: > > call KSPSetOptionsPrefix(ksp,"a_",ierr) > > then running with -a_ksp_monitor does print the residuals to > screen and -ksp_monitor doesn't, as expected. > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > > ------------------------------ > > Message: 6 > Date: Wed, 14 May 2014 13:29:55 +0200 > From: Christophe Ortiz > To: petsc-users at mcs.anl.gov > Subject: [petsc-users] Memory corruption with two-dimensional array > and PetscMemzero > Message-ID: > < > CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hi all, > > I am experiencing some problems of memory corruption with PetscMemzero(). > > I set the values of the Jacobian by blocks using MatSetValuesBlocked(). To > do so, I use some temporary two-dimensional arrays[dof][dof] that I must > reset at each loop. > > Inside FormIJacobian, for instance, I declare the following two-dimensional > array: > > PetscScalar diag[dof][dof]; > > and then, to zero the array diag[][] I do > > ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar)); > > So far no problem. It works fine. > > Now, what I want is to have diag[][] as a global array so all functions can > have access to it. Therefore, I declare it outside main(). > Since outside the main() I still do not know dof, which is determined later > inside main(), I declare the two-dimensional array diag as follows: > > PetscScalar **diag; > > Then, inside main(), once dof is determined, I allocate memory for diag as > follows: > > diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); > > for (k = 0; k < dof; k++){ > diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); > } > That is, the classical way to allocate memory using the pointer notation. > > Then, when it comes to zero the two-dimensional array diag[][] inside > FormIJacobian, I do as before: > > ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar)); > > Compilation goes well but when I launch the executable, after few timesteps > I get the following memory corruption message: > > TSAdapt 'basic': step 0 accepted t=0 + 1.000e-16 > wlte=8.5e-05 family='arkimex' scheme=0:'1bee' dt=1.100e-16 > TSAdapt 'basic': step 1 accepted t=1e-16 + 1.100e-16 > wlte=4.07e-13 family='arkimex' scheme=0:'3' dt=1.210e-16 > TSAdapt 'basic': step 2 accepted t=2.1e-16 + 1.210e-16 > wlte=1.15e-13 family='arkimex' scheme=0:'3' dt=1.331e-16 > TSAdapt 'basic': step 3 accepted t=3.31e-16 + 1.331e-16 > wlte=1.14e-13 family='arkimex' scheme=0:'3' dt=1.464e-16 > [0]PETSC ERROR: PetscMallocValidate: error detected at TSComputeIJacobian() > line 719 in src/ts/interface/ts.c > [0]PETSC ERROR: Memory [id=0(0)] at address 0x243c260 is corrupted > (probably write past end of array) > [0]PETSC ERROR: Memory originally allocated in (null)() line 0 in > src/mat/impls/aij/seq/(null) > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Memory corruption! > [0]PETSC ERROR: ! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.1, Jun, 10, 2013 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./diffusion on a icc-nompi-double-blas-debug named > mazinger.ciemat.es by u5751 Wed May 14 13:23:26 2014 > [0]PETSC ERROR: Libraries linked from > /home/u5751/petsc-3.4.1/icc-nompi-double-blas-debug/lib > [0]PETSC ERROR: Configure run at Wed Apr 2 14:01:51 2014 > [0]PETSC ERROR: Configure options --with-mpi=0 --with-cc=icc --with-cxx=icc > --with-clanguage=cxx --with-debugging=1 --with-scalar-type=real > --with-precision=double --download-f-blas-lapack > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocValidate() line 149 in src/sys/memory/mtr.c > [0]PETSC ERROR: TSComputeIJacobian() line 719 in src/ts/interface/ts.c > [0]PETSC ERROR: SNESTSFormJacobian_ARKIMEX() line 995 in > src/ts/impls/arkimex/arkimex.c > [0]PETSC ERROR: SNESTSFormJacobian() line 3397 in src/ts/interface/ts.c > [0]PETSC ERROR: SNESComputeJacobian() line 2152 in > src/snes/interface/snes.c > [0]PETSC ERROR: SNESSolve_NEWTONLS() line 218 in src/snes/impls/ls/ls.c > [0]PETSC ERROR: SNESSolve() line 3636 in src/snes/interface/snes.c > [0]PETSC ERROR: TSStep_ARKIMEX() line 765 in src/ts/impls/arkimex/arkimex.c > [0]PETSC ERROR: TSStep() line 2458 in src/ts/interface/ts.c > [0]PETSC ERROR: TSSolve() line 2583 in src/ts/interface/ts.c > [0]PETSC ERROR: main() line 2690 in src/ts/examples/tutorials/diffusion.cxx > ./compile_diffusion: line 25: 17061 Aborted ./diffusion > -ts_adapt_monitor -ts_adapt_basic_clip 0.01,1.10 -draw_pause -2 > -ts_arkimex_type 3 -ts_max_snes_failures -1 -snes_type newtonls > -snes_linesearch_type basic -ksp_type gmres -pc_type ilu > > Did I do something wrong ? Or is it due to the pointer notation to declare > the two-dimensional array that conflicts with PetscMemzero ? > > Many thanks in advance for your help. > > Christophe > > -- > Q > Por favor, piense en el medio ambiente antes de imprimir este mensaje. > Please consider the environment before printing this email. > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/2f259965/attachment.html > > > > ------------------------------ > > _______________________________________________ > petsc-users mailing list > petsc-users at mcs.anl.gov > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > End of petsc-users Digest, Vol 65, Issue 49 > ******************************************* > > ---------------------------- > Confidencialidad: > Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su > destinatario y puede contener informaci?n privilegiada o confidencial. Si > no es vd. el destinatario indicado, queda notificado de que la utilizaci?n, > divulgaci?n y/o copia sin autorizaci?n est? prohibida en virtud de la > legislaci?n vigente. Si ha recibido este mensaje por error, le rogamos que > nos lo comunique inmediatamente respondiendo al mensaje y proceda a su > destrucci?n. > > Disclaimer: > This message and its attached files is intended exclusively for its > recipients and may contain confidential information. If you received this > e-mail in error you are hereby notified that any dissemination, copy or > disclosure of this communication is strictly prohibited and may be > unlawful. In this case, please notify us by a reply and delete this email > and its contents immediately. > ---------------------------- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 14 07:24:49 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 May 2014 07:24:49 -0500 Subject: [petsc-users] Problem with MatZeroRowsColumnsIS() In-Reply-To: <1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov> References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de> <57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de> <1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov> Message-ID: On Tue, May 13, 2014 at 9:00 PM, Barry Smith wrote: > > Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to be > wrong. > > You wrote > > Calculating the norm2 of the residuals defined above in each case gives: > MatZeroRowsIS() 1cpu: norm(res,2) = 0 > MatZeroRowsIS() 4cpu: norm(res,2) = 0 > MatZeroRowsColumnsIS() 1cpu: norm(res,2) = 1.6880e-10 > MatZeroRowsColumnsIS() 4cpu: norm(res,2) = 7.3786e+06 > I think the issue is that the RHS changes between 1 and 4 processes. This could be a bug, but I have gone over the code and our test which look correct. I think it could also be a misintepretation of how it works since you are using a composite matrix. Does it work if you put everything in a single matrix? Thanks, Matt > why do you conclude this is wrong? MatZeroRowsColumnsIS() IS suppose to > change the right hand side in a way different than MatZeroRowsIS(). > > Explanation. For simplicity reorder the matrix rows/columns so that > zeroed ones come last and the matrix is symmetric. Then you have > > ( A B ) (x_A) = (b_A) > ( B D ) (x_B) (b_B) > > with MatZeroRows the new system is > > ( A B ) (x_A) = (b_A) > ( 0 I ) (x_B) (x_B) > > it has the same solution as the original problem with the give x_B > > with MatZeroRowsColumns the new system is > > ( A 0 ) (x_A) = (b_A) - B*x_B > ( 0 I ) (x_B) (x_B) > > note the right hand side needs to be changed so that the new problem has > the same solution. > > Barry > > > On May 9, 2014, at 9:50 AM, P?s?k, Adina-Erika > wrote: > > > Yes, I tested the implementation with both MatZeroRowsIS() and > MatZeroRowsColumnsIS(). But first, I will be more explicit about the > problem I was set to solve: > > > > We have a Dirichlet block of size (L,W,H) and centered (xc,yc,zc), which > is much smaller than the model domain, and we set Vx = Vpush, Vy=0 within > the block (Vz is let free for easier convergence). > > As I said before, since the code does not have a monolithic matrix, but > 4 submatrices (VV VP; PV PP), and the rhs has 2 sub vectors rhs=(f; g), my > approach is to modify only (VV, VP, f) for the Dirichlet BC. > > > > The way I tested the implementation: > > 1) Output (VV, VP, f, Dirichlet dofs) - unmodified (no Dirichlet BC) > > 2) Output (VV, VP, f, Dirichlet dofs) - a) modified with MatZeroRowsIS(), > > - b) modified with MatZeroRowsColumnsIS() -> S_PETSc > > Again, the only difference between a) and b) is: > > // > > ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > > // ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); > CHKERRQ(ierr); > > > > ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > > ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > > 3) Read them in Matlab and perform the exact same operations on the > unmodified matrices and f vector. -> S_Matlab > > 4) Compare S_PETSc with S_Matlab. If the implementation is correct, they > should be equal (VV, VP, f). > > 5) Check for 1 cpu and 4 cpus. > > > > Now to answer your questions: > > > > a,b,d) Yes, matrix modification is done correctly (check the spy > diagrams below) in all cases: MatZeroRowsIS() and MatZeroRowsColumnsIS() > on 1 and 4 cpus. > > > > I should have said that in the piece of code above: > > v_vv = 1.0; > > v_vp = 0.0; > > The vector x_push is a duplicate of rhs, with zero elements except the > values for the Dirichlet dofs. > > > > c) The rhs is a different matter. With MatZeroRows() there is no > problem. The rhs is equivalent with the one in Matlab, sequential and > parallel. > > However, with MatZeroRowsColumns(), the residual contains nonzero > elements, and in parallel the nonzero pattern is even bigger (1 cpu - 63, 4 > cpu - 554). But if you look carefully, the values of the nonzero residuals > are very small < +/- 1e-10. > > So, I did a tolerance filter: > > > > tol = 1e-10; > > res = f_petsc - f_mod_matlab; > > for i=1:length(res) > > if abs(res(i))>0 & abs(res(i)) > res(i)=0; > > end > > end > > > > and then the f_petsc and f_mod_matlab are equivalent on 1 and 4 cpus > (figure 5). So it seems that MatZeroRowsColumnsIS() might give some nonzero > residuals. > > > > Calculating the norm2 of the residuals defined above in each case gives: > > MatZeroRowsIS() 1cpu: norm(res,2) = 0 > > MatZeroRowsIS() 4cpu: norm(res,2) = 0 > > MatZeroRowsColumnsIS() 1cpu: norm(res,2) = 1.6880e-10 > > MatZeroRowsColumnsIS() 4cpu: norm(res,2) = 7.3786e+06 > > > > Since this is purely a problem of matrix and vector > assembly/manipulation, I think the nonzero residuals of the rhs with > MatZeroRowsColumnsIS() give the parallel artefacts that I showed last time. > > If you need the raw data and the matlab scripts that I used for testing > for your consideration, please let me know. > > > > Thanks, > > Adina > > > > When performing the manual operations on the unmodified matrices and rhs > vector in Matlab, I took into account: > > - matlab indexing = petsc indexing +1; > > - the vectors written to file for matlab (PETSC_VIEWER_BINARY_MATLAB) > have the natural ordering, rather than the petsc ordering. On 1 cpu, they > are equivalent, but on 4 cpus, the Dirichlet BC indices had to be converted > to natural indices in order to perform the correct operations on the rhs. > > > > > > > > > > > > > > > > On May 6, 2014, at 4:22 PM, Matthew Knepley wrote: > > > >> On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika < > puesoek at uni-mainz.de> wrote: > >> Hello! > >> > >> I was trying to implement some internal Dirichlet boundary conditions > into an aij matrix of the form: A=( VV VP; PV PP ). The idea was to > create an internal block (let's say Dirichlet block) that moves with > constant velocity within the domain (i.e. check all the dofs within the > block and set the values accordingly to the desired motion). > >> > >> Ideally, this means to zero the rows and columns in VV, VP, PV > corresponding to the dirichlet dofs and modify the corresponding rhs > values. However, since we have submatrices and not a monolithic matrix A, > we can choose to modify only VV and PV matrices. > >> The global indices of the velocity points within the Dirichlet block > are contained in the arrays rowid_array. > >> > >> What I want to point out is that the function MatZeroRowsColumnsIS() > seems to create parallel artefacts, compared to MatZeroRowsIS() when run on > more than 1 processor. Moreover, the results on 1 cpu are identical. > >> See below the results of the test (the Dirichlet block is outlined in > white) and the piece of the code involved where the 1) - 2) parts are the > only difference. > >> > >> I am assuming that you are showing the result of solving the equations. > It would be more useful, and presumably just as easy > >> to say: > >> > >> a) Are the correct rows zeroed out? > >> > >> b) Is the diagonal element correct? > >> > >> c) Is the rhs value correct? > >> > >> d) Are the columns zeroed correctly? > >> > >> If we know where the problem is, its easier to fix. For example, if the > rhs values are > >> correct and the rows are zeroed, then something is wrong with the > solution procedure. > >> Since ZeroRows() works and ZeroRowsColumns() does not, this is a > distinct possibility. > >> > >> Thanks, > >> > >> Matt > >> > >> Thanks, > >> Adina Pusok > >> > >> // Create an IS required by MatZeroRows() > >> ierr = > ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx); > CHKERRQ(ierr); > >> ierr = > ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy); > CHKERRQ(ierr); > >> ierr = > ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz); > CHKERRQ(ierr); > >> > >> 1) /* > >> ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > >> ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > >> ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs); > CHKERRQ(ierr);*/ > >> > >> 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs); CHKERRQ(ierr); > >> ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs); CHKERRQ(ierr); > >> ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs); CHKERRQ(ierr); > >> > >> ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > >> ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > >> ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL); > CHKERRQ(ierr); > >> > >> ierr = ISDestroy(&isx); CHKERRQ(ierr); > >> ierr = ISDestroy(&isy); CHKERRQ(ierr); > >> ierr = ISDestroy(&isz); CHKERRQ(ierr); > >> > >> > >> Results (velocity) with MatZeroRowsColumnsIS(). > >> 1cpu 4cpu > >> > >> Results (velocity) with MatZeroRowsIS(): > >> 1cpu 4cpu > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >> -- Norbert Wiener > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliver.browne at upm.es Wed May 14 07:42:58 2014 From: oliver.browne at upm.es (Oliver Browne) Date: Wed, 14 May 2014 14:42:58 +0200 Subject: [petsc-users] MatMPIAIJSetPreallocationCSR Message-ID: Hi, I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine. If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here; http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR Thanks in advance, Ollie From atmmachado at gmail.com Wed May 14 08:07:16 2014 From: atmmachado at gmail.com (=?UTF-8?B?QW5kcsOpIFRpbcOzdGhlbw==?=) Date: Wed, 14 May 2014 10:07:16 -0300 Subject: [petsc-users] help: (petsc-dev) + petsc4py acessing the tao optimizations solvers ? Message-ID: I read about the merger of the TAO solvers on the PETSC-DEV. How can I use the TAO's constrainded optimization solver on the PETSC-DEV (via petsc4py)? Can you show me some simple python script to deal with classical linear constrainded optimization problems like: minimize sum(x) subject to x >= 0 and Ax = b Thanks for your time. Andre -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 14 08:16:42 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 May 2014 08:16:42 -0500 Subject: [petsc-users] PetscLayoutCreate for Fortran In-Reply-To: References: <44B5B5E9-5F44-4649-A01E-003189B05AD4@mcs.anl.gov> Message-ID: <1670730F-7EB3-4CCF-91FE-452A7446215C@mcs.anl.gov> On May 14, 2014, at 12:43 AM, Hossein Talebi wrote: > > Thank you. > > Well, only the first part. I move around the elements and identify the Halo nodes etc. However, I do not renumber the vertices to be contiguous on the CPUs like what you said. You need to do this! Once this is done then using the PETSc solvers is easy. Note you can do this by simply counting the number of local vertices on each process and using an MPI_Scan to get the first number on each process from the previous process. > > BUT, I just noticed: I partition the domain based on the computational wight of the elements which is different to that of Mat-Vec calculation. This means my portioning may not be efficient for the solution process. That is fine, it is what we do to. > > I think I will then go with the copy-in, solve, copy-out option. I do not know what you mean here but it sounds bad. > > > > > On Wed, May 14, 2014 at 3:06 AM, Barry Smith wrote: > > On May 13, 2014, at 11:42 AM, Hossein Talebi wrote: > > > > > I have already decomposed the Finite Element system using Metis. I just need to have the global rows exactly like how I define and I like to have the answer in the same layout so I don't have to move things around the processes again. > > Metis tells you a good partitioning IT DOES NOT MOVE the elements to form a good partitioning. Do you move the elements around based on what metis told you and similarly do you renumber the elements (and vertices) to be contiquously numbered on each process with the first process getting the first set of numbers, the second process the second set of numbers etc? > > If you do all that then when you create Vec and Mat you should simply set the local size (based on the number of local vertices on each process). You never need to use PetscLayoutCreate and in fact if your code was in C you would never use PetscLayoutCreate() > > If you do not do all that then you need to do that first before you start calling PETSc. > > Barry > > > > > No, I don't need it for something else. > > > > Cheers > > Hossein > > > > > > > > > > On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley wrote: > > On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi wrote: > > Hi All, > > > > > > I am using PETSC from Fortran. I would like to define my own layout i.e. which row belongs to which CPU since I have already done the domain decomposition. It appears that "PetscLayoutCreate" and the other routine do this. But in the manual it says it is not provided in Fortran. > > > > Is there any way that I can do this using Fortran? Anyone has an example? > > > > You can do this for Vec and Mat directly. Do you want it for something else? > > > > Thanks, > > > > Matt > > > > Cheers > > Hossein > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > -- > > www.permix.org > > > > > -- > www.permix.org From bsmith at mcs.anl.gov Wed May 14 08:27:59 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 May 2014 08:27:59 -0500 Subject: [petsc-users] MatMPIAIJSetPreallocationCSR In-Reply-To: References: Message-ID: On May 14, 2014, at 7:42 AM, Oliver Browne wrote: > Hi, > > I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine. > > If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here; What do you mean by ?separate? the vectors? Each processor needs to provide ITS rows to the function call. You cannot have processor zero deliver all the rows. Barry > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR > > Thanks in advance, > > Ollie > From atmmachado at gmail.com Wed May 14 09:29:52 2014 From: atmmachado at gmail.com (=?UTF-8?B?QW5kcsOpIFRpbcOzdGhlbw==?=) Date: Wed, 14 May 2014 11:29:52 -0300 Subject: [petsc-users] help: (petsc-dev) + petsc4py acessing the tao optimizations solvers ? In-Reply-To: References: Message-ID: p.s. I am not quite sure of the need of petsc-dev. I Just read about the inclusion of tao solvers in it and installed petsc-dev, cython and petsc4py on my ubuntu 12.04. Unfortunately petsc4py (and the old tao4py) documentation does not have this type of python documented examples. After some correspondence in petsc4py mailing list they suggested that I might find some help on petsc-users mailing list. 2014-05-14 10:07 GMT-03:00 Andr? Tim?theo : > I read about the merger of the TAO solvers on the PETSC-DEV. > > How can I use the TAO's constrainded optimization solver on > the PETSC-DEV (via petsc4py)? > > Can you show me some simple python script to deal with classical > linear > constrainded optimization problems like: > > minimize sum(x) subject to x >= 0 and Ax = b > > Thanks for your time. > > Andre > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Wed May 14 09:48:32 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Wed, 14 May 2014 07:48:32 -0700 (PDT) Subject: [petsc-users] ILUTP in PETSc In-Reply-To: References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com> <8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov> <1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com> Message-ID: <1400078912.39037.YahooMailNeo@web160203.mail.bf1.yahoo.com> It turns out that I can not set PC side as right when KSP type?is set to pre_only. After I fixed that, it works fine. ? This brings me a question: why do I have to set KSP type?to pre_only when SuperLU's ILUTP is used as preconditioner? Can I still?set KSP type?as KSPBCGS (which?seems to be?the fastest with PETSc's ILU for my cases)? ? Thanks, Qin ________________________________ From: Barry Smith To: Qin Lu Cc: Xiaoye S. Li ; "petsc-users at mcs.anl.gov" Sent: Tuesday, May 13, 2014 8:31 PM Subject: Re: [petsc-users] ILUTP in PETSc ? Works fine for me.? Please please please ALWAYS cut and paste the entire error message that is printed. We print the information for a reason, because it provides clues as to what went wrong. ? ./ex10 -f0 ~/Datafiles/Matrices/arco1 -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8 -ksp_monitor_true_residual -ksp_rtol 1.e-12 -ksp_view ? 0 KSP preconditioned resid norm 2.544968580491e+03 true resid norm 7.410897708964e+00 ||r(i)||/||b|| 1.000000000000e+00 ? 1 KSP preconditioned resid norm 2.467110329809e-06 true resid norm 1.439993537311e-07 ||r(i)||/||b|| 1.943075716143e-08 ? 2 KSP preconditioned resid norm 1.522204461523e-12 true resid norm 2.699724724531e-11 ||r(i)||/||b|| 3.642911871885e-12 KSP Object: 1 MPI processes ? type: gmres ? ? GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement ? ? GMRES: happy breakdown tolerance 1e-30 ? maximum iterations=10000, initial guess is zero ? tolerances:? relative=1e-12, absolute=1e-50, divergence=10000 ? left preconditioning ? using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes ? type: ilu ? ? ILU: out-of-place factorization ? ? 0 levels of fill ? ? tolerance for zero pivot 2.22045e-14 ? ? using diagonal shift on blocks to prevent zero pivot [INBLOCKS] ? ? matrix ordering: natural ? ? factor fill ratio given 0, needed 0 ? ? ? Factored matrix follows: ? ? ? ? Mat Object:? ? ? ? 1 MPI processes ? ? ? ? ? type: seqaij ? ? ? ? ? rows=1501, cols=1501 ? ? ? ? ? package used to perform factorization: superlu ? ? ? ? ? total: nonzeros=0, allocated nonzeros=0 ? ? ? ? ? total number of mallocs used during MatSetValues calls =0 ? ? ? ? ? ? SuperLU run parameters: ? ? ? ? ? ? ? Equil: YES ? ? ? ? ? ? ? ColPerm: 3 ? ? ? ? ? ? ? IterRefine: 0 ? ? ? ? ? ? ? SymmetricMode: NO ? ? ? ? ? ? ? DiagPivotThresh: 0.1 ? ? ? ? ? ? ? PivotGrowth: NO ? ? ? ? ? ? ? ConditionNumber: NO ? ? ? ? ? ? ? RowPerm: 1 ? ? ? ? ? ? ? ReplaceTinyPivot: NO ? ? ? ? ? ? ? PrintStat: NO ? ? ? ? ? ? ? lwork: 0 ? ? ? ? ? ? ? ILU_DropTol: 1e-08 ? ? ? ? ? ? ? ILU_FillTol: 0.01 ? ? ? ? ? ? ? ILU_FillFactor: 10 ? ? ? ? ? ? ? ILU_DropRule: 9 ? ? ? ? ? ? ? ILU_Norm: 2 ? ? ? ? ? ? ? ILU_MILU: 0 ? linear system matrix = precond matrix: ? Mat Object:? 1 MPI processes ? ? type: seqaij ? ? rows=1501, cols=1501 ? ? total: nonzeros=26131, allocated nonzeros=26131 ? ? total number of mallocs used during MatSetValues calls =0 ? ? ? using I-node routines: found 501 nodes, limit used is 5 Number of iterations =? 2 Residual norm 2.69972e-11 ~/Src/petsc/src/ksp/ksp/examples/tutorials? master On May 13, 2014, at 12:17 PM, Qin Lu wrote: > I tried to use command line options as the example suggested ('-ksp_type preonly -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8') without changing my source code, but then the call to KSPSetUp returned error number 56. >? > Does this mean I still need to change the source code (such as adding calls to PCFactorSetMatSolverPackage, PCFactorGetMatrix, etc.)in addition to the command line options? >? > I ask this since the use of SuperLU seems to be different from using Hypre, which can be invoked with command line options without changing source code. >? > Thanks a lot, > Qin > > > ----- Original Message ----- > From: Barry Smith > To: Qin Lu > Cc: Xiaoye S. Li ; "petsc-users at mcs.anl.gov" > Sent: Monday, May 12, 2014 5:11 PM > Subject: Re: [petsc-users] ILUTP in PETSc > > >? ? See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html > > > > > > On May 12, 2014, at 4:54 PM, Qin Lu wrote: > >> Hello, >> >> I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?) >>? >> Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work?? >>? >> Many thanks, >> Qin? >> >> >>? From: Xiaoye S. Li >> To: Barry Smith >> Cc: Qin Lu ; "petsc-users at mcs.anl.gov" >> Sent: Friday, May 2, 2014 3:40 PM >> Subject: Re: [petsc-users] ILUTP in PETSc >> >> >> >> The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily.? >> >> In SuperLU distribution: >> >>? ? EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c) >> >>? ? SRC/zgsitrf.c : the actual ILUTP factorization routine >> >> >> Sherry Li >> >> >> >> On Fri, May 2, 2014 at 12:25 PM, Barry Smith wrote: >> >> >>> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.htmlthere are two listed. ./configure ?download-hypre >>> >>> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid >>> >>> you can also add -help to see what options are available. >>> >>>? ? Both pretty much suck and I can?t image much reason for using them. >>> >>>? ? Barry >>> >>> >>> >>> On May 2, 2014, at 10:27 AM, Qin Lu wrote: >>> >>>> Hello, >>>> >>>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthatmentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? >>>> >>>> Many thanks, >>>> Qin >>> >>>? ? ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.sarich at gmail.com Wed May 14 09:57:19 2014 From: jason.sarich at gmail.com (Jason Sarich) Date: Wed, 14 May 2014 09:57:19 -0500 Subject: [petsc-users] help: (petsc-dev) + petsc4py acessing the tao optimizations solvers ? In-Reply-To: References: Message-ID: Hi Andre, TAO specializes in unconstrained and bound-constrained optimization, there is not a lot of support for linear constrained optimization. There is an interior point solver (ipm) that can accept general constraints, there are new functions for setting up these constraints, they aren't quite solid yet, and I've not very experienced in using the petsc4py package, but I can give you some general help. There is a simple C example using nonlinear constraints in src/tao/examples/tutorials/toy.c, you should be able to easily modify this for your example, where the equality constraint function will evaluate Ax-b and the equality jacobian will be the A matrix (you won't need to set the inequality constraint or jacobian). Because the ipm method builds a KKT system and solves it, it doesn't work well with iterative methods, a direct solver like superlu may be necessary. I don't know enough about the actual python bindings to give you an example program in python, but it should follow pretty directly from the C example. Please let me know if you have any specific questions. Jason Sarich On Wed, May 14, 2014 at 9:29 AM, Andr? Tim?theo wrote: > p.s. I am not quite sure of the need of petsc-dev. I Just read about the > inclusion of tao solvers in it and installed petsc-dev, cython and petsc4py > on my ubuntu 12.04. > > Unfortunately petsc4py (and the old tao4py) documentation does not have > this type of python documented examples. After some correspondence in > petsc4py mailing list they suggested that I might find some help on > petsc-users mailing list. > > > 2014-05-14 10:07 GMT-03:00 Andr? Tim?theo : > > I read about the merger of the TAO solvers on the PETSC-DEV. >> >> How can I use the TAO's constrainded optimization solver on >> the PETSC-DEV (via petsc4py)? >> >> Can you show me some simple python script to deal with classical >> linear >> constrainded optimization problems like: >> >> minimize sum(x) subject to x >= 0 and Ax = b >> >> Thanks for your time. >> >> Andre >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 14 11:34:10 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 May 2014 11:34:10 -0500 Subject: [petsc-users] MatMPIAIJSetPreallocationCSR In-Reply-To: <30468df2eb47c9fc3cb02433a6ecc1f9@upm.es> References: <30468df2eb47c9fc3cb02433a6ecc1f9@upm.es> Message-ID: <373F07AB-A56A-4A9A-BD2E-F854FEC16D50@mcs.anl.gov> See the manual page for MatMPIAIJSetPreallocationCSR() it gives an explicit simple example The format which is used for the sparse matrix input, is equivalent to a row-major ordering.. i.e for the following matrix, the input data expected is as shown: 1 0 0 2 0 3 P0 ------- 4 5 6 P1 Process0 [P0]: rows_owned=[0,1] i = {0,1,3} [size = nrow+1 = 2+1] j = {0,0,2} [size = nz = 6] v = {1,2,3} [size = nz = 6] Process1 [P1]: rows_owned=[2] i = {0,3} [size = nrow+1 = 1+1] j = {0,1,2} [size = nz = 6] v = {4,5,6} [size = nz = 6] The column indices are global, the numerical values are just numerical values and do not need to be adjusted. On each process the i indices start with 0 because they just point into the local part of the j indices. Are you saying each process of yours HAS the entire matrix? If so you just need to adjust the local portion of the i vales and pass that plus the appropriate location in j and v to the routine as in the example above. Barry On May 14, 2014, at 8:36 AM, Oliver Browne wrote: > > > > On 14-05-2014 15:27, Barry Smith wrote: >> On May 14, 2014, at 7:42 AM, Oliver Browne wrote: >>> Hi, >>> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine. >>> If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here; >> What do you mean by ?separate? the vectors? Each processor >> needs to provide ITS rows to the function call. You cannot have >> processor zero deliver all the rows. > > I mean split them so they change from global numbering to local numbering. > > At the moment I just have > > CALL MatMPIAIJSetPreallocationCSR(A,NVPN,NNVI,CONT,ierr) - 3 vectors have global numbering > > How can submit this to a specific processor? > > Ollie > > >> Barry >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR >>> Thanks in advance, >>> Ollie From bsmith at mcs.anl.gov Wed May 14 11:37:57 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 May 2014 11:37:57 -0500 Subject: [petsc-users] ILUTP in PETSc In-Reply-To: <1400078912.39037.YahooMailNeo@web160203.mail.bf1.yahoo.com> References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com> <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov> <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com> <8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov> <1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com> <1400078912.39037.YahooMailNeo@web160203.mail.bf1.yahoo.com> Message-ID: On May 14, 2014, at 9:48 AM, Qin Lu wrote: > It turns out that I can not set PC side as right when KSP type is set to pre_only. After I fixed that, it works fine. > > This brings me a question: why do I have to set KSP type to pre_only when SuperLU's ILUTP is used as preconditioner? You don?t and you shouldn?t. Since you are using SuperLU ILU as a preconditioner, not a direct solver, using preonly means that it will apply the ILU triangular solves only once which means it will not return the correct solution. For LU you can use preonly since LU is a direct (?exact?) solver. > Can I still set KSP type as KSPBCGS (which seems to be the fastest with PETSc's ILU for my cases)? Yes. Barry > > Thanks, > Qin > > From: Barry Smith > To: Qin Lu > Cc: Xiaoye S. Li ; "petsc-users at mcs.anl.gov" > Sent: Tuesday, May 13, 2014 8:31 PM > Subject: Re: [petsc-users] ILUTP in PETSc > > > Works fine for me. Please please please ALWAYS cut and paste the entire error message that is printed. We print the information for a reason, because it provides clues as to what went wrong. > > > ./ex10 -f0 ~/Datafiles/Matrices/arco1 -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8 -ksp_monitor_true_residual -ksp_rtol 1.e-12 -ksp_view > > 0 KSP preconditioned resid norm 2.544968580491e+03 true resid norm 7.410897708964e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.467110329809e-06 true resid norm 1.439993537311e-07 ||r(i)||/||b|| 1.943075716143e-08 > 2 KSP preconditioned resid norm 1.522204461523e-12 true resid norm 2.699724724531e-11 ||r(i)||/||b|| 3.642911871885e-12 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: natural > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=1501, cols=1501 > package used to perform factorization: superlu > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU run parameters: > Equil: YES > ColPerm: 3 > IterRefine: 0 > SymmetricMode: NO > DiagPivotThresh: 0.1 > PivotGrowth: NO > ConditionNumber: NO > RowPerm: 1 > ReplaceTinyPivot: NO > PrintStat: NO > lwork: 0 > ILU_DropTol: 1e-08 > ILU_FillTol: 0.01 > ILU_FillFactor: 10 > ILU_DropRule: 9 > ILU_Norm: 2 > ILU_MILU: 0 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=1501, cols=1501 > total: nonzeros=26131, allocated nonzeros=26131 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 501 nodes, limit used is 5 > Number of iterations = 2 > Residual norm 2.69972e-11 > ~/Src/petsc/src/ksp/ksp/examples/tutorials master > > On May 13, 2014, at 12:17 PM, Qin Lu wrote: > > > I tried to use command line options as the example suggested ('-ksp_type preonly -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8') without changing my source code, but then the call to KSPSetUp returned error number 56. > > > > Does this mean I still need to change the source code (such as adding calls to PCFactorSetMatSolverPackage, PCFactorGetMatrix, etc.)in addition to the command line options? > > > > I ask this since the use of SuperLU seems to be different from using Hypre, which can be invoked with command line options without changing source code. > > > > Thanks a lot, > > Qin > > > > > > ----- Original Message ----- > > From: Barry Smith > > To: Qin Lu > > Cc: Xiaoye S. Li ; "petsc-users at mcs.anl.gov" > > Sent: Monday, May 12, 2014 5:11 PM > > Subject: Re: [petsc-users] ILUTP in PETSc > > > > > > See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html > > > > > > > > > > > > On May 12, 2014, at 4:54 PM, Qin Lu wrote: > > > >> Hello, > >> > >> I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?) > >> > >> Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work? > >> > >> Many thanks, > >> Qin > >> > >> > >> From: Xiaoye S. Li > >> To: Barry Smith > >> Cc: Qin Lu ; "petsc-users at mcs.anl.gov" > >> Sent: Friday, May 2, 2014 3:40 PM > >> Subject: Re: [petsc-users] ILUTP in PETSc > >> > >> > >> > >> The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily. > >> > >> In SuperLU distribution: > >> > >> EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c) > >> > >> SRC/zgsitrf.c : the actual ILUTP factorization routine > >> > >> > >> Sherry Li > >> > >> > >> > >> On Fri, May 2, 2014 at 12:25 PM, Barry Smith wrote: > >> > >> > >>> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.htmlthere are two listed. ./configure ?download-hypre > >>> > >>> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid > >>> > >>> you can also add -help to see what options are available. > >>> > >>> Both pretty much suck and I can?t image much reason for using them. > >>> > >>> Barry > >>> > >>> > >>> > >>> On May 2, 2014, at 10:27 AM, Qin Lu wrote: > >>> > >>>> Hello, > >>>> > >>>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthatmentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it? > >>>> > >>>> Many thanks, > >>>> Qin > >>> > >>> > > From bsmith at mcs.anl.gov Wed May 14 13:03:04 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 May 2014 13:03:04 -0500 Subject: [petsc-users] petsc 3.4, mat_view and prefix problem In-Reply-To: References: Message-ID: Yes, some of this handling of prefixes and viewing is wonky in 3.4 it is all fixed and coherent in master of the development version and will be correct in the next release. Barry On May 14, 2014, at 4:02 AM, Klaij, Christiaan wrote: > I'm having problems using mat_view in petsc 3.4.3 in combination > with a prefix. For example in ../snes/examples/tutorials/ex70: > > mpiexec -n 2 ./ex70 -nx 16 -ny 24 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower -user_ksp -a00_mat_view > > does not print the matrix a00 to screen. This used to work in 3.3 > versions before the single consistent -xxx_view scheme. > > Similarly, if I add this at line 105 of > ../ksp/ksp/examples/tutorials/ex1f.F: > > call MatSetOptionsPrefix(A,"a_",ierr) > > then running with -mat_view still prints the matrix to screen but > running with -a_mat_view doesn't. I expected the opposite. > > The problem only occurs with mat, not with ksp. For example, if I > add this at line 184 of ../ksp/ksp/examples/tutorials/ex1f.F: > > call KSPSetOptionsPrefix(ksp,"a_",ierr) > > then running with -a_ksp_monitor does print the residuals to > screen and -ksp_monitor doesn't, as expected. > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > From bsmith at mcs.anl.gov Wed May 14 13:05:12 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 14 May 2014 13:05:12 -0500 Subject: [petsc-users] MatMPIAIJSetPreallocationCSR In-Reply-To: <4ede9bde820986a689a2ba2fcb6291db@upm.es> References: <30468df2eb47c9fc3cb02433a6ecc1f9@upm.es> <373F07AB-A56A-4A9A-BD2E-F854FEC16D50@mcs.anl.gov> <4ede9bde820986a689a2ba2fcb6291db@upm.es> Message-ID: <0CC4B85D-12C2-4C4B-BEFE-6E42437101E3@mcs.anl.gov> On May 14, 2014, at 12:27 PM, Oliver Browne wrote: > > > On 14-05-2014 17:34, Barry Smith wrote: >> See the manual page for MatMPIAIJSetPreallocationCSR() it gives an >> explicit simple example >> The format which is used for the sparse matrix input, is equivalent to a >> row-major ordering.. i.e for the following matrix, the input data >> expected is >> as shown: >> 1 0 0 >> 2 0 3 P0 >> ------- >> 4 5 6 P1 >> Process0 [P0]: rows_owned=[0,1] >> i = {0,1,3} [size = nrow+1 = 2+1] >> j = {0,0,2} [size = nz = 6] >> v = {1,2,3} [size = nz = 6] >> Process1 [P1]: rows_owned=[2] >> i = {0,3} [size = nrow+1 = 1+1] >> j = {0,1,2} [size = nz = 6] >> v = {4,5,6} [size = nz = 6] >> The column indices are global, the numerical values are just >> numerical values and do not need to be adjusted. On each process the i >> indices start with 0 because they just point into the local part of >> the j indices. >> Are you saying each process of yours HAS the entire matrix? > > I am not entirely sure about this and what it means. Each processor has a portion of the matrix. > > > If so >> you just need to adjust the local portion of the i vales and pass that >> plus the appropriate location in j and v to the routine as in the >> example above. > > So this MatMPIAIJSetPreallocationCSR call should be in some sort of loop; > > Do counter = 1, No of Processors > > calculate local numbering for i and isolate parts of j and v needed > > Call MatMPIAIJSetPreallocationCSR(A,i,j,v) > > END DO > > Is this correct? Oh boy, oh boy. No absolutely not. Each process is calling MatMPIAIJSetPreallocationCSR() once with its part of the data. Barry > > Ollie > >> Barry >> On May 14, 2014, at 8:36 AM, Oliver Browne wrote: >>> On 14-05-2014 15:27, Barry Smith wrote: >>>> On May 14, 2014, at 7:42 AM, Oliver Browne wrote: >>>>> Hi, >>>>> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine. >>>>> If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here; >>>> What do you mean by ?separate? the vectors? Each processor >>>> needs to provide ITS rows to the function call. You cannot have >>>> processor zero deliver all the rows. >>> I mean split them so they change from global numbering to local numbering. >>> At the moment I just have >>> CALL MatMPIAIJSetPreallocationCSR(A,NVPN,NNVI,CONT,ierr) - 3 vectors have global numbering >>> How can submit this to a specific processor? >>> Ollie >>>> Barry >>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR >>>>> Thanks in advance, >>>>> Ollie From jed at jedbrown.org Wed May 14 19:08:21 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 14 May 2014 18:08:21 -0600 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: References: Message-ID: <87zjij6gzu.fsf@jedbrown.org> Christophe Ortiz writes: > Hi all, > > I am experiencing some problems of memory corruption with PetscMemzero(). > > I set the values of the Jacobian by blocks using MatSetValuesBlocked(). To > do so, I use some temporary two-dimensional arrays[dof][dof] that I must > reset at each loop. > > Inside FormIJacobian, for instance, I declare the following two-dimensional > array: > > PetscScalar diag[dof][dof]; > > and then, to zero the array diag[][] I do > > ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar)); Note that this can also be spelled PetscMemzero(diag,sizeof diag); > Then, inside main(), once dof is determined, I allocate memory for diag as > follows: > > diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); > > for (k = 0; k < dof; k++){ > diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); > } > That is, the classical way to allocate memory using the pointer notation. Note that you can do a contiguous allocation by creating a Vec, then use VecGetArray2D to get 2D indexing of it. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From jed at jedbrown.org Wed May 14 19:58:23 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 14 May 2014 18:58:23 -0600 Subject: [petsc-users] Memory usage during matrix factorization In-Reply-To: <17A78B9D13564547AC894B88C159674720382707@XMBX4.uibk.ac.at> References: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at> <87vbt98ev4.fsf@jedbrown.org> <17A78B9D13564547AC894B88C159674720382707@XMBX4.uibk.ac.at> Message-ID: <87r43v6eog.fsf@jedbrown.org> "De Groof, Vincent Frans Maria" writes: > Thanks. I made a new function based on the PetscGetCurrentUsage which > does what I want. It seems like I am being lucky as the numbers > returned by the OS seem to be reasonable. For example, Blue Gene/Q wants us to use Kernel_GetMemorySize, which resides in a header with inline assembly. So we need compiler flags that support the inline assembly just to access the function. getrusage() is useless on BG/Q and some other HPC systems. If it works on your system, you should think the vendor for doing something reasonable with the useful POSIX function. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From mairhofer at itt.uni-stuttgart.de Thu May 15 09:39:00 2014 From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer) Date: Thu, 15 May 2014 16:39:00 +0200 Subject: [petsc-users] PetscMalloc with Fortran Message-ID: <5374D184.5070207@itt.uni-stuttgart.de> Hi, I'm trying to set the coloring of a matrix using ISColoringCreate. Therefore I need an array 'colors' which in C can be creates as (from example ex5s.c) int *colors PetscMalloc(...,&colors) colors(i) = .... ISColoringCreate(...) How do I have to define the array colors in Fortran? I tried: Integer, allocatable :: colors(:) and allocate() instead of PetscMalloc and Integer, pointer :: colors but neither worked. Thanks, Jonas From jed at jedbrown.org Thu May 15 09:45:21 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 15 May 2014 08:45:21 -0600 Subject: [petsc-users] PetscMalloc with Fortran In-Reply-To: <5374D184.5070207@itt.uni-stuttgart.de> References: <5374D184.5070207@itt.uni-stuttgart.de> Message-ID: <87ha4r3xtq.fsf@jedbrown.org> Jonas Mairhofer writes: > Hi, I'm trying to set the coloring of a matrix using ISColoringCreate. > Therefore I need an array 'colors' which in C can be creates as (from > example ex5s.c) > > int *colors > PetscMalloc(...,&colors) There is no PetscMalloc in Fortran, due to language "deficiencies". > colors(i) = .... > > ISColoringCreate(...) > > How do I have to define the array colors in Fortran? > > I tried: > > Integer, allocatable :: colors(:) and allocate() instead of > PetscMalloc > > and > > Integer, pointer :: colors > > but neither worked. The ISColoringCreate Fortran binding copies from the array you pass into one allocated using PetscMalloc. You should pass a normal Fortran array (statically or dynamically allocated). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From mairhofer at itt.uni-stuttgart.de Thu May 15 11:56:44 2014 From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer) Date: Thu, 15 May 2014 18:56:44 +0200 Subject: [petsc-users] PetscMalloc with Fortran In-Reply-To: <87ha4r3xtq.fsf@jedbrown.org> References: <5374D184.5070207@itt.uni-stuttgart.de> <87ha4r3xtq.fsf@jedbrown.org> Message-ID: <5374F1CC.3080906@itt.uni-stuttgart.de> If 'colors' can be a dynamically allocated array then I dont know where the mistake is in this code: ISColoring iscoloring Integer, allocatable :: colors(:) PetscInt maxc ... !calculate max. number of colors maxc = 2*irc+1 !irc is the number of ghost nodes needed to calculate the function I want to solve allocate(colors(user%xm)) !where user%xm is the number of locally owned nodes of a global array !Set colors DO i=1,user%xm colors(i) = mod(i,maxc) END DO call ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr) ... deallocate(colors) call ISColoringDestroy(iscoloring,ierr) On execution I get the following error message (running the DO Loop from 0 to user%xm-1 does not change anything): [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Arguments are incompatible! [0]PETSC ERROR: Number of colors passed in 291 is less then the actual number of colors in array 61665! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named aries.itt.uni-stuttgart.de by mhofer Thu May 15 18:01:41 2014 [0]PETSC ERROR: Libraries linked from /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib [0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014 [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack --download-mpich [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ISColoringCreate() line 276 in /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c But when I print out colors, it only has entries from 0 to 218, so no entry is larger then 291 as stated in the error message. Am 15.05.2014 16:45, schrieb Jed Brown: > Jonas Mairhofer writes: > >> Hi, I'm trying to set the coloring of a matrix using ISColoringCreate. >> Therefore I need an array 'colors' which in C can be creates as (from >> example ex5s.c) >> >> int *colors >> PetscMalloc(...,&colors) > There is no PetscMalloc in Fortran, due to language "deficiencies". > >> colors(i) = .... >> >> ISColoringCreate(...) >> >> How do I have to define the array colors in Fortran? >> >> I tried: >> >> Integer, allocatable :: colors(:) and allocate() instead of >> PetscMalloc >> >> and >> >> Integer, pointer :: colors >> >> but neither worked. > The ISColoringCreate Fortran binding copies from the array you pass into > one allocated using PetscMalloc. You should pass a normal Fortran array > (statically or dynamically allocated). From prbrune at gmail.com Thu May 15 12:16:27 2014 From: prbrune at gmail.com (Peter Brune) Date: Thu, 15 May 2014 12:16:27 -0500 Subject: [petsc-users] PetscMalloc with Fortran In-Reply-To: <5374F1CC.3080906@itt.uni-stuttgart.de> References: <5374D184.5070207@itt.uni-stuttgart.de> <87ha4r3xtq.fsf@jedbrown.org> <5374F1CC.3080906@itt.uni-stuttgart.de> Message-ID: You should be using an array of type ISColoringValue. ISColoringValue is by default a short, not an int, so you're getting nonsense entries. We should either maintain or remove ex5s if it does something like this. - Peter On Thu, May 15, 2014 at 11:56 AM, Jonas Mairhofer < mairhofer at itt.uni-stuttgart.de> wrote: > > If 'colors' can be a dynamically allocated array then I dont know where > the mistake is in this code: > > > > > > ISColoring iscoloring > Integer, allocatable :: colors(:) > PetscInt maxc > > ... > > > !calculate max. number of colors > maxc = 2*irc+1 !irc is the number of ghost nodes needed to > calculate the function I want to solve > > allocate(colors(user%xm)) !where user%xm is the number of locally > owned nodes of a global array > > !Set colors > DO i=1,user%xm > colors(i) = mod(i,maxc) > END DO > > call > ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr) > > ... > > deallocate(colors) > call ISColoringDestroy(iscoloring,ierr) > > > > > On execution I get the following error message (running the DO Loop from > 0 to user%xm-1 does not change anything): > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Arguments are incompatible! > [0]PETSC ERROR: Number of colors passed in 291 is less then the actual > number of colors in array 61665! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named > aries.itt.uni-stuttgart.de by mhofer Thu May 15 18:01:41 2014 > [0]PETSC ERROR: Libraries linked from > /usr/ITT/mhofer/Documents/Diss/NumericalMethods/ > Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib > [0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran > --download-f-blas-lapack --download-mpich > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ISColoringCreate() line 276 in > /usr/ITT/mhofer/Documents/Diss/NumericalMethods/ > Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c > > > > > > But when I print out colors, it only has entries from 0 to 218, so no > entry is larger then 291 as stated in the error message. > > > > > > > > > > > Am 15.05.2014 16:45, schrieb Jed Brown: > > Jonas Mairhofer writes: >> >> Hi, I'm trying to set the coloring of a matrix using ISColoringCreate. >>> Therefore I need an array 'colors' which in C can be creates as (from >>> example ex5s.c) >>> >>> int *colors >>> PetscMalloc(...,&colors) >>> >> There is no PetscMalloc in Fortran, due to language "deficiencies". >> >> colors(i) = .... >>> >>> ISColoringCreate(...) >>> >>> How do I have to define the array colors in Fortran? >>> >>> I tried: >>> >>> Integer, allocatable :: colors(:) and allocate() instead of >>> PetscMalloc >>> >>> and >>> >>> Integer, pointer :: colors >>> >>> but neither worked. >>> >> The ISColoringCreate Fortran binding copies from the array you pass into >> one allocated using PetscMalloc. You should pass a normal Fortran array >> (statically or dynamically allocated). >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrosso at uci.edu Thu May 15 17:15:23 2014 From: mrosso at uci.edu (Michele Rosso) Date: Thu, 15 May 2014 15:15:23 -0700 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid Message-ID: <53753C7B.8010201@uci.edu> Hi, I am solving an inhomogeneous Laplacian in 3D (basically a slightly modified version of example ex34). The laplacian is discretized by using a cell-center finite difference 7-point stencil with periodic BCs. I am solving a time-dependent problem so the solution of the laplacian is repeated at each time step with a different matrix (always SPD though) and rhs. Also, the laplacian features large magnitude variations in the coefficients. I solve by means of CG + GAMG as preconditioner. Everything works fine for a while until I receive a DIVERGED_INDEFINITE_PC message. Before checking my model is incorrect I would like to rule out the possibility of improper use of the linear solver. I attached the full output of a serial run with -log-summary -ksp_view -ksp_converged_reason ksp_monitor_true_residual. I would appreciate if you could help me in locating the issue. Thanks. Michele -------------- next part -------------- 0 KSP unpreconditioned resid norm 1.436519531784e-03 true resid norm 1.436519531784e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.442655469101e-04 true resid norm 5.442655469101e-04 ||r(i)||/||b|| 3.788779302111e-01 2 KSP unpreconditioned resid norm 9.032635039164e-05 true resid norm 9.032635039164e-05 ||r(i)||/||b|| 6.287860930055e-02 3 KSP unpreconditioned resid norm 2.083274324922e-05 true resid norm 2.083274324922e-05 ||r(i)||/||b|| 1.450223459430e-02 4 KSP unpreconditioned resid norm 3.472803766647e-06 true resid norm 3.472803766647e-06 ||r(i)||/||b|| 2.417512390058e-03 5 KSP unpreconditioned resid norm 5.985774054401e-07 true resid norm 5.985774054401e-07 ||r(i)||/||b|| 4.166858801403e-04 6 KSP unpreconditioned resid norm 1.076601847506e-07 true resid norm 1.076601847507e-07 ||r(i)||/||b|| 7.494515902407e-05 7 KSP unpreconditioned resid norm 1.899550823170e-08 true resid norm 1.899550823182e-08 ||r(i)||/||b|| 1.322328573440e-05 8 KSP unpreconditioned resid norm 3.253138971746e-09 true resid norm 3.253138971800e-09 ||r(i)||/||b|| 2.264597800324e-06 9 KSP unpreconditioned resid norm 5.542615532199e-10 true resid norm 5.542615531507e-10 ||r(i)||/||b|| 3.858364198240e-07 Linear solve converged due to CONVERGED_RTOL iterations 9 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 1 **************************************** Simulation time = 0.0391 sec Time/time step = 28.3848 sec Average time/time step = 28.3848 sec U MAX = 0.000000000000000E+00 V MAX = 0.000000000000000E+00 W MAX = 0.000000000000000E+00 U MIN = 0.000000000000000E+00 V MIN = 0.000000000000000E+00 W MIN = -2.393531935714555E-04 U MAX = 0.000000000000000E+00 V MAX = 0.000000000000000E+00 W MAX = 2.393531935714555E-04 max(|divU|) = 1.519695631721842E-03 sum(divU*dV) = -5.185694521768437E-19 sum(|divU|*dV) = 7.171332934751148E-04 Convective cfl = 1.531860438857315E-03 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.222653406554768E-01 Iterations to convergence = 9 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.387273402621e-03 true resid norm 1.387273402621e-03 ||r(i)||/||b|| 4.913739999200e-01 1 KSP unpreconditioned resid norm 5.438934302605e-04 true resid norm 5.438934302605e-04 ||r(i)||/||b|| 1.926477433016e-01 2 KSP unpreconditioned resid norm 8.987734737919e-05 true resid norm 8.987734737919e-05 ||r(i)||/||b|| 3.183467051302e-02 3 KSP unpreconditioned resid norm 2.082485764685e-05 true resid norm 2.082485764685e-05 ||r(i)||/||b|| 7.376191009187e-03 4 KSP unpreconditioned resid norm 3.456700428092e-06 true resid norm 3.456700428091e-06 ||r(i)||/||b|| 1.224367678835e-03 5 KSP unpreconditioned resid norm 5.978250288791e-07 true resid norm 5.978250288791e-07 ||r(i)||/||b|| 2.117503839817e-04 6 KSP unpreconditioned resid norm 1.072323731238e-07 true resid norm 1.072323731239e-07 ||r(i)||/||b|| 3.798184265859e-05 7 KSP unpreconditioned resid norm 1.896537313626e-08 true resid norm 1.896537313628e-08 ||r(i)||/||b|| 6.717559235506e-06 8 KSP unpreconditioned resid norm 3.238581391500e-09 true resid norm 3.238581391409e-09 ||r(i)||/||b|| 1.147109639208e-06 9 KSP unpreconditioned resid norm 5.535175902890e-10 true resid norm 5.535175902904e-10 ||r(i)||/||b|| 1.960566329991e-07 Linear solve converged due to CONVERGED_RTOL iterations 9 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 2 **************************************** Simulation time = 0.0781 sec Time/time step = 5.6450 sec Average time/time step = 17.0149 sec U MAX = 2.461642558262655E-06 V MAX = 2.461642558262655E-06 W MAX = 0.000000000000000E+00 U MIN = -2.461642558262655E-06 V MIN = -2.461642558262655E-06 W MIN = -4.787063871429022E-04 U MAX = 2.461642558262655E-06 V MAX = 2.461642558262655E-06 W MAX = 4.787063871429022E-04 max(|divU|) = 2.991458565354511E-03 sum(divU*dV) = -3.567187440313492E-19 sum(|divU|*dV) = 1.437560546432204E-03 Convective cfl = 3.095229902460337E-03 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.215998719111754E-01 Iterations to convergence = 9 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.358918535350e-03 true resid norm 1.358918535350e-03 ||r(i)||/||b|| 3.249904384726e-01 1 KSP unpreconditioned resid norm 5.438197237016e-04 true resid norm 5.438197237016e-04 ||r(i)||/||b|| 1.300565161621e-01 2 KSP unpreconditioned resid norm 8.964805696977e-05 true resid norm 8.964805696977e-05 ||r(i)||/||b|| 2.143966734202e-02 3 KSP unpreconditioned resid norm 2.082401686586e-05 true resid norm 2.082401686586e-05 ||r(i)||/||b|| 4.980141337354e-03 4 KSP unpreconditioned resid norm 3.448023436879e-06 true resid norm 3.448023436880e-06 ||r(i)||/||b|| 8.246076710745e-04 5 KSP unpreconditioned resid norm 5.975866292463e-07 true resid norm 5.975866292461e-07 ||r(i)||/||b|| 1.429150722519e-04 6 KSP unpreconditioned resid norm 1.070083763305e-07 true resid norm 1.070083763303e-07 ||r(i)||/||b|| 2.559145249634e-05 7 KSP unpreconditioned resid norm 1.895097623737e-08 true resid norm 1.895097623731e-08 ||r(i)||/||b|| 4.532196681870e-06 8 KSP unpreconditioned resid norm 3.230731653657e-09 true resid norm 3.230731653621e-09 ||r(i)||/||b|| 7.726415302937e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 3 **************************************** Simulation time = 0.1172 sec Time/time step = 5.2815 sec Average time/time step = 13.1038 sec U MAX = 6.252436586750240E-06 V MAX = 6.252436586750240E-06 W MAX = 0.000000000000000E+00 U MIN = -6.252436586750240E-06 V MIN = -6.252436586750240E-06 W MIN = -7.180595806250817E-04 U MAX = 6.252436586750240E-06 V MAX = 6.252436586750240E-06 W MAX = 7.180595806250817E-04 max(|divU|) = 4.445068216431675E-03 sum(divU*dV) = 8.811539788548160E-18 sum(|divU|*dV) = 2.160108093875001E-03 Convective cfl = 4.675612504310926E-03 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.209274045579312E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.332134890432e-03 true resid norm 1.332134890432e-03 ||r(i)||/||b|| 2.416626272755e-01 1 KSP unpreconditioned resid norm 5.438184509314e-04 true resid norm 5.438184509314e-04 ||r(i)||/||b|| 9.865412020722e-02 2 KSP unpreconditioned resid norm 8.944495318173e-05 true resid norm 8.944495318173e-05 ||r(i)||/||b|| 1.622621142774e-02 3 KSP unpreconditioned resid norm 2.082458866812e-05 true resid norm 2.082458866812e-05 ||r(i)||/||b|| 3.777789205592e-03 4 KSP unpreconditioned resid norm 3.440115169336e-06 true resid norm 3.440115169336e-06 ||r(i)||/||b|| 6.240713879072e-04 5 KSP unpreconditioned resid norm 5.974692409002e-07 true resid norm 5.974692409003e-07 ||r(i)||/||b|| 1.083869114977e-04 6 KSP unpreconditioned resid norm 1.068060495192e-07 true resid norm 1.068060495197e-07 ||r(i)||/||b|| 1.937568839403e-05 7 KSP unpreconditioned resid norm 1.893864251792e-08 true resid norm 1.893864251800e-08 ||r(i)||/||b|| 3.435659662397e-06 8 KSP unpreconditioned resid norm 3.223578380294e-09 true resid norm 3.223578380319e-09 ||r(i)||/||b|| 5.847894430296e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 4 **************************************** Simulation time = 0.1562 sec Time/time step = 5.2840 sec Average time/time step = 11.1488 sec U MAX = 1.130662420195935E-05 V MAX = 1.130662420195935E-05 W MAX = 0.000000000000000E+00 U MIN = -1.130662420195935E-05 V MIN = -1.130662420195935E-05 W MIN = -9.574127740050222E-04 U MAX = 1.130662420195935E-05 V MAX = 1.130662420195935E-05 W MAX = 9.574127740050222E-04 max(|divU|) = 5.875355255447298E-03 sum(divU*dV) = -6.773730321252588E-19 sum(|divU|*dV) = 2.884755678217677E-03 Convective cfl = 6.272166543417223E-03 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.202483055484767E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.307223362909e-03 true resid norm 1.307223362909e-03 ||r(i)||/||b|| 1.917339325704e-01 1 KSP unpreconditioned resid norm 5.438982564847e-04 true resid norm 5.438982564847e-04 ||r(i)||/||b|| 7.977500601115e-02 2 KSP unpreconditioned resid norm 8.927299097308e-05 true resid norm 8.927299097308e-05 ||r(i)||/||b|| 1.309390737441e-02 3 KSP unpreconditioned resid norm 2.082704964639e-05 true resid norm 2.082704964639e-05 ||r(i)||/||b|| 3.054758846763e-03 4 KSP unpreconditioned resid norm 3.433123136852e-06 true resid norm 3.433123136852e-06 ||r(i)||/||b|| 5.035453149813e-04 5 KSP unpreconditioned resid norm 5.974837111114e-07 true resid norm 5.974837111118e-07 ||r(i)||/||b|| 8.763452737203e-05 6 KSP unpreconditioned resid norm 1.066300405697e-07 true resid norm 1.066300405697e-07 ||r(i)||/||b|| 1.563971207115e-05 7 KSP unpreconditioned resid norm 1.892883581926e-08 true resid norm 1.892883581947e-08 ||r(i)||/||b|| 2.776342768669e-06 8 KSP unpreconditioned resid norm 3.217263431375e-09 true resid norm 3.217263430635e-09 ||r(i)||/||b|| 4.718845969046e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 5 **************************************** Simulation time = 0.1953 sec Time/time step = 5.2903 sec Average time/time step = 9.9771 sec U MAX = 1.751966015931765E-05 V MAX = 1.751966015931765E-05 W MAX = 0.000000000000000E+00 U MIN = -1.751966015931765E-05 V MIN = -1.751966015931765E-05 W MIN = -1.196765967264871E-03 U MAX = 1.751966015931765E-05 V MAX = 1.751966015931765E-05 W MAX = 1.196765967264871E-03 max(|divU|) = 7.282771149617234E-03 sum(divU*dV) = 1.540113958245672E-18 sum(|divU|*dV) = 3.611478531740617E-03 Convective cfl = 7.883553840534440E-03 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.195631525122878E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.284025180478e-03 true resid norm 1.284025180478e-03 ||r(i)||/||b|| 1.585293618291e-01 1 KSP unpreconditioned resid norm 5.440529475024e-04 true resid norm 5.440529475024e-04 ||r(i)||/||b|| 6.717030777910e-02 2 KSP unpreconditioned resid norm 8.912993213531e-05 true resid norm 8.912993213531e-05 ||r(i)||/||b|| 1.100423221921e-02 3 KSP unpreconditioned resid norm 2.083128820889e-05 true resid norm 2.083128820889e-05 ||r(i)||/||b|| 2.571889458278e-03 4 KSP unpreconditioned resid norm 3.426985329255e-06 true resid norm 3.426985329255e-06 ||r(i)||/||b|| 4.231052517541e-04 5 KSP unpreconditioned resid norm 5.976212412688e-07 true resid norm 5.976212412689e-07 ||r(i)||/||b|| 7.378400006037e-05 6 KSP unpreconditioned resid norm 1.064784783066e-07 true resid norm 1.064784783067e-07 ||r(i)||/||b|| 1.314613254563e-05 7 KSP unpreconditioned resid norm 1.892141014094e-08 true resid norm 1.892141014064e-08 ||r(i)||/||b|| 2.336090537872e-06 8 KSP unpreconditioned resid norm 3.211733810102e-09 true resid norm 3.211733810296e-09 ||r(i)||/||b|| 3.965296935390e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 6 **************************************** Simulation time = 0.2344 sec Time/time step = 5.2861 sec Average time/time step = 9.1953 sec U MAX = 2.479691939489266E-05 V MAX = 2.479691939489266E-05 W MAX = 0.000000000000000E+00 U MIN = -2.479691939489266E-05 V MIN = -2.479691939489266E-05 W MIN = -1.436119160392139E-03 U MAX = 2.479691939489266E-05 V MAX = 2.479691939489266E-05 W MAX = 1.436119160392139E-03 max(|divU|) = 8.667778929815951E-03 sum(divU*dV) = 1.929412357010522E-17 sum(|divU|*dV) = 4.340254834708106E-03 Convective cfl = 9.508563194764315E-03 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.188724683969847E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.262404152977e-03 true resid norm 1.262404152977e-03 ||r(i)||/||b|| 1.348871940068e-01 1 KSP unpreconditioned resid norm 5.442761720331e-04 true resid norm 5.442761720331e-04 ||r(i)||/||b|| 5.815561160596e-02 2 KSP unpreconditioned resid norm 8.901362921075e-05 true resid norm 8.901362921075e-05 ||r(i)||/||b|| 9.511057646855e-03 3 KSP unpreconditioned resid norm 2.083716594647e-05 true resid norm 2.083716594647e-05 ||r(i)||/||b|| 2.226439796593e-03 4 KSP unpreconditioned resid norm 3.421634895428e-06 true resid norm 3.421634895429e-06 ||r(i)||/||b|| 3.655998191003e-04 5 KSP unpreconditioned resid norm 5.978718503849e-07 true resid norm 5.978718503845e-07 ||r(i)||/||b|| 6.388228055477e-05 6 KSP unpreconditioned resid norm 1.063494715136e-07 true resid norm 1.063494715136e-07 ||r(i)||/||b|| 1.136338292514e-05 7 KSP unpreconditioned resid norm 1.891618348142e-08 true resid norm 1.891618348117e-08 ||r(i)||/||b|| 2.021183869741e-06 8 KSP unpreconditioned resid norm 3.206931177866e-09 true resid norm 3.206931177846e-09 ||r(i)||/||b|| 3.426588441841e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 7 **************************************** Simulation time = 0.2734 sec Time/time step = 5.2908 sec Average time/time step = 8.6375 sec U MAX = 3.305085009380369E-05 V MAX = 3.305085009380369E-05 W MAX = 0.000000000000000E+00 U MIN = -3.305085009380369E-05 V MIN = -3.305085009380369E-05 W MIN = -1.675472353378616E-03 U MAX = 3.305085009380369E-05 V MAX = 3.305085009380369E-05 W MAX = 1.675472353378616E-03 max(|divU|) = 1.003084665173450E-02 sum(divU*dV) = 1.650583198534465E-17 sum(|divU|*dV) = 5.071065394490515E-03 Convective cfl = 1.114607394282383E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.181767370247083E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.242229955292e-03 true resid norm 1.242229955292e-03 ||r(i)||/||b|| 1.172208952731e-01 1 KSP unpreconditioned resid norm 5.445606391168e-04 true resid norm 5.445606391168e-04 ||r(i)||/||b|| 5.138652902053e-02 2 KSP unpreconditioned resid norm 8.892190380460e-05 true resid norm 8.892190380460e-05 ||r(i)||/||b|| 8.390962662720e-03 3 KSP unpreconditioned resid norm 2.084447741222e-05 true resid norm 2.084447741222e-05 ||r(i)||/||b|| 1.966953295042e-03 4 KSP unpreconditioned resid norm 3.416997089687e-06 true resid norm 3.416997089688e-06 ||r(i)||/||b|| 3.224390591231e-04 5 KSP unpreconditioned resid norm 5.982254535369e-07 true resid norm 5.982254535377e-07 ||r(i)||/||b|| 5.645051702395e-05 6 KSP unpreconditioned resid norm 1.062409410453e-07 true resid norm 1.062409410446e-07 ||r(i)||/||b|| 1.002524385348e-05 7 KSP unpreconditioned resid norm 1.891292046742e-08 true resid norm 1.891292046669e-08 ||r(i)||/||b|| 1.784685242769e-06 8 KSP unpreconditioned resid norm 3.202788322478e-09 true resid norm 3.202788322675e-09 ||r(i)||/||b|| 3.022256168875e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 8 **************************************** Simulation time = 0.3125 sec Time/time step = 5.2868 sec Average time/time step = 8.2187 sec U MAX = 4.220050557427550E-05 V MAX = 4.220050557427550E-05 W MAX = 0.000000000000000E+00 U MIN = -4.220050557427550E-05 V MIN = -4.220050557427550E-05 W MIN = -1.914825546219209E-03 U MAX = 4.220050557427550E-05 V MAX = 4.220050557427550E-05 W MAX = 1.914825546219209E-03 max(|divU|) = 1.137244546422055E-02 sum(divU*dV) = 2.000003067929342E-18 sum(|divU|*dV) = 5.803893153988580E-03 Convective cfl = 1.279504996715366E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.174764056878764E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.223379873944e-03 true resid norm 1.223379873944e-03 ||r(i)||/||b|| 1.035358121775e-01 1 KSP unpreconditioned resid norm 5.448986744158e-04 true resid norm 5.448986744158e-04 ||r(i)||/||b|| 4.611529747356e-02 2 KSP unpreconditioned resid norm 8.885276850463e-05 true resid norm 8.885276850464e-05 ||r(i)||/||b|| 7.519695024645e-03 3 KSP unpreconditioned resid norm 2.085297091140e-05 true resid norm 2.085297091140e-05 ||r(i)||/||b|| 1.764806930055e-03 4 KSP unpreconditioned resid norm 3.412995973201e-06 true resid norm 3.412995973200e-06 ||r(i)||/||b|| 2.888451229009e-04 5 KSP unpreconditioned resid norm 5.986734251487e-07 true resid norm 5.986734251486e-07 ||r(i)||/||b|| 5.066630620792e-05 6 KSP unpreconditioned resid norm 1.061508214252e-07 true resid norm 1.061508214263e-07 ||r(i)||/||b|| 8.983645835408e-06 7 KSP unpreconditioned resid norm 1.891136234479e-08 true resid norm 1.891136234533e-08 ||r(i)||/||b|| 1.600486734750e-06 8 KSP unpreconditioned resid norm 3.199236033375e-09 true resid norm 3.199236033557e-09 ||r(i)||/||b|| 2.707544141740e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 9 **************************************** Simulation time = 0.3516 sec Time/time step = 5.2901 sec Average time/time step = 7.8933 sec U MAX = 5.217110505333921E-05 V MAX = 5.217110505333921E-05 W MAX = 0.000000000000000E+00 U MIN = -5.217110505333921E-05 V MIN = -5.217110505333921E-05 W MIN = -2.154178738910741E-03 U MAX = 5.217110505333921E-05 V MAX = 5.217110505333921E-05 W MAX = 2.154178738910741E-03 max(|divU|) = 1.269304839536958E-02 sum(divU*dV) = -2.694000336582297E-17 sum(|divU|*dV) = 6.538722584835714E-03 Convective cfl = 1.445453407371149E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.167718875855106E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.205741071227e-03 true resid norm 1.205741071227e-03 ||r(i)||/||b|| 9.263443341308e-02 1 KSP unpreconditioned resid norm 5.452830457760e-04 true resid norm 5.452830457760e-04 ||r(i)||/||b|| 4.189289657671e-02 2 KSP unpreconditioned resid norm 8.880461896504e-05 true resid norm 8.880461896504e-05 ||r(i)||/||b|| 6.822663471120e-03 3 KSP unpreconditioned resid norm 2.086238471379e-05 true resid norm 2.086238471379e-05 ||r(i)||/||b|| 1.602811112373e-03 4 KSP unpreconditioned resid norm 3.409561410338e-06 true resid norm 3.409561410337e-06 ||r(i)||/||b|| 2.619491008233e-04 5 KSP unpreconditioned resid norm 5.992097002504e-07 true resid norm 5.992097002501e-07 ||r(i)||/||b|| 4.603596278080e-05 6 KSP unpreconditioned resid norm 1.060772787794e-07 true resid norm 1.060772787798e-07 ||r(i)||/||b|| 8.149683918264e-06 7 KSP unpreconditioned resid norm 1.891126501641e-08 true resid norm 1.891126501618e-08 ||r(i)||/||b|| 1.452910879212e-06 8 KSP unpreconditioned resid norm 3.196215804033e-09 true resid norm 3.196215804331e-09 ||r(i)||/||b|| 2.455582273554e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 10 **************************************** Simulation time = 0.3906 sec Time/time step = 5.2867 sec Average time/time step = 7.6326 sec U MAX = 6.289364203444345E-05 V MAX = 6.289364203444345E-05 W MAX = 0.000000000000000E+00 U MIN = -6.289364203444345E-05 V MIN = -6.289364203444345E-05 W MIN = -2.393531931450745E-03 U MAX = 6.289364203444345E-05 V MAX = 6.289364203444345E-05 W MAX = 2.393531931450745E-03 max(|divU|) = 1.399312989607586E-02 sum(divU*dV) = 1.953489400469769E-17 sum(|divU|*dV) = 7.275539118175428E-03 Convective cfl = 1.612364297932565E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.160635639983364E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.189212180809e-03 true resid norm 1.189212180809e-03 ||r(i)||/||b|| 8.375462057382e-02 1 KSP unpreconditioned resid norm 5.457076531056e-04 true resid norm 5.457076531056e-04 ||r(i)||/||b|| 3.843345886265e-02 2 KSP unpreconditioned resid norm 8.877633152293e-05 true resid norm 8.877633152293e-05 ||r(i)||/||b|| 6.252398085579e-03 3 KSP unpreconditioned resid norm 2.087248091046e-05 true resid norm 2.087248091046e-05 ||r(i)||/||b|| 1.470020865327e-03 4 KSP unpreconditioned resid norm 3.406633345812e-06 true resid norm 3.406633345813e-06 ||r(i)||/||b|| 2.399246222979e-04 5 KSP unpreconditioned resid norm 5.998310926621e-07 true resid norm 5.998310926614e-07 ||r(i)||/||b|| 4.224530019533e-05 6 KSP unpreconditioned resid norm 1.060188550981e-07 true resid norm 1.060188550980e-07 ||r(i)||/||b|| 7.466765919236e-06 7 KSP unpreconditioned resid norm 1.891242496403e-08 true resid norm 1.891242496489e-08 ||r(i)||/||b|| 1.331976751186e-06 8 KSP unpreconditioned resid norm 3.193674607405e-09 true resid norm 3.193674607819e-09 ||r(i)||/||b|| 2.249262237056e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 11 **************************************** Simulation time = 0.4297 sec Time/time step = 5.2932 sec Average time/time step = 7.4199 sec U MAX = 7.430448596718565E-05 V MAX = 7.430448596718565E-05 W MAX = 2.709060511180018E-21 U MIN = -7.430448596718565E-05 V MIN = -7.430448596718565E-05 W MIN = -2.632885123836222E-03 U MAX = 7.430448596718565E-05 V MAX = 7.430448596718565E-05 W MAX = 2.632885123836222E-03 max(|divU|) = 1.527316561904287E-02 sum(divU*dV) = -1.972843168610879E-17 sum(|divU|*dV) = 8.014328726644185E-03 Convective cfl = 1.780156221293180E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.153517864980171E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.173703535188e-03 true resid norm 1.173703535188e-03 ||r(i)||/||b|| 7.638845186070e-02 1 KSP unpreconditioned resid norm 5.461678657327e-04 true resid norm 5.461678657327e-04 ||r(i)||/||b|| 3.554638498442e-02 2 KSP unpreconditioned resid norm 8.876725102317e-05 true resid norm 8.876725102317e-05 ||r(i)||/||b|| 5.777262041306e-03 3 KSP unpreconditioned resid norm 2.088306633812e-05 true resid norm 2.088306633812e-05 ||r(i)||/||b|| 1.359138027490e-03 4 KSP unpreconditioned resid norm 3.404162327555e-06 true resid norm 3.404162327556e-06 ||r(i)||/||b|| 2.215539804460e-04 5 KSP unpreconditioned resid norm 6.005368354461e-07 true resid norm 6.005368354463e-07 ||r(i)||/||b|| 3.908489475386e-05 6 KSP unpreconditioned resid norm 1.059745081597e-07 true resid norm 1.059745081589e-07 ||r(i)||/||b|| 6.897166424276e-06 7 KSP unpreconditioned resid norm 1.891469142575e-08 true resid norm 1.891469142698e-08 ||r(i)||/||b|| 1.231029772180e-06 8 KSP unpreconditioned resid norm 3.191570541883e-09 true resid norm 3.191570541947e-09 ||r(i)||/||b|| 2.077178140768e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 12 **************************************** Simulation time = 0.4688 sec Time/time step = 5.2795 sec Average time/time step = 7.2416 sec U MAX = 8.634496512897778E-05 V MAX = 8.634496512897778E-05 W MAX = 8.489812655424283E-19 U MIN = -8.634496512897778E-05 V MIN = -8.634496512897778E-05 W MIN = -2.872238316062392E-03 U MAX = 8.634496512897778E-05 V MAX = 8.634496512897778E-05 W MAX = 2.872238316062392E-03 max(|divU|) = 1.653363196596731E-02 sum(divU*dV) = -2.462346780191219E-17 sum(|divU|*dV) = 8.755077706950073E-03 Convective cfl = 1.948754077645023E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.146368792554558E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.159136096728e-03 true resid norm 1.159136096728e-03 ||r(i)||/||b|| 7.018451665043e-02 1 KSP unpreconditioned resid norm 5.466604833622e-04 true resid norm 5.466604833622e-04 ||r(i)||/||b|| 3.309973859407e-02 2 KSP unpreconditioned resid norm 8.877709917005e-05 true resid norm 8.877709917005e-05 ||r(i)||/||b|| 5.375363438739e-03 3 KSP unpreconditioned resid norm 2.089399796449e-05 true resid norm 2.089399796449e-05 ||r(i)||/||b|| 1.265110414706e-03 4 KSP unpreconditioned resid norm 3.402106992486e-06 true resid norm 3.402106992486e-06 ||r(i)||/||b|| 2.059941326429e-04 5 KSP unpreconditioned resid norm 6.013276058776e-07 true resid norm 6.013276058795e-07 ||r(i)||/||b|| 3.640977749406e-05 6 KSP unpreconditioned resid norm 1.059435670909e-07 true resid norm 1.059435670911e-07 ||r(i)||/||b|| 6.414775684667e-06 7 KSP unpreconditioned resid norm 1.891796347385e-08 true resid norm 1.891796347347e-08 ||r(i)||/||b|| 1.145463527660e-06 8 KSP unpreconditioned resid norm 3.189873620070e-09 true resid norm 3.189873619632e-09 ||r(i)||/||b|| 1.931436168727e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 13 **************************************** Simulation time = 0.5078 sec Time/time step = 5.2942 sec Average time/time step = 7.0918 sec U MAX = 9.896094562848124E-05 V MAX = 9.896094562848124E-05 W MAX = 9.909002805648082E-18 U MIN = -9.896094562848124E-05 V MIN = -9.896094562848124E-05 W MIN = -3.111591508121445E-03 U MAX = 9.896094562848124E-05 V MAX = 9.896094562848124E-05 W MAX = 3.111591508121445E-03 max(|divU|) = 1.777500515817062E-02 sum(divU*dV) = 1.234132652517441E-18 sum(|divU|*dV) = 9.497772642299201E-03 Convective cfl = 2.118088575602181E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.139191413667278E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.145439560085e-03 true resid norm 1.145439560085e-03 ||r(i)||/||b|| 6.489199472629e-02 1 KSP unpreconditioned resid norm 5.471834130806e-04 true resid norm 5.471834130806e-04 ||r(i)||/||b|| 3.099929877864e-02 2 KSP unpreconditioned resid norm 8.880585182682e-05 true resid norm 8.880585182682e-05 ||r(i)||/||b|| 5.031071973787e-03 3 KSP unpreconditioned resid norm 2.090517530672e-05 true resid norm 2.090517530672e-05 ||r(i)||/||b|| 1.184330079935e-03 4 KSP unpreconditioned resid norm 3.400429960731e-06 true resid norm 3.400429960732e-06 ||r(i)||/||b|| 1.926427991213e-04 5 KSP unpreconditioned resid norm 6.022043913180e-07 true resid norm 6.022043913185e-07 ||r(i)||/||b|| 3.411637378990e-05 6 KSP unpreconditioned resid norm 1.059256456909e-07 true resid norm 1.059256456905e-07 ||r(i)||/||b|| 6.000950797457e-06 7 KSP unpreconditioned resid norm 1.892217526757e-08 true resid norm 1.892217526718e-08 ||r(i)||/||b|| 1.071988204735e-06 8 KSP unpreconditioned resid norm 3.188547743643e-09 true resid norm 3.188547741446e-09 ||r(i)||/||b|| 1.806391453837e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 14 **************************************** Simulation time = 0.5469 sec Time/time step = 5.2864 sec Average time/time step = 6.9628 sec U MAX = 1.121024310588700E-04 V MAX = 1.121024310588700E-04 W MAX = 4.203521926363097E-17 U MIN = -1.121024310588700E-04 V MIN = -1.121024310588700E-04 W MIN = -3.350944700001264E-03 U MAX = 1.121024310588700E-04 V MAX = 1.121024310588700E-04 W MAX = 3.350944700001264E-03 max(|divU|) = 1.899775987895855E-02 sum(divU*dV) = 6.277424218272801E-18 sum(|divU|*dV) = 1.024240047107154E-02 Convective cfl = 2.288095719756163E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.131988490632803E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.132550247035e-03 true resid norm 1.132550247035e-03 ||r(i)||/||b|| 6.032706786034e-02 1 KSP unpreconditioned resid norm 5.477352053856e-04 true resid norm 5.477352053856e-04 ||r(i)||/||b|| 2.917597606933e-02 2 KSP unpreconditioned resid norm 8.885363013505e-05 true resid norm 8.885363013505e-05 ||r(i)||/||b|| 4.732928175885e-03 3 KSP unpreconditioned resid norm 2.091652524753e-05 true resid norm 2.091652524753e-05 ||r(i)||/||b|| 1.114151571918e-03 4 KSP unpreconditioned resid norm 3.399093507522e-06 true resid norm 3.399093507523e-06 ||r(i)||/||b|| 1.810580548006e-04 5 KSP unpreconditioned resid norm 6.031675064902e-07 true resid norm 6.031675064902e-07 ||r(i)||/||b|| 3.212866465789e-05 6 KSP unpreconditioned resid norm 1.059205575554e-07 true resid norm 1.059205575550e-07 ||r(i)||/||b|| 5.642024872768e-06 7 KSP unpreconditioned resid norm 1.892728637898e-08 true resid norm 1.892728637999e-08 ||r(i)||/||b|| 1.008191639044e-06 8 KSP unpreconditioned resid norm 3.187569971159e-09 true resid norm 3.187569969133e-09 ||r(i)||/||b|| 1.697909212779e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 15 **************************************** Simulation time = 0.5859 sec Time/time step = 5.2800 sec Average time/time step = 6.8506 sec U MAX = 1.257232025658109E-04 V MAX = 1.257232025658109E-04 W MAX = 1.252536729387660E-16 U MIN = -1.257232025658109E-04 V MIN = -1.257232025658109E-04 W MIN = -3.590297891684162E-03 U MAX = 1.257232025658109E-04 V MAX = 1.257232025658109E-04 W MAX = 3.590297891684162E-03 max(|divU|) = 2.020236775959870E-02 sum(divU*dV) = -2.143760560576972E-17 sum(|divU|*dV) = 1.098894856855465E-02 Convective cfl = 2.458716349962102E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.124762576993311E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.120409377489e-03 true resid norm 1.120409377489e-03 ||r(i)||/||b|| 5.635193026078e-02 1 KSP unpreconditioned resid norm 5.483146006913e-04 true resid norm 5.483146006913e-04 ||r(i)||/||b|| 2.757794316965e-02 2 KSP unpreconditioned resid norm 8.892063571212e-05 true resid norm 8.892063571212e-05 ||r(i)||/||b|| 4.472338024897e-03 3 KSP unpreconditioned resid norm 2.092798582458e-05 true resid norm 2.092798582458e-05 ||r(i)||/||b|| 1.052590616770e-03 4 KSP unpreconditioned resid norm 3.398055920868e-06 true resid norm 3.398055920868e-06 ||r(i)||/||b|| 1.709080753181e-04 5 KSP unpreconditioned resid norm 6.042159401370e-07 true resid norm 6.042159401381e-07 ||r(i)||/||b|| 3.038954796811e-05 6 KSP unpreconditioned resid norm 1.059282601319e-07 true resid norm 1.059282601309e-07 ||r(i)||/||b|| 5.327750773493e-06 7 KSP unpreconditioned resid norm 1.893326684092e-08 true resid norm 1.893326684090e-08 ||r(i)||/||b|| 9.522645508544e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 16 **************************************** Simulation time = 0.6250 sec Time/time step = 4.9413 sec Average time/time step = 6.7313 sec U MAX = 1.397805072379311E-04 V MAX = 1.397805072379311E-04 W MAX = 7.133117161023170E-16 U MIN = -1.397805072379311E-04 V MIN = -1.397805072379311E-04 W MIN = -3.829651083145590E-03 U MAX = 1.397805072379311E-04 V MAX = 1.397805072379311E-04 W MAX = 3.829651083145590E-03 max(|divU|) = 2.138929604270771E-02 sum(divU*dV) = 1.306705937858796E-17 sum(|divU|*dV) = 1.173740476487146E-02 Convective cfl = 2.629895742477729E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.117516034740129E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.108962607610e-03 true resid norm 1.108962607610e-03 ||r(i)||/||b|| 5.286125774350e-02 1 KSP unpreconditioned resid norm 5.489205855391e-04 true resid norm 5.489205855391e-04 ||r(i)||/||b|| 2.616556442369e-02 2 KSP unpreconditioned resid norm 8.900718881832e-05 true resid norm 8.900718881832e-05 ||r(i)||/||b|| 4.242732727741e-03 3 KSP unpreconditioned resid norm 2.093948323922e-05 true resid norm 2.093948323923e-05 ||r(i)||/||b|| 9.981287132030e-04 4 KSP unpreconditioned resid norm 3.397281585611e-06 true resid norm 3.397281585611e-06 ||r(i)||/||b|| 1.619392541208e-04 5 KSP unpreconditioned resid norm 6.053482815860e-07 true resid norm 6.053482815853e-07 ||r(i)||/||b|| 2.885532056524e-05 6 KSP unpreconditioned resid norm 1.059487960121e-07 true resid norm 1.059487960112e-07 ||r(i)||/||b|| 5.050293468081e-06 7 KSP unpreconditioned resid norm 1.894010874842e-08 true resid norm 1.894010874811e-08 ||r(i)||/||b|| 9.028239215219e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 17 **************************************** Simulation time = 0.6641 sec Time/time step = 4.9385 sec Average time/time step = 6.6258 sec U MAX = 1.542347908037065E-04 V MAX = 1.542347908037065E-04 W MAX = 5.397018840409226E-15 U MIN = -1.542347908037065E-04 V MIN = -1.542347908037065E-04 W MIN = -4.069004274352844E-03 U MAX = 1.542347908037065E-04 V MAX = 1.542347908037065E-04 W MAX = 4.069004274352844E-03 max(|divU|) = 2.255900665212567E-02 sum(divU*dV) = 3.138559063017270E-17 sum(|divU|*dV) = 1.248775726365972E-02 Convective cfl = 2.801583267814565E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.110251049105822E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.098157656259e-03 true resid norm 1.098157656259e-03 ||r(i)||/||b|| 4.977309599735e-02 1 KSP unpreconditioned resid norm 5.495507861759e-04 true resid norm 5.495507861759e-04 ||r(i)||/||b|| 2.490793911044e-02 2 KSP unpreconditioned resid norm 8.911358386156e-05 true resid norm 8.911358386156e-05 ||r(i)||/||b|| 4.039000173546e-03 3 KSP unpreconditioned resid norm 2.095097948922e-05 true resid norm 2.095097948922e-05 ||r(i)||/||b|| 9.495859792192e-04 4 KSP unpreconditioned resid norm 3.396688656334e-06 true resid norm 3.396688656331e-06 ||r(i)||/||b|| 1.539521302803e-04 5 KSP unpreconditioned resid norm 6.065576823288e-07 true resid norm 6.065576823288e-07 ||r(i)||/||b|| 2.749172997012e-05 6 KSP unpreconditioned resid norm 1.059825424861e-07 true resid norm 1.059825424861e-07 ||r(i)||/||b|| 4.803571901665e-06 7 KSP unpreconditioned resid norm 1.894779795394e-08 true resid norm 1.894779795395e-08 ||r(i)||/||b|| 8.587934174348e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 18 **************************************** Simulation time = 0.7031 sec Time/time step = 4.9321 sec Average time/time step = 6.5317 sec U MAX = 1.690494631829092E-04 V MAX = 1.690494631829092E-04 W MAX = 1.511073763710468E-14 U MIN = -1.690494631829092E-04 V MIN = -1.690494631829092E-04 W MIN = -4.308357465263742E-03 U MAX = 1.690494631829092E-04 V MAX = 1.690494631829092E-04 W MAX = 4.308357465263742E-03 max(|divU|) = 2.371195570421031E-02 sum(divU*dV) = 3.211602794323830E-18 sum(|divU|*dV) = 1.323999447402270E-02 Convective cfl = 2.973732090642919E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.102969641552062E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.087947834748e-03 true resid norm 1.087947834748e-03 ||r(i)||/||b|| 4.702287504217e-02 1 KSP unpreconditioned resid norm 5.502038146917e-04 true resid norm 5.502038146917e-04 ||r(i)||/||b|| 2.378070381654e-02 2 KSP unpreconditioned resid norm 8.924045016893e-05 true resid norm 8.924045016893e-05 ||r(i)||/||b|| 3.857117412228e-03 3 KSP unpreconditioned resid norm 2.096241068775e-05 true resid norm 2.096241068775e-05 ||r(i)||/||b|| 9.060294867735e-04 4 KSP unpreconditioned resid norm 3.396224040976e-06 true resid norm 3.396224040977e-06 ||r(i)||/||b|| 1.467903272505e-04 5 KSP unpreconditioned resid norm 6.078391357452e-07 true resid norm 6.078391357456e-07 ||r(i)||/||b|| 2.627179614043e-05 6 KSP unpreconditioned resid norm 1.060299620778e-07 true resid norm 1.060299620776e-07 ||r(i)||/||b|| 4.582787426253e-06 7 KSP unpreconditioned resid norm 1.895638236531e-08 true resid norm 1.895638236494e-08 ||r(i)||/||b|| 8.193256797143e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 19 **************************************** Simulation time = 0.7422 sec Time/time step = 4.9272 sec Average time/time step = 6.4473 sec U MAX = 1.841906839855050E-04 V MAX = 1.841906839855050E-04 W MAX = 3.305759669279126E-14 U MIN = -1.841906839855050E-04 V MIN = -1.841906839855050E-04 W MIN = -4.547710655825268E-03 U MAX = 1.841906839855050E-04 V MAX = 1.841906839855050E-04 W MAX = 4.547710655825268E-03 max(|divU|) = 2.484859332098110E-02 sum(divU*dV) = 1.110347635684405E-17 sum(|divU|*dV) = 1.399410480450672E-02 Convective cfl = 3.146298895229618E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.095673681655521E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.078290222827e-03 true resid norm 1.078290222827e-03 ||r(i)||/||b|| 4.455900228385e-02 1 KSP unpreconditioned resid norm 5.508779799768e-04 true resid norm 5.508779799768e-04 ||r(i)||/||b|| 2.276434734200e-02 2 KSP unpreconditioned resid norm 8.938865146485e-05 true resid norm 8.938865146485e-05 ||r(i)||/||b|| 3.693874840422e-03 3 KSP unpreconditioned resid norm 2.097375746539e-05 true resid norm 2.097375746539e-05 ||r(i)||/||b|| 8.667144401542e-04 4 KSP unpreconditioned resid norm 3.395811674005e-06 true resid norm 3.395811674004e-06 ||r(i)||/||b|| 1.403276937268e-04 5 KSP unpreconditioned resid norm 6.091841326740e-07 true resid norm 6.091841326724e-07 ||r(i)||/||b|| 2.517377658111e-05 6 KSP unpreconditioned resid norm 1.060918815922e-07 true resid norm 1.060918815906e-07 ||r(i)||/||b|| 4.384115049934e-06 7 KSP unpreconditioned resid norm 1.896595062233e-08 true resid norm 1.896595062427e-08 ||r(i)||/||b|| 7.837443197499e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 20 **************************************** Simulation time = 0.7812 sec Time/time step = 4.9370 sec Average time/time step = 6.3718 sec U MAX = 1.996271584416596E-04 V MAX = 1.996271584416596E-04 W MAX = 6.378338308959739E-14 U MIN = -1.996271584416596E-04 V MIN = -1.996271584416596E-04 W MIN = -4.787063845972175E-03 U MAX = 1.996271584416596E-04 V MAX = 1.996271584416596E-04 W MAX = 4.787063845972175E-03 max(|divU|) = 2.596936353016404E-02 sum(divU*dV) = -5.087558198290590E-18 sum(|divU|*dV) = 1.475007648125029E-02 Convective cfl = 3.319243624227516E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.088364898408413E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.069146388466e-03 true resid norm 1.069146388466e-03 ||r(i)||/||b|| 4.233981144113e-02 1 KSP unpreconditioned resid norm 5.515719033567e-04 true resid norm 5.515719033567e-04 ||r(i)||/||b|| 2.184308027066e-02 2 KSP unpreconditioned resid norm 8.955941510882e-05 true resid norm 8.955941510882e-05 ||r(i)||/||b|| 3.546688076949e-03 3 KSP unpreconditioned resid norm 2.098506285815e-05 true resid norm 2.098506285815e-05 ||r(i)||/||b|| 8.310401775466e-04 4 KSP unpreconditioned resid norm 3.395363857784e-06 true resid norm 3.395363857784e-06 ||r(i)||/||b|| 1.344615359164e-04 5 KSP unpreconditioned resid norm 6.105809735096e-07 true resid norm 6.105809735105e-07 ||r(i)||/||b|| 2.417992855503e-05 6 KSP unpreconditioned resid norm 1.061694543455e-07 true resid norm 1.061694543438e-07 ||r(i)||/||b|| 4.204470712540e-06 7 KSP unpreconditioned resid norm 1.897668188500e-08 true resid norm 1.897668188377e-08 ||r(i)||/||b|| 7.515052582182e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 21 **************************************** Simulation time = 0.8203 sec Time/time step = 4.9311 sec Average time/time step = 6.3032 sec U MAX = 2.153299398103797E-04 V MAX = 2.153299398103797E-04 W MAX = 1.135363150842961E-13 U MIN = -2.153299398103797E-04 V MIN = -2.153299398103797E-04 W MIN = -5.026417035625515E-03 U MAX = 2.153299398103797E-04 V MAX = 2.153299398103797E-04 W MAX = 5.026417035625515E-03 max(|divU|) = 2.707470407611532E-02 sum(divU*dV) = 2.692776324389155E-17 sum(|divU|*dV) = 1.550789744160050E-02 Convective cfl = 3.492529225757616E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.081044891147881E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.060482205902e-03 true resid norm 1.060482205902e-03 ||r(i)||/||b|| 4.033130934561e-02 1 KSP unpreconditioned resid norm 5.522846765089e-04 true resid norm 5.522846765089e-04 ||r(i)||/||b|| 2.100399611720e-02 2 KSP unpreconditioned resid norm 8.975439648397e-05 true resid norm 8.975439648397e-05 ||r(i)||/||b|| 3.413458811074e-03 3 KSP unpreconditioned resid norm 2.099646773029e-05 true resid norm 2.099646773029e-05 ||r(i)||/||b|| 7.985188534825e-04 4 KSP unpreconditioned resid norm 3.394779537215e-06 true resid norm 3.394779537214e-06 ||r(i)||/||b|| 1.291072145422e-04 5 KSP unpreconditioned resid norm 6.120131793176e-07 true resid norm 6.120131793168e-07 ||r(i)||/||b|| 2.327553703518e-05 6 KSP unpreconditioned resid norm 1.062640565168e-07 true resid norm 1.062640565178e-07 ||r(i)||/||b|| 4.041339413228e-06 7 KSP unpreconditioned resid norm 1.898884634996e-08 true resid norm 1.898884635060e-08 ||r(i)||/||b|| 7.221667954637e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 22 **************************************** Simulation time = 0.8594 sec Time/time step = 4.9284 sec Average time/time step = 6.2407 sec U MAX = 2.312722392105076E-04 V MAX = 2.312722392105076E-04 W MAX = 1.906783837822408E-13 U MIN = -2.312722392105076E-04 V MIN = -2.312722392105076E-04 W MIN = -5.265770379365912E-03 U MAX = 2.312722392105076E-04 V MAX = 2.312722392105076E-04 W MAX = 5.265770379365912E-03 max(|divU|) = 2.816504607520991E-02 sum(divU*dV) = 8.245320893979468E-18 sum(|divU|*dV) = 1.626755532681830E-02 Convective cfl = 3.666121508983634E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.073715135884436E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.052267378610e-03 true resid norm 1.052267378610e-03 ||r(i)||/||b|| 3.850549867735e-02 1 KSP unpreconditioned resid norm 5.530158897949e-04 true resid norm 5.530158897949e-04 ||r(i)||/||b|| 2.023644659705e-02 2 KSP unpreconditioned resid norm 8.997572962617e-05 true resid norm 8.997572962617e-05 ||r(i)||/||b|| 3.292471484474e-03 3 KSP unpreconditioned resid norm 2.100825099164e-05 true resid norm 2.100825099164e-05 ||r(i)||/||b|| 7.687525026587e-04 4 KSP unpreconditioned resid norm 3.393942616429e-06 true resid norm 3.393942616428e-06 ||r(i)||/||b|| 1.241941502554e-04 5 KSP unpreconditioned resid norm 6.134571479999e-07 true resid norm 6.134571479977e-07 ||r(i)||/||b|| 2.244816657915e-05 6 KSP unpreconditioned resid norm 1.063770079844e-07 true resid norm 1.063770079864e-07 ||r(i)||/||b|| 3.892641569610e-06 7 KSP unpreconditioned resid norm 1.900281175641e-08 true resid norm 1.900281175742e-08 ||r(i)||/||b|| 6.953676963341e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 23 **************************************** Simulation time = 0.8984 sec Time/time step = 4.9308 sec Average time/time step = 6.1837 sec U MAX = 2.474292462119088E-04 V MAX = 2.474292462119088E-04 W MAX = 3.061489539643341E-13 U MIN = -2.474292462119088E-04 V MIN = -2.474292462119088E-04 W MIN = -5.505123957015076E-03 U MAX = 2.474292462119088E-04 V MAX = 2.474292462119088E-04 W MAX = 5.505123957015076E-03 max(|divU|) = 2.924081356023149E-02 sum(divU*dV) = 3.321134889941472E-17 sum(|divU|*dV) = 1.702903756496682E-02 Convective cfl = 3.839988767640892E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.066377001411958E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.044474816858e-03 true resid norm 1.044474816858e-03 ||r(i)||/||b|| 3.683911364142e-02 1 KSP unpreconditioned resid norm 5.537655397092e-04 true resid norm 5.537655397092e-04 ||r(i)||/||b|| 1.953156870686e-02 2 KSP unpreconditioned resid norm 9.022607780607e-05 true resid norm 9.022607780607e-05 ||r(i)||/||b|| 3.182315820419e-03 3 KSP unpreconditioned resid norm 2.102087735276e-05 true resid norm 2.102087735276e-05 ||r(i)||/||b|| 7.414161424877e-04 4 KSP unpreconditioned resid norm 3.392721415668e-06 true resid norm 3.392721415672e-06 ||r(i)||/||b|| 1.196628657468e-04 5 KSP unpreconditioned resid norm 6.148790223682e-07 true resid norm 6.148790223667e-07 ||r(i)||/||b|| 2.168706972642e-05 6 KSP unpreconditioned resid norm 1.065090503259e-07 true resid norm 1.065090503268e-07 ||r(i)||/||b|| 3.756623851048e-06 7 KSP unpreconditioned resid norm 1.901908100219e-08 true resid norm 1.901908100011e-08 ||r(i)||/||b|| 6.708118520521e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 24 **************************************** Simulation time = 0.9375 sec Time/time step = 4.9346 sec Average time/time step = 6.1317 sec U MAX = 2.637779631398120E-04 V MAX = 2.637779631398120E-04 W MAX = 4.739806503114139E-13 U MIN = -2.637779631398120E-04 V MIN = -2.637779631398120E-04 W MIN = -5.744477599478519E-03 U MAX = 2.637779631398120E-04 V MAX = 2.637779631398120E-04 W MAX = 5.744477599478519E-03 max(|divU|) = 3.030242301450445E-02 sum(divU*dV) = -1.221618764499917E-17 sum(|divU|*dV) = 1.779233150950448E-02 Convective cfl = 4.014101456485212E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.059031763149156E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.037080016490e-03 true resid norm 1.037080016490e-03 ||r(i)||/||b|| 3.531265790473e-02 1 KSP unpreconditioned resid norm 5.545338685654e-04 true resid norm 5.545338685654e-04 ||r(i)||/||b|| 1.888192278887e-02 2 KSP unpreconditioned resid norm 9.050870210229e-05 true resid norm 9.050870210229e-05 ||r(i)||/||b|| 3.081828580168e-03 3 KSP unpreconditioned resid norm 2.103505922838e-05 true resid norm 2.103505922838e-05 ||r(i)||/||b|| 7.162454571748e-04 4 KSP unpreconditioned resid norm 3.390971345242e-06 true resid norm 3.390971345244e-06 ||r(i)||/||b|| 1.154628468155e-04 5 KSP unpreconditioned resid norm 6.162308170075e-07 true resid norm 6.162308170072e-07 ||r(i)||/||b|| 2.098270884149e-05 6 KSP unpreconditioned resid norm 1.066594669233e-07 true resid norm 1.066594669234e-07 ||r(i)||/||b|| 3.631763420256e-06 7 KSP unpreconditioned resid norm 1.903823155722e-08 true resid norm 1.903823155849e-08 ||r(i)||/||b|| 6.482533145433e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 25 **************************************** Simulation time = 0.9766 sec Time/time step = 4.9259 sec Average time/time step = 6.0835 sec U MAX = 2.802970539392997E-04 V MAX = 2.802970539392997E-04 W MAX = 7.118669076767622E-13 U MIN = -2.802970539392997E-04 V MIN = -2.802970539392997E-04 W MIN = -5.983831310938474E-03 U MAX = 2.802970539392997E-04 V MAX = 2.802970539392997E-04 W MAX = 5.983831310938474E-03 max(|divU|) = 3.135028298108410E-02 sum(divU*dV) = 2.828981006113311E-17 sum(|divU|*dV) = 1.855823305025077E-02 Convective cfl = 4.188432268042927E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.051680600073254E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.030060522589e-03 true resid norm 1.030060522589e-03 ||r(i)||/||b|| 3.390966630881e-02 1 KSP unpreconditioned resid norm 5.553212043042e-04 true resid norm 5.553212043042e-04 ||r(i)||/||b|| 1.828121388910e-02 2 KSP unpreconditioned resid norm 9.082756384795e-05 true resid norm 9.082756384795e-05 ||r(i)||/||b|| 2.990049918606e-03 3 KSP unpreconditioned resid norm 2.105184319773e-05 true resid norm 2.105184319773e-05 ||r(i)||/||b|| 6.930281885051e-04 4 KSP unpreconditioned resid norm 3.388544492928e-06 true resid norm 3.388544492927e-06 ||r(i)||/||b|| 1.115511278298e-04 5 KSP unpreconditioned resid norm 6.174461021315e-07 true resid norm 6.174461021323e-07 ||r(i)||/||b|| 2.032636998296e-05 6 KSP unpreconditioned resid norm 1.068247480447e-07 true resid norm 1.068247480449e-07 ||r(i)||/||b|| 3.516678370141e-06 7 KSP unpreconditioned resid norm 1.906085595275e-08 true resid norm 1.906085595211e-08 ||r(i)||/||b|| 6.274847455291e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 26 **************************************** Simulation time = 1.0156 sec Time/time step = 4.9355 sec Average time/time step = 6.0393 sec U MAX = 2.969667062277622E-04 V MAX = 2.969667062277622E-04 W MAX = 1.041779941880119E-12 U MIN = -2.969667062277622E-04 V MIN = -2.969667062277622E-04 W MIN = -6.226019377274523E-03 U MAX = 2.969667062277622E-04 V MAX = 2.969667062277622E-04 W MAX = 6.226019377274523E-03 max(|divU|) = 3.238479376966569E-02 sum(divU*dV) = -6.519427921141458E-17 sum(|divU|*dV) = 1.932639723845376E-02 Convective cfl = 4.364769785427231E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.044248168636828E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.023395463874e-03 true resid norm 1.023395463874e-03 ||r(i)||/||b|| 3.261613207602e-02 1 KSP unpreconditioned resid norm 5.561278383905e-04 true resid norm 5.561278383905e-04 ||r(i)||/||b|| 1.772407604723e-02 2 KSP unpreconditioned resid norm 9.118747527464e-05 true resid norm 9.118747527464e-05 ||r(i)||/||b|| 2.906191049526e-03 3 KSP unpreconditioned resid norm 2.107273617243e-05 true resid norm 2.107273617243e-05 ||r(i)||/||b|| 6.715987811802e-04 4 KSP unpreconditioned resid norm 3.385312807046e-06 true resid norm 3.385312807042e-06 ||r(i)||/||b|| 1.078916347891e-04 5 KSP unpreconditioned resid norm 6.184362436807e-07 true resid norm 6.184362436783e-07 ||r(i)||/||b|| 1.970987650077e-05 6 KSP unpreconditioned resid norm 1.069966168774e-07 true resid norm 1.069966168782e-07 ||r(i)||/||b|| 3.410036404283e-06 7 KSP unpreconditioned resid norm 1.908732989312e-08 true resid norm 1.908732989601e-08 ||r(i)||/||b|| 6.083228769749e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 27 **************************************** Simulation time = 1.0547 sec Time/time step = 4.9416 sec Average time/time step = 5.9987 sec U MAX = 3.137685038647847E-04 V MAX = 3.137685038647847E-04 W MAX = 1.490637308130234E-12 U MIN = -3.137685038647847E-04 V MIN = -3.137685038647847E-04 W MIN = -6.468958239699217E-03 U MAX = 3.137685038647847E-04 V MAX = 3.137685038647847E-04 W MAX = 6.468958239699217E-03 max(|divU|) = 3.340634722407749E-02 sum(divU*dV) = 1.986223127301173E-17 sum(|divU|*dV) = 2.009669468666401E-02 Convective cfl = 4.541756958354424E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.036791757405083E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.017065243907e-03 true resid norm 1.017065243907e-03 ||r(i)||/||b|| 3.142006095747e-02 1 KSP unpreconditioned resid norm 5.569540235575e-04 true resid norm 5.569540235575e-04 ||r(i)||/||b|| 1.720590638164e-02 2 KSP unpreconditioned resid norm 9.159428781950e-05 true resid norm 9.159428781949e-05 ||r(i)||/||b|| 2.829610119789e-03 3 KSP unpreconditioned resid norm 2.109988559131e-05 true resid norm 2.109988559131e-05 ||r(i)||/||b|| 6.518359519669e-04 4 KSP unpreconditioned resid norm 3.381216180592e-06 true resid norm 3.381216180594e-06 ||r(i)||/||b|| 1.044554605922e-04 5 KSP unpreconditioned resid norm 6.190897167649e-07 true resid norm 6.190897167645e-07 ||r(i)||/||b|| 1.912545606628e-05 6 KSP unpreconditioned resid norm 1.071592799258e-07 true resid norm 1.071592799256e-07 ||r(i)||/||b|| 3.310457345377e-06 7 KSP unpreconditioned resid norm 1.911726630815e-08 true resid norm 1.911726630583e-08 ||r(i)||/||b|| 5.905871587568e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 28 **************************************** Simulation time = 1.0938 sec Time/time step = 4.9359 sec Average time/time step = 5.9607 sec U MAX = 3.306853077415233E-04 V MAX = 3.306853077415233E-04 W MAX = 2.091014663450294E-12 U MIN = -3.306853077415233E-04 V MIN = -3.306853077415233E-04 W MIN = -6.712267098442212E-03 U MAX = 3.306853077415233E-04 V MAX = 3.306853077415233E-04 W MAX = 6.712267098442212E-03 max(|divU|) = 3.441532656931547E-02 sum(divU*dV) = -9.662507560945147E-17 sum(|divU|*dV) = 2.086896801303882E-02 Convective cfl = 4.719128136912165E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.029322602551064E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.011051571776e-03 true resid norm 1.011051571776e-03 ||r(i)||/||b|| 3.031112740975e-02 1 KSP unpreconditioned resid norm 5.577999890334e-04 true resid norm 5.577999890334e-04 ||r(i)||/||b|| 1.672273404120e-02 2 KSP unpreconditioned resid norm 9.205514482216e-05 true resid norm 9.205514482217e-05 ||r(i)||/||b|| 2.759795149248e-03 3 KSP unpreconditioned resid norm 2.113634676424e-05 true resid norm 2.113634676424e-05 ||r(i)||/||b|| 6.336635218537e-04 4 KSP unpreconditioned resid norm 3.376357337148e-06 true resid norm 3.376357337147e-06 ||r(i)||/||b|| 1.012225293783e-04 5 KSP unpreconditioned resid norm 6.192812171961e-07 true resid norm 6.192812171938e-07 ||r(i)||/||b|| 1.856592917792e-05 6 KSP unpreconditioned resid norm 1.072861280251e-07 true resid norm 1.072861280260e-07 ||r(i)||/||b|| 3.216417032202e-06 7 KSP unpreconditioned resid norm 1.914838022261e-08 true resid norm 1.914838022181e-08 ||r(i)||/||b|| 5.740646756266e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 29 **************************************** Simulation time = 1.1328 sec Time/time step = 4.9249 sec Average time/time step = 5.9250 sec U MAX = 3.477011451042520E-04 V MAX = 3.477011451042520E-04 W MAX = 2.881901841773920E-12 U MIN = -3.477011451042520E-04 V MIN = -3.477011451042520E-04 W MIN = -6.955954894808746E-03 U MAX = 3.477011451042520E-04 V MAX = 3.477011451042520E-04 W MAX = 6.955954894808746E-03 max(|divU|) = 3.541210624580458E-02 sum(divU*dV) = 3.454128104623594E-17 sum(|divU|*dV) = 2.164316982199105E-02 Convective cfl = 4.896868598411040E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.021841360604639E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.005337398416e-03 true resid norm 1.005337398416e-03 ||r(i)||/||b|| 2.928039669666e-02 1 KSP unpreconditioned resid norm 5.586660114961e-04 true resid norm 5.586660114961e-04 ||r(i)||/||b|| 1.627111700342e-02 2 KSP unpreconditioned resid norm 9.257879095466e-05 true resid norm 9.257879095465e-05 ||r(i)||/||b|| 2.696352218787e-03 3 KSP unpreconditioned resid norm 2.118646758439e-05 true resid norm 2.118646758439e-05 ||r(i)||/||b|| 6.170547086473e-04 4 KSP unpreconditioned resid norm 3.371181273599e-06 true resid norm 3.371181273596e-06 ||r(i)||/||b|| 9.818547005487e-05 5 KSP unpreconditioned resid norm 6.189056124530e-07 true resid norm 6.189056124541e-07 ||r(i)||/||b|| 1.802559208380e-05 6 KSP unpreconditioned resid norm 1.073375933067e-07 true resid norm 1.073375933085e-07 ||r(i)||/||b|| 3.126201529446e-06 7 KSP unpreconditioned resid norm 1.917441834395e-08 true resid norm 1.917441834284e-08 ||r(i)||/||b|| 5.584538846266e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 30 **************************************** Simulation time = 1.1719 sec Time/time step = 4.9409 sec Average time/time step = 5.8922 sec U MAX = 3.648011082886927E-04 V MAX = 3.648011082886927E-04 W MAX = 3.909498888809396E-12 U MIN = -3.648011082886927E-04 V MIN = -3.648011082886927E-04 W MIN = -7.200030149701984E-03 U MAX = 3.648011082886927E-04 V MAX = 3.648011082886927E-04 W MAX = 7.200030149701984E-03 max(|divU|) = 3.639705168563705E-02 sum(divU*dV) = 1.507642367696106E-17 sum(|divU|*dV) = 2.241929289046668E-02 Convective cfl = 5.074964714418797E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.014348641117621E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.999070192231e-04 true resid norm 9.999070192231e-04 ||r(i)||/||b|| 2.832010531776e-02 1 KSP unpreconditioned resid norm 5.595524902876e-04 true resid norm 5.595524902876e-04 ||r(i)||/||b|| 1.584805902060e-02 2 KSP unpreconditioned resid norm 9.317595589083e-05 true resid norm 9.317595589083e-05 ||r(i)||/||b|| 2.638998260020e-03 3 KSP unpreconditioned resid norm 2.125644558449e-05 true resid norm 2.125644558449e-05 ||r(i)||/||b|| 6.020407558511e-04 4 KSP unpreconditioned resid norm 3.366807458257e-06 true resid norm 3.366807458254e-06 ||r(i)||/||b|| 9.535720818967e-05 5 KSP unpreconditioned resid norm 6.179712823500e-07 true resid norm 6.179712823509e-07 ||r(i)||/||b|| 1.750263920852e-05 6 KSP unpreconditioned resid norm 1.072668127310e-07 true resid norm 1.072668127304e-07 ||r(i)||/||b|| 3.038089917587e-06 7 KSP unpreconditioned resid norm 1.918241746540e-08 true resid norm 1.918241746529e-08 ||r(i)||/||b|| 5.432985991920e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 31 **************************************** Simulation time = 1.2109 sec Time/time step = 4.9422 sec Average time/time step = 5.8615 sec U MAX = 3.819712638762183E-04 V MAX = 3.819712638762183E-04 W MAX = 5.228048290315647E-12 U MIN = -3.819712638762183E-04 V MIN = -3.819712638762183E-04 W MIN = -7.444500960418116E-03 U MAX = 3.819712638762183E-04 V MAX = 3.819712638762183E-04 W MAX = 7.444500960418116E-03 max(|divU|) = 3.737051905435520E-02 sum(divU*dV) = 3.823246617899632E-17 sum(|divU|*dV) = 2.319732986848551E-02 Convective cfl = 5.253403832429154E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 9.006845011728921E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.947462645860e-04 true resid norm 9.947462645860e-04 ||r(i)||/||b|| 2.742348500190e-02 1 KSP unpreconditioned resid norm 5.604599968758e-04 true resid norm 5.604599968758e-04 ||r(i)||/||b|| 1.545094147690e-02 2 KSP unpreconditioned resid norm 9.385983767545e-05 true resid norm 9.385983767545e-05 ||r(i)||/||b|| 2.587558196908e-03 3 KSP unpreconditioned resid norm 2.135513624454e-05 true resid norm 2.135513624454e-05 ||r(i)||/||b|| 5.887252652911e-04 4 KSP unpreconditioned resid norm 3.365634334577e-06 true resid norm 3.365634334576e-06 ||r(i)||/||b|| 9.278488995838e-05 5 KSP unpreconditioned resid norm 6.168289689638e-07 true resid norm 6.168289689632e-07 ||r(i)||/||b|| 1.700493943160e-05 6 KSP unpreconditioned resid norm 1.070545756011e-07 true resid norm 1.070545756046e-07 ||r(i)||/||b|| 2.951314976488e-06 7 KSP unpreconditioned resid norm 1.915262615660e-08 true resid norm 1.915262615561e-08 ||r(i)||/||b|| 5.280057586784e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 32 **************************************** Simulation time = 1.2500 sec Time/time step = 4.9284 sec Average time/time step = 5.8324 sec U MAX = 3.991985725649520E-04 V MAX = 3.991985725649520E-04 W MAX = 6.900699263091350E-12 U MIN = -3.991985725649520E-04 V MIN = -3.991985725649520E-04 W MIN = -7.689374996953326E-03 U MAX = 3.991985725649520E-04 V MAX = 3.991985725649520E-04 W MAX = 7.689374996953326E-03 max(|divU|) = 3.833285500851496E-02 sum(divU*dV) = -3.832772970957615E-17 sum(|divU|*dV) = 2.397727321507414E-02 Convective cfl = 5.432174170933268E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.999331002654465E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.898426613669e-04 true resid norm 9.898426613669e-04 ||r(i)||/||b|| 2.658461796242e-02 1 KSP unpreconditioned resid norm 5.613892905236e-04 true resid norm 5.613892905236e-04 ||r(i)||/||b|| 1.507746675230e-02 2 KSP unpreconditioned resid norm 9.464672133814e-05 true resid norm 9.464672133815e-05 ||r(i)||/||b|| 2.541966543143e-03 3 KSP unpreconditioned resid norm 2.149522795860e-05 true resid norm 2.149522795859e-05 ||r(i)||/||b|| 5.773063190723e-04 4 KSP unpreconditioned resid norm 3.372418915567e-06 true resid norm 3.372418915564e-06 ||r(i)||/||b|| 9.057446398170e-05 5 KSP unpreconditioned resid norm 6.166967184143e-07 true resid norm 6.166967184150e-07 ||r(i)||/||b|| 1.656288145341e-05 6 KSP unpreconditioned resid norm 1.068366215632e-07 true resid norm 1.068366215608e-07 ||r(i)||/||b|| 2.869355786976e-06 7 KSP unpreconditioned resid norm 1.907817749690e-08 true resid norm 1.907817750218e-08 ||r(i)||/||b|| 5.123905849990e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 33 **************************************** Simulation time = 1.2891 sec Time/time step = 4.9415 sec Average time/time step = 5.8054 sec U MAX = 4.164708188215090E-04 V MAX = 4.164708188215090E-04 W MAX = 9.000399662903064E-12 U MIN = -4.164708188215090E-04 V MIN = -4.164708188215090E-04 W MIN = -7.934659497922977E-03 U MAX = 4.164708188215090E-04 V MAX = 4.164708188215090E-04 W MAX = 7.934659497922977E-03 max(|divU|) = 3.928439652168028E-02 sum(divU*dV) = -5.266414533402403E-17 sum(|divU|*dV) = 2.475911515855458E-02 Convective cfl = 5.611264726762238E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.991807110650943E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.851854712036e-04 true resid norm 9.851854712036e-04 ||r(i)||/||b|| 2.579831449281e-02 1 KSP unpreconditioned resid norm 5.623413102085e-04 true resid norm 5.623413102085e-04 ||r(i)||/||b|| 1.472561096068e-02 2 KSP unpreconditioned resid norm 9.555677903554e-05 true resid norm 9.555677903554e-05 ||r(i)||/||b|| 2.502273845419e-03 3 KSP unpreconditioned resid norm 2.169494954217e-05 true resid norm 2.169494954217e-05 ||r(i)||/||b|| 5.681094043244e-04 4 KSP unpreconditioned resid norm 3.396150651318e-06 true resid norm 3.396150651318e-06 ||r(i)||/||b|| 8.893245498294e-05 5 KSP unpreconditioned resid norm 6.207853792369e-07 true resid norm 6.207853792359e-07 ||r(i)||/||b|| 1.625604204912e-05 6 KSP unpreconditioned resid norm 1.072872335213e-07 true resid norm 1.072872335227e-07 ||r(i)||/||b|| 2.809450476469e-06 7 KSP unpreconditioned resid norm 1.907040101434e-08 true resid norm 1.907040101209e-08 ||r(i)||/||b|| 4.993823165225e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 34 **************************************** Simulation time = 1.3281 sec Time/time step = 4.9264 sec Average time/time step = 5.7795 sec U MAX = 4.337765483258318E-04 V MAX = 4.337765483258318E-04 W MAX = 1.161081079721469E-11 U MIN = -4.337765483258318E-04 V MIN = -4.337765483258318E-04 W MIN = -8.180361266170116E-03 U MAX = 4.337765483258318E-04 V MAX = 4.337765483258318E-04 W MAX = 8.180361266170116E-03 max(|divU|) = 4.022547080844885E-02 sum(divU*dV) = 1.233544127631819E-17 sum(|divU|*dV) = 2.554284768086973E-02 Convective cfl = 5.790665192205939E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.984273802559585E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.807655573968e-04 true resid norm 9.807655573968e-04 ||r(i)||/||b|| 2.506000711107e-02 1 KSP unpreconditioned resid norm 5.633171668328e-04 true resid norm 5.633171668328e-04 ||r(i)||/||b|| 1.439358478706e-02 2 KSP unpreconditioned resid norm 9.661510804421e-05 true resid norm 9.661510804421e-05 ||r(i)||/||b|| 2.468658566121e-03 3 KSP unpreconditioned resid norm 2.198053894015e-05 true resid norm 2.198053894015e-05 ||r(i)||/||b|| 5.616352022061e-04 4 KSP unpreconditioned resid norm 3.453141439878e-06 true resid norm 3.453141439878e-06 ||r(i)||/||b|| 8.823285889908e-05 5 KSP unpreconditioned resid norm 6.364463899108e-07 true resid norm 6.364463899125e-07 ||r(i)||/||b|| 1.626214433891e-05 6 KSP unpreconditioned resid norm 1.105673245591e-07 true resid norm 1.105673245571e-07 ||r(i)||/||b|| 2.825158284520e-06 7 KSP unpreconditioned resid norm 1.972920790450e-08 true resid norm 1.972920790619e-08 ||r(i)||/||b|| 5.041103724495e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 35 **************************************** Simulation time = 1.3672 sec Time/time step = 4.9401 sec Average time/time step = 5.7555 sec U MAX = 4.511050108339042E-04 V MAX = 4.511050108339042E-04 W MAX = 1.482724021178660E-11 U MIN = -4.511050108339042E-04 V MIN = -4.511050108339042E-04 W MIN = -8.426486664096112E-03 U MAX = 4.511050108339042E-04 V MAX = 4.511050108339042E-04 W MAX = 8.426486664096112E-03 max(|divU|) = 4.115639534074935E-02 sum(divU*dV) = -1.145860583228735E-17 sum(|divU|*dV) = 2.632846251912788E-02 Convective cfl = 5.970365878888909E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.976731518557743E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.765751078588e-04 true resid norm 9.765751078588e-04 ||r(i)||/||b|| 2.436565860028e-02 1 KSP unpreconditioned resid norm 5.643181597471e-04 true resid norm 5.643181597471e-04 ||r(i)||/||b|| 1.407980145273e-02 2 KSP unpreconditioned resid norm 9.785307283726e-05 true resid norm 9.785307283727e-05 ||r(i)||/||b|| 2.441445155169e-03 3 KSP unpreconditioned resid norm 2.238976860976e-05 true resid norm 2.238976860975e-05 ||r(i)||/||b|| 5.586272409508e-04 4 KSP unpreconditioned resid norm 3.571652839321e-06 true resid norm 3.571652839320e-06 ||r(i)||/||b|| 8.911313940040e-05 5 KSP unpreconditioned resid norm 6.783193984635e-07 true resid norm 6.783193984689e-07 ||r(i)||/||b|| 1.692414515999e-05 6 KSP unpreconditioned resid norm 1.217975177427e-07 true resid norm 1.217975177415e-07 ||r(i)||/||b|| 3.038861744242e-06 7 KSP unpreconditioned resid norm 2.276836536065e-08 true resid norm 2.276836535724e-08 ||r(i)||/||b|| 5.680732723130e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 36 **************************************** Simulation time = 1.4062 sec Time/time step = 4.9283 sec Average time/time step = 5.7326 sec U MAX = 4.684461064760571E-04 V MAX = 4.684461064760571E-04 W MAX = 1.875758734736681E-11 U MIN = -4.684461064760571E-04 V MIN = -4.684461064760571E-04 W MIN = -8.673041608687577E-03 U MAX = 4.684461064760571E-04 V MAX = 4.684461064760571E-04 W MAX = 8.673041608687577E-03 max(|divU|) = 4.207747792094865E-02 sum(divU*dV) = 4.599775939863605E-17 sum(|divU|*dV) = 2.711595117503776E-02 Convective cfl = 6.150357645849402E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.969180675226200E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.726073108163e-04 true resid norm 9.726073108163e-04 ||r(i)||/||b|| 2.371168358058e-02 1 KSP unpreconditioned resid norm 5.653458120211e-04 true resid norm 5.653458120211e-04 ||r(i)||/||b|| 1.378285034379e-02 2 KSP unpreconditioned resid norm 9.931002100599e-05 true resid norm 9.931002100599e-05 ||r(i)||/||b|| 2.421129029452e-03 3 KSP unpreconditioned resid norm 2.297687015357e-05 true resid norm 2.297687015357e-05 ||r(i)||/||b|| 5.601646920547e-04 4 KSP unpreconditioned resid norm 3.797658632871e-06 true resid norm 3.797658632874e-06 ||r(i)||/||b|| 9.258503287849e-05 5 KSP unpreconditioned resid norm 7.702859013373e-07 true resid norm 7.702859013399e-07 ||r(i)||/||b|| 1.877918801970e-05 6 KSP unpreconditioned resid norm 1.486225606933e-07 true resid norm 1.486225606894e-07 ||r(i)||/||b|| 3.623344275549e-06 7 KSP unpreconditioned resid norm 3.002239710354e-08 true resid norm 3.002239710279e-08 ||r(i)||/||b|| 7.319311427287e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 37 **************************************** Simulation time = 1.4453 sec Time/time step = 4.9175 sec Average time/time step = 5.7105 sec U MAX = 4.857903344348851E-04 V MAX = 4.857903344348851E-04 W MAX = 2.352329683379803E-11 U MIN = -4.857903344348851E-04 V MIN = -4.857903344348851E-04 W MIN = -8.920031566154332E-03 U MAX = 4.857903344348851E-04 V MAX = 4.857903344348851E-04 W MAX = 8.920031566154332E-03 max(|divU|) = 4.298901676450798E-02 sum(divU*dV) = 1.778694787993800E-17 sum(|divU|*dV) = 2.790530492402937E-02 Convective cfl = 6.330631830415426E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.961621668491502E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.688561023665e-04 true resid norm 9.688561023665e-04 ||r(i)||/||b|| 2.309488393758e-02 1 KSP unpreconditioned resid norm 5.664018426655e-04 true resid norm 5.664018426655e-04 ||r(i)||/||b|| 1.350147332142e-02 2 KSP unpreconditioned resid norm 1.010354191468e-04 true resid norm 1.010354191468e-04 ||r(i)||/||b|| 2.408408506775e-03 3 KSP unpreconditioned resid norm 2.381918152940e-05 true resid norm 2.381918152940e-05 ||r(i)||/||b|| 5.677842473883e-04 4 KSP unpreconditioned resid norm 4.199762467923e-06 true resid norm 4.199762467922e-06 ||r(i)||/||b|| 1.001108694317e-04 5 KSP unpreconditioned resid norm 9.414129391480e-07 true resid norm 9.414129391472e-07 ||r(i)||/||b|| 2.244071386229e-05 6 KSP unpreconditioned resid norm 1.950841469631e-07 true resid norm 1.950841469624e-07 ||r(i)||/||b|| 4.650273369962e-06 7 KSP unpreconditioned resid norm 3.989489021571e-08 true resid norm 3.989489021477e-08 ||r(i)||/||b|| 9.509852463769e-07 Linear solve converged due to CONVERGED_RTOL iterations 7 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 38 **************************************** Simulation time = 1.4844 sec Time/time step = 4.9318 sec Average time/time step = 5.6900 sec U MAX = 5.031287439701298E-04 V MAX = 5.031287439701298E-04 W MAX = 2.926031410161659E-11 U MIN = -5.031287439701298E-04 V MIN = -5.031287439701298E-04 W MIN = -9.167461546052079E-03 U MAX = 5.031287439701298E-04 V MAX = 5.031287439701298E-04 W MAX = 9.167461546052079E-03 max(|divU|) = 4.389130055351834E-02 sum(divU*dV) = -1.294684815588316E-16 sum(|divU|*dV) = 2.869651481910813E-02 Convective cfl = 6.511180181755097E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.954054876448685E-01 Iterations to convergence = 7 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.653160648367e-04 true resid norm 9.653160648367e-04 ||r(i)||/||b|| 2.251239751711e-02 1 KSP unpreconditioned resid norm 5.674880156214e-04 true resid norm 5.674880156214e-04 ||r(i)||/||b|| 1.323454178298e-02 2 KSP unpreconditioned resid norm 1.030914649249e-04 true resid norm 1.030914649249e-04 ||r(i)||/||b|| 2.404223987926e-03 3 KSP unpreconditioned resid norm 2.502572173112e-05 true resid norm 2.502572173112e-05 ||r(i)||/||b|| 5.836316376429e-04 4 KSP unpreconditioned resid norm 4.870445401901e-06 true resid norm 4.870445401897e-06 ||r(i)||/||b|| 1.135849769489e-04 5 KSP unpreconditioned resid norm 1.215482808089e-06 true resid norm 1.215482808082e-06 ||r(i)||/||b|| 2.834660392334e-05 6 KSP unpreconditioned resid norm 2.551625330565e-07 true resid norm 2.551625330483e-07 ||r(i)||/||b|| 5.950714573914e-06 7 KSP unpreconditioned resid norm 4.861220178024e-08 true resid norm 4.861220177235e-08 ||r(i)||/||b|| 1.133698329849e-06 8 KSP unpreconditioned resid norm 8.144841115238e-09 true resid norm 8.144841112622e-09 ||r(i)||/||b|| 1.899480465728e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 39 **************************************** Simulation time = 1.5234 sec Time/time step = 5.2763 sec Average time/time step = 5.6794 sec U MAX = 5.204528884791881E-04 V MAX = 5.204528884791881E-04 W MAX = 3.612003794311162E-11 U MIN = -5.204528884791881E-04 V MIN = -5.204528884791881E-04 W MIN = -9.415336094758673E-03 U MAX = 5.204528884791881E-04 V MAX = 5.204528884791881E-04 W MAX = 9.415336094758673E-03 max(|divU|) = 4.478460844464908E-02 sum(divU*dV) = 1.137276829931661E-16 sum(|divU|*dV) = 2.948957168820364E-02 Convective cfl = 6.691994797898912E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.946480662031222E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.619817564018e-04 true resid norm 9.619817564018e-04 ||r(i)||/||b|| 2.196164143736e-02 1 KSP unpreconditioned resid norm 5.686026262463e-04 true resid norm 5.686026262463e-04 ||r(i)||/||b|| 1.298096030913e-02 2 KSP unpreconditioned resid norm 1.055538212906e-04 true resid norm 1.055538212906e-04 ||r(i)||/||b|| 2.409749623732e-03 3 KSP unpreconditioned resid norm 2.674695797372e-05 true resid norm 2.674695797372e-05 ||r(i)||/||b|| 6.106218716207e-04 4 KSP unpreconditioned resid norm 5.924085158408e-06 true resid norm 5.924085158412e-06 ||r(i)||/||b|| 1.352443881889e-04 5 KSP unpreconditioned resid norm 1.602272846882e-06 true resid norm 1.602272846877e-06 ||r(i)||/||b|| 3.657921942258e-05 6 KSP unpreconditioned resid norm 3.168298436773e-07 true resid norm 3.168298436703e-07 ||r(i)||/||b|| 7.233092911626e-06 7 KSP unpreconditioned resid norm 5.586014529979e-08 true resid norm 5.586014529694e-08 ||r(i)||/||b|| 1.275263770322e-06 8 KSP unpreconditioned resid norm 9.407129276743e-09 true resid norm 9.407129273877e-09 ||r(i)||/||b|| 2.147608296030e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 40 **************************************** Simulation time = 1.5625 sec Time/time step = 5.2801 sec Average time/time step = 5.6694 sec U MAX = 5.377547834995266E-04 V MAX = 5.377547834995266E-04 W MAX = 4.427026464318359E-11 U MIN = -5.377547834995266E-04 V MIN = -5.377547834995266E-04 W MIN = -9.663659288215718E-03 U MAX = 5.377547834995266E-04 V MAX = 5.377547834995266E-04 W MAX = 9.663659288215718E-03 max(|divU|) = 4.566921003955636E-02 sum(divU*dV) = 6.161473781623374E-17 sum(|divU|*dV) = 3.028446612649909E-02 Convective cfl = 6.873068067337454E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.938899375482311E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.588526158093e-04 true resid norm 9.588526158093e-04 ||r(i)||/||b|| 2.144038799229e-02 1 KSP unpreconditioned resid norm 5.697560491174e-04 true resid norm 5.697560491174e-04 ||r(i)||/||b|| 1.274000879032e-02 2 KSP unpreconditioned resid norm 1.085274114053e-04 true resid norm 1.085274114053e-04 ||r(i)||/||b|| 2.426723116738e-03 3 KSP unpreconditioned resid norm 2.919112513317e-05 true resid norm 2.919112513317e-05 ||r(i)||/||b|| 6.527270598919e-04 4 KSP unpreconditioned resid norm 7.501781719811e-06 true resid norm 7.501781719801e-06 ||r(i)||/||b|| 1.677433090906e-04 5 KSP unpreconditioned resid norm 2.097754413791e-06 true resid norm 2.097754413779e-06 ||r(i)||/||b|| 4.690675897672e-05 6 KSP unpreconditioned resid norm 3.750084920979e-07 true resid norm 3.750084920952e-07 ||r(i)||/||b|| 8.385363337763e-06 7 KSP unpreconditioned resid norm 6.407190922772e-08 true resid norm 6.407190922495e-08 ||r(i)||/||b|| 1.432677525764e-06 8 KSP unpreconditioned resid norm 1.109760823695e-08 true resid norm 1.109760823122e-08 ||r(i)||/||b|| 2.481476530813e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 41 **************************************** Simulation time = 1.6016 sec Time/time step = 5.2897 sec Average time/time step = 5.6602 sec U MAX = 5.550268693159551E-04 V MAX = 5.550268693159551E-04 W MAX = 5.389611832243758E-11 U MIN = -5.550268693159551E-04 V MIN = -5.550268693159551E-04 W MIN = -9.912434723936841E-03 U MAX = 5.550268693159551E-04 V MAX = 5.550268693159551E-04 W MAX = 9.912434723936841E-03 max(|divU|) = 4.654536534164120E-02 sum(divU*dV) = 7.663481308386309E-17 sum(|divU|*dV) = 3.108213844533452E-02 Convective cfl = 7.054392616044000E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.931311356592495E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.559282785354e-04 true resid norm 9.559282785354e-04 ||r(i)||/||b|| 2.094662207925e-02 1 KSP unpreconditioned resid norm 5.709496722979e-04 true resid norm 5.709496722979e-04 ||r(i)||/||b|| 1.251084132611e-02 2 KSP unpreconditioned resid norm 1.121376956818e-04 true resid norm 1.121376956818e-04 ||r(i)||/||b|| 2.457198918608e-03 3 KSP unpreconditioned resid norm 3.263680051620e-05 true resid norm 3.263680051620e-05 ||r(i)||/||b|| 7.151485541742e-04 4 KSP unpreconditioned resid norm 9.787969787122e-06 true resid norm 9.787969787117e-06 ||r(i)||/||b|| 2.144772873212e-04 5 KSP unpreconditioned resid norm 2.697948086712e-06 true resid norm 2.697948086711e-06 ||r(i)||/||b|| 5.911834625121e-05 6 KSP unpreconditioned resid norm 4.388419847123e-07 true resid norm 4.388419847060e-07 ||r(i)||/||b|| 9.616053225489e-06 7 KSP unpreconditioned resid norm 7.533046586121e-08 true resid norm 7.533046585203e-08 ||r(i)||/||b|| 1.650666514097e-06 8 KSP unpreconditioned resid norm 1.327386810018e-08 true resid norm 1.327386809678e-08 ||r(i)||/||b|| 2.908614639783e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 42 **************************************** Simulation time = 1.6406 sec Time/time step = 5.2955 sec Average time/time step = 5.6515 sec U MAX = 5.722619783523600E-04 V MAX = 5.722619783523600E-04 W MAX = 6.520096218797720E-11 U MIN = -5.722619783523600E-04 V MIN = -5.722619783523600E-04 W MIN = -1.016166551240614E-02 U MAX = 5.722619783523600E-04 V MAX = 5.722619783523600E-04 W MAX = 1.016166551240614E-02 max(|divU|) = 4.741332470628808E-02 sum(divU*dV) = -6.679930840361994E-17 sum(|divU|*dV) = 3.188275111165742E-02 Convective cfl = 7.235961260230947E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.923716936691350E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.532094014231e-04 true resid norm 9.532094014231e-04 ||r(i)||/||b|| 2.047853444210e-02 1 KSP unpreconditioned resid norm 5.721881245947e-04 true resid norm 5.721881245947e-04 ||r(i)||/||b|| 1.229275980638e-02 2 KSP unpreconditioned resid norm 1.165466916946e-04 true resid norm 1.165466916946e-04 ||r(i)||/||b|| 2.503862673215e-03 3 KSP unpreconditioned resid norm 3.745944846755e-05 true resid norm 3.745944846755e-05 ||r(i)||/||b|| 8.047702891721e-04 4 KSP unpreconditioned resid norm 1.305337758017e-05 true resid norm 1.305337758017e-05 ||r(i)||/||b|| 2.804358013697e-04 5 KSP unpreconditioned resid norm 3.436611414495e-06 true resid norm 3.436611414503e-06 ||r(i)||/||b|| 7.383137966426e-05 6 KSP unpreconditioned resid norm 5.329476257960e-07 true resid norm 5.329476257973e-07 ||r(i)||/||b|| 1.144972583614e-05 7 KSP unpreconditioned resid norm 9.361074626380e-08 true resid norm 9.361074626318e-08 ||r(i)||/||b|| 2.011112027052e-06 8 KSP unpreconditioned resid norm 1.654045230649e-08 true resid norm 1.654045231200e-08 ||r(i)||/||b|| 3.553513234904e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 43 **************************************** Simulation time = 1.6797 sec Time/time step = 5.2843 sec Average time/time step = 5.6430 sec U MAX = 5.894533070614432E-04 V MAX = 5.894533070614432E-04 W MAX = 7.840728546853753E-11 U MIN = -5.894533070614432E-04 V MIN = -5.894533070614432E-04 W MIN = -1.041135426811980E-02 U MAX = 5.894533070614432E-04 V MAX = 5.894533070614432E-04 W MAX = 1.041135426811980E-02 max(|divU|) = 4.827332879033855E-02 sum(divU*dV) = -1.944361779260113E-17 sum(|divU|*dV) = 3.268620746643017E-02 Convective cfl = 7.417766964635319E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.916116440402518E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.506987107380e-04 true resid norm 9.506987107380e-04 ||r(i)||/||b|| 2.003452088410e-02 1 KSP unpreconditioned resid norm 5.734767148944e-04 true resid norm 5.734767148944e-04 ||r(i)||/||b|| 1.208514442202e-02 2 KSP unpreconditioned resid norm 1.219614101567e-04 true resid norm 1.219614101567e-04 ||r(i)||/||b|| 2.570150134044e-03 3 KSP unpreconditioned resid norm 4.417104052445e-05 true resid norm 4.417104052446e-05 ||r(i)||/||b|| 9.308371031371e-04 4 KSP unpreconditioned resid norm 1.774091199672e-05 true resid norm 1.774091199672e-05 ||r(i)||/||b|| 3.738625790554e-04 5 KSP unpreconditioned resid norm 4.454157853553e-06 true resid norm 4.454157853559e-06 ||r(i)||/||b|| 9.386456248470e-05 6 KSP unpreconditioned resid norm 7.200394014205e-07 true resid norm 7.200394014228e-07 ||r(i)||/||b|| 1.517372881886e-05 7 KSP unpreconditioned resid norm 1.379733077289e-07 true resid norm 1.379733077328e-07 ||r(i)||/||b|| 2.907576379351e-06 8 KSP unpreconditioned resid norm 2.662633277832e-08 true resid norm 2.662633278849e-08 ||r(i)||/||b|| 5.611092287101e-07 Linear solve converged due to CONVERGED_RTOL iterations 8 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 44 **************************************** Simulation time = 1.7188 sec Time/time step = 5.2837 sec Average time/time step = 5.6348 sec U MAX = 6.065943917762114E-04 V MAX = 6.065943917762114E-04 W MAX = 9.375756091715709E-11 U MIN = -6.065943917762114E-04 V MIN = -6.065943917762114E-04 W MIN = -1.066150310063001E-02 U MAX = 6.065943917762114E-04 V MAX = 6.065943917762114E-04 W MAX = 1.066150310063001E-02 max(|divU|) = 4.912560853256981E-02 sum(divU*dV) = 6.111669858309849E-17 sum(|divU|*dV) = 3.349197454532234E-02 Convective cfl = 7.599802805876757E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.908510187181871E-01 Iterations to convergence = 8 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.484010305825e-04 true resid norm 9.484010305825e-04 ||r(i)||/||b|| 1.961316094007e-02 1 KSP unpreconditioned resid norm 5.748213015688e-04 true resid norm 5.748213015688e-04 ||r(i)||/||b|| 1.188744248045e-02 2 KSP unpreconditioned resid norm 1.286469364839e-04 true resid norm 1.286469364839e-04 ||r(i)||/||b|| 2.660449523295e-03 3 KSP unpreconditioned resid norm 5.348907510610e-05 true resid norm 5.348907510610e-05 ||r(i)||/||b|| 1.106166911214e-03 4 KSP unpreconditioned resid norm 2.465650792099e-05 true resid norm 2.465650792099e-05 ||r(i)||/||b|| 5.099025016637e-04 5 KSP unpreconditioned resid norm 6.146862394919e-06 true resid norm 6.146862394916e-06 ||r(i)||/||b|| 1.271185896476e-04 6 KSP unpreconditioned resid norm 1.188069342317e-06 true resid norm 1.188069342320e-06 ||r(i)||/||b|| 2.456955915007e-05 7 KSP unpreconditioned resid norm 2.741412806621e-07 true resid norm 2.741412806753e-07 ||r(i)||/||b|| 5.669307481560e-06 8 KSP unpreconditioned resid norm 6.049923165219e-08 true resid norm 6.049923166410e-08 ||r(i)||/||b|| 1.251138631355e-06 9 KSP unpreconditioned resid norm 1.176107887875e-08 true resid norm 1.176107888235e-08 ||r(i)||/||b|| 2.432219340870e-07 Linear solve converged due to CONVERGED_RTOL iterations 9 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 45 **************************************** Simulation time = 1.7578 sec Time/time step = 5.6541 sec Average time/time step = 5.6352 sec U MAX = 6.236790880062606E-04 V MAX = 6.236790880062606E-04 W MAX = 1.115150678938498E-10 U MIN = -6.236790880062606E-04 V MIN = -6.236790880062606E-04 W MIN = -1.091211360600322E-02 U MAX = 6.236790880062606E-04 V MAX = 6.236790880062606E-04 W MAX = 1.091211360600322E-02 max(|divU|) = 4.997038517101132E-02 sum(divU*dV) = -1.366059071355439E-16 sum(|divU|*dV) = 3.429973463439592E-02 Convective cfl = 7.782061940490073E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.900898492656075E-01 Iterations to convergence = 9 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.463226338129e-04 true resid norm 9.463226338129e-04 ||r(i)||/||b|| 1.921318554716e-02 1 KSP unpreconditioned resid norm 5.762320387465e-04 true resid norm 5.762320387465e-04 ||r(i)||/||b|| 1.169923732464e-02 2 KSP unpreconditioned resid norm 1.369454100402e-04 true resid norm 1.369454100402e-04 ||r(i)||/||b|| 2.780402242240e-03 3 KSP unpreconditioned resid norm 6.645141899166e-05 true resid norm 6.645141899167e-05 ||r(i)||/||b|| 1.349162957051e-03 4 KSP unpreconditioned resid norm 3.542849089818e-05 true resid norm 3.542849089818e-05 ||r(i)||/||b|| 7.193045426169e-04 5 KSP unpreconditioned resid norm 9.634571468376e-06 true resid norm 9.634571468390e-06 ||r(i)||/||b|| 1.956106751286e-04 6 KSP unpreconditioned resid norm 2.490903677247e-06 true resid norm 2.490903677264e-06 ||r(i)||/||b|| 5.057280975999e-05 7 KSP unpreconditioned resid norm 6.460104461267e-07 true resid norm 6.460104461372e-07 ||r(i)||/||b|| 1.311594811701e-05 8 KSP unpreconditioned resid norm 1.308945022433e-07 true resid norm 1.308945022564e-07 ||r(i)||/||b|| 2.657550679965e-06 9 KSP unpreconditioned resid norm 2.332721980281e-08 true resid norm 2.332721980822e-08 ||r(i)||/||b|| 4.736124725971e-07 Linear solve converged due to CONVERGED_RTOL iterations 9 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 46 **************************************** Simulation time = 1.7969 sec Time/time step = 5.6396 sec Average time/time step = 5.6353 sec U MAX = 6.407015528670302E-04 V MAX = 6.407015528670302E-04 W MAX = 1.319646761900996E-10 U MIN = -6.407015528670302E-04 V MIN = -6.407015528670302E-04 W MIN = -1.116318685908394E-02 U MAX = 6.407015528670302E-04 V MAX = 6.407015528670302E-04 W MAX = 1.116318685908394E-02 max(|divU|) = 5.080787027504124E-02 sum(divU*dV) = -1.802459127280384E-16 sum(|divU|*dV) = 3.510958606679235E-02 Convective cfl = 7.964537577483519E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.893281669768407E-01 Iterations to convergence = 9 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.444723150048e-04 true resid norm 9.444723150048e-04 ||r(i)||/||b|| 1.883348287760e-02 1 KSP unpreconditioned resid norm 5.777155860646e-04 true resid norm 5.777155860646e-04 ||r(i)||/||b|| 1.152007996996e-02 2 KSP unpreconditioned resid norm 1.472956246270e-04 true resid norm 1.472956246270e-04 ||r(i)||/||b|| 2.937184690633e-03 3 KSP unpreconditioned resid norm 8.458940194082e-05 true resid norm 8.458940194082e-05 ||r(i)||/||b|| 1.686775808850e-03 4 KSP unpreconditioned resid norm 5.365762440872e-05 true resid norm 5.365762440871e-05 ||r(i)||/||b|| 1.069973078617e-03 5 KSP unpreconditioned resid norm 1.864657810331e-05 true resid norm 1.864657810332e-05 ||r(i)||/||b|| 3.718266844412e-04 6 KSP unpreconditioned resid norm 6.524460622530e-06 true resid norm 6.524460622545e-06 ||r(i)||/||b|| 1.301026144103e-04 7 KSP unpreconditioned resid norm 1.622750138813e-06 true resid norm 1.622750138828e-06 ||r(i)||/||b|| 3.235884892410e-05 8 KSP unpreconditioned resid norm 2.958932862640e-07 true resid norm 2.958932862731e-07 ||r(i)||/||b|| 5.900332971215e-06 9 KSP unpreconditioned resid norm 6.413461809109e-08 true resid norm 6.413461809469e-08 ||r(i)||/||b|| 1.278892152325e-06 10 KSP unpreconditioned resid norm 1.891915012387e-08 true resid norm 1.891915013001e-08 ||r(i)||/||b|| 3.772619740903e-07 Linear solve converged due to CONVERGED_RTOL iterations 10 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 47 **************************************** Simulation time = 1.8359 sec Time/time step = 5.9881 sec Average time/time step = 5.6428 sec U MAX = 6.576562305696783E-04 V MAX = 6.576562305696783E-04 W MAX = 1.554135859239686E-10 U MIN = -6.576562305696783E-04 V MIN = -6.576562305696783E-04 W MIN = -1.141472340685389E-02 U MAX = 6.576562305696783E-04 V MAX = 6.576562305696783E-04 W MAX = 1.141472340685389E-02 max(|divU|) = 5.163826577654140E-02 sum(divU*dV) = -7.239041132354072E-17 sum(|divU|*dV) = 3.592336567287319E-02 Convective cfl = 8.147222955515682E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.885660029728505E-01 Iterations to convergence = 10 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.428598999965e-04 true resid norm 9.428598999965e-04 ||r(i)||/||b|| 1.847305363556e-02 1 KSP unpreconditioned resid norm 5.792840703613e-04 true resid norm 5.792840703613e-04 ||r(i)||/||b|| 1.134966679785e-02 2 KSP unpreconditioned resid norm 1.602663517902e-04 true resid norm 1.602663517901e-04 ||r(i)||/||b|| 3.140030573583e-03 3 KSP unpreconditioned resid norm 1.101142848680e-04 true resid norm 1.101142848680e-04 ||r(i)||/||b|| 2.157422423433e-03 4 KSP unpreconditioned resid norm 8.731041325492e-05 true resid norm 8.731041325492e-05 ||r(i)||/||b|| 1.710635850572e-03 5 KSP unpreconditioned resid norm 5.035997710797e-05 true resid norm 5.035997710799e-05 ||r(i)||/||b|| 9.866816461331e-04 6 KSP unpreconditioned resid norm 2.457879661966e-05 true resid norm 2.457879661967e-05 ||r(i)||/||b|| 4.815619247933e-04 7 KSP unpreconditioned resid norm 5.634575585722e-06 true resid norm 5.634575585741e-06 ||r(i)||/||b|| 1.103958467312e-04 8 KSP unpreconditioned resid norm 1.134851225677e-06 true resid norm 1.134851225701e-06 ||r(i)||/||b|| 2.223465815104e-05 9 KSP unpreconditioned resid norm 3.682136578226e-07 true resid norm 3.682136578415e-07 ||r(i)||/||b|| 7.214253836309e-06 10 KSP unpreconditioned resid norm 1.353419828913e-07 true resid norm 1.353419829004e-07 ||r(i)||/||b|| 2.651697998050e-06 11 KSP unpreconditioned resid norm 3.039998964113e-08 true resid norm 3.039998964440e-08 ||r(i)||/||b|| 5.956140877596e-07 Linear solve converged due to CONVERGED_RTOL iterations 11 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 48 **************************************** Simulation time = 1.8750 sec Time/time step = 6.3739 sec Average time/time step = 5.6581 sec U MAX = 6.745378410411323E-04 V MAX = 6.745378410411323E-04 W MAX = 1.821920190168132E-10 U MIN = -6.745378410411323E-04 V MIN = -6.745378410411323E-04 W MIN = -1.166672326300875E-02 U MAX = 6.745378410411323E-04 V MAX = 6.745378410411323E-04 W MAX = 1.166672326300875E-02 max(|divU|) = 5.252541757016951E-02 sum(divU*dV) = 9.877092826794907E-17 sum(|divU|*dV) = 3.673921294188139E-02 Convective cfl = 8.330111324858251E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.878033882759807E-01 Iterations to convergence = 11 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.414933765577e-04 true resid norm 9.414933765577e-04 ||r(i)||/||b|| 1.813094466660e-02 1 KSP unpreconditioned resid norm 5.809485671766e-04 true resid norm 5.809485671766e-04 ||r(i)||/||b|| 1.118770093118e-02 2 KSP unpreconditioned resid norm 1.765886995066e-04 true resid norm 1.765886995066e-04 ||r(i)||/||b|| 3.400682383137e-03 3 KSP unpreconditioned resid norm 1.457796058411e-04 true resid norm 1.457796058411e-04 ||r(i)||/||b|| 2.807371812521e-03 4 KSP unpreconditioned resid norm 1.438887535261e-04 true resid norm 1.438887535261e-04 ||r(i)||/||b|| 2.770958450994e-03 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 5 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 49 **************************************** Simulation time = 1.9141 sec Time/time step = 4.1844 sec Average time/time step = 5.6280 sec U MAX = 6.913413717411347E-04 V MAX = 6.913413717411347E-04 W MAX = 2.126538579577966E-10 U MIN = -6.913413717411347E-04 V MIN = -6.913413717411347E-04 W MIN = -1.191918590367497E-02 U MAX = 6.913413717411347E-04 V MAX = 6.913413717411347E-04 W MAX = 1.191918590367497E-02 max(|divU|) = 5.340673281799226E-02 sum(divU*dV) = -5.499226812896440E-17 sum(|divU|*dV) = 3.755711646535303E-02 Convective cfl = 8.513195934180635E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.870403538643769E-01 Iterations to convergence = 5 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.519521198687e-04 true resid norm 9.519521198687e-04 ||r(i)||/||b|| 1.802538489112e-02 1 KSP unpreconditioned resid norm 6.431597000660e-04 true resid norm 6.431597000660e-04 ||r(i)||/||b|| 1.217834479085e-02 2 KSP unpreconditioned resid norm 3.702102145333e-04 true resid norm 3.702102145333e-04 ||r(i)||/||b|| 7.009997108370e-03 3 KSP unpreconditioned resid norm 3.869684006500e-04 true resid norm 3.869684006501e-04 ||r(i)||/||b|| 7.327316381605e-03 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 4 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 50 **************************************** Simulation time = 1.9531 sec Time/time step = 3.8233 sec Average time/time step = 5.5919 sec U MAX = 7.080620726354203E-04 V MAX = 7.080620726354203E-04 W MAX = 2.471772277684510E-10 U MIN = -7.080620726354203E-04 V MIN = -7.080620726354203E-04 W MIN = -1.217211026400828E-02 U MAX = 7.080620726354203E-04 V MAX = 7.080620726354203E-04 W MAX = 1.217211026400828E-02 max(|divU|) = 5.428216207969611E-02 sum(divU*dV) = 2.375536962971300E-17 sum(|divU|*dV) = 3.837706418998891E-02 Convective cfl = 8.696470021938640E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.862769307070695E-01 Iterations to convergence = 4 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.016568468033e-03 true resid norm 1.016568468033e-03 ||r(i)||/||b|| 1.893304776508e-02 1 KSP unpreconditioned resid norm 8.594004065921e-04 true resid norm 8.594004065921e-04 ||r(i)||/||b|| 1.600587610083e-02 2 KSP unpreconditioned resid norm 8.414870106025e-04 true resid norm 8.414870106025e-04 ||r(i)||/||b|| 1.567224861525e-02 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 3 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 51 **************************************** Simulation time = 1.9922 sec Time/time step = 3.4677 sec Average time/time step = 5.5502 sec U MAX = 7.246954541516481E-04 V MAX = 7.246954541516481E-04 W MAX = 2.861650172945249E-10 U MIN = -7.246954541516481E-04 V MIN = -7.246954541516481E-04 W MIN = -1.242549473530513E-02 U MAX = 7.246954541516481E-04 V MAX = 7.246954541516481E-04 W MAX = 1.242549473530513E-02 max(|divU|) = 5.515171879413290E-02 sum(divU*dV) = -4.605848273045875E-20 sum(|divU|*dV) = 3.919904340353457E-02 Convective cfl = 8.879926811909390E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.855131497816892E-01 Iterations to convergence = 3 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.258666639469e-03 true resid norm 1.258666639469e-03 ||r(i)||/||b|| 2.306487833771e-02 1 KSP unpreconditioned resid norm 1.504157912911e-03 true resid norm 1.504157912911e-03 ||r(i)||/||b|| 2.756346928893e-02 2 KSP unpreconditioned resid norm 1.513545089714e-03 true resid norm 1.513545089714e-03 ||r(i)||/||b|| 2.773548790299e-02 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 3 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 52 **************************************** Simulation time = 2.0312 sec Time/time step = 3.4708 sec Average time/time step = 5.5103 sec U MAX = 7.412372878275829E-04 V MAX = 7.412372878275829E-04 W MAX = 3.300453361744023E-10 U MIN = -7.412372878275829E-04 V MIN = -7.412372878275829E-04 W MIN = -1.267933716225437E-02 U MAX = 7.412372878275829E-04 V MAX = 7.412372878275829E-04 W MAX = 1.267933716225437E-02 max(|divU|) = 5.601540601387010E-02 sum(divU*dV) = -1.943497884490673E-16 sum(|divU|*dV) = 4.002304072475085E-02 Convective cfl = 9.063559512262100E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.847490420774172E-01 Iterations to convergence = 3 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.761560659866e-03 true resid norm 1.761560659866e-03 ||r(i)||/||b|| 3.177100637149e-02 1 KSP unpreconditioned resid norm 2.631712886238e-03 true resid norm 2.631712886238e-03 ||r(i)||/||b|| 4.746482410829e-02 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 53 **************************************** Simulation time = 2.0703 sec Time/time step = 3.1133 sec Average time/time step = 5.4650 sec U MAX = 7.576836087104737E-04 V MAX = 7.576836087104737E-04 W MAX = 3.792719040608200E-10 U MIN = -7.576836087104737E-04 V MIN = -7.576836087104737E-04 W MIN = -1.293363484006254E-02 U MAX = 7.576836087104737E-04 V MAX = 7.576836087104737E-04 W MAX = 1.293363484006254E-02 max(|divU|) = 5.687321627651611E-02 sum(divU*dV) = -4.700826762012491E-17 sum(|divU|*dV) = 4.084904209733418E-02 Convective cfl = 9.247361316789432E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.839846385889424E-01 Iterations to convergence = 2 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 2.799899355961e-03 true resid norm 2.799899355961e-03 ||r(i)||/||b|| 4.971640159634e-02 1 KSP unpreconditioned resid norm 4.527402695250e-03 true resid norm 4.527402695250e-03 ||r(i)||/||b|| 8.039080765750e-02 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 54 **************************************** Simulation time = 2.1094 sec Time/time step = 3.1127 sec Average time/time step = 5.4215 sec U MAX = 7.740307160562104E-04 V MAX = 7.740307160562104E-04 W MAX = 4.343243689038809E-10 U MIN = -7.740307160562104E-04 V MIN = -7.740307160562104E-04 W MIN = -1.318838451137465E-02 U MAX = 7.740307160562104E-04 V MAX = 7.740307160562104E-04 W MAX = 1.318838451137465E-02 max(|divU|) = 5.772513218099847E-02 sum(divU*dV) = 1.862846001320152E-16 sum(|divU|*dV) = 4.167703278667138E-02 Convective cfl = 9.431325403831724E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.832199703200240E-01 Iterations to convergence = 2 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 4.697531508109e-03 true resid norm 4.697531508109e-03 ||r(i)||/||b|| 8.214417102469e-02 1 KSP unpreconditioned resid norm 6.214787472969e-03 true resid norm 6.214787472969e-03 ||r(i)||/||b|| 1.086759214239e-01 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 55 **************************************** Simulation time = 2.1484 sec Time/time step = 3.1091 sec Average time/time step = 5.3794 sec U MAX = 7.902751731180926E-04 V MAX = 7.902751731180926E-04 W MAX = 4.957085513370435E-10 U MIN = -7.902751731180926E-04 V MIN = -7.902751731180926E-04 W MIN = -1.344358236313642E-02 U MAX = 7.902751731180926E-04 V MAX = 7.902751731180926E-04 W MAX = 1.344358236313642E-02 max(|divU|) = 5.857112669786742E-02 sum(divU*dV) = -1.033909168771564E-16 sum(|divU|*dV) = 4.250699737772344E-02 Convective cfl = 9.615444933998471E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.824550682920411E-01 Iterations to convergence = 2 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 6.443380291215e-03 true resid norm 6.443380291215e-03 ||r(i)||/||b|| 1.109919986786e-01 1 KSP unpreconditioned resid norm 8.054889194093e-03 true resid norm 8.054889194093e-03 ||r(i)||/||b|| 1.387514333130e-01 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 56 **************************************** Simulation time = 2.1875 sec Time/time step = 3.1099 sec Average time/time step = 5.3389 sec U MAX = 8.064138093027779E-04 V MAX = 8.064138093027779E-04 W MAX = 5.639566124425724E-10 U MIN = -8.064138093027779E-04 V MIN = -8.064138093027779E-04 W MIN = -1.369922402375323E-02 U MAX = 8.064138093027779E-04 V MAX = 8.064138093027779E-04 W MAX = 1.369922402375323E-02 max(|divU|) = 5.941116286987973E-02 sum(divU*dV) = -1.848161969535551E-16 sum(|divU|*dV) = 4.333892934140238E-02 Convective cfl = 9.799713051109626E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.816899635391449E-01 Iterations to convergence = 2 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 8.320817865971e-03 true resid norm 8.320817865971e-03 ||r(i)||/||b|| 1.412314148924e-01 1 KSP unpreconditioned resid norm 9.218063404335e-03 true resid norm 9.218063404335e-03 ||r(i)||/||b|| 1.564605977601e-01 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 57 **************************************** Simulation time = 2.2266 sec Time/time step = 3.1149 sec Average time/time step = 5.2999 sec U MAX = 8.224437242859797E-04 V MAX = 8.224437242859797E-04 W MAX = 6.396271424076921E-10 U MIN = -8.224437242859797E-04 V MIN = -8.224437242859797E-04 W MIN = -1.395530456105393E-02 U MAX = 8.224437242859797E-04 V MAX = 8.224437242859797E-04 W MAX = 1.395530456105393E-02 max(|divU|) = 6.024519384412073E-02 sum(divU*dV) = 1.301277135451108E-16 sum(|divU|*dV) = 4.417280564319946E-02 Convective cfl = 9.984122886160571E-02 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.809246870908851E-01 Iterations to convergence = 2 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 9.540635888297e-03 true resid norm 9.540635888297e-03 ||r(i)||/||b|| 1.596034321147e-01 1 KSP unpreconditioned resid norm 1.056716658719e-02 true resid norm 1.056716658719e-02 ||r(i)||/||b|| 1.767760634396e-01 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 58 **************************************** Simulation time = 2.2656 sec Time/time step = 3.1126 sec Average time/time step = 5.2622 sec U MAX = 8.383622937791725E-04 V MAX = 8.383622937791725E-04 W MAX = 7.233051678163470E-10 U MIN = -8.383622937791725E-04 V MIN = -8.383622937791725E-04 W MIN = -1.421181848163963E-02 U MAX = 8.383622937791725E-04 V MAX = 8.383622937791725E-04 W MAX = 1.421181848163963E-02 max(|divU|) = 6.107316302490359E-02 sum(divU*dV) = 2.143023305597198E-16 sum(|divU|*dV) = 4.500860689979907E-02 Convective cfl = 1.016866756428670E-01 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.801592699424566E-01 Iterations to convergence = 2 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.093039378003e-02 true resid norm 1.093039378003e-02 ||r(i)||/||b|| 1.802637775768e-01 1 KSP unpreconditioned resid norm 1.120586105321e-02 true resid norm 1.120586105321e-02 ||r(i)||/||b|| 1.848067768742e-01 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 59 **************************************** Simulation time = 2.3047 sec Time/time step = 3.1166 sec Average time/time step = 5.2258 sec U MAX = 8.541671768674378E-04 V MAX = 8.541671768674378E-04 W MAX = 8.156020755531066E-10 U MIN = -8.541671768674378E-04 V MIN = -8.541671768674378E-04 W MIN = -1.446875973217087E-02 U MAX = 8.541671768674378E-04 V MAX = 8.541671768674378E-04 W MAX = 1.446875973217087E-02 max(|divU|) = 6.189500431536687E-02 sum(divU*dV) = -1.994552209034640E-16 sum(|divU|*dV) = 4.584631503542637E-02 Convective cfl = 1.035334021497968E-01 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.793937430115675E-01 Iterations to convergence = 2 Convergence in 1 iterations 0 KSP unpreconditioned resid norm 1.162684118472e-02 true resid norm 1.162684118472e-02 ||r(i)||/||b|| 1.890803640116e-01 1 KSP unpreconditioned resid norm 1.216327907130e-02 true resid norm 1.216327907130e-02 ||r(i)||/||b|| 1.978041325102e-01 Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2 KSP Object: 1 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-06, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=3 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve info for each block is in the following KSP and PC objects: [0] number of local blocks = 1, first local block number = 0 [0] local block number 0 KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot matrix ordering: nd factor fill ratio given 5, needed 2.26888 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 package used to perform factorization: petsc total: nonzeros=16851, allocated nonzeros=16851 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 110 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines - - - - - - - - - - - - - - - - - - linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=155, cols=155 total: nonzeros=7427, allocated nonzeros=7427 total number of mallocs used during MatSetValues calls =0 not using I-node routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0704919, max = 1.48033 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=11665, cols=11665 total: nonzeros=393037, allocated nonzeros=393037 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.0925946, max = 1.94449 maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning has attached null space using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=131072, cols=131072 total: nonzeros=917504, allocated nonzeros=917504 total number of mallocs used during MatSetValues calls =0 not using I-node routines **************************************** TIME STEP = 60 **************************************** Simulation time = 2.3438 sec Time/time step = 3.1097 sec Average time/time step = 5.1905 sec U MAX = 8.745386602467303E-04 V MAX = 8.745386602467303E-04 W MAX = 9.171554515245112E-10 U MIN = -8.745386602467303E-04 V MIN = -8.745386602467303E-04 W MIN = -1.472612170301261E-02 U MAX = 8.745386602467303E-04 V MAX = 8.745386602467303E-04 W MAX = 1.472612170301261E-02 max(|divU|) = 6.271064239595950E-02 sum(divU*dV) = 5.434308178504297E-17 sum(|divU|*dV) = 4.668591125631307E-02 Convective cfl = 1.054412737504388E-01 Viscous cfl = 7.281777777777779E-01 Gravity cfl = 6.205004906490450E-01 DT = 3.906250000000000E-02 MAX DT ALLOWED = 8.786033133991950E-01 Iterations to convergence = 2 ---------------------------------------- SUMMARY ---------------------------------------- Setup time = 0.0290 min Initialization time = 0.0021 min Processing time = 5.1905 min Post-processing time = 0.0000 min Total simulation time = 5.2216 min Processing time per time step = 5.1905 sec Total number of time steps = 60 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./hit on a interlagos-64idx-gnu-dbg named nid21503 with 1 processor, by Unknown Thu May 15 13:14:15 2014 Using Petsc Release Version 3.4.2, Jul, 02, 2013 Max Max/Min Avg Total Time (sec): 3.135e+02 1.00000 3.135e+02 Objects: 1.264e+03 1.00000 1.264e+03 Flops: 1.133e+10 1.00000 1.133e+10 1.133e+10 Flops/sec: 3.614e+07 1.00000 3.614e+07 3.614e+07 Memory: 1.210e+08 1.00000 1.210e+08 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 2.820e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 3.1346e+02 100.0% 1.1329e+10 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 2.819e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKer 1 1.0 5.9605e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ThreadCommBarrie 1 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 20 1.0 1.8802e-02 1.0 1.57e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 835 VecTDot 790 1.0 2.0298e-01 1.0 2.07e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1020 VecNorm 980 1.0 1.0135e-01 1.0 2.54e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 2509 VecScale 3230 1.0 4.8646e-01 1.0 2.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 474 VecCopy 1314 1.0 2.0094e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 3909 1.0 2.9169e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 7196 1.0 1.2969e+00 1.0 1.12e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 10 0 0 0 0 10 0 0 0 864 VecAYPX 7253 1.0 4.8285e+00 1.0 7.25e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 6 0 0 0 2 6 0 0 0 150 VecMAXPY 22 1.0 3.6978e-02 1.0 1.86e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 502 VecAssemblyBegin 62 1.0 1.8764e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 62 1.0 1.7571e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 4834 1.0 2.3412e+00 1.0 3.45e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 147 VecScatterBegin 120 1.0 1.4015e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSetRandom 2 1.0 1.5882e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 22 1.0 1.3849e-02 1.0 4.71e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 340 MatMult 5729 1.0 4.0637e+01 1.0 7.52e+09 1.0 0.0e+00 0.0e+00 0.0e+00 13 66 0 0 0 13 66 0 0 0 185 MatMultAdd 802 1.0 2.9998e+00 1.0 4.08e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 136 MatMultTranspose 802 1.0 3.5211e+00 1.0 4.08e+08 1.0 0.0e+00 0.0e+00 8.0e+02 1 4 0 0 3 1 4 0 0 3 116 MatSolve 401 1.0 6.9252e-02 1.0 1.35e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 194 MatLUFactorSym 1 1.0 2.0640e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 9.8920e-03 1.0 1.08e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 109 MatConvert 2 1.0 3.3493e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 6 1.0 2.2011e-02 1.0 3.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 165 MatAssemblyBegin 79 1.0 3.0279e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 79 1.0 5.6627e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRow 428211 1.0 1.1904e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 1.1492e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 5.8603e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 2 1.0 7.4472e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatView 360 1.0 5.0706e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 2 1.0 7.3698e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMatMult 2 1.0 2.6457e-01 1.0 3.13e+06 1.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 12 MatMatMultSym 2 1.0 2.0506e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 MatMatMultNum 2 1.0 5.9428e-02 1.0 3.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 53 MatPtAP 2 1.0 8.5463e-01 1.0 1.91e+07 1.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 22 MatPtAPSymbolic 2 1.0 2.9844e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 MatPtAPNumeric 2 1.0 5.5616e-01 1.0 1.91e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 34 MatTrnMatMult 2 1.0 2.5088e+00 1.0 3.99e+07 1.0 0.0e+00 0.0e+00 2.4e+01 1 0 0 0 0 1 0 0 0 0 16 MatGetSymTrans 4 1.0 5.5090e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 20 1.0 5.0538e-02 1.0 3.14e+07 1.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 0 0 0 0 0 0 621 KSPSetUp 8 1.0 5.7340e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 60 1.0 1.6715e+02 1.0 1.13e+10 1.0 0.0e+00 0.0e+00 2.7e+04 53100 0 0 96 53100 0 0 96 68 PCSetUp 2 1.0 2.2745e+01 1.0 1.32e+08 1.0 0.0e+00 0.0e+00 5.8e+02 7 1 0 0 2 7 1 0 0 2 6 PCSetUpOnBlocks 401 1.0 1.4589e-02 1.0 1.08e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 74 PCApply 401 1.0 1.1839e+02 1.0 8.85e+09 1.0 0.0e+00 0.0e+00 2.1e+04 38 78 0 0 75 38 78 0 0 75 75 PCGAMGgraph_AGG 2 1.0 5.8934e+00 1.0 2.62e+06 1.0 0.0e+00 0.0e+00 3.0e+01 2 0 0 0 0 2 0 0 0 0 0 PCGAMGcoarse_AGG 2 1.0 2.7180e+00 1.0 3.99e+07 1.0 0.0e+00 0.0e+00 3.4e+01 1 0 0 0 0 1 0 0 0 0 15 PCGAMGProl_AGG 2 1.0 1.1115e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 4 0 0 0 0 4 0 0 0 0 0 PCGAMGPOpt_AGG 2 1.0 2.1251e+00 1.0 6.98e+07 1.0 0.0e+00 0.0e+00 3.5e+02 1 1 0 0 1 1 1 0 0 1 33 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 1093 1093 1124012864 0 Vector Scatter 2 2 1384 0 Matrix 16 16 80634240 0 Matrix Coarsen 2 2 1368 0 Matrix Null Space 120 120 78240 0 Distributed Mesh 1 1 2409608 0 Bipartite Graph 2 2 1712 0 Index Set 10 10 1206832 0 IS L to G Mapping 1 1 1202876 0 Krylov Solver 7 7 85064 0 Preconditioner 7 7 7752 0 Viewer 1 0 0 0 PetscRandom 2 2 1344 0 ======================================================================================================================== Average time to get PetscTime(): 1.90735e-07 #PETSc Option Table entries: -finput input_droplet.txt -ksp_converged_reason -ksp_monitor_true_residual -ksp_view -log_summary -options_left -pc_gamg_agg_nsmooths 1 -pc_type gamg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure run at: Wed May 14 12:49:45 2014 Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib="-L/opt/acml/5.3.0/gfortran64/lib -lacml" --with-x="0 " --with-debugging="1 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-dynamic-loading="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " PETSC_ARCH=interlagos-64idx-gnu-dbg ----------------------------------------- Libraries compiled on Wed May 14 12:49:45 2014 on h2ologin2 Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2 Using PETSc arch: interlagos-64idx-gnu-dbg ----------------------------------------- Using C compiler: cc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -fno-inline -O0 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/lib -lsuperlu_dist_3.3 -L/opt/acml/5.3.0/gfortran64/lib -lacml -lpthread -lparmetis -lmetis -ldl ----------------------------------------- #PETSc Option Table entries: -ksp_converged_reason -ksp_monitor_true_residual -ksp_view -log_summary -options_left -pc_gamg_agg_nsmooths 1 -pc_type gamg #End of PETSc Option Table entries There is no unused database option. Application 4423330 resources: utime ~282s, stime ~33s, Rss ~206632, inblocks ~23827, outblocks ~77441 From mairhofer at itt.uni-stuttgart.de Fri May 16 02:59:58 2014 From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer) Date: Fri, 16 May 2014 09:59:58 +0200 Subject: [petsc-users] PetscMalloc with Fortran In-Reply-To: References: <5374D184.5070207@itt.uni-stuttgart.de> <87ha4r3xtq.fsf@jedbrown.org> <5374F1CC.3080906@itt.uni-stuttgart.de> Message-ID: <5375C57E.4070001@itt.uni-stuttgart.de> I tried to use ISColoringValue, but when I include IScoloringValue colors into my code, I get an error message from the compiler(gfortran): ISColoringValue colors 1 error: unclassifiable statement at (1) I'm including these header files, am I missing one? #include #include #include #include #include #include #include #include #include #include Thank you for your fast responses! Am 15.05.2014 19:16, schrieb Peter Brune: > You should be using an array of type ISColoringValue. ISColoringValue > is by default a short, not an int, so you're getting nonsense entries. > We should either maintain or remove ex5s if it does something like this. > > - Peter > > > On Thu, May 15, 2014 at 11:56 AM, Jonas Mairhofer > > wrote: > > > If 'colors' can be a dynamically allocated array then I dont know > where > the mistake is in this code: > > > > > > ISColoring iscoloring > Integer, allocatable :: colors(:) > PetscInt maxc > > ... > > > !calculate max. number of colors > maxc = 2*irc+1 !irc is the number of ghost nodes needed to > calculate the function I want to solve > > allocate(colors(user%xm)) !where user%xm is the number of > locally > owned nodes of a global array > > !Set colors > DO i=1,user%xm > colors(i) = mod(i,maxc) > END DO > > call > ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr) > > ... > > deallocate(colors) > call ISColoringDestroy(iscoloring,ierr) > > > > > On execution I get the following error message (running the DO > Loop from > 0 to user%xm-1 does not change anything): > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Arguments are incompatible! > [0]PETSC ERROR: Number of colors passed in 291 is less then the actual > number of colors in array 61665! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named > aries.itt.uni-stuttgart.de by > mhofer Thu May 15 18:01:41 2014 > [0]PETSC ERROR: Libraries linked from > /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib > [0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran > --download-f-blas-lapack --download-mpich > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ISColoringCreate() line 276 in > /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c > > > > > > But when I print out colors, it only has entries from 0 to 218, so no > entry is larger then 291 as stated in the error message. > > > > > > > > > > > Am 15.05.2014 16:45, schrieb Jed Brown: > > Jonas Mairhofer > writes: > > Hi, I'm trying to set the coloring of a matrix using > ISColoringCreate. > Therefore I need an array 'colors' which in C can be > creates as (from > example ex5s.c) > > int *colors > PetscMalloc(...,&colors) > > There is no PetscMalloc in Fortran, due to language > "deficiencies". > > colors(i) = .... > > ISColoringCreate(...) > > How do I have to define the array colors in Fortran? > > I tried: > > Integer, allocatable :: colors(:) and allocate() > instead of > PetscMalloc > > and > > Integer, pointer :: colors > > but neither worked. > > The ISColoringCreate Fortran binding copies from the array you > pass into > one allocated using PetscMalloc. You should pass a normal > Fortran array > (statically or dynamically allocated). > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliver.browne at upm.es Fri May 16 03:39:55 2014 From: oliver.browne at upm.es (Oliver Browne) Date: Fri, 16 May 2014 10:39:55 +0200 Subject: [petsc-users] MatMPIAIJSetPreallocationCSR In-Reply-To: <0CC4B85D-12C2-4C4B-BEFE-6E42437101E3@mcs.anl.gov> References: <30468df2eb47c9fc3cb02433a6ecc1f9@upm.es> <373F07AB-A56A-4A9A-BD2E-F854FEC16D50@mcs.anl.gov> <4ede9bde820986a689a2ba2fcb6291db@upm.es> <0CC4B85D-12C2-4C4B-BEFE-6E42437101E3@mcs.anl.gov> Message-ID: <8ae9445ef0992d6905969309fd6fbb56@upm.es> >> >> >> On 14-05-2014 17:34, Barry Smith wrote: >>> See the manual page for MatMPIAIJSetPreallocationCSR() it gives an >>> explicit simple example >>> The format which is used for the sparse matrix input, is >>> equivalent to a >>> row-major ordering.. i.e for the following matrix, the input data >>> expected is >>> as shown: >>> 1 0 0 >>> 2 0 3 P0 >>> ------- >>> 4 5 6 P1 >>> Process0 [P0]: rows_owned=[0,1] >>> i = {0,1,3} [size = nrow+1 = 2+1] >>> j = {0,0,2} [size = nz = 6] >>> v = {1,2,3} [size = nz = 6] >>> Process1 [P1]: rows_owned=[2] >>> i = {0,3} [size = nrow+1 = 1+1] >>> j = {0,1,2} [size = nz = 6] >>> v = {4,5,6} [size = nz = 6] >>> The column indices are global, the numerical values are just >>> numerical values and do not need to be adjusted. On each process the >>> i >>> indices start with 0 because they just point into the local part of >>> the j indices. >>> Are you saying each process of yours HAS the entire matrix? >> >> I am not entirely sure about this and what it means. Each processor >> has a portion of the matrix. >> >> >> If so >>> you just need to adjust the local portion of the i vales and pass >>> that >>> plus the appropriate location in j and v to the routine as in the >>> example above. >> >> So this MatMPIAIJSetPreallocationCSR call should be in some sort of >> loop; >> >> Do counter = 1, No of Processors >> >> calculate local numbering for i and isolate parts of j and v needed >> >> Call MatMPIAIJSetPreallocationCSR(A,i,j,v) >> >> END DO >> >> Is this correct? > > Oh boy, oh boy. No absolutely not. Each process is calling > MatMPIAIJSetPreallocationCSR() once with its part of the data. > > Barry so using the above example I should CALL MatMPIAIJSetPreallocationCSR(A,i,j,v) where i = [0, 1, 3, 6] j = [0, 0, 2, 0 , 1 , 2] v = [1,2,3,4,5,6] Ollie > >> >> Ollie >> >>> Barry >>> On May 14, 2014, at 8:36 AM, Oliver Browne >>> wrote: >>>> On 14-05-2014 15:27, Barry Smith wrote: >>>>> On May 14, 2014, at 7:42 AM, Oliver Browne >>>>> wrote: >>>>>> Hi, >>>>>> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, >>>>>> columns and values for my matrix (efficiency). I have my 3 vectors >>>>>> in CSR format. If I run on a single processor, with my test case, >>>>>> everything works fine. I also worked without >>>>>> MatMPIAIJSetPreallocationCSR, and individually input each value >>>>>> with the call MatSetValues in MPI and this also works fine. >>>>>> If I want to use MatMPIAIJSetPreallocationCSR, do I need to >>>>>> separate the vectors for each processor as they have done here; >>>>> What do you mean by ?separate? the vectors? Each processor >>>>> needs to provide ITS rows to the function call. You cannot have >>>>> processor zero deliver all the rows. >>>> I mean split them so they change from global numbering to local >>>> numbering. >>>> At the moment I just have >>>> CALL MatMPIAIJSetPreallocationCSR(A,NVPN,NNVI,CONT,ierr) - 3 vectors >>>> have global numbering >>>> How can submit this to a specific processor? >>>> Ollie >>>>> Barry >>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR >>>>>> Thanks in advance, >>>>>> Ollie From bsmith at mcs.anl.gov Fri May 16 07:12:25 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 May 2014 07:12:25 -0500 Subject: [petsc-users] MatMPIAIJSetPreallocationCSR In-Reply-To: <8ae9445ef0992d6905969309fd6fbb56@upm.es> References: <30468df2eb47c9fc3cb02433a6ecc1f9@upm.es> <373F07AB-A56A-4A9A-BD2E-F854FEC16D50@mcs.anl.gov> <4ede9bde820986a689a2ba2fcb6291db@upm.es> <0CC4B85D-12C2-4C4B-BEFE-6E42437101E3@mcs.anl.gov> <8ae9445ef0992d6905969309fd6fbb56@upm.es> Message-ID: No, process zero calls with > i = [0, 1, 3 > j = [0, 0, 2] > v = [1,2,3] and process one calls with > i = [0, 3] > j = [ 0 , 1 , 2] > v = [4,5,6] this is how MPI works, each process figures out its own information and calls appropriate routines with its own values. Barry On May 16, 2014, at 3:39 AM, Oliver Browne wrote: > >>> On 14-05-2014 17:34, Barry Smith wrote: >>>> See the manual page for MatMPIAIJSetPreallocationCSR() it gives an >>>> explicit simple example >>>> The format which is used for the sparse matrix input, is equivalent to a >>>> row-major ordering.. i.e for the following matrix, the input data >>>> expected is >>>> as shown: >>>> 1 0 0 >>>> 2 0 3 P0 >>>> ------- >>>> 4 5 6 P1 >>>> Process0 [P0]: rows_owned=[0,1] >>>> i = {0,1,3} [size = nrow+1 = 2+1] >>>> j = {0,0,2} [size = nz = 6] >>>> v = {1,2,3} [size = nz = 6] >>>> Process1 [P1]: rows_owned=[2] >>>> i = {0,3} [size = nrow+1 = 1+1] >>>> j = {0,1,2} [size = nz = 6] >>>> v = {4,5,6} [size = nz = 6] >>>> The column indices are global, the numerical values are just >>>> numerical values and do not need to be adjusted. On each process the i >>>> indices start with 0 because they just point into the local part of >>>> the j indices. >>>> Are you saying each process of yours HAS the entire matrix? >>> I am not entirely sure about this and what it means. Each processor has a portion of the matrix. >>> If so >>>> you just need to adjust the local portion of the i vales and pass that >>>> plus the appropriate location in j and v to the routine as in the >>>> example above. >>> So this MatMPIAIJSetPreallocationCSR call should be in some sort of loop; >>> Do counter = 1, No of Processors >>> calculate local numbering for i and isolate parts of j and v needed >>> Call MatMPIAIJSetPreallocationCSR(A,i,j,v) >>> END DO >>> Is this correct? >> Oh boy, oh boy. No absolutely not. Each process is calling >> MatMPIAIJSetPreallocationCSR() once with its part of the data. >> Barry > > > so using the above example > > I should > > CALL MatMPIAIJSetPreallocationCSR(A,i,j,v) > > where > > i = [0, 1, 3, 6] > j = [0, 0, 2, 0 , 1 , 2] > v = [1,2,3,4,5,6] > > Ollie > > >>> Ollie >>>> Barry >>>> On May 14, 2014, at 8:36 AM, Oliver Browne wrote: >>>>> On 14-05-2014 15:27, Barry Smith wrote: >>>>>> On May 14, 2014, at 7:42 AM, Oliver Browne wrote: >>>>>>> Hi, >>>>>>> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine. >>>>>>> If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here; >>>>>> What do you mean by ?separate? the vectors? Each processor >>>>>> needs to provide ITS rows to the function call. You cannot have >>>>>> processor zero deliver all the rows. >>>>> I mean split them so they change from global numbering to local numbering. >>>>> At the moment I just have >>>>> CALL MatMPIAIJSetPreallocationCSR(A,NVPN,NNVI,CONT,ierr) - 3 vectors have global numbering >>>>> How can submit this to a specific processor? >>>>> Ollie >>>>>> Barry >>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR >>>>>>> Thanks in advance, >>>>>>> Ollie From dario.isola at newmerical.com Fri May 16 10:46:33 2014 From: dario.isola at newmerical.com (Dario Isola) Date: Fri, 16 May 2014 11:46:33 -0400 Subject: [petsc-users] hypre support Message-ID: <537632D9.2090603@newmerical.com> Dear all, I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre. I tried to run it with the following options -pc_type hypre -pc_type_hypre pilut -ksp_type richardson and, although he did not complain, it does not solve the system either. To what extent is hypre supported by petsc? More specifically, what kind of matrices? I am using a *baij*//matrix. Thanks in advance, D -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri May 16 10:49:55 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 May 2014 10:49:55 -0500 Subject: [petsc-users] PetscMalloc with Fortran In-Reply-To: <5375C57E.4070001@itt.uni-stuttgart.de> References: <5374D184.5070207@itt.uni-stuttgart.de> <87ha4r3xtq.fsf@jedbrown.org> <5374F1CC.3080906@itt.uni-stuttgart.de> <5375C57E.4070001@itt.uni-stuttgart.de> Message-ID: <5B38C1F8-60E6-44B0-8268-F23563C437B4@mcs.anl.gov> Sorry it is missing in the fortran includes. You can use a short unsigned integer (16 bit) to represent it. I don?t know how that it is indicated in Fortran but a Fortran programmer would know. Request-assigned: Satish, please add ISColoringValue to Fortran include; note that its value is configure assigned on the C side. Barry On May 16, 2014, at 2:59 AM, Jonas Mairhofer wrote: > > > I tried to use ISColoringValue, but when I include > > IScoloringValue colors > > > into my code, I get an error message from the compiler(gfortran): > > > ISColoringValue colors > 1 > error: unclassifiable statement at (1) > > I'm including these header files, am I missing one? > > > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > > > Thank you for your fast responses! > > > > Am 15.05.2014 19:16, schrieb Peter Brune: >> You should be using an array of type ISColoringValue. ISColoringValue is by default a short, not an int, so you're getting nonsense entries. We should either maintain or remove ex5s if it does something like this. >> >> - Peter >> >> >> On Thu, May 15, 2014 at 11:56 AM, Jonas Mairhofer wrote: >> >> If 'colors' can be a dynamically allocated array then I dont know where >> the mistake is in this code: >> >> >> >> >> >> ISColoring iscoloring >> Integer, allocatable :: colors(:) >> PetscInt maxc >> >> ... >> >> >> !calculate max. number of colors >> maxc = 2*irc+1 !irc is the number of ghost nodes needed to >> calculate the function I want to solve >> >> allocate(colors(user%xm)) !where user%xm is the number of locally >> owned nodes of a global array >> >> !Set colors >> DO i=1,user%xm >> colors(i) = mod(i,maxc) >> END DO >> >> call >> ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr) >> >> ... >> >> deallocate(colors) >> call ISColoringDestroy(iscoloring,ierr) >> >> >> >> >> On execution I get the following error message (running the DO Loop from >> 0 to user%xm-1 does not change anything): >> >> >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Arguments are incompatible! >> [0]PETSC ERROR: Number of colors passed in 291 is less then the actual >> number of colors in array 61665! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named >> aries.itt.uni-stuttgart.de by mhofer Thu May 15 18:01:41 2014 >> [0]PETSC ERROR: Libraries linked from >> /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib >> [0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014 >> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran >> --download-f-blas-lapack --download-mpich >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ISColoringCreate() line 276 in >> /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c >> >> >> >> >> >> But when I print out colors, it only has entries from 0 to 218, so no >> entry is larger then 291 as stated in the error message. >> >> >> >> >> >> >> >> >> >> >> Am 15.05.2014 16:45, schrieb Jed Brown: >> >> Jonas Mairhofer writes: >> >> Hi, I'm trying to set the coloring of a matrix using ISColoringCreate. >> Therefore I need an array 'colors' which in C can be creates as (from >> example ex5s.c) >> >> int *colors >> PetscMalloc(...,&colors) >> There is no PetscMalloc in Fortran, due to language "deficiencies". >> >> colors(i) = .... >> >> ISColoringCreate(...) >> >> How do I have to define the array colors in Fortran? >> >> I tried: >> >> Integer, allocatable :: colors(:) and allocate() instead of >> PetscMalloc >> >> and >> >> Integer, pointer :: colors >> >> but neither worked. >> The ISColoringCreate Fortran binding copies from the array you pass into >> one allocated using PetscMalloc. You should pass a normal Fortran array >> (statically or dynamically allocated). >> >> > From jed at jedbrown.org Fri May 16 10:50:11 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 16 May 2014 09:50:11 -0600 Subject: [petsc-users] hypre support In-Reply-To: <537632D9.2090603@newmerical.com> References: <537632D9.2090603@newmerical.com> Message-ID: <87y4y1wwng.fsf@jedbrown.org> Dario Isola writes: > Dear all, > > I am investigating the use of hypre+petsc. I was able to successfully > configure, install, compile petsc 3.3 with the external package for hypre. > > I tried to run it with the following options > > -pc_type hypre -pc_type_hypre pilut -ksp_type richardson > > and, although he did not complain, it does not solve the system either. There is no reason pilut (which is deprecated; Hypre recommends using euclid) can be expected to create a contractive Richardson iteration. You should probably use Krylov (perhaps GMRES, the default), which will also fix the scaling. Use -ksp_monitor_true_residual -ksp_converged_reason while debugging/tuning the solver. > To what extent is hypre supported by petsc? More specifically, what kind > of matrices? I am using a *baij*//matrix. AIJ and BAIJ work with Hypre. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From bsmith at mcs.anl.gov Fri May 16 10:54:19 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 May 2014 10:54:19 -0500 Subject: [petsc-users] hypre support In-Reply-To: <537632D9.2090603@newmerical.com> References: <537632D9.2090603@newmerical.com> Message-ID: On May 16, 2014, at 10:46 AM, Dario Isola wrote: > Dear all, > > I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre. > > I tried to run it with the following options > -pc_type hypre -pc_type_hypre pilut -ksp_type richardson > and, although he did not complain, it does not solve the system either. Do you meaning it did not converge? At first always run with -ksp_view (or -snes_view if using snes or -ts_view if using ts) and -ksp_monitor_true_residual to see what is going on. > -pc_type_hypre pilut is wrong it is -pc_hypre_type pilut Note that pilut will generally not work with Richardson you need a ?real? Krylov method like GMRES. Also the ilu type preconditioners don?t scale particularly well though occasionally they can be fine. > > To what extent is hypre supported by petsc? More specifically, what kind of matrices? If it cannot handle the matrix type it would give an error message. Hypre uses a format like AIJ so you should use AIJ. Note that you can make the matrix type a runtime option so you don?t have to compile in that it is BAIJ. > I am using a baij matrix. > > Thanks in advance, > D From dario.isola at newmerical.com Fri May 16 11:49:40 2014 From: dario.isola at newmerical.com (Dario Isola) Date: Fri, 16 May 2014 12:49:40 -0400 Subject: [petsc-users] hypre support In-Reply-To: References: <537632D9.2090603@newmerical.com> Message-ID: <537641A4.3050907@newmerical.com> Thanks a lot for your answers. I ran it with -ksp_type gmres -pc_type hypre -pc_hypre_type euclid and it worked very well. Thanks. I then tried to use boomeramg as a preconditioner coupled with Richardson but I was not successful, it failed to solve the system and returned nans. -ksp_type richardson -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_relax_type_all SOR/Jacobi -pc_hypre_boomeramg_print_debug -ksp_view -ksp_monitor_true_residual and i got the following ===== Proc = 0 Level = 0 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 18308 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 7 nc_offd = 0 ===== Proc = 0 Level = 1 ===== Proc = 0 Coarsen 1st pass = 0.010000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 8725 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 16 nc_offd = 0 ===== Proc = 0 Level = 2 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 4721 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 Proc = 0 iter 3 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 4 nc_offd = 0 ===== Proc = 0 Level = 3 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 2495 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 22 nc_offd = 0 Proc = 0 iter 3 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 4 nc_offd = 0 ===== Proc = 0 Level = 4 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 1337 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 Proc = 0 iter 3 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 ===== Proc = 0 Level = 5 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 695 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 3 nc_offd = 0 ===== Proc = 0 Level = 6 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 343 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 21 nc_offd = 0 Proc = 0 iter 3 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 ===== Proc = 0 Level = 7 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 174 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 15 nc_offd = 0 Proc = 0 iter 3 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 ===== Proc = 0 Level = 8 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 81 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 ===== Proc = 0 Level = 9 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 37 nc_offd = 0 Proc = 0 iter 2 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 6 nc_offd = 0 ===== Proc = 0 Level = 10 ===== Proc = 0 Coarsen 1st pass = 0.000000 Proc = 0 Coarsen 2nd pass = 0.000000 Proc = 0 Initialize CLJP phase = 0.000000 Proc = 0 iter 1 comm. and subgraph update = 0.000000 Proc = 0 CLJP phase = 0.000000 graph_size = 11 nc_offd = 0 0 KSP preconditioned resid norm 7.299769365830e+14 true resid norm 8.197927963033e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.319459389445e+28 true resid norm 6.152576199945e+12 ||r(i)||/||b|| 7.505038136086e+14 KSP Object: 1 MPI processes type: richardson Richardson: damping factor=1 maximum iterations=90, initial guess is zero tolerances: relative=0.1, absolute=1e-50, divergence=100000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.25 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 HYPRE BoomerAMG: Maximum row sums 0.9 HYPRE BoomerAMG: Sweeps down 1 HYPRE BoomerAMG: Sweeps up 1 HYPRE BoomerAMG: Sweeps on coarse 1 HYPRE BoomerAMG: Relax down SOR/Jacobi HYPRE BoomerAMG: Relax up SOR/Jacobi HYPRE BoomerAMG: Relax on coarse Gaussian-elimination HYPRE BoomerAMG: Relax weight (all) 1 HYPRE BoomerAMG: Outer relax weight (all) 1 HYPRE BoomerAMG: Using CF-relaxation HYPRE BoomerAMG: Measure type local HYPRE BoomerAMG: Coarsen type Falgout HYPRE BoomerAMG: Interpolation type classical linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqbaij rows=22905, cols=22905, bs=5 total: nonzeros=785525, allocated nonzeros=785525 total number of mallocs used during MatSetValues calls =0 block size is 5 Do you guys have any suggestion? Is it possible that I am haven't initialized boomeramg properly? Or it is just my system equations that can not be solved by AMG? Sincerely, Dario On 05/16/2014 11:54 AM, Barry Smith wrote: > On May 16, 2014, at 10:46 AM, Dario Isola wrote: > >> Dear all, >> >> I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre. >> >> I tried to run it with the following options >> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson >> and, although he did not complain, it does not solve the system either. > Do you meaning it did not converge? At first always run with -ksp_view (or -snes_view if using snes or -ts_view if using ts) and -ksp_monitor_true_residual to see what is going on. > >> -pc_type_hypre pilut > is wrong it is -pc_hypre_type pilut > > Note that pilut will generally not work with Richardson you need a ?real? Krylov method like GMRES. > > Also the ilu type preconditioners don?t scale particularly well though occasionally they can be fine. > >> To what extent is hypre supported by petsc? More specifically, what kind of matrices? > If it cannot handle the matrix type it would give an error message. Hypre uses a format like AIJ so you should use AIJ. Note that you can make the matrix type a runtime option so you don?t have to compile in that it is BAIJ. > > >> I am using a baij matrix. >> >> Thanks in advance, >> D -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri May 16 12:56:49 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 May 2014 12:56:49 -0500 Subject: [petsc-users] hypre support In-Reply-To: <537641A4.3050907@newmerical.com> References: <537632D9.2090603@newmerical.com> <537641A4.3050907@newmerical.com> Message-ID: <98D9D6B3-D03E-4254-9634-FA3D229918DD@mcs.anl.gov> Algebraic multigrid is not for everything. On May 16, 2014, at 11:49 AM, Dario Isola wrote: > Thanks a lot for your answers. > > I ran it with > -ksp_type gmres -pc_type hypre -pc_hypre_type euclid > and it worked very well. Thanks. > > I then tried to use boomeramg as a preconditioner coupled with Richardson but I was not successful, it failed to solve the system and returned nans. > -ksp_type richardson -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_relax_type_all SOR/Jacobi -pc_hypre_boomeramg_print_debug -ksp_view -ksp_monitor_true_residual > and i got the following > > ===== Proc = 0 Level = 0 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 18308 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 7 nc_offd = 0 > > ===== Proc = 0 Level = 1 ===== > Proc = 0 Coarsen 1st pass = 0.010000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 8725 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 16 nc_offd = 0 > > ===== Proc = 0 Level = 2 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 4721 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 > Proc = 0 iter 3 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 4 nc_offd = 0 > > ===== Proc = 0 Level = 3 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 2495 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 22 nc_offd = 0 > Proc = 0 iter 3 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 4 nc_offd = 0 > > ===== Proc = 0 Level = 4 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 1337 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 > Proc = 0 iter 3 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 > > ===== Proc = 0 Level = 5 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 695 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 3 nc_offd = 0 > > ===== Proc = 0 Level = 6 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 343 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 21 nc_offd = 0 > Proc = 0 iter 3 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 > > ===== Proc = 0 Level = 7 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 174 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 15 nc_offd = 0 > Proc = 0 iter 3 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 > > ===== Proc = 0 Level = 8 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 81 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 > > ===== Proc = 0 Level = 9 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 37 nc_offd = 0 > Proc = 0 iter 2 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 6 nc_offd = 0 > > ===== Proc = 0 Level = 10 ===== > Proc = 0 Coarsen 1st pass = 0.000000 > Proc = 0 Coarsen 2nd pass = 0.000000 > Proc = 0 Initialize CLJP phase = 0.000000 > Proc = 0 iter 1 comm. and subgraph update = 0.000000 > Proc = 0 CLJP phase = 0.000000 graph_size = 11 nc_offd = 0 > > > 0 KSP preconditioned resid norm 7.299769365830e+14 true resid norm 8.197927963033e-03 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.319459389445e+28 true resid norm 6.152576199945e+12 ||r(i)||/||b|| 7.505038136086e+14 > KSP Object: 1 MPI processes > type: richardson > Richardson: damping factor=1 > maximum iterations=90, initial guess is zero > tolerances: relative=0.1, absolute=1e-50, divergence=100000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: hypre > HYPRE BoomerAMG preconditioning > HYPRE BoomerAMG: Cycle type V > HYPRE BoomerAMG: Maximum number of levels 25 > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > HYPRE BoomerAMG: Threshold for strong coupling 0.25 > HYPRE BoomerAMG: Interpolation truncation factor 0 > HYPRE BoomerAMG: Interpolation: max elements per row 0 > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > HYPRE BoomerAMG: Maximum row sums 0.9 > HYPRE BoomerAMG: Sweeps down 1 > HYPRE BoomerAMG: Sweeps up 1 > HYPRE BoomerAMG: Sweeps on coarse 1 > HYPRE BoomerAMG: Relax down SOR/Jacobi > HYPRE BoomerAMG: Relax up SOR/Jacobi > HYPRE BoomerAMG: Relax on coarse Gaussian-elimination > HYPRE BoomerAMG: Relax weight (all) 1 > HYPRE BoomerAMG: Outer relax weight (all) 1 > HYPRE BoomerAMG: Using CF-relaxation > HYPRE BoomerAMG: Measure type local > HYPRE BoomerAMG: Coarsen type Falgout > HYPRE BoomerAMG: Interpolation type classical > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqbaij > rows=22905, cols=22905, bs=5 > total: nonzeros=785525, allocated nonzeros=785525 > total number of mallocs used during MatSetValues calls =0 > block size is 5 > Do you guys have any suggestion? Is it possible that I am haven't initialized boomeramg properly? Or it is just my system equations that can not be solved by AMG? > > Sincerely, > Dario > > > > > On 05/16/2014 11:54 AM, Barry Smith wrote: >> On May 16, 2014, at 10:46 AM, Dario Isola >> wrote: >> >> >>> Dear all, >>> >>> I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre. >>> >>> I tried to run it with the following options >>> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson >>> and, although he did not complain, it does not solve the system either. >>> >> Do you meaning it did not converge? At first always run with -ksp_view (or -snes_view if using snes or -ts_view if using ts) and -ksp_monitor_true_residual to see what is going on. >> >> >>> -pc_type_hypre pilut >>> >> is wrong it is -pc_hypre_type pilut >> >> Note that pilut will generally not work with Richardson you need a ?real? Krylov method like GMRES. >> >> Also the ilu type preconditioners don?t scale particularly well though occasionally they can be fine. >> >> >>> To what extent is hypre supported by petsc? More specifically, what kind of matrices? >>> >> If it cannot handle the matrix type it would give an error message. Hypre uses a format like AIJ so you should use AIJ. Note that you can make the matrix type a runtime option so you don?t have to compile in that it is BAIJ. >> >> >> >>> I am using a baij matrix. >>> >>> Thanks in advance, >>> D >>> > From dario.isola at newmerical.com Fri May 16 13:55:29 2014 From: dario.isola at newmerical.com (Dario Isola) Date: Fri, 16 May 2014 14:55:29 -0400 Subject: [petsc-users] hypre support In-Reply-To: <98D9D6B3-D03E-4254-9634-FA3D229918DD@mcs.anl.gov> References: <537632D9.2090603@newmerical.com> <537641A4.3050907@newmerical.com> <98D9D6B3-D03E-4254-9634-FA3D229918DD@mcs.anl.gov> Message-ID: <53765F21.5080809@newmerical.com> I was eventually able to make it run adopting a very small time-step (Courant number of about 1). So either my problem is not well solved by AMG, as you said, or I am not using it very well. But I guess I should be able to take it from there. Thanks again for the support! Dario On 05/16/2014 01:56 PM, Barry Smith wrote: > Algebraic multigrid is not for everything. > > On May 16, 2014, at 11:49 AM, Dario Isola wrote: > >> Thanks a lot for your answers. >> >> I ran it with >> -ksp_type gmres -pc_type hypre -pc_hypre_type euclid >> and it worked very well. Thanks. >> >> I then tried to use boomeramg as a preconditioner coupled with Richardson but I was not successful, it failed to solve the system and returned nans. >> -ksp_type richardson -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_relax_type_all SOR/Jacobi -pc_hypre_boomeramg_print_debug -ksp_view -ksp_monitor_true_residual >> and i got the following >> >> ===== Proc = 0 Level = 0 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 18308 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 7 nc_offd = 0 >> >> ===== Proc = 0 Level = 1 ===== >> Proc = 0 Coarsen 1st pass = 0.010000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 8725 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 16 nc_offd = 0 >> >> ===== Proc = 0 Level = 2 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 4721 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 >> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 4 nc_offd = 0 >> >> ===== Proc = 0 Level = 3 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 2495 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 22 nc_offd = 0 >> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 4 nc_offd = 0 >> >> ===== Proc = 0 Level = 4 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 1337 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 >> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 >> >> ===== Proc = 0 Level = 5 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 695 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 3 nc_offd = 0 >> >> ===== Proc = 0 Level = 6 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 343 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 21 nc_offd = 0 >> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 >> >> ===== Proc = 0 Level = 7 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 174 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 15 nc_offd = 0 >> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 >> >> ===== Proc = 0 Level = 8 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 81 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 >> >> ===== Proc = 0 Level = 9 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 37 nc_offd = 0 >> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 6 nc_offd = 0 >> >> ===== Proc = 0 Level = 10 ===== >> Proc = 0 Coarsen 1st pass = 0.000000 >> Proc = 0 Coarsen 2nd pass = 0.000000 >> Proc = 0 Initialize CLJP phase = 0.000000 >> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >> Proc = 0 CLJP phase = 0.000000 graph_size = 11 nc_offd = 0 >> >> >> 0 KSP preconditioned resid norm 7.299769365830e+14 true resid norm 8.197927963033e-03 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP preconditioned resid norm 2.319459389445e+28 true resid norm 6.152576199945e+12 ||r(i)||/||b|| 7.505038136086e+14 >> KSP Object: 1 MPI processes >> type: richardson >> Richardson: damping factor=1 >> maximum iterations=90, initial guess is zero >> tolerances: relative=0.1, absolute=1e-50, divergence=100000 >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: hypre >> HYPRE BoomerAMG preconditioning >> HYPRE BoomerAMG: Cycle type V >> HYPRE BoomerAMG: Maximum number of levels 25 >> HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 >> HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 >> HYPRE BoomerAMG: Threshold for strong coupling 0.25 >> HYPRE BoomerAMG: Interpolation truncation factor 0 >> HYPRE BoomerAMG: Interpolation: max elements per row 0 >> HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 >> HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 >> HYPRE BoomerAMG: Maximum row sums 0.9 >> HYPRE BoomerAMG: Sweeps down 1 >> HYPRE BoomerAMG: Sweeps up 1 >> HYPRE BoomerAMG: Sweeps on coarse 1 >> HYPRE BoomerAMG: Relax down SOR/Jacobi >> HYPRE BoomerAMG: Relax up SOR/Jacobi >> HYPRE BoomerAMG: Relax on coarse Gaussian-elimination >> HYPRE BoomerAMG: Relax weight (all) 1 >> HYPRE BoomerAMG: Outer relax weight (all) 1 >> HYPRE BoomerAMG: Using CF-relaxation >> HYPRE BoomerAMG: Measure type local >> HYPRE BoomerAMG: Coarsen type Falgout >> HYPRE BoomerAMG: Interpolation type classical >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqbaij >> rows=22905, cols=22905, bs=5 >> total: nonzeros=785525, allocated nonzeros=785525 >> total number of mallocs used during MatSetValues calls =0 >> block size is 5 >> Do you guys have any suggestion? Is it possible that I am haven't initialized boomeramg properly? Or it is just my system equations that can not be solved by AMG? >> >> Sincerely, >> Dario >> >> >> >> >> On 05/16/2014 11:54 AM, Barry Smith wrote: >>> On May 16, 2014, at 10:46 AM, Dario Isola >>> wrote: >>> >>> >>>> Dear all, >>>> >>>> I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre. >>>> >>>> I tried to run it with the following options >>>> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson >>>> and, although he did not complain, it does not solve the system either. >>>> >>> Do you meaning it did not converge? At first always run with -ksp_view (or -snes_view if using snes or -ts_view if using ts) and -ksp_monitor_true_residual to see what is going on. >>> >>> >>>> -pc_type_hypre pilut >>>> >>> is wrong it is -pc_hypre_type pilut >>> >>> Note that pilut will generally not work with Richardson you need a ?real? Krylov method like GMRES. >>> >>> Also the ilu type preconditioners don?t scale particularly well though occasionally they can be fine. >>> >>> >>>> To what extent is hypre supported by petsc? More specifically, what kind of matrices? >>>> >>> If it cannot handle the matrix type it would give an error message. Hypre uses a format like AIJ so you should use AIJ. Note that you can make the matrix type a runtime option so you don?t have to compile in that it is BAIJ. >>> >>> >>> >>>> I am using a baij matrix. >>>> >>>> Thanks in advance, >>>> D >>>> From knepley at gmail.com Fri May 16 14:13:13 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 16 May 2014 14:13:13 -0500 Subject: [petsc-users] hypre support In-Reply-To: <53765F21.5080809@newmerical.com> References: <537632D9.2090603@newmerical.com> <537641A4.3050907@newmerical.com> <98D9D6B3-D03E-4254-9634-FA3D229918DD@mcs.anl.gov> <53765F21.5080809@newmerical.com> Message-ID: On Fri, May 16, 2014 at 1:55 PM, Dario Isola wrote: > I was eventually able to make it run adopting a very small time-step > (Courant number of about 1). > AMG is intended for elliptic systems. Matt > So either my problem is not well solved by AMG, as you said, or I am not > using it very well. > > But I guess I should be able to take it from there. > > Thanks again for the support! > > Dario > > > On 05/16/2014 01:56 PM, Barry Smith wrote: > >> Algebraic multigrid is not for everything. >> >> On May 16, 2014, at 11:49 AM, Dario Isola >> wrote: >> >> Thanks a lot for your answers. >>> >>> I ran it with >>> -ksp_type gmres -pc_type hypre -pc_hypre_type euclid >>> and it worked very well. Thanks. >>> >>> I then tried to use boomeramg as a preconditioner coupled with >>> Richardson but I was not successful, it failed to solve the system and >>> returned nans. >>> -ksp_type richardson -pc_type hypre -pc_hypre_type boomeramg >>> -pc_hypre_boomeramg_relax_type_all SOR/Jacobi -pc_hypre_boomeramg_print_debug >>> -ksp_view -ksp_monitor_true_residual >>> and i got the following >>> >>> ===== Proc = 0 Level = 0 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 18308 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 7 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 1 ===== >>> Proc = 0 Coarsen 1st pass = 0.010000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 8725 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 16 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 2 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 4721 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 >>> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 4 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 3 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 2495 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 22 nc_offd = 0 >>> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 4 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 4 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 1337 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 >>> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 5 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 695 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 3 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 6 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 343 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 21 nc_offd = 0 >>> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 7 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 174 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 15 nc_offd = 0 >>> Proc = 0 iter 3 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 2 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 8 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 81 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 13 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 9 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 37 nc_offd = 0 >>> Proc = 0 iter 2 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 6 nc_offd = 0 >>> >>> ===== Proc = 0 Level = 10 ===== >>> Proc = 0 Coarsen 1st pass = 0.000000 >>> Proc = 0 Coarsen 2nd pass = 0.000000 >>> Proc = 0 Initialize CLJP phase = 0.000000 >>> Proc = 0 iter 1 comm. and subgraph update = 0.000000 >>> Proc = 0 CLJP phase = 0.000000 graph_size = 11 nc_offd = 0 >>> >>> >>> 0 KSP preconditioned resid norm 7.299769365830e+14 true resid norm >>> 8.197927963033e-03 ||r(i)||/||b|| 1.000000000000e+00 >>> 1 KSP preconditioned resid norm 2.319459389445e+28 true resid norm >>> 6.152576199945e+12 ||r(i)||/||b|| 7.505038136086e+14 >>> KSP Object: 1 MPI processes >>> type: richardson >>> Richardson: damping factor=1 >>> maximum iterations=90, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-50, divergence=100000 >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI processes >>> type: hypre >>> HYPRE BoomerAMG preconditioning >>> HYPRE BoomerAMG: Cycle type V >>> HYPRE BoomerAMG: Maximum number of levels 25 >>> HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 >>> HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 >>> HYPRE BoomerAMG: Threshold for strong coupling 0.25 >>> HYPRE BoomerAMG: Interpolation truncation factor 0 >>> HYPRE BoomerAMG: Interpolation: max elements per row 0 >>> HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 >>> HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 >>> HYPRE BoomerAMG: Maximum row sums 0.9 >>> HYPRE BoomerAMG: Sweeps down 1 >>> HYPRE BoomerAMG: Sweeps up 1 >>> HYPRE BoomerAMG: Sweeps on coarse 1 >>> HYPRE BoomerAMG: Relax down SOR/Jacobi >>> HYPRE BoomerAMG: Relax up SOR/Jacobi >>> HYPRE BoomerAMG: Relax on coarse Gaussian-elimination >>> HYPRE BoomerAMG: Relax weight (all) 1 >>> HYPRE BoomerAMG: Outer relax weight (all) 1 >>> HYPRE BoomerAMG: Using CF-relaxation >>> HYPRE BoomerAMG: Measure type local >>> HYPRE BoomerAMG: Coarsen type Falgout >>> HYPRE BoomerAMG: Interpolation type classical >>> linear system matrix = precond matrix: >>> Matrix Object: 1 MPI processes >>> type: seqbaij >>> rows=22905, cols=22905, bs=5 >>> total: nonzeros=785525, allocated nonzeros=785525 >>> total number of mallocs used during MatSetValues calls =0 >>> block size is 5 >>> Do you guys have any suggestion? Is it possible that I am haven't >>> initialized boomeramg properly? Or it is just my system equations that can >>> not be solved by AMG? >>> >>> Sincerely, >>> Dario >>> >>> >>> >>> >>> On 05/16/2014 11:54 AM, Barry Smith wrote: >>> >>>> On May 16, 2014, at 10:46 AM, Dario Isola >>>> wrote: >>>> >>>> >>>> Dear all, >>>>> >>>>> I am investigating the use of hypre+petsc. I was able to successfully >>>>> configure, install, compile petsc 3.3 with the external package for hypre. >>>>> >>>>> I tried to run it with the following options >>>>> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson >>>>> and, although he did not complain, it does not solve the system either. >>>>> >>>>> Do you meaning it did not converge? At first always run with >>>> -ksp_view (or -snes_view if using snes or -ts_view if using ts) and >>>> -ksp_monitor_true_residual to see what is going on. >>>> >>>> >>>> -pc_type_hypre pilut >>>>> >>>>> is wrong it is -pc_hypre_type pilut >>>> >>>> Note that pilut will generally not work with Richardson you need a >>>> ?real? Krylov method like GMRES. >>>> >>>> Also the ilu type preconditioners don?t scale particularly well though >>>> occasionally they can be fine. >>>> >>>> >>>> To what extent is hypre supported by petsc? More specifically, what >>>>> kind of matrices? >>>>> >>>>> If it cannot handle the matrix type it would give an error >>>> message. Hypre uses a format like AIJ so you should use AIJ. Note that you >>>> can make the matrix type a runtime option so you don?t have to compile in >>>> that it is BAIJ. >>>> >>>> >>>> >>>> I am using a baij matrix. >>>>> >>>>> Thanks in advance, >>>>> D >>>>> >>>>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From altriaex86 at gmail.com Fri May 16 18:21:23 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Sat, 17 May 2014 09:21:23 +1000 Subject: [petsc-users] SlepcInitialize not return Message-ID: Hi, I write a piece of code containing SLEPc calling and compile it into a .so library. Then I call this library in my program. I call MPI_Init in the main program, but it turns out that function SlepcInitialize in SLEPc .so library will not return, and the program stuck at that line. I don't know if this problem related to the place of MPI_Init calling, but the library works well if I only use one process and not initialize MPI outside. Guoxi -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri May 16 18:28:52 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 May 2014 18:28:52 -0500 Subject: [petsc-users] SlepcInitialize not return In-Reply-To: References: Message-ID: <249234C0-A064-42A4-8E96-7AC5259C92C3@mcs.anl.gov> Reproduce this in a small about of code (sounds easy in this case) and then email the code out; your verbal descriptions is not detailed enough to determine the problem. Also send a make file that links the .so library, it could be a problem related to that. Barry On May 16, 2014, at 6:21 PM, ??? wrote: > Hi, > > I write a piece of code containing SLEPc calling and compile it into a .so library. Then I call this library in my program. I call MPI_Init in the main program, but it turns out that function SlepcInitialize in SLEPc .so library will not return, and the program stuck at that line. I don't know if this problem related to the place of MPI_Init calling, but the library works well if I only use one process and not initialize MPI outside. > > Guoxi From bsmith at mcs.anl.gov Fri May 16 19:54:52 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 16 May 2014 19:54:52 -0500 Subject: [petsc-users] SlepcInitialize not return In-Reply-To: References: <249234C0-A064-42A4-8E96-7AC5259C92C3@mcs.anl.gov> Message-ID: <2303C406-C3A0-4FD8-819F-4E8CF9A5AC50@mcs.anl.gov> Sounds like some issue with how you are using the shared libraries. If you don?t show us the offending code it is going to be awfully difficult for us to debug it. Barry On May 16, 2014, at 6:39 PM, ??? wrote: > execute program A calls .so library B, and .so libraryB calls SLEPclibrary. > I used to put MPI_Init() only in A. But this time I try to put it in A,B( and C). > Now it returns to this to me. > I guess maybe this is why it does not return. However I don't know how to fix it. > > INTERNAL ERROR: Invalid error class (59) encountered while returning from > MPI_Init. Please file a bug report. > Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: > (unknown)(): unable to bind socket to port > INTERNAL ERROR: Invalid error class (59) encountered while returning from > MPI_Init. Please file a bug report. > Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: > (unknown)(): unable to bind socket to port > [cli_1]: aborting job: > Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: > (unknown)(): unable to bind socket to port > INTERNAL ERROR: Invalid error class (59) encountered while returning from > MPI_Init. Please file a bug report. > INTERNAL ERROR: Invalid error class (59) encountered while returning from > MPI_Init. Please file a bug report. > Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: > (unknown)(): unable to bind socket to port > [cli_3]: aborting job: > Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: > (unknown)(): unable to bind socket to port > [cli_0]: aborting job: > Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: > (unknown)(): unable to bind socket to port > Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: > (unknown)(): unable to bind socket to port > [cli_2]: aborting job: > Fatal error in MPI_Init: Unknown error. Please file a bug report., error stack: > (unknown)(): unable to bind socket to port > > > > 2014-05-17 9:28 GMT+10:00 Barry Smith : > > Reproduce this in a small about of code (sounds easy in this case) and then email the code out; your verbal descriptions is not detailed enough to determine the problem. Also send a make file that links the .so library, it could be a problem related to that. > > Barry > > On May 16, 2014, at 6:21 PM, ??? wrote: > > > Hi, > > > > I write a piece of code containing SLEPc calling and compile it into a .so library. Then I call this library in my program. I call MPI_Init in the main program, but it turns out that function SlepcInitialize in SLEPc .so library will not return, and the program stuck at that line. I don't know if this problem related to the place of MPI_Init calling, but the library works well if I only use one process and not initialize MPI outside. > > > > Guoxi > > From jed at jedbrown.org Sat May 17 02:26:03 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 17 May 2014 01:26:03 -0600 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <53753C7B.8010201@uci.edu> References: <53753C7B.8010201@uci.edu> Message-ID: <87y4y0uar8.fsf@jedbrown.org> Michele Rosso writes: > Hi, > > I am solving an inhomogeneous Laplacian in 3D (basically a slightly > modified version of example ex34). > The laplacian is discretized by using a cell-center finite difference > 7-point stencil with periodic BCs. > I am solving a time-dependent problem so the solution of the laplacian > is repeated at each time step with a different matrix (always SPD > though) and rhs. Also, the laplacian features large magnitude variations > in the coefficients. I solve by means of CG + GAMG as preconditioner. > Everything works fine for a while until I receive a > DIVERGED_INDEFINITE_PC message. What is changing as you time step? Is there a nonlinearity that activates suddenly? Especially a bifurcation or perhaps a source term that is incompatible with the boundary conditions? You could try -mg_levels_ksp_type richardson -mg_levels_pc_type sor. Can you reproduce with a small problem? The configuration looks okay to me. > Before checking my model is incorrect I would like to rule out the > possibility of improper use of the linear solver. I attached the full > output of a serial run with -log-summary -ksp_view > -ksp_converged_reason ksp_monitor_true_residual. I would appreciate if > you could help me in locating the issue. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From zonexo at gmail.com Sun May 18 20:18:04 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 19 May 2014 09:18:04 +0800 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: References: <534C9A2C.5060404@gmail.com> <534C9DB5.9070407@gmail.com> <53514B8A.90901@gmail.com> <495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> Message-ID: <53795BCC.8020500@gmail.com> Hi Barry, I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu? Thank you Yours sincerely, TAY wee-beng On 14/5/2014 12:03 AM, Barry Smith wrote: > Please send you current code. So we may compile and run it. > > Barry > > > > On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: > >> Hi, >> >> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. >> >> Thank you. >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 21/4/2014 8:58 AM, Barry Smith wrote: >>> Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. >>> >>> Barry >>> >>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >>> >>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >>>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>>>>>>>> >>>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>>>>>>>> >>>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>>>>>>> Hmm, >>>>>>>>> >>>>>>>>> Interface DMDAVecGetArrayF90 >>>>>>>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>>>>>>> USE_DM_HIDE >>>>>>>>> DM_HIDE da1 >>>>>>>>> VEC_HIDE v >>>>>>>>> PetscScalar,pointer :: d1(:,:,:) >>>>>>>>> PetscErrorCode ierr >>>>>>>>> End Subroutine >>>>>>>>> >>>>>>>>> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >>>>>>>>> >>>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >>>>>>>>> >>>>>>>>> Also, supposed I call: >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> >>>>>>>>> u_array .... >>>>>>>>> >>>>>>>>> v_array .... etc >>>>>>>>> >>>>>>>>> Now to restore the array, does it matter the sequence they are restored? >>>>>>>>> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >>>>>>>>> >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>>>>> >>>>>>>>> u_array = 0.d0 >>>>>>>>> >>>>>>>>> v_array = 0.d0 >>>>>>>>> >>>>>>>>> w_array = 0.d0 >>>>>>>>> >>>>>>>>> p_array = 0.d0 >>>>>>>>> >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>>>>> >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>>>>>>>> >>>>>>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >>>>>>>> Hi Matt, >>>>>>>> >>>>>>>> Do you mean putting the above lines into ex11f90.F and test? >>>>>>>> >>>>>>>> It already has DMDAVecGetArray(). Just run it. >>>>>>> Hi, >>>>>>> >>>>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >>>>>>> >>>>>>> No the global/local difference should not matter. >>>>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >>>>>>> >>>>>>> DMGetLocalVector() >>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >>>>>> >>>>>> If so, when should I call them? >>>>>> >>>>>> You just need a local vector from somewhere. >>>> Hi, >>>> >>>> Anyone can help with the questions below? Still trying to find why my code doesn't work. >>>> >>>> Thanks. >>>>> Hi, >>>>> >>>>> I insert part of my error region code into ex11f90: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> u_array = 0.d0 >>>>> v_array = 0.d0 >>>>> w_array = 0.d0 >>>>> p_array = 0.d0 >>>>> >>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> It worked w/o error. I'm going to change the way the modules are defined in my code. >>>>> >>>>> My code contains a main program and a number of modules files, with subroutines inside e.g. >>>>> >>>>> module solve >>>>> <- add include file? >>>>> subroutine RRK >>>>> <- add include file? >>>>> end subroutine RRK >>>>> >>>>> end module solve >>>>> >>>>> So where should the include files (#include ) be placed? >>>>> >>>>> After the module or inside the subroutine? >>>>> >>>>> Thanks. >>>>>> Matt >>>>>> Thanks. >>>>>>> Matt >>>>>>> Thanks. >>>>>>>> Matt >>>>>>>> Thanks >>>>>>>> >>>>>>>> Regards. >>>>>>>>> Matt >>>>>>>>> As in w, then v and u? >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> thanks >>>>>>>>> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >>>>>>>>> >>>>>>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >>>>>>>>> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >>>>>>>>> >>>>>>>>> >>>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>>>>>>> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >>>>>>>>> >>>>>>>>> However, by re-writing my code, I found out a few things: >>>>>>>>> >>>>>>>>> 1. if I write my code this way: >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> >>>>>>>>> u_array = .... >>>>>>>>> >>>>>>>>> v_array = .... >>>>>>>>> >>>>>>>>> w_array = .... >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> The code runs fine. >>>>>>>>> >>>>>>>>> 2. if I write my code this way: >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> >>>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>>>>>>> >>>>>>>>> where the subroutine is: >>>>>>>>> >>>>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>>>> >>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>>>> >>>>>>>>> u ... >>>>>>>>> v... >>>>>>>>> w ... >>>>>>>>> >>>>>>>>> end subroutine uvw_array_change. >>>>>>>>> >>>>>>>>> The above will give an error at : >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> 3. Same as above, except I change the order of the last 3 lines to: >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> >>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> >>>>>>>>> So they are now in reversed order. Now it works. >>>>>>>>> >>>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>>>>>>> >>>>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>>>> >>>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>>>> >>>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>>>> >>>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>>>>>> >>>>>>>>> u ... >>>>>>>>> v... >>>>>>>>> w ... >>>>>>>>> >>>>>>>>> end subroutine uvw_array_change. >>>>>>>>> >>>>>>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >>>>>>>>> >>>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >>>>>>>>> >>>>>>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >>>>>>>>> >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>>> Yours sincerely, >>>>>>>>> >>>>>>>>> TAY wee-beng >>>>>>>>> >>>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>>>>>>> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>>>>> >>>>>>>>> >>>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>>>>>>>> >>>>>>>>> Hi Barry, >>>>>>>>> >>>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >>>>>>>>> >>>>>>>>> I have attached my code. >>>>>>>>> >>>>>>>>> Thank you >>>>>>>>> >>>>>>>>> Yours sincerely, >>>>>>>>> >>>>>>>>> TAY wee-beng >>>>>>>>> >>>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>>>>>>> Please send the code that creates da_w and the declarations of w_array >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Barry, >>>>>>>>> >>>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>>>>>>> >>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>>>> >>>>>>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >>>>>>>>> >>>>>>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >>>>>>>>> >>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>>>> -------------------------------------------------------------------------- >>>>>>>>> An MPI process has executed an operation involving a call to the >>>>>>>>> "fork()" system call to create a child process. Open MPI is currently >>>>>>>>> operating in a condition that could result in memory corruption or >>>>>>>>> other system errors; your MPI job may hang, crash, or produce silent >>>>>>>>> data corruption. The use of fork() (or system() or other calls that >>>>>>>>> create child processes) is strongly discouraged. >>>>>>>>> >>>>>>>>> The process that invoked fork was: >>>>>>>>> >>>>>>>>> Local host: n12-76 (PID 20235) >>>>>>>>> MPI_COMM_WORLD rank: 2 >>>>>>>>> >>>>>>>>> If you are *absolutely sure* that your application will successfully >>>>>>>>> and correctly survive a call to fork(), you may disable this warning >>>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>>>>>>> -------------------------------------------------------------------------- >>>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >>>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >>>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >>>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >>>>>>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >>>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>>>>>>>> >>>>>>>>> .... >>>>>>>>> >>>>>>>>> 1 >>>>>>>>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>>>> [1]PETSC ERROR: or see >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >>>>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>>>>>> [1]PETSC ERROR: to get more information on the crash. >>>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>>>>>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>>>> [3]PETSC ERROR: or see >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >>>>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>>>>>> [3]PETSC ERROR: to get more information on the crash. >>>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>>>>>> >>>>>>>>> ... >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>>> Yours sincerely, >>>>>>>>> >>>>>>>>> TAY wee-beng >>>>>>>>> >>>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>>>>>>> >>>>>>>>> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> This routines don?t have any parallel communication in them so are unlikely to hang. >>>>>>>>> >>>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >>>>>>>>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> -- >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>>> Yours sincerely, >>>>>>>>> >>>>>>>>> TAY wee-beng >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener From knepley at gmail.com Sun May 18 20:53:19 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 18 May 2014 20:53:19 -0500 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: <53795BCC.8020500@gmail.com> References: <534C9A2C.5060404@gmail.com> <534C9DB5.9070407@gmail.com> <53514B8A.90901@gmail.com> <495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> Message-ID: On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng wrote: > Hi Barry, > > I am trying to sort out the details so that it's easier to pinpoint the > error. However, I tried on gnu gfortran and it worked well. On intel ifort, > it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that > it's a bug in ifort? Do you work with both intel and gnu? > Yes it works with Intel. Is this using optimization? Matt > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 14/5/2014 12:03 AM, Barry Smith wrote: > >> Please send you current code. So we may compile and run it. >> >> Barry >> >> >> On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: >> >> Hi, >>> >>> I have sent the entire code a while ago. Is there any answer? I was also >>> trying myself but it worked for some intel compiler, and some not. I'm >>> still not able to find the answer. gnu compilers for most cluster are old >>> versions so they are not able to compile since I have allocatable >>> structures. >>> >>> Thank you. >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> On 21/4/2014 8:58 AM, Barry Smith wrote: >>> >>>> Please send the entire code. If we can run it and reproduce the >>>> problem we can likely track down the issue much faster than through endless >>>> rounds of email. >>>> >>>> Barry >>>> >>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >>>> >>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>>> >>>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>>>> >>>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng >>>>>>> wrote: >>>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>>>>> >>>>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng >>>>>>>> wrote: >>>>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>>>>>> >>>>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng >>>>>>>>> wrote: >>>>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>>>>>>> >>>>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng >>>>>>>>>> wrote: >>>>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>>>>>>>> Hmm, >>>>>>>>>> >>>>>>>>>> Interface DMDAVecGetArrayF90 >>>>>>>>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>>>>>>>> USE_DM_HIDE >>>>>>>>>> DM_HIDE da1 >>>>>>>>>> VEC_HIDE v >>>>>>>>>> PetscScalar,pointer :: d1(:,:,:) >>>>>>>>>> PetscErrorCode ierr >>>>>>>>>> End Subroutine >>>>>>>>>> >>>>>>>>>> So the d1 is a F90 POINTER. But your subroutine seems to be >>>>>>>>>> treating it as a ?plain old Fortran array?? >>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain >>>>>>>>>> old Fortran array"? Because I declare it as a Fortran array and it works >>>>>>>>>> w/o any problem if I only call DMDAVecGetArrayF90 and >>>>>>>>>> DMDAVecRestoreArrayF90 with "u". >>>>>>>>>> >>>>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with >>>>>>>>>> "u", "v" and "w", error starts to happen. I wonder why... >>>>>>>>>> >>>>>>>>>> Also, supposed I call: >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> u_array .... >>>>>>>>>> >>>>>>>>>> v_array .... etc >>>>>>>>>> >>>>>>>>>> Now to restore the array, does it matter the sequence they are >>>>>>>>>> restored? >>>>>>>>>> No it should not matter. If it matters that is a sign that >>>>>>>>>> memory has been written to incorrectly earlier in the code. >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Hmm, I have been getting different results on different intel >>>>>>>>>> compilers. I'm not sure if MPI played a part but I'm only using a single >>>>>>>>>> processor. In the debug mode, things run without problem. In optimized >>>>>>>>>> mode, in some cases, the code aborts even doing simple initialization: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>>>>>> >>>>>>>>>> u_array = 0.d0 >>>>>>>>>> >>>>>>>>>> v_array = 0.d0 >>>>>>>>>> >>>>>>>>>> w_array = 0.d0 >>>>>>>>>> >>>>>>>>>> p_array = 0.d0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), >>>>>>>>>> giving segmentation error. But other >>>>>>>>>> version of intel compiler passes thru this part w/o error. >>>>>>>>>> Since the response is different among different compilers, is this PETSc or >>>>>>>>>> intel 's bug? Or mvapich or openmpi? >>>>>>>>>> >>>>>>>>>> We do this is a bunch of examples. Can you reproduce this >>>>>>>>>> different behavior in src/dm/examples/tutorials/ex11f90.F? >>>>>>>>>> >>>>>>>>> Hi Matt, >>>>>>>>> >>>>>>>>> Do you mean putting the above lines into ex11f90.F and test? >>>>>>>>> >>>>>>>>> It already has DMDAVecGetArray(). Just run it. >>>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> It worked. The differences between mine and the code is the way the >>>>>>>> fortran modules are defined, and the ex11f90 only uses global vectors. Does >>>>>>>> it make a difference whether global or local vectors are used? Because the >>>>>>>> way it accesses x1 only touches the local region. >>>>>>>> >>>>>>>> No the global/local difference should not matter. >>>>>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be >>>>>>>> used 1st, is that so? I can't find the equivalent for local vector though. >>>>>>>> >>>>>>>> DMGetLocalVector() >>>>>>>> >>>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my >>>>>>> code. Does it matter? >>>>>>> >>>>>>> If so, when should I call them? >>>>>>> >>>>>>> You just need a local vector from somewhere. >>>>>>> >>>>>> Hi, >>>>> >>>>> Anyone can help with the questions below? Still trying to find why my >>>>> code doesn't work. >>>>> >>>>> Thanks. >>>>> >>>>>> Hi, >>>>>> >>>>>> I insert part of my error region code into ex11f90: >>>>>> >>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>> >>>>>> u_array = 0.d0 >>>>>> v_array = 0.d0 >>>>>> w_array = 0.d0 >>>>>> p_array = 0.d0 >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> It worked w/o error. I'm going to change the way the modules are >>>>>> defined in my code. >>>>>> >>>>>> My code contains a main program and a number of modules files, with >>>>>> subroutines inside e.g. >>>>>> >>>>>> module solve >>>>>> <- add include file? >>>>>> subroutine RRK >>>>>> <- add include file? >>>>>> end subroutine RRK >>>>>> >>>>>> end module solve >>>>>> >>>>>> So where should the include files (#include ) >>>>>> be placed? >>>>>> >>>>>> After the module or inside the subroutine? >>>>>> >>>>>> Thanks. >>>>>> >>>>>>> Matt >>>>>>> Thanks. >>>>>>> >>>>>>>> Matt >>>>>>>> Thanks. >>>>>>>> >>>>>>>>> Matt >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> Regards. >>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> As in w, then v and u? >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> thanks >>>>>>>>>> Note also that the beginning and end indices of the u,v,w, >>>>>>>>>> are different for each process see for example >>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/ >>>>>>>>>> tutorials/ex11f90.F (and they do not start at 1). This is how >>>>>>>>>> to get the loop bounds. >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> In my case, I fixed the u,v,w such that their indices are the >>>>>>>>>> same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the >>>>>>>>>> problem lies in my subroutine treating it as a ?plain old Fortran array?. >>>>>>>>>> >>>>>>>>>> If I declare them as pointers, their indices follow the C 0 start >>>>>>>>>> convention, is that so? >>>>>>>>>> Not really. It is that in each process you need to access >>>>>>>>>> them from the indices indicated by DMDAGetCorners() for global vectors and >>>>>>>>>> DMDAGetGhostCorners() for local vectors. So really C or Fortran >>>>>>>>>> doesn?t make any difference. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow >>>>>>>>>> the Fortran 1 start convention. Is there some way to manipulate such that I >>>>>>>>>> do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>>>>>>>> If you code wishes to access them with indices plus one from >>>>>>>>>> the values returned by DMDAGetCorners() for global vectors and >>>>>>>>>> DMDAGetGhostCorners() for local vectors then you need to manually subtract >>>>>>>>>> off the 1. >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence >>>>>>>>>> I can run on 1 processor. Tried using valgrind but perhaps I'm using the >>>>>>>>>> optimized version, it didn't catch the error, besides saying "Segmentation >>>>>>>>>> fault (core dumped)" >>>>>>>>>> >>>>>>>>>> However, by re-writing my code, I found out a few things: >>>>>>>>>> >>>>>>>>>> 1. if I write my code this way: >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> u_array = .... >>>>>>>>>> >>>>>>>>>> v_array = .... >>>>>>>>>> >>>>>>>>>> w_array = .... >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> The code runs fine. >>>>>>>>>> >>>>>>>>>> 2. if I write my code this way: >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this >>>>>>>>>> subroutine does the same modification as the above. >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>>>>>>>> >>>>>>>>>> where the subroutine is: >>>>>>>>>> >>>>>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>>>>> >>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>>>>> >>>>>>>>>> u ... >>>>>>>>>> v... >>>>>>>>>> w ... >>>>>>>>>> >>>>>>>>>> end subroutine uvw_array_change. >>>>>>>>>> >>>>>>>>>> The above will give an error at : >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> 3. Same as above, except I change the order of the last 3 lines >>>>>>>>>> to: >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> So they are now in reversed order. Now it works. >>>>>>>>>> >>>>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>>>>>>>> >>>>>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>>>>> >>>>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_ >>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>>>>>>>> 3):end_indices(3)) >>>>>>>>>> >>>>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_ >>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>>>>>>>> 3):end_indices(3)) >>>>>>>>>> >>>>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_ >>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>>>>>>>> 3):end_indices(3)) >>>>>>>>>> >>>>>>>>>> u ... >>>>>>>>>> v... >>>>>>>>>> w ... >>>>>>>>>> >>>>>>>>>> end subroutine uvw_array_change. >>>>>>>>>> >>>>>>>>>> The start_indices and end_indices are simply to shift the 0 >>>>>>>>>> indices of C convention to that of the 1 indices of the Fortran convention. >>>>>>>>>> This is necessary in my case because most of my codes start array counting >>>>>>>>>> at 1, hence the "trick". >>>>>>>>>> >>>>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 >>>>>>>>>> (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> " >>>>>>>>>> >>>>>>>>>> So did I violate and cause memory corruption due to the trick >>>>>>>>>> above? But I can't think of any way other >>>>>>>>>> than the "trick" to continue using the 1 indices >>>>>>>>>> convention. >>>>>>>>>> >>>>>>>>>> Thank you. >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>>>>>>>> Try running under valgrind http://www.mcs.anl.gov/petsc/ >>>>>>>>>> documentation/faq.html#valgrind >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Barry, >>>>>>>>>> >>>>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode >>>>>>>>>> but fails in non-debug mode. >>>>>>>>>> >>>>>>>>>> I have attached my code. >>>>>>>>>> >>>>>>>>>> Thank you >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>>>>>>>> Please send the code that creates da_w and the declarations >>>>>>>>>> of w_array >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Barry, >>>>>>>>>> >>>>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>>>>>>>> >>>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>>>>> >>>>>>>>>> I got the msg below. Before the gdb windows appear (thru x11), >>>>>>>>>> the program aborts. >>>>>>>>>> >>>>>>>>>> Also I tried running in another cluster and it worked. Also tried >>>>>>>>>> in the current cluster in debug mode and it worked too. >>>>>>>>>> >>>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>>>>> ------------------------------------------------------------ >>>>>>>>>> -------------- >>>>>>>>>> An MPI process has executed an operation involving a call to the >>>>>>>>>> "fork()" system call to create a child process. Open MPI is >>>>>>>>>> currently >>>>>>>>>> operating in a condition that could result in memory corruption or >>>>>>>>>> other system errors; your MPI job may hang, crash, or produce >>>>>>>>>> silent >>>>>>>>>> data corruption. The use of fork() (or system() or other calls >>>>>>>>>> that >>>>>>>>>> create child processes) is strongly discouraged. >>>>>>>>>> >>>>>>>>>> The process that invoked fork was: >>>>>>>>>> >>>>>>>>>> Local host: n12-76 (PID 20235) >>>>>>>>>> MPI_COMM_WORLD rank: 2 >>>>>>>>>> >>>>>>>>>> If you are *absolutely sure* that your application will >>>>>>>>>> successfully >>>>>>>>>> and correctly survive a call to fork(), you may disable this >>>>>>>>>> warning >>>>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>>>>>>>> ------------------------------------------------------------ >>>>>>>>>> -------------- >>>>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on >>>>>>>>>> display localhost:50.0 on machine n12-76 >>>>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on >>>>>>>>>> display localhost:50.0 on machine n12-76 >>>>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on >>>>>>>>>> display localhost:50.0 on machine n12-76 >>>>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on >>>>>>>>>> display localhost:50.0 on machine n12-76 >>>>>>>>>> [n12-76:20232] 3 more processes have sent help message >>>>>>>>>> help-mpi-runtime.txt / mpi_init:warn-fork >>>>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 >>>>>>>>>> to see all help / error messages >>>>>>>>>> >>>>>>>>>> .... >>>>>>>>>> >>>>>>>>>> 1 >>>>>>>>>> [1]PETSC ERROR: ------------------------------ >>>>>>>>>> ------------------------------------------ >>>>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>>>> Violation, probably memory access out of range >>>>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>>> -on_error_attach_debugger >>>>>>>>>> [1]PETSC ERROR: or see >>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html# >>>>>>>>>> valgrind[1]PETSC ERROR: or try http://valgrind.org >>>>>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption >>>>>>>>>> errors >>>>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, >>>>>>>>>> link, and run >>>>>>>>>> [1]PETSC ERROR: to get more information on the crash. >>>>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown >>>>>>>>>> directory unknown file (null) >>>>>>>>>> [3]PETSC ERROR: ------------------------------ >>>>>>>>>> ------------------------------------------ >>>>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>>>> Violation, probably memory access out of range >>>>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>>> -on_error_attach_debugger >>>>>>>>>> [3]PETSC ERROR: or see >>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html# >>>>>>>>>> valgrind[3]PETSC ERROR: or try http://valgrind.org >>>>>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption >>>>>>>>>> errors >>>>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, >>>>>>>>>> link, and run >>>>>>>>>> [3]PETSC ERROR: to get more information on the crash. >>>>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown >>>>>>>>>> directory unknown file (null) >>>>>>>>>> >>>>>>>>>> ... >>>>>>>>>> Thank you. >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>>>>>>>> >>>>>>>>>> Because IO doesn?t always get flushed immediately it may not >>>>>>>>>> be hanging at this point. It is better to use the option >>>>>>>>>> -start_in_debugger then type cont in each debugger window and then when you >>>>>>>>>> think it is ?hanging? do a control C in each debugger window and type where >>>>>>>>>> to see where each process is you can also look around in the debugger at >>>>>>>>>> variables to see why it is ?hanging? at that point. >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> This routines don?t have any parallel communication in them >>>>>>>>>> so are unlikely to hang. >>>>>>>>>> >>>>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> My code hangs and I added in mpi_barrier and print to catch the >>>>>>>>>> bug. I found that it hangs after printing "7". Is it because I'm doing >>>>>>>>>> something wrong? I need to access the u,v,w array so I use >>>>>>>>>> DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"3" >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"4" >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"5" >>>>>>>>>> call I_IIB_uv_initial_1st_dm(I_ >>>>>>>>>> cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_ >>>>>>>>>> v1,I_cell_w1,u_array,v_array,w_array) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"6" >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> !must be in reverse order >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"7" >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"8" >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> -- >>>>>>>>>> Thank you. >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Sun May 18 22:28:13 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 19 May 2014 11:28:13 +0800 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: References: <534C9A2C.5060404@gmail.com> <53514B8A.90901@gmail.com> <495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> Message-ID: <53797A4D.6090602@gmail.com> On 19/5/2014 9:53 AM, Matthew Knepley wrote: > On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng > wrote: > > Hi Barry, > > I am trying to sort out the details so that it's easier to > pinpoint the error. However, I tried on gnu gfortran and it worked > well. On intel ifort, it stopped at one of the > "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in > ifort? Do you work with both intel and gnu? > > > Yes it works with Intel. Is this using optimization? Hi Matt, I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel. Does it definitely mean that it's a bug in ifort? > > Matt > > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 14/5/2014 12:03 AM, Barry Smith wrote: > > Please send you current code. So we may compile and run it. > > Barry > > > On May 12, 2014, at 9:52 PM, TAY wee-beng > wrote: > > Hi, > > I have sent the entire code a while ago. Is there any > answer? I was also trying myself but it worked for some > intel compiler, and some not. I'm still not able to find > the answer. gnu compilers for most cluster are old > versions so they are not able to compile since I have > allocatable structures. > > Thank you. > > Yours sincerely, > > TAY wee-beng > > On 21/4/2014 8:58 AM, Barry Smith wrote: > > Please send the entire code. If we can run it and > reproduce the problem we can likely track down the > issue much faster than through endless rounds of email. > > Barry > > On Apr 20, 2014, at 7:49 PM, TAY wee-beng > > wrote: > > On 20/4/2014 8:39 AM, TAY wee-beng wrote: > > On 20/4/2014 1:02 AM, Matthew Knepley wrote: > > On Sat, Apr 19, 2014 at 10:49 AM, TAY > wee-beng > wrote: > On 19/4/2014 11:39 PM, Matthew Knepley wrote: > > On Sat, Apr 19, 2014 at 10:16 AM, TAY > wee-beng > wrote: > On 19/4/2014 10:55 PM, Matthew Knepley > wrote: > > On Sat, Apr 19, 2014 at 9:14 AM, > TAY wee-beng > wrote: > On 19/4/2014 6:48 PM, Matthew > Knepley wrote: > > On Sat, Apr 19, 2014 at 4:59 > AM, TAY wee-beng > > wrote: > On 19/4/2014 1:17 PM, Barry > Smith wrote: > On Apr 19, 2014, at 12:11 AM, > TAY wee-beng > wrote: > > On 19/4/2014 12:10 PM, Barry > Smith wrote: > On Apr 18, 2014, at 9:57 PM, > TAY wee-beng > wrote: > > On 19/4/2014 3:53 AM, Barry > Smith wrote: > Hmm, > > Interface > DMDAVecGetArrayF90 > Subroutine > DMDAVecGetArrayF903(da1, > v,d1,ierr) > USE_DM_HIDE > DM_HIDE da1 > VEC_HIDE v > > PetscScalar,pointer :: d1(:,:,:) > PetscErrorCode ierr > End Subroutine > > So the d1 is a F90 > POINTER. But your subroutine > seems to be treating it as a > ?plain old Fortran array?? > real(8), intent(inout) :: > u(:,:,:),v(:,:,:),w(:,:,:) > Hi, > > So d1 is a pointer, and it's > different if I declare it as > "plain old Fortran array"? > Because I declare it as a > Fortran array and it works w/o > any problem if I only call > DMDAVecGetArrayF90 and > DMDAVecRestoreArrayF90 with "u". > > But if I call > DMDAVecGetArrayF90 and > DMDAVecRestoreArrayF90 with > "u", "v" and "w", error starts > to happen. I wonder why... > > Also, supposed I call: > > call > DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > > call > DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) > > u_array .... > > v_array .... etc > > Now to restore the array, does > it matter the sequence they > are restored? > No it should not matter. > If it matters that is a sign > that memory has been written > to incorrectly earlier in the > code. > > Hi, > > Hmm, I have been getting > different results on different > intel compilers. I'm not sure > if MPI played a part but I'm > only using a single processor. > In the debug mode, things run > without problem. In optimized > mode, in some cases, the code > aborts even doing simple > initialization: > > > call > DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > > call > DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) > > call > DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) > > u_array = 0.d0 > > v_array = 0.d0 > > w_array = 0.d0 > > p_array = 0.d0 > > > call > DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) > > > call > DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) > > call > DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) > > The code aborts at call > DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), > giving segmentation error. But > other version of intel > compiler passes thru this part > w/o error. Since the response > is different among different > compilers, is this PETSc or > intel 's bug? Or mvapich or > openmpi? > > We do this is a bunch of > examples. Can you reproduce > this different behavior in > src/dm/examples/tutorials/ex11f90.F? > > Hi Matt, > > Do you mean putting the above > lines into ex11f90.F and test? > > It already has DMDAVecGetArray(). > Just run it. > > Hi, > > It worked. The differences between > mine and the code is the way the > fortran modules are defined, and the > ex11f90 only uses global vectors. Does > it make a difference whether global or > local vectors are used? Because the > way it accesses x1 only touches the > local region. > > No the global/local difference should > not matter. > Also, before using > DMDAVecGetArrayF90, DMGetGlobalVector > must be used 1st, is that so? I can't > find the equivalent for local vector > though. > > DMGetLocalVector() > > Ops, I do not have DMGetLocalVector and > DMRestoreLocalVector in my code. Does it > matter? > > If so, when should I call them? > > You just need a local vector from somewhere. > > Hi, > > Anyone can help with the questions below? Still > trying to find why my code doesn't work. > > Thanks. > > Hi, > > I insert part of my error region code into > ex11f90: > > call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > call > DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) > call > DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) > call > DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) > > u_array = 0.d0 > v_array = 0.d0 > w_array = 0.d0 > p_array = 0.d0 > > call > DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) > > call > DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) > > call > DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) > > It worked w/o error. I'm going to change the > way the modules are defined in my code. > > My code contains a main program and a number > of modules files, with subroutines inside e.g. > > module solve > <- add include file? > subroutine RRK > <- add include file? > end subroutine RRK > > end module solve > > So where should the include files (#include > ) be placed? > > After the module or inside the subroutine? > > Thanks. > > Matt > Thanks. > > Matt > Thanks. > > Matt > Thanks > > Regards. > > Matt > As in w, then v and u? > > call > DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) > call > DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) > call > DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) > > thanks > Note also that the > beginning and end indices of > the u,v,w, are different for > each process see for example > http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F > (and they do not start at 1). > This is how to get the loop > bounds. > Hi, > > In my case, I fixed the u,v,w > such that their indices are > the same. I also checked using > DMDAGetCorners and > DMDAGetGhostCorners. Now the > problem lies in my subroutine > treating it as a ?plain old > Fortran array?. > > If I declare them as pointers, > their indices follow the C 0 > start convention, is that so? > Not really. It is that in > each process you need to > access them from the indices > indicated by DMDAGetCorners() > for global vectors and > DMDAGetGhostCorners() for > local vectors. So really C or > Fortran doesn?t make any > difference. > > > So my problem now is that in > my old MPI code, the u(i,j,k) > follow the Fortran 1 start > convention. Is there some way > to manipulate such that I do > not have to change my u(i,j,k) > to u(i-1,j-1,k-1)? > If you code wishes to > access them with indices plus > one from the values returned > by DMDAGetCorners() for global > vectors and > DMDAGetGhostCorners() for > local vectors then you need to > manually subtract off the 1. > > Barry > > Thanks. > Barry > > On Apr 18, 2014, at 10:58 AM, > TAY wee-beng > wrote: > > Hi, > > I tried to pinpoint the > problem. I reduced my job size > and hence I can run on 1 > processor. Tried using > valgrind but perhaps I'm using > the optimized version, it > didn't catch the error, > besides saying "Segmentation > fault (core dumped)" > > However, by re-writing my > code, I found out a few things: > > 1. if I write my code this way: > > call > DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > > call > DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) > > u_array = .... > > v_array = .... > > w_array = .... > > call > DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) > > call > DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) > > The code runs fine. > > 2. if I write my code this way: > > call > DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > > call > DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) > > call > uvw_array_change(u_array,v_array,w_array) > -> this subroutine does the > same modification as the above. > > call > DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) > > call > DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) > -> error > > where the subroutine is: > > subroutine uvw_array_change(u,v,w) > > real(8), intent(inout) :: > u(:,:,:),v(:,:,:),w(:,:,:) > > u ... > v... > w ... > > end subroutine uvw_array_change. > > The above will give an error at : > > call > DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) > > 3. Same as above, except I > change the order of the last 3 > lines to: > > call > DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) > > call > DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) > > call > DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) > > So they are now in reversed > order. Now it works. > > 4. Same as 2 or 3, except the > subroutine is changed to : > > subroutine uvw_array_change(u,v,w) > > real(8), intent(inout) :: > u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) > > real(8), intent(inout) :: > v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) > > real(8), intent(inout) :: > w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) > > u ... > v... > w ... > > end subroutine uvw_array_change. > > The start_indices and > end_indices are simply to > shift the 0 indices of C > convention to that of the 1 > indices of the Fortran > convention. This is necessary > in my case because most of my > codes start array counting at > 1, hence the "trick". > > However, now no matter which > order of the > DMDAVecRestoreArrayF90 (as in > 2 or 3), error will occur at > "call > DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) > " > > So did I violate and cause > memory corruption due to the > trick above? But I can't think > of any way other than > the "trick" to continue using > the 1 indices convention. > > Thank you. > > Yours sincerely, > > TAY wee-beng > > On 15/4/2014 8:00 PM, Barry > Smith wrote: > Try running under valgrind > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > On Apr 14, 2014, at 9:47 PM, > TAY wee-beng > wrote: > > Hi Barry, > > As I mentioned earlier, the > code works fine in PETSc debug > mode but fails in non-debug mode. > > I have attached my code. > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 15/4/2014 2:26 AM, Barry > Smith wrote: > Please send the code that > creates da_w and the > declarations of w_array > > Barry > > On Apr 14, 2014, at 9:40 AM, > TAY wee-beng > > > wrote: > > > Hi Barry, > > I'm not too sure how to do it. > I'm running mpi. So I run: > > mpirun -n 4 ./a.out > -start_in_debugger > > I got the msg below. Before > the gdb windows appear (thru > x11), the program aborts. > > Also I tried running in > another cluster and it worked. > Also tried in the current > cluster in debug mode and it > worked too. > > mpirun -n 4 ./a.out > -start_in_debugger > -------------------------------------------------------------------------- > An MPI process has executed an > operation involving a call to the > "fork()" system call to create > a child process. Open MPI is > currently > operating in a condition that > could result in memory > corruption or > other system errors; your MPI > job may hang, crash, or > produce silent > data corruption. The use of > fork() (or system() or other > calls that > create child processes) is > strongly discouraged. > > The process that invoked fork was: > > Local host: > n12-76 (PID 20235) > MPI_COMM_WORLD rank: 2 > > If you are *absolutely sure* > that your application will > successfully > and correctly survive a call > to fork(), you may disable > this warning > by setting the > mpi_warn_on_fork MCA parameter > to 0. > -------------------------------------------------------------------------- > [2]PETSC ERROR: PETSC: > Attaching gdb to ./a.out of > pid 20235 on display > localhost:50.0 on machine n12-76 > [0]PETSC ERROR: PETSC: > Attaching gdb to ./a.out of > pid 20233 on display > localhost:50.0 on machine n12-76 > [1]PETSC ERROR: PETSC: > Attaching gdb to ./a.out of > pid 20234 on display > localhost:50.0 on machine n12-76 > [3]PETSC ERROR: PETSC: > Attaching gdb to ./a.out of > pid 20236 on display > localhost:50.0 on machine n12-76 > [n12-76:20232] 3 more > processes have sent help > message help-mpi-runtime.txt / > mpi_init:warn-fork > [n12-76:20232] Set MCA > parameter > "orte_base_help_aggregate" to > 0 to see all help / error messages > > .... > > 1 > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal > number 11 SEGV: Segmentation > Violation, probably memory > access out of range > [1]PETSC ERROR: Try option > -start_in_debugger or > -on_error_attach_debugger > [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC > ERROR: or try http://valgrind.org > on GNU/linux and Apple Mac > OS X to find memory corruption > errors > [1]PETSC ERROR: configure > using --with-debugging=yes, > recompile, link, and run > [1]PETSC ERROR: to get more > information on the crash. > [1]PETSC ERROR: User provided > function() line 0 in unknown > directory unknown file (null) > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: Caught signal > number 11 SEGV: Segmentation > Violation, probably memory > access out of range > [3]PETSC ERROR: Try option > -start_in_debugger or > -on_error_attach_debugger > [3]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC > ERROR: or try http://valgrind.org > on GNU/linux and Apple Mac > OS X to find memory corruption > errors > [3]PETSC ERROR: configure > using --with-debugging=yes, > recompile, link, and run > [3]PETSC ERROR: to get more > information on the crash. > [3]PETSC ERROR: User provided > function() line 0 in unknown > directory unknown file (null) > > ... > Thank you. > > Yours sincerely, > > TAY wee-beng > > On 14/4/2014 9:05 PM, Barry > Smith wrote: > > Because IO doesn?t always > get flushed immediately it may > not be hanging at this point. > It is better to use the > option -start_in_debugger then > type cont in each debugger > window and then when you think > it is ?hanging? do a control C > in each debugger window and > type where to see where each > process is you can also look > around in the debugger at > variables to see why it is > ?hanging? at that point. > > Barry > > This routines don?t have > any parallel communication in > them so are unlikely to hang. > > On Apr 14, 2014, at 6:52 AM, > TAY wee-beng > > > > > wrote: > > > > Hi, > > My code hangs and I added in > mpi_barrier and print to catch > the bug. I found that it hangs > after printing "7". Is it > because I'm doing something > wrong? I need to access the > u,v,w array so I use > DMDAVecGetArrayF90. After > access, I use > DMDAVecRestoreArrayF90. > > call > DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > call > MPI_Barrier(MPI_COMM_WORLD,ierr); > if (myid==0) print *,"3" > call > DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) > call > MPI_Barrier(MPI_COMM_WORLD,ierr); > if (myid==0) print *,"4" > call > DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) > call > MPI_Barrier(MPI_COMM_WORLD,ierr); > if (myid==0) print *,"5" > call > I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) > call > MPI_Barrier(MPI_COMM_WORLD,ierr); > if (myid==0) print *,"6" > call > DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) > !must be in reverse order > call > MPI_Barrier(MPI_COMM_WORLD,ierr); > if (myid==0) print *,"7" > call > DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) > call > MPI_Barrier(MPI_COMM_WORLD,ierr); > if (myid==0) print *,"8" > call > DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) > -- > Thank you. > > Yours sincerely, > > TAY wee-beng > > > > > > > > > -- > What most experimenters take > for granted before they begin > their experiments is > infinitely more interesting > than any results to which > their experiments lead. > -- Norbert Wiener > > > > -- > What most experimenters take for > granted before they begin their > experiments is infinitely more > interesting than any results to > which their experiments lead. > -- Norbert Wiener > > > > -- > What most experimenters take for > granted before they begin their > experiments is infinitely more > interesting than any results to which > their experiments lead. > -- Norbert Wiener > > > > -- > What most experimenters take for granted > before they begin their experiments is > infinitely more interesting than any > results to which their experiments lead. > -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.wales at unimelb.edu.au Sun May 18 22:28:51 2014 From: scott.wales at unimelb.edu.au (Scott Wales) Date: Mon, 19 May 2014 13:28:51 +1000 Subject: [petsc-users] DMPlexDistribute error Message-ID: <53797A73.9070106@unimelb.edu.au> Hi, I'm trying to create a distributed unstructured grid in PETSc, and have encountered the following error: [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Invalid argument! [0]PETSC ERROR: Wrong type of object: Parameter # 1! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: bin/celllist_square on a arch-linux2-c-debug named raijin5 by saw562 Mon May 19 13:00:31 2014 [0]PETSC ERROR: Libraries linked from /home/562/saw562/opt/petsc/3.4.4/lib [0]PETSC ERROR: Configure run at Fri May 16 14:23:02 2014 [0]PETSC ERROR: Configure options --with-shared-libraries=1 --prefix=/home/562/saw562/opt/petsc/3.4.4 --with-blas-lapack-lib="-L/apps/intel-ct/12.1.9.293/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread" --with-mpi-dir=/apps/openmpi/1.6.3 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ISGetIndices() line 372 in src/vec/is/is/interface/index.c [0]PETSC ERROR: DMPlexCreatePartitionClosure() line 2637 ihttps://gist.github.com/ScottWales/2758b5ec96573c63e31an src/dm/impls/plex/plex.c [0]PETSC ERROR: DMPlexDistribute() line 2810 in src/dm/impls/plex/plex.c I've created the DMPlex using `DMPlexCreateFromCellList`, added a default section and then called `DMPlexDistribute` to spread the grid points across all of the processors. You can see my test code at https://gist.github.com/ScottWales/2758b5ec96573c63e31a#file-petsc-test-c-L164. Have I missed a step in the grid setup? Thanks, Scott From bsmith at mcs.anl.gov Sun May 18 22:36:27 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 18 May 2014 22:36:27 -0500 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: <53797A4D.6090602@gmail.com> References: <534C9A2C.5060404@gmail.com> <53514B8A.90901@gmail.com> <495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> <53797A4D.6090602@gmail.com> Message-ID: On May 18, 2014, at 10:28 PM, TAY wee-beng wrote: > On 19/5/2014 9:53 AM, Matthew Knepley wrote: >> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng wrote: >> Hi Barry, >> >> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu? >> >> Yes it works with Intel. Is this using optimization? > Hi Matt, > > I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel. Does it definitely mean that it's a bug in ifort? No. Does it run clean under valgrind? >> >> Matt >> >> >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 14/5/2014 12:03 AM, Barry Smith wrote: >> Please send you current code. So we may compile and run it. >> >> Barry >> >> >> On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: >> >> Hi, >> >> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. >> >> Thank you. >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 21/4/2014 8:58 AM, Barry Smith wrote: >> Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. >> >> Barry >> >> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >> >> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >> On 19/4/2014 1:17 PM, Barry Smith wrote: >> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >> >> On 19/4/2014 12:10 PM, Barry Smith wrote: >> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >> >> On 19/4/2014 3:53 AM, Barry Smith wrote: >> Hmm, >> >> Interface DMDAVecGetArrayF90 >> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >> USE_DM_HIDE >> DM_HIDE da1 >> VEC_HIDE v >> PetscScalar,pointer :: d1(:,:,:) >> PetscErrorCode ierr >> End Subroutine >> >> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >> Hi, >> >> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >> >> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >> >> Also, supposed I call: >> >> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >> >> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >> >> u_array .... >> >> v_array .... etc >> >> Now to restore the array, does it matter the sequence they are restored? >> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >> >> Hi, >> >> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >> >> >> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >> >> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >> >> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >> >> u_array = 0.d0 >> >> v_array = 0.d0 >> >> w_array = 0.d0 >> >> p_array = 0.d0 >> >> >> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >> >> >> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >> >> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >> >> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >> Hi Matt, >> >> Do you mean putting the above lines into ex11f90.F and test? >> >> It already has DMDAVecGetArray(). Just run it. >> Hi, >> >> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >> >> No the global/local difference should not matter. >> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >> >> DMGetLocalVector() >> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >> >> If so, when should I call them? >> >> You just need a local vector from somewhere. >> Hi, >> >> Anyone can help with the questions below? Still trying to find why my code doesn't work. >> >> Thanks. >> Hi, >> >> I insert part of my error region code into ex11f90: >> >> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >> >> u_array = 0.d0 >> v_array = 0.d0 >> w_array = 0.d0 >> p_array = 0.d0 >> >> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >> >> It worked w/o error. I'm going to change the way the modules are defined in my code. >> >> My code contains a main program and a number of modules files, with subroutines inside e.g. >> >> module solve >> <- add include file? >> subroutine RRK >> <- add include file? >> end subroutine RRK >> >> end module solve >> >> So where should the include files (#include ) be placed? >> >> After the module or inside the subroutine? >> >> Thanks. >> Matt >> Thanks. >> Matt >> Thanks. >> Matt >> Thanks >> >> Regards. >> Matt >> As in w, then v and u? >> >> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >> >> thanks >> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >> Hi, >> >> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >> >> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >> >> >> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >> >> Barry >> >> Thanks. >> Barry >> >> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >> >> Hi, >> >> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >> >> However, by re-writing my code, I found out a few things: >> >> 1. if I write my code this way: >> >> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >> >> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >> >> u_array = .... >> >> v_array = .... >> >> w_array = .... >> >> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >> >> The code runs fine. >> >> 2. if I write my code this way: >> >> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >> >> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >> >> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >> >> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >> >> where the subroutine is: >> >> subroutine uvw_array_change(u,v,w) >> >> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >> >> u ... >> v... >> w ... >> >> end subroutine uvw_array_change. >> >> The above will give an error at : >> >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >> >> 3. Same as above, except I change the order of the last 3 lines to: >> >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >> >> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >> >> So they are now in reversed order. Now it works. >> >> 4. Same as 2 or 3, except the subroutine is changed to : >> >> subroutine uvw_array_change(u,v,w) >> >> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >> >> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >> >> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >> >> u ... >> v... >> w ... >> >> end subroutine uvw_array_change. >> >> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >> >> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >> >> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >> >> Thank you. >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 15/4/2014 8:00 PM, Barry Smith wrote: >> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> >> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >> >> Hi Barry, >> >> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >> >> I have attached my code. >> >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 15/4/2014 2:26 AM, Barry Smith wrote: >> Please send the code that creates da_w and the declarations of w_array >> >> Barry >> >> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >> >> wrote: >> >> >> Hi Barry, >> >> I'm not too sure how to do it. I'm running mpi. So I run: >> >> mpirun -n 4 ./a.out -start_in_debugger >> >> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >> >> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >> >> mpirun -n 4 ./a.out -start_in_debugger >> -------------------------------------------------------------------------- >> An MPI process has executed an operation involving a call to the >> "fork()" system call to create a child process. Open MPI is currently >> operating in a condition that could result in memory corruption or >> other system errors; your MPI job may hang, crash, or produce silent >> data corruption. The use of fork() (or system() or other calls that >> create child processes) is strongly discouraged. >> >> The process that invoked fork was: >> >> Local host: n12-76 (PID 20235) >> MPI_COMM_WORLD rank: 2 >> >> If you are *absolutely sure* that your application will successfully >> and correctly survive a call to fork(), you may disable this warning >> by setting the mpi_warn_on_fork MCA parameter to 0. >> -------------------------------------------------------------------------- >> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >> >> .... >> >> 1 >> [1]PETSC ERROR: ------------------------------------------------------------------------ >> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [1]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >> on GNU/linux and Apple Mac OS X to find memory corruption errors >> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >> [1]PETSC ERROR: to get more information on the crash. >> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >> [3]PETSC ERROR: ------------------------------------------------------------------------ >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [3]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >> on GNU/linux and Apple Mac OS X to find memory corruption errors >> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >> [3]PETSC ERROR: to get more information on the crash. >> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >> >> ... >> Thank you. >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 14/4/2014 9:05 PM, Barry Smith wrote: >> >> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >> >> Barry >> >> This routines don?t have any parallel communication in them so are unlikely to hang. >> >> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >> >> >> >> wrote: >> >> >> >> Hi, >> >> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >> >> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >> -- >> Thank you. >> >> Yours sincerely, >> >> TAY wee-beng >> >> >> >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From zonexo at gmail.com Mon May 19 01:26:59 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Mon, 19 May 2014 14:26:59 +0800 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: References: <534C9A2C.5060404@gmail.com> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> <53797A4D.6090602@gmail.com> Message-ID: <5379A433.5000401@gmail.com> On 19/5/2014 11:36 AM, Barry Smith wrote: > On May 18, 2014, at 10:28 PM, TAY wee-beng wrote: > >> On 19/5/2014 9:53 AM, Matthew Knepley wrote: >>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng wrote: >>> Hi Barry, >>> >>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu? >>> >>> Yes it works with Intel. Is this using optimization? >> Hi Matt, >> >> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel. Does it definitely mean that it's a bug in ifort? > No. Does it run clean under valgrind? Hi, Do you mean the debug or optimized version? Thanks. > >>> Matt >>> >>> >>> Thank you >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> On 14/5/2014 12:03 AM, Barry Smith wrote: >>> Please send you current code. So we may compile and run it. >>> >>> Barry >>> >>> >>> On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: >>> >>> Hi, >>> >>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. >>> >>> Thank you. >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> On 21/4/2014 8:58 AM, Barry Smith wrote: >>> Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. >>> >>> Barry >>> >>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >>> >>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>> >>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>> >>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>> Hmm, >>> >>> Interface DMDAVecGetArrayF90 >>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>> USE_DM_HIDE >>> DM_HIDE da1 >>> VEC_HIDE v >>> PetscScalar,pointer :: d1(:,:,:) >>> PetscErrorCode ierr >>> End Subroutine >>> >>> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>> Hi, >>> >>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >>> >>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >>> >>> Also, supposed I call: >>> >>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>> >>> u_array .... >>> >>> v_array .... etc >>> >>> Now to restore the array, does it matter the sequence they are restored? >>> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >>> >>> Hi, >>> >>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >>> >>> >>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>> >>> u_array = 0.d0 >>> >>> v_array = 0.d0 >>> >>> w_array = 0.d0 >>> >>> p_array = 0.d0 >>> >>> >>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>> >>> >>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>> >>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>> >>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >>> Hi Matt, >>> >>> Do you mean putting the above lines into ex11f90.F and test? >>> >>> It already has DMDAVecGetArray(). Just run it. >>> Hi, >>> >>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >>> >>> No the global/local difference should not matter. >>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >>> >>> DMGetLocalVector() >>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >>> >>> If so, when should I call them? >>> >>> You just need a local vector from somewhere. >>> Hi, >>> >>> Anyone can help with the questions below? Still trying to find why my code doesn't work. >>> >>> Thanks. >>> Hi, >>> >>> I insert part of my error region code into ex11f90: >>> >>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>> >>> u_array = 0.d0 >>> v_array = 0.d0 >>> w_array = 0.d0 >>> p_array = 0.d0 >>> >>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>> >>> It worked w/o error. I'm going to change the way the modules are defined in my code. >>> >>> My code contains a main program and a number of modules files, with subroutines inside e.g. >>> >>> module solve >>> <- add include file? >>> subroutine RRK >>> <- add include file? >>> end subroutine RRK >>> >>> end module solve >>> >>> So where should the include files (#include ) be placed? >>> >>> After the module or inside the subroutine? >>> >>> Thanks. >>> Matt >>> Thanks. >>> Matt >>> Thanks. >>> Matt >>> Thanks >>> >>> Regards. >>> Matt >>> As in w, then v and u? >>> >>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>> >>> thanks >>> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >>> Hi, >>> >>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >>> >>> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >>> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >>> >>> >>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >>> >>> Barry >>> >>> Thanks. >>> Barry >>> >>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>> >>> Hi, >>> >>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >>> >>> However, by re-writing my code, I found out a few things: >>> >>> 1. if I write my code this way: >>> >>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>> >>> u_array = .... >>> >>> v_array = .... >>> >>> w_array = .... >>> >>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>> >>> The code runs fine. >>> >>> 2. if I write my code this way: >>> >>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>> >>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >>> >>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>> >>> where the subroutine is: >>> >>> subroutine uvw_array_change(u,v,w) >>> >>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>> >>> u ... >>> v... >>> w ... >>> >>> end subroutine uvw_array_change. >>> >>> The above will give an error at : >>> >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>> >>> 3. Same as above, except I change the order of the last 3 lines to: >>> >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>> >>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>> >>> So they are now in reversed order. Now it works. >>> >>> 4. Same as 2 or 3, except the subroutine is changed to : >>> >>> subroutine uvw_array_change(u,v,w) >>> >>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>> >>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>> >>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>> >>> u ... >>> v... >>> w ... >>> >>> end subroutine uvw_array_change. >>> >>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >>> >>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >>> >>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >>> >>> Thank you. >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >>> >>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>> >>> Hi Barry, >>> >>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >>> >>> I have attached my code. >>> >>> Thank you >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>> Please send the code that creates da_w and the declarations of w_array >>> >>> Barry >>> >>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>> >>> wrote: >>> >>> >>> Hi Barry, >>> >>> I'm not too sure how to do it. I'm running mpi. So I run: >>> >>> mpirun -n 4 ./a.out -start_in_debugger >>> >>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >>> >>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >>> >>> mpirun -n 4 ./a.out -start_in_debugger >>> -------------------------------------------------------------------------- >>> An MPI process has executed an operation involving a call to the >>> "fork()" system call to create a child process. Open MPI is currently >>> operating in a condition that could result in memory corruption or >>> other system errors; your MPI job may hang, crash, or produce silent >>> data corruption. The use of fork() (or system() or other calls that >>> create child processes) is strongly discouraged. >>> >>> The process that invoked fork was: >>> >>> Local host: n12-76 (PID 20235) >>> MPI_COMM_WORLD rank: 2 >>> >>> If you are *absolutely sure* that your application will successfully >>> and correctly survive a call to fork(), you may disable this warning >>> by setting the mpi_warn_on_fork MCA parameter to 0. >>> -------------------------------------------------------------------------- >>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>> >>> .... >>> >>> 1 >>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [1]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>> [1]PETSC ERROR: to get more information on the crash. >>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [3]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>> [3]PETSC ERROR: to get more information on the crash. >>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>> >>> ... >>> Thank you. >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>> >>> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >>> >>> Barry >>> >>> This routines don?t have any parallel communication in them so are unlikely to hang. >>> >>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>> >>> >>> >>> wrote: >>> >>> >>> >>> Hi, >>> >>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>> >>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>> -- >>> Thank you. >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener From jon.the.wong at gmail.com Mon May 19 03:57:16 2014 From: jon.the.wong at gmail.com (Jonathan Wong) Date: Mon, 19 May 2014 01:57:16 -0700 Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner for cg method 1-processor/block Message-ID: I'm running into an issue for a symmetric (may not be pd) finite element problem where I am using the cg method and getting an indefinite_mat or indefinite_pc error using the jacobi preconditioner. If I change the pc type to bjacobi, it converges nicely. I am only using 1 process, and I assumed they would produce the same result, as I'm using default options. Does anyone have any ideas why this would happen? It also works fine with gmres + jacobi pc. Thanks, Jon -------------- next part -------------- An HTML attachment was scrubbed... URL: From christophe.ortiz at ciemat.es Mon May 19 04:14:10 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Mon, 19 May 2014 11:14:10 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero Message-ID: On Thu, May 15, 2014 at 2:08 AM, Jed Brown wrote: > Christophe Ortiz writes: > > > Hi all, > > > > I am experiencing some problems of memory corruption with PetscMemzero(). > > > > I set the values of the Jacobian by blocks using MatSetValuesBlocked(). > To > > do so, I use some temporary two-dimensional arrays[dof][dof] that I must > > reset at each loop. > > > > Inside FormIJacobian, for instance, I declare the following > two-dimensional > > array: > > > > PetscScalar diag[dof][dof]; > > > > and then, to zero the array diag[][] I do > > > > ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar)); > > Note that this can also be spelled > > PetscMemzero(diag,sizeof diag); > Ok. > > > Then, inside main(), once dof is determined, I allocate memory for diag > as > > follows: > > > > diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); > > > > for (k = 0; k < dof; k++){ > > diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); > > } > > That is, the classical way to allocate memory using the pointer notation. > > Note that you can do a contiguous allocation by creating a Vec, then use > VecGetArray2D to get 2D indexing of it. > Good to know. I'll try. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 19 05:32:09 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 May 2014 05:32:09 -0500 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: <5379A433.5000401@gmail.com> References: <534C9A2C.5060404@gmail.com> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> <53797A4D.6090602@gmail.com> <5379A433.5000401@gmail.com> Message-ID: On Mon, May 19, 2014 at 1:26 AM, TAY wee-beng wrote: > On 19/5/2014 11:36 AM, Barry Smith wrote: > >> On May 18, 2014, at 10:28 PM, TAY wee-beng wrote: >> >> On 19/5/2014 9:53 AM, Matthew Knepley wrote: >>> >>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng wrote: >>>> Hi Barry, >>>> >>>> I am trying to sort out the details so that it's easier to pinpoint the >>>> error. However, I tried on gnu gfortran and it worked well. On intel ifort, >>>> it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that >>>> it's a bug in ifort? Do you work with both intel and gnu? >>>> >>>> Yes it works with Intel. Is this using optimization? >>>> >>> Hi Matt, >>> >>> I forgot to add that in non-optimized cases, it works with gnu and >>> intel. However, in optimized cases, it works with gnu, but not intel. Does >>> it definitely mean that it's a bug in ifort? >>> >> No. Does it run clean under valgrind? >> > Hi, > > Do you mean the debug or optimized version? > optimized. Have you tried using a lower optimization level? Matt > > Thanks. > > >> Matt >>>> >>>> Thank you >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 14/5/2014 12:03 AM, Barry Smith wrote: >>>> Please send you current code. So we may compile and run it. >>>> >>>> Barry >>>> >>>> >>>> On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: >>>> >>>> Hi, >>>> >>>> I have sent the entire code a while ago. Is there any answer? I was >>>> also trying myself but it worked for some intel compiler, and some not. I'm >>>> still not able to find the answer. gnu compilers for most cluster are old >>>> versions so they are not able to compile since I have allocatable >>>> structures. >>>> >>>> Thank you. >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 21/4/2014 8:58 AM, Barry Smith wrote: >>>> Please send the entire code. If we can run it and reproduce the >>>> problem we can likely track down the issue much faster than through endless >>>> rounds of email. >>>> >>>> Barry >>>> >>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >>>> >>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng >>>> wrote: >>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng >>>> wrote: >>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>>> >>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>>> >>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>> Hmm, >>>> >>>> Interface DMDAVecGetArrayF90 >>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>> USE_DM_HIDE >>>> DM_HIDE da1 >>>> VEC_HIDE v >>>> PetscScalar,pointer :: d1(:,:,:) >>>> PetscErrorCode ierr >>>> End Subroutine >>>> >>>> So the d1 is a F90 POINTER. But your subroutine seems to be >>>> treating it as a ?plain old Fortran array?? >>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>> Hi, >>>> >>>> So d1 is a pointer, and it's different if I declare it as "plain old >>>> Fortran array"? Because I declare it as a Fortran array and it works w/o >>>> any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 >>>> with "u". >>>> >>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", >>>> "v" and "w", error starts to happen. I wonder why... >>>> >>>> Also, supposed I call: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> u_array .... >>>> >>>> v_array .... etc >>>> >>>> Now to restore the array, does it matter the sequence they are restored? >>>> No it should not matter. If it matters that is a sign that memory >>>> has been written to incorrectly earlier in the code. >>>> >>>> Hi, >>>> >>>> Hmm, I have been getting different results on different intel >>>> compilers. I'm not sure if MPI played a part but I'm only using a single >>>> processor. In the debug mode, things run without problem. In optimized >>>> mode, in some cases, the code aborts even doing simple initialization: >>>> >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> u_array = 0.d0 >>>> >>>> v_array = 0.d0 >>>> >>>> w_array = 0.d0 >>>> >>>> p_array = 0.d0 >>>> >>>> >>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), >>>> giving segmentation error. But other >>>> version of intel compiler >>>> passes thru this part w/o error. Since the response is different among >>>> different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>>> >>>> We do this is a bunch of examples. Can you reproduce this different >>>> behavior in src/dm/examples/tutorials/ex11f90.F? >>>> Hi Matt, >>>> >>>> Do you mean putting the above lines into ex11f90.F and test? >>>> >>>> It already has DMDAVecGetArray(). Just run it. >>>> Hi, >>>> >>>> It worked. The differences between mine and the code is the way the >>>> fortran modules are defined, and the ex11f90 only uses global vectors. Does >>>> it make a difference whether global or local vectors are used? Because the >>>> way it accesses x1 only touches the local region. >>>> >>>> No the global/local difference should not matter. >>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be >>>> used 1st, is that so? I can't find the equivalent for local vector though. >>>> >>>> DMGetLocalVector() >>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my >>>> code. Does it matter? >>>> >>>> If so, when should I call them? >>>> >>>> You just need a local vector from somewhere. >>>> Hi, >>>> >>>> Anyone can help with the questions below? Still trying to find why my >>>> code doesn't work. >>>> >>>> Thanks. >>>> Hi, >>>> >>>> I insert part of my error region code into ex11f90: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> u_array = 0.d0 >>>> v_array = 0.d0 >>>> w_array = 0.d0 >>>> p_array = 0.d0 >>>> >>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> It worked w/o error. I'm going to change the way the modules are >>>> defined in my code. >>>> >>>> My code contains a main program and a number of modules files, with >>>> subroutines inside e.g. >>>> >>>> module solve >>>> <- add include file? >>>> subroutine RRK >>>> <- add include file? >>>> end subroutine RRK >>>> >>>> end module solve >>>> >>>> So where should the include files (#include ) >>>> be placed? >>>> >>>> After the module or inside the subroutine? >>>> >>>> Thanks. >>>> Matt >>>> Thanks. >>>> Matt >>>> Thanks. >>>> Matt >>>> Thanks >>>> >>>> Regards. >>>> Matt >>>> As in w, then v and u? >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> thanks >>>> Note also that the beginning and end indices of the u,v,w, are >>>> different for each process see for example >>>> http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/ >>>> tutorials/ex11f90.F (and they do not start at 1). This is how to get >>>> the loop bounds. >>>> Hi, >>>> >>>> In my case, I fixed the u,v,w such that their indices are the same. I >>>> also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem >>>> lies in my subroutine treating it as a ?plain old Fortran array?. >>>> >>>> If I declare them as pointers, their indices follow the C 0 start >>>> convention, is that so? >>>> Not really. It is that in each process you need to access them >>>> from the indices indicated by DMDAGetCorners() for global vectors and >>>> DMDAGetGhostCorners() for local vectors. So >>>> really C or Fortran >>>> doesn?t make any difference. >>>> >>>> >>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the >>>> Fortran 1 start convention. Is there some way to manipulate such that I do >>>> not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>> If you code wishes to access them with indices plus one from the >>>> values returned by DMDAGetCorners() for global vectors and >>>> DMDAGetGhostCorners() for local vectors then you need to manually subtract >>>> off the 1. >>>> >>>> Barry >>>> >>>> Thanks. >>>> Barry >>>> >>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>>> >>>> Hi, >>>> >>>> I tried to pinpoint the problem. I reduced my job size and hence I can >>>> run on 1 processor. Tried using valgrind but perhaps I'm using the >>>> optimized version, it didn't catch the error, besides saying "Segmentation >>>> fault (core dumped)" >>>> >>>> However, by re-writing my code, I found out a few things: >>>> >>>> 1. if I write my code this way: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> u_array = .... >>>> >>>> v_array = .... >>>> >>>> w_array = .... >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> The code runs fine. >>>> >>>> 2. if I write my code this way: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does >>>> the same modification as the above. >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>> >>>> where the subroutine is: >>>> >>>> subroutine uvw_array_change(u,v,w) >>>> >>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>> >>>> u ... >>>> v... >>>> w ... >>>> >>>> end subroutine uvw_array_change. >>>> >>>> The above will give an error at : >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> 3. Same as above, except I change the order of the last 3 lines to: >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> So they are now in reversed order. Now it works. >>>> >>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>> >>>> subroutine uvw_array_change(u,v,w) >>>> >>>> real(8), intent(inout) :: u(start_indices(1):end_ >>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>> 3):end_indices(3)) >>>> >>>> real(8), intent(inout) :: v(start_indices(1):end_ >>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>> 3):end_indices(3)) >>>> >>>> real(8), intent(inout) :: w(start_indices(1):end_ >>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>> 3):end_indices(3)) >>>> >>>> u ... >>>> v... >>>> w ... >>>> >>>> end subroutine uvw_array_change. >>>> >>>> The start_indices and end_indices are simply to shift the 0 indices of >>>> C convention to that of the 1 indices of the Fortran convention. This is >>>> necessary in my case because most of my codes start array counting at 1, >>>> hence the "trick". >>>> >>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in >>>> 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> " >>>> >>>> So did I violate and cause memory corruption due to the trick above? >>>> But I can't think of any way other >>>> than the "trick" to continue using the 1 indices >>>> convention. >>>> >>>> Thank you. >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>> Try running under valgrind http://www.mcs.anl.gov/petsc/ >>>> documentation/faq.html#valgrind >>>> >>>> >>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>>> >>>> Hi Barry, >>>> >>>> As I mentioned earlier, the code works fine in PETSc debug mode but >>>> fails in non-debug mode. >>>> >>>> I have attached my code. >>>> >>>> Thank you >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>> Please send the code that creates da_w and the declarations of >>>> w_array >>>> >>>> Barry >>>> >>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>> >>>> wrote: >>>> >>>> >>>> Hi Barry, >>>> >>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>> >>>> mpirun -n 4 ./a.out -start_in_debugger >>>> >>>> I got the msg below. Before the gdb windows appear (thru x11), the >>>> program aborts. >>>> >>>> Also I tried running in another cluster and it worked. Also tried in >>>> the current cluster in debug mode and it worked too. >>>> >>>> mpirun -n 4 ./a.out -start_in_debugger >>>> ------------------------------------------------------------ >>>> -------------- >>>> An MPI process has executed an operation involving a call to the >>>> "fork()" system call to create a child process. Open MPI is currently >>>> operating in a condition that could result in memory corruption or >>>> other system errors; your MPI job may hang, crash, or produce silent >>>> data corruption. The use of fork() (or system() or other calls that >>>> create child processes) is strongly discouraged. >>>> >>>> The process that invoked fork was: >>>> >>>> Local host: n12-76 (PID 20235) >>>> MPI_COMM_WORLD rank: 2 >>>> >>>> If you are *absolutely sure* that your application will successfully >>>> and correctly survive a call to fork(), you may disable this warning >>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>> ------------------------------------------------------------ >>>> -------------- >>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display >>>> localhost:50.0 on machine n12-76 >>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display >>>> localhost:50.0 on machine n12-76 >>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display >>>> localhost:50.0 on machine n12-76 >>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display >>>> localhost:50.0 on machine n12-76 >>>> [n12-76:20232] 3 more processes have sent help message >>>> help-mpi-runtime.txt / mpi_init:warn-fork >>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see >>>> all help / error messages >>>> >>>> .... >>>> >>>> 1 >>>> [1]PETSC ERROR: ------------------------------ >>>> ------------------------------------------ >>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>> probably memory access out of range >>>> [1]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> [1]PETSC ERROR: or see >>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSCERROR: or try >>>> http://valgrind.org >>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >>>> and run >>>> [1]PETSC ERROR: to get more information on the crash. >>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory >>>> unknown file (null) >>>> [3]PETSC ERROR: ------------------------------ >>>> ------------------------------------------ >>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>> probably memory access out of range >>>> [3]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> [3]PETSC ERROR: or see >>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSCERROR: or try >>>> http://valgrind.org >>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >>>> and run >>>> [3]PETSC ERROR: to get more information on the crash. >>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory >>>> unknown file (null) >>>> >>>> ... >>>> Thank you. >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>> >>>> Because IO doesn?t always get flushed immediately it may not be >>>> hanging at this point. It is better to use the option -start_in_debugger >>>> then type cont in each debugger window and then when you think it is >>>> ?hanging? do a control C in each debugger window and type where to see >>>> where each process is you can also look around in the debugger at variables >>>> to see why it is ?hanging? at that point. >>>> >>>> Barry >>>> >>>> This routines don?t have any parallel communication in them so are >>>> unlikely to hang. >>>> >>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>> >>>> >>>> >>>> wrote: >>>> >>>> >>>> >>>> Hi, >>>> >>>> My code hangs and I added in mpi_barrier and print to catch the bug. I >>>> found that it hangs after printing "7". Is it because I'm doing something >>>> wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After >>>> access, I use DMDAVecRestoreArrayF90. >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print >>>> *,"3" >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print >>>> *,"4" >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print >>>> *,"5" >>>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_ >>>> cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print >>>> *,"6" >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> !must be in reverse order >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print >>>> *,"7" >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print >>>> *,"8" >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> -- >>>> Thank you. >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 19 05:41:13 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 May 2014 05:41:13 -0500 Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner for cg method 1-processor/block In-Reply-To: References: Message-ID: On Mon, May 19, 2014 at 3:57 AM, Jonathan Wong wrote: > I'm running into an issue for a symmetric (may not be pd) finite element > problem where I am using the cg method and getting an indefinite_mat or > indefinite_pc error using the jacobi preconditioner. If I change the pc > type to bjacobi, it converges nicely. I am only using 1 process, and I > assumed they would produce the same result, as I'm using default options. > > Does anyone have any ideas why this would happen? > No, Block-Jacobi and Jacobi are completely different. If you are not positive definite, you should be using MINRES. Matt > It also works fine with gmres + jacobi pc. > > Thanks, > Jon > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 19 05:46:02 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 May 2014 05:46:02 -0500 Subject: [petsc-users] DMPlexDistribute error In-Reply-To: <53797A73.9070106@unimelb.edu.au> References: <53797A73.9070106@unimelb.edu.au> Message-ID: On Sun, May 18, 2014 at 10:28 PM, Scott Wales wrote: > Hi, > > I'm trying to create a distributed unstructured grid in PETSc, and have > encountered the following error: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > [0]PETSC ERROR: bin/celllist_square on a arch-linux2-c-debug named > raijin5 by saw562 Mon May 19 13:00:31 2014 > [0]PETSC ERROR: Libraries linked from /home/562/saw562/opt/petsc/3. > 4.4/lib > [0]PETSC ERROR: Configure run at Fri May 16 14:23:02 2014 > [0]PETSC ERROR: Configure options --with-shared-libraries=1 > --prefix=/home/562/saw562/opt/petsc/3.4.4 --with-blas-lapack-lib="-L/ > apps/intel-ct/12.1.9.293/mkl/lib/intel64 -lmkl_intel_lp64 > -lmkl_sequential -lmkl_core -lpthread" --with-mpi-dir=/apps/openmpi/1.6.3 > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > [0]PETSC ERROR: ISGetIndices() line 372 in > src/vec/is/is/interface/index.c > [0]PETSC ERROR: DMPlexCreatePartitionClosure() line 2637 ihttps:// > gist.github.com/ScottWales/2758b5ec96573c63e31an src/dm/impls/plex/plex.c > [0]PETSC ERROR: DMPlexDistribute() line 2810 in > src/dm/impls/plex/plex.c > > I've created the DMPlex using `DMPlexCreateFromCellList`, added a default > section and then called `DMPlexDistribute` to spread the grid points across > all of the processors. You can see my test code at > https://gist.github.com/ScottWales/2758b5ec96573c63e31a#file- > petsc-test-c-L164. Have I missed a step in the grid setup? > 1) I have better checking in the 'master' branch, and we are about to release, so I recommend upgrading 2) You did not install with any mesh partitioner, so it freaked out. You need something like --download-chaco in configure. Thanks, Matt > Thanks, Scott > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon May 19 08:35:28 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 19 May 2014 07:35:28 -0600 Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner for cg method 1-processor/block In-Reply-To: References: Message-ID: <87egzpsxgf.fsf@jedbrown.org> Matthew Knepley writes: > No, Block-Jacobi and Jacobi are completely different. If you are not > positive definite, you should be using MINRES. MINRES requires an SPD preconditioner. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From bsmith at mcs.anl.gov Mon May 19 12:43:48 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 19 May 2014 12:43:48 -0500 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: <5379A433.5000401@gmail.com> References: <534C9A2C.5060404@gmail.com> <5351E62B.6060201@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> <53797A4D.6090602@gmail.com> <5379A433.5000401@gmail.com> Message-ID: On May 19, 2014, at 1:26 AM, TAY wee-beng wrote: > On 19/5/2014 11:36 AM, Barry Smith wrote: >> On May 18, 2014, at 10:28 PM, TAY wee-beng wrote: >> >>> On 19/5/2014 9:53 AM, Matthew Knepley wrote: >>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng wrote: >>>> Hi Barry, >>>> >>>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu? >>>> >>>> Yes it works with Intel. Is this using optimization? >>> Hi Matt, >>> >>> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel. Does it definitely mean that it's a bug in ifort? >> No. Does it run clean under valgrind? > Hi, > > Do you mean the debug or optimized version? Both. > > Thanks. >> >>>> Matt >>>> >>>> Thank you >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 14/5/2014 12:03 AM, Barry Smith wrote: >>>> Please send you current code. So we may compile and run it. >>>> >>>> Barry >>>> >>>> >>>> On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: >>>> >>>> Hi, >>>> >>>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. >>>> >>>> Thank you. >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 21/4/2014 8:58 AM, Barry Smith wrote: >>>> Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. >>>> >>>> Barry >>>> >>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >>>> >>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>>> >>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>>> >>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>> Hmm, >>>> >>>> Interface DMDAVecGetArrayF90 >>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>> USE_DM_HIDE >>>> DM_HIDE da1 >>>> VEC_HIDE v >>>> PetscScalar,pointer :: d1(:,:,:) >>>> PetscErrorCode ierr >>>> End Subroutine >>>> >>>> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>> Hi, >>>> >>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >>>> >>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >>>> >>>> Also, supposed I call: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> u_array .... >>>> >>>> v_array .... etc >>>> >>>> Now to restore the array, does it matter the sequence they are restored? >>>> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >>>> >>>> Hi, >>>> >>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >>>> >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> u_array = 0.d0 >>>> >>>> v_array = 0.d0 >>>> >>>> w_array = 0.d0 >>>> >>>> p_array = 0.d0 >>>> >>>> >>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>>> >>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >>>> Hi Matt, >>>> >>>> Do you mean putting the above lines into ex11f90.F and test? >>>> >>>> It already has DMDAVecGetArray(). Just run it. >>>> Hi, >>>> >>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >>>> >>>> No the global/local difference should not matter. >>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >>>> >>>> DMGetLocalVector() >>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >>>> >>>> If so, when should I call them? >>>> >>>> You just need a local vector from somewhere. >>>> Hi, >>>> >>>> Anyone can help with the questions below? Still trying to find why my code doesn't work. >>>> >>>> Thanks. >>>> Hi, >>>> >>>> I insert part of my error region code into ex11f90: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> u_array = 0.d0 >>>> v_array = 0.d0 >>>> w_array = 0.d0 >>>> p_array = 0.d0 >>>> >>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> It worked w/o error. I'm going to change the way the modules are defined in my code. >>>> >>>> My code contains a main program and a number of modules files, with subroutines inside e.g. >>>> >>>> module solve >>>> <- add include file? >>>> subroutine RRK >>>> <- add include file? >>>> end subroutine RRK >>>> >>>> end module solve >>>> >>>> So where should the include files (#include ) be placed? >>>> >>>> After the module or inside the subroutine? >>>> >>>> Thanks. >>>> Matt >>>> Thanks. >>>> Matt >>>> Thanks. >>>> Matt >>>> Thanks >>>> >>>> Regards. >>>> Matt >>>> As in w, then v and u? >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> thanks >>>> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >>>> Hi, >>>> >>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >>>> >>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >>>> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >>>> >>>> >>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >>>> >>>> Barry >>>> >>>> Thanks. >>>> Barry >>>> >>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>>> >>>> Hi, >>>> >>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >>>> >>>> However, by re-writing my code, I found out a few things: >>>> >>>> 1. if I write my code this way: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> u_array = .... >>>> >>>> v_array = .... >>>> >>>> w_array = .... >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> The code runs fine. >>>> >>>> 2. if I write my code this way: >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>> >>>> where the subroutine is: >>>> >>>> subroutine uvw_array_change(u,v,w) >>>> >>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>> >>>> u ... >>>> v... >>>> w ... >>>> >>>> end subroutine uvw_array_change. >>>> >>>> The above will give an error at : >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> 3. Same as above, except I change the order of the last 3 lines to: >>>> >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>> >>>> So they are now in reversed order. Now it works. >>>> >>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>> >>>> subroutine uvw_array_change(u,v,w) >>>> >>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>> >>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>> >>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>> >>>> u ... >>>> v... >>>> w ... >>>> >>>> end subroutine uvw_array_change. >>>> >>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >>>> >>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >>>> >>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >>>> >>>> Thank you. >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>> >>>> >>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>>> >>>> Hi Barry, >>>> >>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >>>> >>>> I have attached my code. >>>> >>>> Thank you >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>> Please send the code that creates da_w and the declarations of w_array >>>> >>>> Barry >>>> >>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>> >>>> wrote: >>>> >>>> >>>> Hi Barry, >>>> >>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>> >>>> mpirun -n 4 ./a.out -start_in_debugger >>>> >>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >>>> >>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >>>> >>>> mpirun -n 4 ./a.out -start_in_debugger >>>> -------------------------------------------------------------------------- >>>> An MPI process has executed an operation involving a call to the >>>> "fork()" system call to create a child process. Open MPI is currently >>>> operating in a condition that could result in memory corruption or >>>> other system errors; your MPI job may hang, crash, or produce silent >>>> data corruption. The use of fork() (or system() or other calls that >>>> create child processes) is strongly discouraged. >>>> >>>> The process that invoked fork was: >>>> >>>> Local host: n12-76 (PID 20235) >>>> MPI_COMM_WORLD rank: 2 >>>> >>>> If you are *absolutely sure* that your application will successfully >>>> and correctly survive a call to fork(), you may disable this warning >>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>> -------------------------------------------------------------------------- >>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>>> >>>> .... >>>> >>>> 1 >>>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>> [1]PETSC ERROR: or see >>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>> [1]PETSC ERROR: to get more information on the crash. >>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>> [3]PETSC ERROR: or see >>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>> [3]PETSC ERROR: to get more information on the crash. >>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>> >>>> ... >>>> Thank you. >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>> >>>> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >>>> >>>> Barry >>>> >>>> This routines don?t have any parallel communication in them so are unlikely to hang. >>>> >>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>> >>>> >>>> >>>> wrote: >>>> >>>> >>>> >>>> Hi, >>>> >>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>> >>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >>>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>> -- >>>> Thank you. >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener > From jon.the.wong at gmail.com Mon May 19 13:42:18 2014 From: jon.the.wong at gmail.com (Jonathan Wong) Date: Mon, 19 May 2014 11:42:18 -0700 Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner for cg method 1-processor/block In-Reply-To: <87egzpsxgf.fsf@jedbrown.org> References: <87egzpsxgf.fsf@jedbrown.org> Message-ID: Thanks for the input. To clarify, I'm trying to compare GPU algorithms to Petsc, and they only have cg/jacobi for what I'm comparing at the moment. This is why I'm not using gmres (which also works well). I can solve the problem with the GPU (custom code) using CG + jacobi for all the meshes. On the CPU side, I can solve everything with cg/bjacobi and almost all of my meshes with cg/jacobi except for my 50k node mesh. I can solve the problem with my finite element built-in direct solver (just takes awhile) on one processor. I've been reading that by default the bjacobi pc uses one block per processor. So I had assumed that for one processor block-jacobi and jacobi would give similar results. cg+bjacobi works fine. cg+jacobi does not. I'll just look into the preconditioner code and use KSPview to try to figure out what the differences are for one processor. I'm not sure why the GPU can consistently solve the problem with cg/jacobi. I'm assuming this is due to the way round-off or the order of operations differences between the two. On Mon, May 19, 2014 at 6:35 AM, Jed Brown wrote: > Matthew Knepley writes: > > No, Block-Jacobi and Jacobi are completely different. If you are not > > positive definite, you should be using MINRES. > > MINRES requires an SPD preconditioner. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 19 13:45:32 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 May 2014 13:45:32 -0500 Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner for cg method 1-processor/block In-Reply-To: References: <87egzpsxgf.fsf@jedbrown.org> Message-ID: On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong wrote: > Thanks for the input. To clarify, I'm trying to compare GPU algorithms to > Petsc, and they only have cg/jacobi for what I'm comparing at the moment. > This is why I'm not using gmres (which also works well). > > I can solve the problem with the GPU (custom code) using CG + jacobi for > all the meshes. On the CPU side, I can solve everything with cg/bjacobi and > almost all of my meshes with cg/jacobi except for my 50k node mesh. I can > solve the problem with my finite element built-in direct solver (just takes > awhile) on one processor. I've been reading that by default the bjacobi pc > uses one block per processor. So I had assumed that for one processor > block-jacobi and jacobi would give similar results. cg+bjacobi works fine. > cg+jacobi does not. > "Jacobi" means preconditioning by the inverse of the diagonal of the matrix. Block-Jacobi means using a preconditioner formed from each of the blocks, in this case 1 block. By default the inner preconditioner is ILU(0), not jacobi. You can make them equivalent using -sub_pc_type jacobi. Matt > I'll just look into the preconditioner code and use KSPview to try to > figure out what the differences are for one processor. I'm not sure why the > GPU can consistently solve the problem with cg/jacobi. I'm assuming this is > due to the way round-off or the order of operations differences between the > two. > > > On Mon, May 19, 2014 at 6:35 AM, Jed Brown wrote: > >> Matthew Knepley writes: >> > No, Block-Jacobi and Jacobi are completely different. If you are not >> > positive definite, you should be using MINRES. >> >> MINRES requires an SPD preconditioner. >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From paulmullowney at gmail.com Mon May 19 14:07:53 2014 From: paulmullowney at gmail.com (Paul Mullowney) Date: Mon, 19 May 2014 13:07:53 -0600 Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner for cg method 1-processor/block In-Reply-To: References: <87egzpsxgf.fsf@jedbrown.org> Message-ID: I don't think bjacobi is working on GPUs. I know Dominic made a pull request a few months ago, but I don't know if its been integrated into next. -Paul On Mon, May 19, 2014 at 12:45 PM, Matthew Knepley wrote: > On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong wrote: > >> Thanks for the input. To clarify, I'm trying to compare GPU algorithms to >> Petsc, and they only have cg/jacobi for what I'm comparing at the moment. >> This is why I'm not using gmres (which also works well). >> >> I can solve the problem with the GPU (custom code) using CG + jacobi for >> all the meshes. On the CPU side, I can solve everything with cg/bjacobi and >> almost all of my meshes with cg/jacobi except for my 50k node mesh. I can >> solve the problem with my finite element built-in direct solver (just takes >> awhile) on one processor. I've been reading that by default the bjacobi pc >> uses one block per processor. So I had assumed that for one processor >> block-jacobi and jacobi would give similar results. cg+bjacobi works fine. >> cg+jacobi does not. >> > > "Jacobi" means preconditioning by the inverse of the diagonal of the > matrix. Block-Jacobi means using a preconditioner > formed from each of the blocks, in this case 1 block. By default the inner > preconditioner is ILU(0), not jacobi. You can > make them equivalent using -sub_pc_type jacobi. > > Matt > > >> I'll just look into the preconditioner code and use KSPview to try to >> figure out what the differences are for one processor. I'm not sure why the >> GPU can consistently solve the problem with cg/jacobi. I'm assuming this is >> due to the way round-off or the order of operations differences between the >> two. >> >> >> On Mon, May 19, 2014 at 6:35 AM, Jed Brown wrote: >> >>> Matthew Knepley writes: >>> > No, Block-Jacobi and Jacobi are completely different. If you are not >>> > positive definite, you should be using MINRES. >>> >>> MINRES requires an SPD preconditioner. >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon.the.wong at gmail.com Mon May 19 15:16:16 2014 From: jon.the.wong at gmail.com (Jonathan Wong) Date: Mon, 19 May 2014 13:16:16 -0700 Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner for cg method 1-processor/block In-Reply-To: References: <87egzpsxgf.fsf@jedbrown.org> Message-ID: Matthew: Thanks for clarifying about the block-jacobi. Paul: I'm only using bjacobi with PETSc to show that the problem is solvable, and to provide some "estimation" as to the performance of the jacobi preconditioner. On the GPU, I am using CUSP to do cg+jacobi which works fine for this 50k node mesh. On Mon, May 19, 2014 at 12:07 PM, Paul Mullowney wrote: > I don't think bjacobi is working on GPUs. I know Dominic made a pull > request a few months ago, but I don't know if its been integrated into next. > -Paul > > > On Mon, May 19, 2014 at 12:45 PM, Matthew Knepley wrote: > >> On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong wrote: >> >>> Thanks for the input. To clarify, I'm trying to compare GPU algorithms >>> to Petsc, and they only have cg/jacobi for what I'm comparing at the >>> moment. This is why I'm not using gmres (which also works well). >>> >>> I can solve the problem with the GPU (custom code) using CG + jacobi for >>> all the meshes. On the CPU side, I can solve everything with cg/bjacobi and >>> almost all of my meshes with cg/jacobi except for my 50k node mesh. I can >>> solve the problem with my finite element built-in direct solver (just takes >>> awhile) on one processor. I've been reading that by default the bjacobi pc >>> uses one block per processor. So I had assumed that for one processor >>> block-jacobi and jacobi would give similar results. cg+bjacobi works fine. >>> cg+jacobi does not. >>> >> >> "Jacobi" means preconditioning by the inverse of the diagonal of the >> matrix. Block-Jacobi means using a preconditioner >> formed from each of the blocks, in this case 1 block. By default the >> inner preconditioner is ILU(0), not jacobi. You can >> make them equivalent using -sub_pc_type jacobi. >> >> Matt >> >> >>> I'll just look into the preconditioner code and use KSPview to try to >>> figure out what the differences are for one processor. I'm not sure why the >>> GPU can consistently solve the problem with cg/jacobi. I'm assuming this is >>> due to the way round-off or the order of operations differences between the >>> two. >>> >>> >>> On Mon, May 19, 2014 at 6:35 AM, Jed Brown wrote: >>> >>>> Matthew Knepley writes: >>>> > No, Block-Jacobi and Jacobi are completely different. If you are not >>>> > positive definite, you should be using MINRES. >>>> >>>> MINRES requires an SPD preconditioner. >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 19 15:20:05 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 May 2014 15:20:05 -0500 Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner for cg method 1-processor/block In-Reply-To: References: <87egzpsxgf.fsf@jedbrown.org> Message-ID: On Mon, May 19, 2014 at 3:16 PM, Jonathan Wong wrote: > Matthew: Thanks for clarifying about the block-jacobi. > > Paul: I'm only using bjacobi with PETSc to show that the problem is > solvable, and to provide some "estimation" as to the performance of the > jacobi preconditioner. On the GPU, I am using CUSP to do cg+jacobi which > works fine for this 50k node mesh. > Its possible the GPU CG code is just ignoring breakdown and continuing the solve. This may work sometimes, but could give incorrect answers. Also, it seems simply beyond belief that CG+Jacobi could solve any FEM problem other than the identity. For example, the Laplacian has a condition number that is proportional to h^{-2}, so it grows like N for linear finite elements in 2D. Are you trying to solve something with an extremely small timestep so that it looks like the identity? Matt > On Mon, May 19, 2014 at 12:07 PM, Paul Mullowney wrote: > >> I don't think bjacobi is working on GPUs. I know Dominic made a pull >> request a few months ago, but I don't know if its been integrated into next. >> -Paul >> >> >> On Mon, May 19, 2014 at 12:45 PM, Matthew Knepley wrote: >> >>> On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong wrote: >>> >>>> Thanks for the input. To clarify, I'm trying to compare GPU algorithms >>>> to Petsc, and they only have cg/jacobi for what I'm comparing at the >>>> moment. This is why I'm not using gmres (which also works well). >>>> >>>> I can solve the problem with the GPU (custom code) using CG + jacobi >>>> for all the meshes. On the CPU side, I can solve everything with cg/bjacobi >>>> and almost all of my meshes with cg/jacobi except for my 50k node mesh. I >>>> can solve the problem with my finite element built-in direct solver (just >>>> takes awhile) on one processor. I've been reading that by default the >>>> bjacobi pc uses one block per processor. So I had assumed that for one >>>> processor block-jacobi and jacobi would give similar results. cg+bjacobi >>>> works fine. cg+jacobi does not. >>>> >>> >>> "Jacobi" means preconditioning by the inverse of the diagonal of the >>> matrix. Block-Jacobi means using a preconditioner >>> formed from each of the blocks, in this case 1 block. By default the >>> inner preconditioner is ILU(0), not jacobi. You can >>> make them equivalent using -sub_pc_type jacobi. >>> >>> Matt >>> >>> >>>> I'll just look into the preconditioner code and use KSPview to try to >>>> figure out what the differences are for one processor. I'm not sure why the >>>> GPU can consistently solve the problem with cg/jacobi. I'm assuming this is >>>> due to the way round-off or the order of operations differences between the >>>> two. >>>> >>>> >>>> On Mon, May 19, 2014 at 6:35 AM, Jed Brown wrote: >>>> >>>>> Matthew Knepley writes: >>>>> > No, Block-Jacobi and Jacobi are completely different. If you are not >>>>> > positive definite, you should be using MINRES. >>>>> >>>>> MINRES requires an SPD preconditioner. >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrosso at uci.edu Mon May 19 17:18:29 2014 From: mrosso at uci.edu (Michele Rosso) Date: Mon, 19 May 2014 15:18:29 -0700 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <87y4y0uar8.fsf@jedbrown.org> References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> Message-ID: <537A8335.4080702@uci.edu> Jed, thanks for your reply. By using the options you suggested, namely /-mg_levels_ksp_type richardson -mg_levels_pc_type sor/, I was able to solve without bumping into the DIVERGED_INDEFINITE_PC message. Nevertheless, the number of iterations increases drastically as the simulation progresses. The Poisson's equation I am solving arises from a variable-density projection method for incompressible multi-phase flows. At each time step the system matrix coefficients change as a consequence of the change in location of the heavier phase; the rhs changes in time because of the change in the velocity field. Usually the black-box multigrid or the deflated conjugate gradient method are used to solve efficiently this type of problem: it is my understanding - please correct me if I am wrong - that AMG is a generalization of the former. The only source term acting is gravity; the hydrostatic pressure is removed from the governing equation in order to accommodate periodic boundary conditions: this is more a hack than a clean solution. Could it be the reason behind the poor performances/ DIVERGED_INDEFINITE_PC problem I am experiencing? Thanks, Michele On 05/17/2014 12:26 AM, Jed Brown wrote: > Michele Rosso writes: > >> Hi, >> >> I am solving an inhomogeneous Laplacian in 3D (basically a slightly >> modified version of example ex34). >> The laplacian is discretized by using a cell-center finite difference >> 7-point stencil with periodic BCs. >> I am solving a time-dependent problem so the solution of the laplacian >> is repeated at each time step with a different matrix (always SPD >> though) and rhs. Also, the laplacian features large magnitude variations >> in the coefficients. I solve by means of CG + GAMG as preconditioner. >> Everything works fine for a while until I receive a >> DIVERGED_INDEFINITE_PC message. > What is changing as you time step? Is there a nonlinearity that > activates suddenly? Especially a bifurcation or perhaps a source term > that is incompatible with the boundary conditions? You could try > -mg_levels_ksp_type richardson -mg_levels_pc_type sor. Can you > reproduce with a small problem? > > The configuration looks okay to me. > >> Before checking my model is incorrect I would like to rule out the >> possibility of improper use of the linear solver. I attached the full >> output of a serial run with -log-summary -ksp_view >> -ksp_converged_reason ksp_monitor_true_residual. I would appreciate if >> you could help me in locating the issue. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon May 19 17:30:06 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 19 May 2014 16:30:06 -0600 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <537A8335.4080702@uci.edu> References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> Message-ID: <8761l1qu4x.fsf@jedbrown.org> Michele Rosso writes: > Jed, > > thanks for your reply. > By using the options you suggested, namely /-mg_levels_ksp_type > richardson -mg_levels_pc_type sor/, I was able to > solve without bumping into the DIVERGED_INDEFINITE_PC message. > Nevertheless, the number of iterations increases drastically as the > simulation progresses. What about SOR with Chebyshev? (A little weird, but sometimes it's a good choice.) If the solve is expensive, you can add a few more iterations for eigenvalue estimation. > The Poisson's equation I am solving arises from a variable-density > projection method for incompressible multi-phase flows. > At each time step the system matrix coefficients change as a consequence > of the change in location of the heavier phase; the rhs changes > in time because of the change in the velocity field. Usually the > black-box multigrid or the deflated conjugate gradient method are used > to solve efficiently this type of problem: it is my understanding - > please correct me if I am wrong - that AMG is a generalization of the > former. Dendy's "black-box MG" is a semi-geometric method for cell-centered discretizations. AMG is not a superset or subset of those methods. > The only source term acting is gravity; the hydrostatic pressure is > removed from the governing equation in order to accommodate periodic > boundary conditions: this is more a hack than a clean solution. Could it > be the reason behind the poor performances/ DIVERGED_INDEFINITE_PC > problem I am experiencing? If you have periodic boundary conditions, then you also have a pressure null space. Have you removed the null space from the RHS and supplied the null space to the solver? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From mrosso at uci.edu Mon May 19 17:41:48 2014 From: mrosso at uci.edu (Michele Rosso) Date: Mon, 19 May 2014 15:41:48 -0700 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <8761l1qu4x.fsf@jedbrown.org> References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> Message-ID: <537A88AC.3060308@uci.edu> Jed, thank you very much! I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type sor/ and report back. Yes, I removed the nullspace from both the system matrix and the rhs. Is there a way to have something similar to Dendy's multigrid or the deflated conjugate gradient method with PETSc? Thank you, Michele // On 05/19/2014 03:30 PM, Jed Brown wrote: > Michele Rosso writes: > >> Jed, >> >> thanks for your reply. >> By using the options you suggested, namely /-mg_levels_ksp_type >> richardson -mg_levels_pc_type sor/, I was able to >> solve without bumping into the DIVERGED_INDEFINITE_PC message. >> Nevertheless, the number of iterations increases drastically as the >> simulation progresses. > What about SOR with Chebyshev? (A little weird, but sometimes it's a > good choice.) If the solve is expensive, you can add a few more > iterations for eigenvalue estimation. > >> The Poisson's equation I am solving arises from a variable-density >> projection method for incompressible multi-phase flows. >> At each time step the system matrix coefficients change as a consequence >> of the change in location of the heavier phase; the rhs changes >> in time because of the change in the velocity field. Usually the >> black-box multigrid or the deflated conjugate gradient method are used >> to solve efficiently this type of problem: it is my understanding - >> please correct me if I am wrong - that AMG is a generalization of the >> former. > Dendy's "black-box MG" is a semi-geometric method for cell-centered > discretizations. AMG is not a superset or subset of those methods. > >> The only source term acting is gravity; the hydrostatic pressure is >> removed from the governing equation in order to accommodate periodic >> boundary conditions: this is more a hack than a clean solution. Could it >> be the reason behind the poor performances/ DIVERGED_INDEFINITE_PC >> problem I am experiencing? > If you have periodic boundary conditions, then you also have a pressure > null space. Have you removed the null space from the RHS and supplied > the null space to the solver? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon May 19 17:49:21 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 19 May 2014 16:49:21 -0600 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <537A88AC.3060308@uci.edu> References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> Message-ID: <871tvpqt8u.fsf@jedbrown.org> Michele Rosso writes: > Jed, > > thank you very much! > I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type > sor/ and report back. > Yes, I removed the nullspace from both the system matrix and the rhs. > Is there a way to have something similar to Dendy's multigrid or the > deflated conjugate gradient method with PETSc? Dendy's MG needs geometry. The algorithm to produce the interpolation operators is not terribly complicated so it could be done, though DMDA support for cell-centered is a somewhat awkward. "Deflated CG" can mean lots of things so you'll have to be more precise. (Most everything in the "deflation" world has a clear analogue in the MG world, but the deflation community doesn't have a precise language to talk about their methods so you always have to read the paper carefully to find out if it's completely standard or if there is something new.) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From andrewdalecramer at gmail.com Mon May 19 20:50:38 2014 From: andrewdalecramer at gmail.com (Andrew Cramer) Date: Tue, 20 May 2014 11:50:38 +1000 Subject: [petsc-users] Accessing Global Vectors Message-ID: Hi All, I'm new to PETSc and would like to use it as my linear elasticity solver within a structural optimisation program. Originally I was using GP-GPUs and CUDA for my solver but I would like to shift to using PETSc to leverage it's breadth of trustworthy solvers. We have some SMP servers and a couple compute clusters (one with GPUs, one without). I've been digging through the docs and I'd like some feedback on my plan and perhaps some pointers if at all possible. The plan is to keep the 6000 lines or so of current code and try as much as possible to use PETSc as a 'drop-in'. This would require giving one field (array) of densities and receiving a 3d field (array) of displacements back. Providing the density field would be easy with the usual array construction functions on one node/process but pulling the displacements back to the 'controlling' node would be difficult. I understand that this goes against the ethos of PETSc which is distributed all the way. My code is highly modular with differing objective functions and optimisers (some of which are written by other research groups) that I drop in and pull out. I don't want to throw all that away. I would need to relearn object oriented programming within PETSc (currently I use c++) and rewrite my entire code base. In terms of performance the optimisers typically rely heavily on tight loops of reductions once the solve is completed so I'm not sure that the speed-up would be too great rewriting them as distributed anyway. Sorry for the long winded post but I'm just not sure how to move forward, I'm sick of implementing every solver I want to try in CUDA especially knowing that people have done it better than I can in PETSc. But it's a framework that I don't know how to interface with, all the examples seem to have the solve as the main thing rather than one part of a broader program. Andrew Cramer University of Queensland, Australia PhD Candidate -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon May 19 21:27:36 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 May 2014 21:27:36 -0500 Subject: [petsc-users] Accessing Global Vectors In-Reply-To: References: Message-ID: On Mon, May 19, 2014 at 8:50 PM, Andrew Cramer wrote: > Hi All, > > I'm new to PETSc and would like to use it as my linear elasticity solver > within a structural optimisation program. Originally I was using GP-GPUs > and CUDA for my solver but I would like to shift to using PETSc to leverage > it's breadth of trustworthy solvers. We have some SMP servers and a couple > compute clusters (one with GPUs, one without). I've been digging through > the docs and I'd like some feedback on my plan and perhaps some pointers if > at all possible. > > The plan is to keep the 6000 lines or so of current code and try as much > as possible to use PETSc as a 'drop-in'. This would require giving one > field (array) of densities and receiving a 3d field (array) of > displacements back. Providing the density field would be easy with the > usual array construction functions on one node/process but pulling the > displacements back to the 'controlling' node would be difficult. > > I understand that this goes against the ethos of PETSc which is > distributed all the way. My code is highly modular with differing objective > functions and optimisers (some of which are written by other research > groups) that I drop in and pull out. I don't want to throw all that away. I > would need to relearn object oriented programming within PETSc (currently I > use c++) and rewrite my entire code base. In terms of performance the > optimisers typically rely heavily on tight loops of reductions once the > solve is completed so I'm not sure that the speed-up would be too great > rewriting them as distributed anyway. > > Sorry for the long winded post but I'm just not sure how to move forward, > I'm sick of implementing every solver I want to try in CUDA especially > knowing that people have done it better than I can in PETSc. But it's a > framework that I don't know how to interface with, all the examples seem to > have the solve as the main thing rather than one part of a broader program. > 1) PETSc can do a good job on linear elasticity. GAMG is particularly effective, and we have an example: http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex56.c.html 2) You can use this function to go back and forth from 1 process http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Vec/VecScatterCreateToZero.html 3) The expense of pushing all that data to nodes can large. You might be better off just using GAMG on 1 process, which is how I would start. Matt > Andrew Cramer > University of Queensland, Australia > PhD Candidate > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewdalecramer at gmail.com Tue May 20 00:57:58 2014 From: andrewdalecramer at gmail.com (Andrew Cramer) Date: Tue, 20 May 2014 15:57:58 +1000 Subject: [petsc-users] Accessing Global Vectors In-Reply-To: References: Message-ID: On 20 May 2014 12:27, Matthew Knepley wrote: > On Mon, May 19, 2014 at 8:50 PM, Andrew Cramer > wrote: > >> Hi All, >> >> I'm new to PETSc and would like to use it as my linear elasticity solver >> within a structural optimisation program. Originally I was using GP-GPUs >> and CUDA for my solver but I would like to shift to using PETSc to leverage >> it's breadth of trustworthy solvers. We have some SMP servers and a couple >> compute clusters (one with GPUs, one without). I've been digging through >> the docs and I'd like some feedback on my plan and perhaps some pointers if >> at all possible. >> >> The plan is to keep the 6000 lines or so of current code and try as much >> as possible to use PETSc as a 'drop-in'. This would require giving one >> field (array) of densities and receiving a 3d field (array) of >> displacements back. Providing the density field would be easy with the >> usual array construction functions on one node/process but pulling the >> displacements back to the 'controlling' node would be difficult. >> >> I understand that this goes against the ethos of PETSc which is >> distributed all the way. My code is highly modular with differing objective >> functions and optimisers (some of which are written by other research >> groups) that I drop in and pull out. I don't want to throw all that away. I >> would need to relearn object oriented programming within PETSc (currently I >> use c++) and rewrite my entire code base. In terms of performance the >> optimisers typically rely heavily on tight loops of reductions once the >> solve is completed so I'm not sure that the speed-up would be too great >> rewriting them as distributed anyway. >> >> Sorry for the long winded post but I'm just not sure how to move forward, >> I'm sick of implementing every solver I want to try in CUDA especially >> knowing that people have done it better than I can in PETSc. But it's a >> framework that I don't know how to interface with, all the examples seem to >> have the solve as the main thing rather than one part of a broader program. >> > > 1) PETSc can do a good job on linear elasticity. GAMG is particularly > effective, and we have an example: > > > http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex56.c.html > > 2) You can use this function to go back and forth from 1 process > > > http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Vec/VecScatterCreateToZero.html > > 3) The expense of pushing all that data to nodes can large. You might be > better off just using GAMG on 1 process, which is how I would start. > > Matt > > >> Andrew Cramer >> University of Queensland, Australia >> PhD Candidate >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > Thanks for your help, I was eyeing off ksp/ex29 as it uses DMDA which I thought would simplify things. I'll take a look at ex56 instead and see what I can do. Andrew -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 20 05:22:35 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 20 May 2014 05:22:35 -0500 Subject: [petsc-users] Accessing Global Vectors In-Reply-To: References: Message-ID: On Tue, May 20, 2014 at 12:57 AM, Andrew Cramer wrote: > On 20 May 2014 12:27, Matthew Knepley wrote: > >> On Mon, May 19, 2014 at 8:50 PM, Andrew Cramer < >> andrewdalecramer at gmail.com> wrote: >> >>> Hi All, >>> >>> I'm new to PETSc and would like to use it as my linear elasticity solver >>> within a structural optimisation program. Originally I was using GP-GPUs >>> and CUDA for my solver but I would like to shift to using PETSc to leverage >>> it's breadth of trustworthy solvers. We have some SMP servers and a couple >>> compute clusters (one with GPUs, one without). I've been digging through >>> the docs and I'd like some feedback on my plan and perhaps some pointers if >>> at all possible. >>> >>> The plan is to keep the 6000 lines or so of current code and try as much >>> as possible to use PETSc as a 'drop-in'. This would require giving one >>> field (array) of densities and receiving a 3d field (array) of >>> displacements back. Providing the density field would be easy with the >>> usual array construction functions on one node/process but pulling the >>> displacements back to the 'controlling' node would be difficult. >>> >>> I understand that this goes against the ethos of PETSc which is >>> distributed all the way. My code is highly modular with differing objective >>> functions and optimisers (some of which are written by other research >>> groups) that I drop in and pull out. I don't want to throw all that away. I >>> would need to relearn object oriented programming within PETSc (currently I >>> use c++) and rewrite my entire code base. In terms of performance the >>> optimisers typically rely heavily on tight loops of reductions once the >>> solve is completed so I'm not sure that the speed-up would be too great >>> rewriting them as distributed anyway. >>> >>> Sorry for the long winded post but I'm just not sure how to move >>> forward, I'm sick of implementing every solver I want to try in CUDA >>> especially knowing that people have done it better than I can in PETSc. But >>> it's a framework that I don't know how to interface with, all the examples >>> seem to have the solve as the main thing rather than one part of a broader >>> program. >>> >> >> 1) PETSc can do a good job on linear elasticity. GAMG is particularly >> effective, and we have an example: >> >> >> http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex56.c.html >> >> 2) You can use this function to go back and forth from 1 process >> >> >> http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Vec/VecScatterCreateToZero.html >> >> 3) The expense of pushing all that data to nodes can large. You might be >> better off just using GAMG on 1 process, which is how I would start. >> >> Matt >> >> >>> Andrew Cramer >>> University of Queensland, Australia >>> PhD Candidate >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > Thanks for your help, I was eyeing off ksp/ex29 as it uses DMDA which I > thought would simplify things. I'll take a look at ex56 instead and see > what I can do. > If you have a completely structured grid, DMDA is definitely simpler, although it is a little awkward for cell-centered discretizations. We have some really new support for arbitrary discretizations on top of DMDA, but it is alpha. Thanks, Matt > Andrew > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From christophe.ortiz at ciemat.es Tue May 20 07:12:45 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Tue, 20 May 2014 14:12:45 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: <87zjij6gzu.fsf@jedbrown.org> References: <87zjij6gzu.fsf@jedbrown.org> Message-ID: I found another problem when using two-dimensional arrays defined using pointers of pointers. When I use a "classical" two-dimensional array defined by PetscScalar array[dof][dof]; and then build the Jacobian using ierr = MatSetValuesBlocked(*Jpre,1,&row,1,&col,&array[0][0],INSERT_VALUES); It works fine. The problem comes when I define the two-dimensional array as follows: PetscScalar **array; array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); for (k = 0; k < dof; k++){ array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); } When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc complains because I am not passing it the right way or when it accepts it, wrong data is passed because the solution is not correct. Maybe Petsc expect dof*dof values and only dof are passed ? How a two-dimensional array declared with pointers of pointers should be passed to MatSetValuesBlocked() ? Many thanks in advance. Christophe On Thu, May 15, 2014 at 2:08 AM, Jed Brown wrote: > Christophe Ortiz writes: > > > Hi all, > > > > I am experiencing some problems of memory corruption with PetscMemzero(). > > > > I set the values of the Jacobian by blocks using MatSetValuesBlocked(). > To > > do so, I use some temporary two-dimensional arrays[dof][dof] that I must > > reset at each loop. > > > > Inside FormIJacobian, for instance, I declare the following > two-dimensional > > array: > > > > PetscScalar diag[dof][dof]; > > > > and then, to zero the array diag[][] I do > > > > ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar)); > > Note that this can also be spelled > > PetscMemzero(diag,sizeof diag); > > > Then, inside main(), once dof is determined, I allocate memory for diag > as > > follows: > > > > diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); > > > > for (k = 0; k < dof; k++){ > > diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); > > } > > That is, the classical way to allocate memory using the pointer notation. > > Note that you can do a contiguous allocation by creating a Vec, then use > VecGetArray2D to get 2D indexing of it. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 20 07:16:05 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 20 May 2014 07:16:05 -0500 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: References: <87zjij6gzu.fsf@jedbrown.org> Message-ID: On Tue, May 20, 2014 at 7:12 AM, Christophe Ortiz < christophe.ortiz at ciemat.es> wrote: > I found another problem when using two-dimensional arrays defined using > pointers of pointers. > > When I use a "classical" two-dimensional array defined by > > PetscScalar array[dof][dof]; > This declaration will use contiguous memory since its on the stack. > and then build the Jacobian using > > ierr = MatSetValuesBlocked(*Jpre,1,&row,1,&col,&array[0][0],INSERT_VALUES); > > It works fine. > > The problem comes when I define the two-dimensional array as follows: > > PetscScalar **array; > This one uses non-contiguous memory since its on the heap. > array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); > for (k = 0; k < dof; k++){ > array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); > } > > When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc > complains because I am not passing it the right way or when it accepts it, > wrong data is passed because the solution is not correct. Maybe Petsc > expect dof*dof values and only dof are passed ? > You can only pass contiguous memory to MatSetValues(). Matt > How a two-dimensional array declared with pointers of pointers should be > passed to MatSetValuesBlocked() ? > > Many thanks in advance. > Christophe > > > > > On Thu, May 15, 2014 at 2:08 AM, Jed Brown wrote: > >> Christophe Ortiz writes: >> >> > Hi all, >> > >> > I am experiencing some problems of memory corruption with >> PetscMemzero(). >> > >> > I set the values of the Jacobian by blocks using MatSetValuesBlocked(). >> To >> > do so, I use some temporary two-dimensional arrays[dof][dof] that I must >> > reset at each loop. >> > >> > Inside FormIJacobian, for instance, I declare the following >> two-dimensional >> > array: >> > >> > PetscScalar diag[dof][dof]; >> > >> > and then, to zero the array diag[][] I do >> > >> > ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar)); >> >> Note that this can also be spelled >> >> PetscMemzero(diag,sizeof diag); >> >> > Then, inside main(), once dof is determined, I allocate memory for diag >> as >> > follows: >> > >> > diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); >> > >> > for (k = 0; k < dof; k++){ >> > diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); >> > } >> > That is, the classical way to allocate memory using the pointer >> notation. >> >> Note that you can do a contiguous allocation by creating a Vec, then use >> VecGetArray2D to get 2D indexing of it. >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From altriaex86 at gmail.com Tue May 20 07:28:34 2014 From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=) Date: Tue, 20 May 2014 22:28:34 +1000 Subject: [petsc-users] Execution time of superlu_dist increase in multiprocessing Message-ID: Hi, I'm working on standard eigensolving with spectrum transform. I tried mumps and superlu_dist for ST. But I found that when I run my program with more process, execution time of mumps decrease, but time of superlu_dist increase. Both of them are called by options like char common_options[] = "-st_ksp_type preonly -st_pc_type lu -st_pc_factor_mat_solver_package mumps"; ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr); Shall I set more parameters to get benefit of parallel computing when using superlu_dist? My mattype is mpiaij. Your sincerely Guoxi -------------- next part -------------- An HTML attachment was scrubbed... URL: From zyzhang at nuaa.edu.cn Tue May 20 07:31:11 2014 From: zyzhang at nuaa.edu.cn (Zhang) Date: Tue, 20 May 2014 20:31:11 +0800 (GMT+08:00) Subject: [petsc-users] How to run snes ex12 with petsc-3.4.4 Message-ID: <82f67e.5b78.146199d356a.Coremail.zyzhang@nuaa.edu.cn> Dear All, I am trying the PetscFEM solver with petsc-3.4.4. But when I run snes/ex12, I always got run time errors. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] DMPlexProjectFunctionLocal line 230 /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c [1]PETSC ERROR: [1] DMPlexProjectFunction line 338 /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] DMPlexProjectFunctionLocal line 230 /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c [0]PETSC ERROR: [0] DMPlexProjectFunction line 338 /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: --------------------- Error Message ------------------------------------ [1]PETSC ERROR: Signal received! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. ex12 on a arch-linux2-c-opt named toshiba by zhenyu Tue May 20 20:26:56 2014 [1]PETSC ERROR: Libraries linked from /home/zhenyu/petsc-3.4.4/arch-linux2-c-opt/lib [1]PETSC ERROR: Configure run at Mon May 19 23:24:37 2014 [1]PETSC ERROR: Configure options --download-cmake=1 --download-fblaslapack=1 --download-f2cblaslapack=1 --download-fftw=1 --download-ptscotch=1 --download-ctetgen=1 --download-petsc4py=1 --download-ml=1 --download-parmetis=1 --download-metis=1 --download-superlu_dist=1 --download-hypre=1 --download-c2html=1 --download-generator=1 --download-fiat=1 --download-scientificpython=1 --download-sowing=1 --download-triangle=1 --download-chaco=1 --download-boost=1 --download-exodusii=1 --download-netcdf=1 --download-netcdf-shared=1 --download-hdf5=1 --download-moab-shared=1 --download-suitesparse=1 --with-mpi-dir=/home/zhenyu/deps/openmpi-1.6.5 --with-pthread=1 --with-valgrind=1 [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ex12 on a arch-linux2-c-opt named toshiba by zhenyu Tue May 20 20:26:56 2014 [0]PETSC ERROR: Libraries linked from /home/zhenyu/petsc-3.4.4/arch-linux2-c-opt/lib [0]PETSC ERROR: Configure run at Mon May 19 23:24:37 2014 [0]PETSC ERROR: Configure options --download-cmake=1 --download-fblaslapack=1 --download-f2cblaslapack=1 --download-fftw=1 --download-ptscotch=1 --download-ctetgen=1 --download-petsc4py=1 --download-ml=1 --download-parmetis=1 --download-metis=1 --download-superlu_dist=1 --download-hypre=1 --download-c2html=1 --download-generator=1 --download-fiat=1 --download-scientificpython=1 --download-sowing=1 --download-triangle=1 --download-chaco=1 --download-boost=1 --download-exodusii=1 --download-netcdf=1 --download-netcdf-shared=1 --download-hdf5=1 --download-moab-shared=1 --download-suitesparse=1 --with-mpi-dir=/home/zhenyu/deps/openmpi-1.6.5 --with-pthread=1 --with-valgrind=1 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 59. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun has exited due to process rank 1 with PID 3027 on node toshiba exiting improperly. There are two reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------- [toshiba:03025] 1 more process has sent help message help-mpi-api.txt / mpi-abort [toshiba:03025] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages Well, for a smooth compiling, I made two correction to ex12.c Line 195: options->fem.bcFuncs = (void (**)(const PetscReal[], PetscScalar *)) &options->exactFuncs; Line 574: void (*initialGuess[numComponents])(const PetscReal x[],PetscScalar* u); then generate ex12.h by PETSC_DIR=$HOME/petsc-3.4.4 DIM=2 ORDER=1 CASE=ex12 $PETSC_DIR/bin/pythonscripts/PetscGenerateFEMQuadrature.py \ $DIM $ORDER $DIM 1 laplacian \ $DIM $ORDER $DIM 1 boundary \ $PETSC_DIR/src/snes/examples/tutorials/$CASE.h Since I am still not fully master the machnism of PetscFEM, could anyone show me a proper way to run this demo? Many thanks. Zhenyu -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Tue May 20 08:13:40 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Tue, 20 May 2014 09:13:40 -0400 Subject: [petsc-users] Execution time of superlu_dist increase in multiprocessing In-Reply-To: <22c24f97a03440af9e5ab46fd6ff25e9@LUCKMAN.anl.gov> References: <22c24f97a03440af9e5ab46fd6ff25e9@LUCKMAN.anl.gov> Message-ID: ?? : You have to experiment to find out which package and options to give better performance. Run your code with '-help' to see available runtime options for mumps and superlu_dist. Then try different options. I would try different matrix ordering first. Different packages and solvers give different performance for an application. One cannot expect same performance. Hong > > I'm working on standard eigensolving with spectrum transform. I tried mumps > and superlu_dist for ST. But I found that when I run my program with more > process, execution time of mumps decrease, but time of superlu_dist > increase. Both of them are called by options like > > char common_options[] = "-st_ksp_type preonly -st_pc_type lu > -st_pc_factor_mat_solver_package mumps"; > > ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr); > > Shall I set more parameters to get benefit of parallel computing when using > superlu_dist? My mattype is mpiaij. > > Your sincerely > Guoxi From jed at jedbrown.org Tue May 20 08:17:27 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 20 May 2014 07:17:27 -0600 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: References: <87zjij6gzu.fsf@jedbrown.org> Message-ID: <87wqdgpp20.fsf@jedbrown.org> Matthew Knepley writes: >> array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); >> for (k = 0; k < dof; k++){ >> array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); >> } >> >> When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc >> complains because I am not passing it the right way or when it accepts it, >> wrong data is passed because the solution is not correct. Maybe Petsc >> expect dof*dof values and only dof are passed ? >> > > You can only pass contiguous memory to MatSetValues(). And, while perhaps atypical, VecGetArray2D will give you contiguous memory behind the scenes, so it would work in this case. (Make a Vec of the right size using COMM_SELF instead of malloc.) With C99, you can use VLA pointers to get the "2D indexing" without setting up explicit pointers. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From christophe.ortiz at ciemat.es Tue May 20 08:24:31 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Tue, 20 May 2014 15:24:31 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: <87wqdgpp20.fsf@jedbrown.org> References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> Message-ID: On Tue, May 20, 2014 at 3:17 PM, Jed Brown wrote: > Matthew Knepley writes: > >> array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); > >> for (k = 0; k < dof; k++){ > >> array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); > >> } > >> > >> When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc > >> complains because I am not passing it the right way or when it accepts > it, > >> wrong data is passed because the solution is not correct. Maybe Petsc > >> expect dof*dof values and only dof are passed ? > >> > > > > You can only pass contiguous memory to MatSetValues(). > > And, while perhaps atypical, VecGetArray2D will give you contiguous > memory behind the scenes, so it would work in this case. (Make a Vec of > the right size using COMM_SELF instead of malloc.) > > With C99, you can use VLA pointers to get the "2D indexing" without > setting up explicit pointers. > Since for some reasons I need global two-dimensional arrays, what I did is the following. I declared a PetscScalar **array outside main(), ie before dof is determined. Then, knowing dof I use malloc inside main() to allocate memory to the array. I use then array in different functions and in order to pass it to MatSetValues, I copy it to a local and classical two-dimensional array[dof][dof] (contiguous memory) which is passed to MatSetValues. It works. But I'll try with VecGetArray2D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue May 20 08:31:36 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 20 May 2014 07:31:36 -0600 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> Message-ID: <87tx8kpoef.fsf@jedbrown.org> Christophe Ortiz writes: > Since for some reasons I need global two-dimensional arrays, what I did is > the following. > I declared a PetscScalar **array outside main(), ie before dof is > determined. > Then, knowing dof I use malloc inside main() to allocate memory to the > array. I use then array in different functions and in order to pass it to > MatSetValues, I copy it to a local and classical two-dimensional > array[dof][dof] (contiguous memory) which is passed to MatSetValues. It > works. This sounds like a perverse way to structure your code, but if you insist... -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From christophe.ortiz at ciemat.es Tue May 20 08:34:26 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Tue, 20 May 2014 15:34:26 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: <87tx8kpoef.fsf@jedbrown.org> References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> <87tx8kpoef.fsf@jedbrown.org> Message-ID: On Tue, May 20, 2014 at 3:31 PM, Jed Brown wrote: > Christophe Ortiz writes: > > Since for some reasons I need global two-dimensional arrays, what I did > is > > the following. > > I declared a PetscScalar **array outside main(), ie before dof is > > determined. > > Then, knowing dof I use malloc inside main() to allocate memory to the > > array. I use then array in different functions and in order to pass it to > > MatSetValues, I copy it to a local and classical two-dimensional > > array[dof][dof] (contiguous memory) which is passed to MatSetValues. It > > works. > > This sounds like a perverse way to structure your code, but if you > insist... > Jeje. Just trying different options... -------------- next part -------------- An HTML attachment was scrubbed... URL: From christophe.ortiz at ciemat.es Tue May 20 09:03:37 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Tue, 20 May 2014 16:03:37 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: <87wqdgpp20.fsf@jedbrown.org> References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> Message-ID: On Tue, May 20, 2014 at 3:17 PM, Jed Brown wrote: > Matthew Knepley writes: > >> array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof); > >> for (k = 0; k < dof; k++){ > >> array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof); > >> } > >> > >> When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc > >> complains because I am not passing it the right way or when it accepts > it, > >> wrong data is passed because the solution is not correct. Maybe Petsc > >> expect dof*dof values and only dof are passed ? > >> > > > > You can only pass contiguous memory to MatSetValues(). > > And, while perhaps atypical, VecGetArray2D will give you contiguous > memory behind the scenes, so it would work in this case. (Make a Vec of > the right size using COMM_SELF instead of malloc.) > > With C99, you can use VLA pointers to get the "2D indexing" without > setting up explicit pointers. > Would the following be ok ? //Creation of vector X of size dof*dof: VecCreateSeq(PETSC_COMM_SELF,dof*dof,&X); // Using two-dimensional array style: PetscScalar *x; VecGetArray2d(X,dof,dof,0,0,&x); x[i][j] = ...; Is it ok ? Then, what should be passed to MatSetValuesBlocked() ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 20 09:20:06 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 20 May 2014 09:20:06 -0500 Subject: [petsc-users] How to run snes ex12 with petsc-3.4.4 In-Reply-To: <82f67e.5b78.146199d356a.Coremail.zyzhang@nuaa.edu.cn> References: <82f67e.5b78.146199d356a.Coremail.zyzhang@nuaa.edu.cn> Message-ID: On Tue, May 20, 2014 at 7:31 AM, Zhang wrote: > Dear All, > > I am trying the PetscFEM solver with petsc-3.4.4. > That is changing quickly since it is very new. Can you use 'master'? http://www.mcs.anl.gov/petsc/developers/index.html If you use that, Python is no longer required. Also, we will release very soon, so its not a waste. Thanks, Matt > But when I run snes/ex12, I always got run time errors. > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSCERROR: or try > http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > corruption errors > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try > http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [1]PETSC ERROR: [0]PETSC ERROR: Note: The EXACT line numbers in the stack > are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > Note: The EXACT line numbers in the stack are not available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] DMPlexProjectFunctionLocal line 230 > /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c > [1]PETSC ERROR: [1] DMPlexProjectFunction line 338 > /home/zhenyu/petsc-3.4.4/src/dm/impl s/plex/plexfem.c > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] DMPlexProjectFunctionLocal line 230 > /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c > [0]PETSC ERROR: [0] DMPlexProjectFunction line 338 > /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [1]PETSC ERROR: Signal received! > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [1]PETSC ERROR: See docs/changes/index.html for recent updates. > [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [1]PETSC ERROR: See docs/index.html for manual pages. > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: [0]PETSC ERROR: See docs/faq.html for hints about trouble > shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > ex12 on a arch-linux2-c-opt named toshiba by zhenyu Tue May 20 20:26:56 > 2014 > [1]PETSC ERROR: Libraries linked from > /home/zhenyu/petsc-3.4.4/arch-linux2-c-opt/lib > [1]PETSC ERROR: Configure run at Mon May 19 23:24:37 2014 > [1]PETSC ERROR: Configure options --download-cmake=1 > --download-fblaslapack=1 --download-f2cblaslapack=1 --download-fftw=1 > --download-ptscotch=1 --download-ctetgen=1 --download-petsc4py=1 > --download-ml=1 --download-parmetis=1 --download-metis=1 > --download-superlu_dist=1 --download-hypre=1 --download-c2html=1 > --download-generator=1 --download-fiat=1 --download-scientificpython=1 > --download-sowing=1 --download -triangle=1 --download-chaco=1 > --download-boost=1 --download-exodusii=1 --download-netcdf=1 > --download-netcdf-shared=1 --download-hdf5=1 --download-moab-shared=1 > --download-suitesparse=1 --with-mpi-dir=/home/zhenyu/deps/openmpi-1.6.5 > --with-pthread=1 --with-valgrind=1 > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ex12 on a arch-linux2-c-opt named toshiba by zhenyu Tue > May 20 20:26:56 2014 > [0]PETSC ERROR: Libraries linked from > /home/zhenyu/petsc-3.4.4/arch-linux2-c-opt/lib > [0]PETSC ERROR: Configure run at Mon May 19 23:24:37 2014 > [0]PETSC ERROR: Configure options --download-cmake=1 > --download-fblaslapack=1 --download-f2cblaslapack=1 --download-fftw=1 > --download-ptscotch=1 --download-ctetgen=1 --download-petsc4py=1 > --download-ml=1 --download-parmetis=1 --download-metis=1 > --download-superlu_dist=1 --download-hypre=1 --download-c2html=1 > --download-generator=1 --download-fiat=1 --download-scientificpython=1 > --download-sowing=1 --download-triangle=1 --download-chaco=1 > --download-boost=1 --download-exodusii=1 --download-netcdf=1 > --download-netcdf-shared=1 --download-hdf5=1 --download-moab-shared=1 > --download-suitesparse=1 --with-mpi-dir=/home/zhenyu/deps/openmpi-1.6.5 > --with-pthread=1 --with-valgrind=1 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD > with errorcode 59. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun has exited due to process rank 1 with PID 3027 on > node toshiba exiting improperly. There are two reasons this could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > This may have caused other processes in the application to be > terminated by signals sent by mpirun (as reported here). > ----------------------------------------------------------- --------------- > [toshiba:03025] 1 more process has sent help message help-mpi-api.txt / > mpi-abort > [toshiba:03025] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages > > > Well, for a smooth compiling, I made two correction to ex12.c > > Line 195: options->fem.bcFuncs = (void (**)(const PetscReal[], > PetscScalar *)) &options->exactFuncs; > > Line 574: void (*initialGuess[numComponents])(const PetscReal > x[],PetscScalar* u); > > then generate ex12.h by > > PETSC_DIR=$HOME/petsc-3.4.4 > DIM=2 > ORDER=1 > > CASE=ex12 > $PETSC_DIR/bin/pythonscripts/PetscGenerateFEMQuadrature.py \ > $DIM $ORDER $DIM 1 laplacian \ > $DIM $ORDER $DIM 1 boundary \ > $PETSC_DIR/src/snes/examples/tutorials/$CASE.h > > Since I am still not fully master the machnism of PetscFEM, > > could anyone show me a proper way to run this demo? Many thanks. > > > Zhenyu > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue May 20 09:25:00 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 20 May 2014 08:25:00 -0600 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> Message-ID: <87lhtwplxf.fsf@jedbrown.org> Christophe Ortiz writes: > Would the following be ok ? > > //Creation of vector X of size dof*dof: > VecCreateSeq(PETSC_COMM_SELF,dof*dof,&X); > > // Using two-dimensional array style: > PetscScalar *x; This needs to be PetscScalar **x; as you would have noticed if you tried to compile. > VecGetArray2d(X,dof,dof,0,0,&x); > > x[i][j] = ...; Yes. > Is it ok ? > Then, what should be passed to MatSetValuesBlocked() ? Since the array starts are (0,0), you can just pass &x[0][0]. Remember to call VecRestoreArray2d() and eventually VecDestroy(). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue May 20 10:20:54 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 20 May 2014 10:20:54 -0500 Subject: [petsc-users] Execution time of superlu_dist increase in multiprocessing In-Reply-To: References: Message-ID: <4E4F2903-433E-4630-B49C-BD3CA0F76A1F@mcs.anl.gov> The time of a direct solver depends on the specific algorithms used by the software and very importantly the nonzero structure of the matrix. We sometimes find that one package scales better than a different package on a particular matrix but then the other package works better on a different matrix. So this is not particularly surprising what you report. Barry On May 20, 2014, at 7:28 AM, ??? wrote: > Hi, > > I'm working on standard eigensolving with spectrum transform. I tried mumps and superlu_dist for ST. But I found that when I run my program with more process, execution time of mumps decrease, but time of superlu_dist increase. Both of them are called by options like > > char common_options[] = "-st_ksp_type preonly -st_pc_type lu -st_pc_factor_mat_solver_package mumps"; > > ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr); > > Shall I set more parameters to get benefit of parallel computing when using superlu_dist? My mattype is mpiaij. > > Your sincerely > Guoxi From danyang.su at gmail.com Tue May 20 13:31:57 2014 From: danyang.su at gmail.com (Danyang Su) Date: Tue, 20 May 2014 11:31:57 -0700 Subject: [petsc-users] Question about DMDA local vector and global vector Message-ID: <537B9F9D.2070609@gmail.com> Hi All, I use DMDA for a flow problem and found the local vector and global vector does not match for 2D and 3D problem when dof >1. For example, the mesh is as follows: |proc 1| proc 2 | proc 3 | |7 8 9|16 17 18|25 26 27| |4 5 6|13 14 15|22 23 24| |1 2 3|10 11 12|19 20 21| /The following functions are used to create DMDA object, global vector and local vector./ call DMDACreate2d(Petsc_Comm_World,DMDA_BOUNDARY_NONE, & DMDA_BOUNDARY_NONE, DMDA_STENCIL_BOX, & nvxgbl,nvzgbl,PETSC_DECIDE,PETSC_DECIDE, & dmda_flow%dof, dmda_flow%swidth, & PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & dmda_flow%da,ierr) call DMCreateGlobalVector(dmda_flow%da,x_flow,ierr) call VecDuplicate(x_flow,b_flow,ierr) call DMCreateLocalVector(dmda_flow%da,x_flow_loc,ierr) call VecDuplicate(x_flow_loc,b_flow_loc,ierr) /The following functions are used to compute the function (b_flow_loc)/ call VecGetArrayF90(b_flow_loc, vecpointer, ierr) vecpointer = (compute the values here...) call VecRestoreArrayF90(b_flow_loc,vecpointer,ierr) call DMLocalToGlobalBegin(dmda_flow%,b_flow_loc,INSERT_VALUES, & b_flow,ierr) call DMLocalToGlobalEnd(dmda_flow%,b_flow_loc,INSERT_VALUES, & b_flow,ierr) /The data of local vector b_flow_loc for proc1, proc2 and proc3 are as follows (just an example, without ghost value)/ proc 1 proc 2 proc 3 1 10 19 2 11 20 3 12 21 4 13 22 5 14 23 6 15 24 ... ... ... /But the global vector b_flow from Vecview shows that the data is stored as follows (left column). I thought the global vector b_flow is like the right column. Is anything wrong here?/ Process [0] Process [0] 1 1 2 2 3 3 10 4 11 5 12 6 ... ... Process [1] Process [1] 4 10 5 11 6 12 13 13 14 14 15 15 ... ... Process [2] Process [2] ... ... Though the data distribution is different from what I thought before, the code works well for 1D problem and most of the 2D and 3D problem, but failed in newton iteration for some 2D problem with dof > 1. I use KSP solver, not SNES solver at present. Thanks and regards, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 20 14:25:43 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 20 May 2014 14:25:43 -0500 Subject: [petsc-users] Question about DMDA local vector and global vector In-Reply-To: <537B9F9D.2070609@gmail.com> References: <537B9F9D.2070609@gmail.com> Message-ID: On Tue, May 20, 2014 at 1:31 PM, Danyang Su wrote: > Hi All, > > I use DMDA for a flow problem and found the local vector and global vector > does not match for 2D and 3D problem when dof >1. > > For example, the mesh is as follows: > > |proc 1| proc 2 | proc 3 | > |7 8 9|16 17 18|25 26 27| > |4 5 6|13 14 15|22 23 24| > |1 2 3|10 11 12|19 20 21| > > *The following functions are used to create DMDA object, global vector and > local vector.* > > call DMDACreate2d(Petsc_Comm_World,DMDA_BOUNDARY_NONE, & > DMDA_BOUNDARY_NONE, DMDA_STENCIL_BOX, & > nvxgbl,nvzgbl,PETSC_DECIDE,PETSC_DECIDE, & > dmda_flow%dof, dmda_flow%swidth, & > PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & > dmda_flow%da,ierr) > call DMCreateGlobalVector(dmda_flow%da,x_flow,ierr) > call VecDuplicate(x_flow,b_flow,ierr) > call DMCreateLocalVector(dmda_flow%da,x_flow_loc,ierr) > call VecDuplicate(x_flow_loc,b_flow_loc,ierr) > > *The following functions are used to compute the function (b_flow_loc)* > > call VecGetArrayF90(b_flow_loc, vecpointer, ierr) > vecpointer = (compute the values here...) > call VecRestoreArrayF90(b_flow_loc,vecpointer,ierr) > call DMLocalToGlobalBegin(dmda_flow%,b_flow_loc,INSERT_VALUES, & > b_flow,ierr) > call DMLocalToGlobalEnd(dmda_flow%,b_flow_loc,INSERT_VALUES, & > b_flow,ierr) > > > *The data of local vector b_flow_loc for proc1, proc2 and proc3 are as > follows (just an example, without ghost value)* > proc 1 proc 2 proc 3 > 1 10 19 > 2 11 20 > 3 12 21 > 4 13 22 > 5 14 23 > 6 15 24 > ... ... ... > > *But the global vector b_flow from Vecview shows that the data is stored > as follows (left column). I thought the global vector b_flow is like the > right column. Is anything wrong here?* > On output, the global vectors are automatically permuted to the natural ordering. Matt > > Process [0] Process [0] > 1 1 > 2 2 > 3 3 > 10 4 > 11 5 > 12 6 > ... ... > Process [1] Process [1] > 4 10 > 5 11 > 6 12 > 13 13 > 14 14 > 15 15 > ... ... > Process [2] Process [2] > ... ... > > Though the data distribution is different from what I thought before, the > code works well for 1D problem and most of the 2D and 3D problem, but > failed in newton iteration for some 2D problem with dof > 1. I use KSP > solver, not SNES solver at present. > > Thanks and regards, > > Danyang > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lb2653 at columbia.edu Tue May 20 16:33:18 2014 From: lb2653 at columbia.edu (Luc Berger-Vergiat) Date: Tue, 20 May 2014 17:33:18 -0400 Subject: [petsc-users] Error message for DMShell using MG preconditioner to solve S Message-ID: <537BCA1E.6010508@columbi.edu> Hi all, I am running an FEM simulation that uses Petsc as a linear solver. I am setting up ISs and pass them to a DMShell in order to use the FieldSplit capabilities of Petsc. When I pass the following options to Petsc: " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_0_fields 1,2 -pc_fieldsplit_1_fields 0 -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type ilu -fieldsplit_Field_0_ksp_type gmres -fieldsplit_Field_0_pc_type mg -malloc_log mlog -log_summary time.log" I get an error message: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [0]PETSC ERROR: Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector() [0]PETSC ERROR: See http://http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-5071-g1163a46 GIT Date: 2014-03-26 22:20:51 -0500 [0]PETSC ERROR: /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap on a arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014 [0]PETSC ERROR: Configure options --download-cmake --download-hypre --download-metis --download-mpich --download-parmetis --with-debugging=no --with-share-libraries=no [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in /home/luc/research/petsc/src/dm/impls/shell/dmshell.c [0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in /home/luc/research/petsc/src/dm/interface/dm.c [0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in /home/luc/research/petsc/src/dm/interface/dmget.c I am not really sure why this happens but it only happens when -fieldsplit_Field_0_pc_type mg, with other preconditioners, I have no problems. I attached the ksp_view in case that's any use. -- Best, Luc -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-16, divergence=1e+16 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=2000, cols=2000 package used to perform factorization: petsc total: nonzeros=40000, allocated nonzeros=40000 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=2000, cols=2000 total: nonzeros=40000, allocated nonzeros=40000 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_Field_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_Field_0_) 1 MPI processes type: mg MG: type is MULTIPLICATIVE, levels=1 cycles=v Cycles per PCApply=1 Not using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (fieldsplit_Field_0_mg_levels_0_) 1 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.937483, max = 10.3123 Chebyshev: estimated using: [0 0.1; 0 1.1] KSP Object: (fieldsplit_Field_0_mg_levels_0_est_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_Field_0_mg_levels_0_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1 linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_Field_0_) 1 MPI processes type: schurcomplement rows=209, cols=209 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_Field_0_) 1 MPI processes type: seqaij rows=209, cols=209 total: nonzeros=3209, allocated nonzeros=3209 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=209, cols=2000 total: nonzeros=14800, allocated nonzeros=14800 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 KSP of A00 KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=2000, cols=2000 package used to perform factorization: petsc total: nonzeros=40000, allocated nonzeros=40000 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=2000, cols=2000 total: nonzeros=40000, allocated nonzeros=40000 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=2000, cols=209 total: nonzeros=14800, allocated nonzeros=14800 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 Mat Object: 1 MPI processes type: seqaij rows=209, cols=209 total: nonzeros=3209, allocated nonzeros=3209 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_Field_0_mg_levels_0_) 1 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1 linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_Field_0_) 1 MPI processes type: schurcomplement rows=209, cols=209 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_Field_0_) 1 MPI processes type: seqaij rows=209, cols=209 total: nonzeros=3209, allocated nonzeros=3209 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=209, cols=2000 total: nonzeros=14800, allocated nonzeros=14800 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 KSP of A00 KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=2000, cols=2000 package used to perform factorization: petsc total: nonzeros=40000, allocated nonzeros=40000 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=2000, cols=2000 total: nonzeros=40000, allocated nonzeros=40000 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=2000, cols=209 total: nonzeros=14800, allocated nonzeros=14800 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 Mat Object: 1 MPI processes type: seqaij rows=209, cols=209 total: nonzeros=3209, allocated nonzeros=3209 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_Field_0_) 1 MPI processes type: schurcomplement rows=209, cols=209 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_Field_0_) 1 MPI processes type: seqaij rows=209, cols=209 total: nonzeros=3209, allocated nonzeros=3209 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=209, cols=2000 total: nonzeros=14800, allocated nonzeros=14800 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 KSP of A00 KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=2000, cols=2000 package used to perform factorization: petsc total: nonzeros=40000, allocated nonzeros=40000 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=2000, cols=2000 total: nonzeros=40000, allocated nonzeros=40000 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=2000, cols=209 total: nonzeros=14800, allocated nonzeros=14800 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 400 nodes, limit used is 5 Mat Object: 1 MPI processes type: seqaij rows=209, cols=209 total: nonzeros=3209, allocated nonzeros=3209 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 119 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=2209, cols=2209 total: nonzeros=72809, allocated nonzeros=72809 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 519 nodes, limit used is 5 From danyang.su at gmail.com Tue May 20 16:49:31 2014 From: danyang.su at gmail.com (Danyang Su) Date: Tue, 20 May 2014 14:49:31 -0700 Subject: [petsc-users] Question about DMDA local vector and global vector In-Reply-To: References: <537B9F9D.2070609@gmail.com> Message-ID: <537BCDEB.1070508@gmail.com> Hi Matthew, How about the matview output? Is this automatically permuted to the natural ordering too? Thanks, Danyang On 20/05/2014 12:25 PM, Matthew Knepley wrote: > On Tue, May 20, 2014 at 1:31 PM, Danyang Su > wrote: > > Hi All, > > I use DMDA for a flow problem and found the local vector and > global vector does not match for 2D and 3D problem when dof >1. > > For example, the mesh is as follows: > > |proc 1| proc 2 | proc 3 | > |7 8 9|16 17 18|25 26 27| > |4 5 6|13 14 15|22 23 24| > |1 2 3|10 11 12|19 20 21| > > /The following functions are used to create DMDA object, global > vector and local vector./ > > call DMDACreate2d(Petsc_Comm_World,DMDA_BOUNDARY_NONE, & > DMDA_BOUNDARY_NONE, > DMDA_STENCIL_BOX, & > nvxgbl,nvzgbl,PETSC_DECIDE,PETSC_DECIDE, & > dmda_flow%dof, > dmda_flow%swidth, & > PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, & > dmda_flow%da,ierr) > call DMCreateGlobalVector(dmda_flow%da,x_flow,ierr) > call VecDuplicate(x_flow,b_flow,ierr) > call DMCreateLocalVector(dmda_flow%da,x_flow_loc,ierr) > call VecDuplicate(x_flow_loc,b_flow_loc,ierr) > > /The following functions are used to compute the function > (b_flow_loc)/ > > call VecGetArrayF90(b_flow_loc, vecpointer, ierr) > vecpointer = (compute the values here...) > call VecRestoreArrayF90(b_flow_loc,vecpointer,ierr) > call DMLocalToGlobalBegin(dmda_flow%,b_flow_loc,INSERT_VALUES, & > b_flow,ierr) > call DMLocalToGlobalEnd(dmda_flow%,b_flow_loc,INSERT_VALUES, & > b_flow,ierr) > > > /The data of local vector b_flow_loc for proc1, proc2 and proc3 > are as follows (just an example, without ghost value)/ > proc 1 proc 2 proc 3 > 1 10 19 > 2 11 20 > 3 12 21 > 4 13 22 > 5 14 23 > 6 15 24 > ... ... ... > > /But the global vector b_flow from Vecview shows that the data is > stored as follows (left column). I thought the global vector > b_flow is like the right column. Is anything wrong here?/ > > > On output, the global vectors are automatically permuted to the > natural ordering. > > Matt > > > Process [0] Process [0] > 1 1 > 2 2 > 3 3 > 10 4 > 11 5 > 12 6 > ... ... > Process [1] Process [1] > 4 10 > 5 11 > 6 12 > 13 13 > 14 14 > 15 15 > ... ... > Process [2] Process [2] > ... ... > > Though the data distribution is different from what I thought > before, the code works well for 1D problem and most of the 2D and > 3D problem, but failed in newton iteration for some 2D problem > with dof > 1. I use KSP solver, not SNES solver at present. > > Thanks and regards, > > Danyang > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue May 20 16:54:14 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 20 May 2014 15:54:14 -0600 Subject: [petsc-users] Question about DMDA local vector and global vector In-Reply-To: <537BCDEB.1070508@gmail.com> References: <537B9F9D.2070609@gmail.com> <537BCDEB.1070508@gmail.com> Message-ID: <8738g4nmk9.fsf@jedbrown.org> Danyang Su writes: > Hi Matthew, > > How about the matview output? Is this automatically permuted to the > natural ordering too? Yes. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From stali at geology.wisc.edu Tue May 20 18:16:56 2014 From: stali at geology.wisc.edu (Tabrez Ali) Date: Tue, 20 May 2014 18:16:56 -0500 Subject: [petsc-users] Error message for DMShell using MG preconditioner to solve S In-Reply-To: <537BCA1E.6010508@columbi.edu> References: <537BCA1E.6010508@columbi.edu> Message-ID: <537BE268.5080806@geology.wisc.edu> I saw a similar error sometime back while fooling around with fieldsplit. Can you update petsc-dev (git-pull), rebuild and try again. T On 05/20/2014 04:33 PM, Luc Berger-Vergiat wrote: > Hi all, > I am running an FEM simulation that uses Petsc as a linear solver. > I am setting up ISs and pass them to a DMShell in order to use the > FieldSplit capabilities of Petsc. > > When I pass the following options to Petsc: > > " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur > -pc_fieldsplit_schur_factorization_type full > -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_0_fields > 1,2 -pc_fieldsplit_1_fields 0 -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type ilu -fieldsplit_Field_0_ksp_type gmres > -fieldsplit_Field_0_pc_type mg -malloc_log mlog -log_summary time.log" > > I get an error message: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: > [0]PETSC ERROR: Must call DMShellSetGlobalVector() or > DMShellSetCreateGlobalVector() > [0]PETSC ERROR: See > http://http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble > shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-5071-g1163a46 > GIT Date: 2014-03-26 22:20:51 -0500 > [0]PETSC ERROR: > /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap on a > arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014 > [0]PETSC ERROR: Configure options --download-cmake --download-hypre > --download-metis --download-mpich --download-parmetis > --with-debugging=no --with-share-libraries=no > [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in > /home/luc/research/petsc/src/dm/impls/shell/dmshell.c > [0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in > /home/luc/research/petsc/src/dm/interface/dm.c > [0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in > /home/luc/research/petsc/src/dm/interface/dmget.c > > I am not really sure why this happens but it only happens when > -fieldsplit_Field_0_pc_type mg, with other preconditioners, I have no > problems. I attached the ksp_view in case that's any use. > -- > Best, > Luc -- No one trusts a model except the one who wrote it; Everyone trusts an observation except the one who made it - Harlow Shapley -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue May 20 20:14:16 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 20 May 2014 20:14:16 -0500 Subject: [petsc-users] Error message for DMShell using MG preconditioner to solve S In-Reply-To: <537BCA1E.6010508@columbi.edu> References: <537BCA1E.6010508@columbi.edu> Message-ID: On Tue, May 20, 2014 at 4:33 PM, Luc Berger-Vergiat wrote: > Hi all, > I am running an FEM simulation that uses Petsc as a linear solver. > I am setting up ISs and pass them to a DMShell in order to use the > FieldSplit capabilities of Petsc. > > When I pass the following options to Petsc: > > " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur > -pc_fieldsplit_schur_factorization_type full > -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_0_fields 1,2 > -pc_fieldsplit_1_fields 0 -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type ilu -fieldsplit_Field_0_ksp_type gmres > -fieldsplit_Field_0_pc_type mg -malloc_log mlog -log_summary time.log" > > I get an error message: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: > [0]PETSC ERROR: Must call DMShellSetGlobalVector() or > DMShellSetCreateGlobalVector() > [0]PETSC ERROR: See > http://http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble > shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-5071-g1163a46 GIT > Date: 2014-03-26 22:20:51 -0500 > [0]PETSC ERROR: /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap > on a arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014 > [0]PETSC ERROR: Configure options --download-cmake --download-hypre > --download-metis --download-mpich --download-parmetis --with-debugging=no > --with-share-libraries=no > [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in > /home/luc/research/petsc/src/dm/impls/shell/dmshell.c > [0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in > /home/luc/research/petsc/src/dm/interface/dm.c > [0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in > /home/luc/research/petsc/src/dm/interface/dmget.c > Always always always give the entire error message. Matt > I am not really sure why this happens but it only happens when > -fieldsplit_Field_0_pc_type mg, with other preconditioners, I have no > problems. I attached the ksp_view in case that's any use. > > -- > Best, > Luc > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From christophe.ortiz at ciemat.es Wed May 21 08:18:06 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Wed, 21 May 2014 15:18:06 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: <87lhtwplxf.fsf@jedbrown.org> References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> <87lhtwplxf.fsf@jedbrown.org> Message-ID: On Tue, May 20, 2014 at 4:25 PM, Jed Brown wrote: > Christophe Ortiz writes: > > Would the following be ok ? > > > > //Creation of vector X of size dof*dof: > > VecCreateSeq(PETSC_COMM_SELF,dof*dof,&X); > > > > // Using two-dimensional array style: > > PetscScalar *x; > > This needs to be > > PetscScalar **x; > > as you would have noticed if you tried to compile. > > > VecGetArray2d(X,dof,dof,0,0,&x); > > > > x[i][j] = ...; > > Yes. > > > Is it ok ? > > Then, what should be passed to MatSetValuesBlocked() ? > > Since the array starts are (0,0), you can just pass &x[0][0]. > > Remember to call VecRestoreArray2d() and eventually VecDestroy(). > I tried and it works. The advantage is that it avoids setting up and using pointers. However, I found out that it is significantly slower than using explicit pointers of pointers **. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed May 21 08:21:47 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 21 May 2014 07:21:47 -0600 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> <87lhtwplxf.fsf@jedbrown.org> Message-ID: <87egznmfmc.fsf@jedbrown.org> Christophe Ortiz writes: > I tried and it works. The advantage is that it avoids setting up and > using pointers. However, I found out that it is significantly slower > than using explicit pointers of pointers **. Are you creating and destroying in an inner loop? What gets set up is the same, and it's fewer allocations than what you were doing with many calls to malloc. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From christophe.ortiz at ciemat.es Wed May 21 08:31:47 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Wed, 21 May 2014 15:31:47 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: <87egznmfmc.fsf@jedbrown.org> References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> <87lhtwplxf.fsf@jedbrown.org> <87egznmfmc.fsf@jedbrown.org> Message-ID: On Wed, May 21, 2014 at 3:21 PM, Jed Brown wrote: > Christophe Ortiz writes: > > I tried and it works. The advantage is that it avoids setting up and > > using pointers. However, I found out that it is significantly slower > > than using explicit pointers of pointers **. > > Are you creating and destroying in an inner loop? In some sense, yes. I create and destroy inside FormIJacobian() (my Jacobian evaluation routine). Therefore it is called at each timestep. I guess this takes time. But it is slower than doing the many malloc. How can I create a global vector that would be passed to FormIJacobian() ? Creating it only once instead of doing it at each timestep would save time. I need to use this vector (size dof*dof) with classes and methods inside FormIJacobian() to calculate the different blocks that are passed to the Jacobian with MatSetValuesBlocked(). However, I cannot pass it as argument of FormIJacobian() since there is no room for it in the arguments. > What gets set up is > the same, and it's fewer allocations than what you were doing with many > calls to malloc. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed May 21 08:44:13 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 21 May 2014 07:44:13 -0600 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> <87lhtwplxf.fsf@jedbrown.org> <87egznmfmc.fsf@jedbrown.org> Message-ID: <87a9abmeky.fsf@jedbrown.org> Christophe Ortiz writes: > In some sense, yes. I create and destroy inside FormIJacobian() (my > Jacobian evaluation routine). Therefore it is called at each timestep. I > guess this takes time. But it is slower than doing the many malloc. What communicator (you should use VecCreateSeq)? Be sure to profile a configure --with-debugging=0. How many elements do you have on each process? How big are the elements? > How can I create a global vector that would be passed to FormIJacobian() ? > Creating it only once instead of doing it at each timestep would save time. You can/should always put this stuff in the user context (which comes in via the last argument). > I need to use this vector (size dof*dof) with classes and methods inside > FormIJacobian() to calculate the different blocks that are passed to the > Jacobian with MatSetValuesBlocked(). However, I cannot pass it as argument > of FormIJacobian() since there is no room for it in the arguments. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From christophe.ortiz at ciemat.es Wed May 21 08:49:58 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Wed, 21 May 2014 15:49:58 +0200 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: <87a9abmeky.fsf@jedbrown.org> References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> <87lhtwplxf.fsf@jedbrown.org> <87egznmfmc.fsf@jedbrown.org> <87a9abmeky.fsf@jedbrown.org> Message-ID: On Wed, May 21, 2014 at 3:44 PM, Jed Brown wrote: > Christophe Ortiz writes: > > In some sense, yes. I create and destroy inside FormIJacobian() (my > > Jacobian evaluation routine). Therefore it is called at each timestep. I > > guess this takes time. But it is slower than doing the many malloc. > > What communicator (you should use VecCreateSeq)? I used: VecCreateSeq(PETSC_COMM_SELF,dof*dof,&X); inside FormIJacobian() > Be sure to profile a > configure --with-debugging=0. > > How many elements do you have on each process? How big are the > elements? > For the moment, dof is small (dof=4). Still doing some tests with the classes and methods. But should reach 1000-10000 in production. > > > How can I create a global vector that would be passed to FormIJacobian() > ? > > Creating it only once instead of doing it at each timestep would save > time. > > You can/should always put this stuff in the user context (which comes in > via the last argument). > Ahhh...did not think about it ! This would allow to create the vector only once in the main (after dof is determined) and pass it as argument to FormIJacobian(). I will try. Thanks ! > > > I need to use this vector (size dof*dof) with classes and methods inside > > FormIJacobian() to calculate the different blocks that are passed to the > > Jacobian with MatSetValuesBlocked(). However, I cannot pass it as > argument > > of FormIJacobian() since there is no room for it in the arguments. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed May 21 08:55:40 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 21 May 2014 07:55:40 -0600 Subject: [petsc-users] Memory corruption with two-dimensional array and PetscMemzero In-Reply-To: References: <87zjij6gzu.fsf@jedbrown.org> <87wqdgpp20.fsf@jedbrown.org> <87lhtwplxf.fsf@jedbrown.org> <87egznmfmc.fsf@jedbrown.org> <87a9abmeky.fsf@jedbrown.org> Message-ID: <874n0jme1v.fsf@jedbrown.org> Christophe Ortiz writes: > For the moment, dof is small (dof=4). Still doing some tests with the > classes and methods. But should reach 1000-10000 in production. This is a huge difference. It's a waste of time to profile cases that don't matter, so try to profile the cases that matter (even if the "physics" is mocked). Note that for 4x4, you can use PetscScalar emat[4][4]; > Ahhh...did not think about it ! This would allow to create the vector only > once in the main (after dof is determined) and pass it as argument to > FormIJacobian(). I will try. Thanks ! Yes, that's what the context is for. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From lb2653 at columbia.edu Wed May 21 14:09:23 2014 From: lb2653 at columbia.edu (Luc Berger-Vergiat) Date: Wed, 21 May 2014 15:09:23 -0400 Subject: [petsc-users] Error message for DMShell using MG preconditioner to solve S In-Reply-To: References: <537BCA1E.6010508@columbi.edu> Message-ID: <537CF9E3.4010908@columbi.edu> So I just pulled an updated version of petsc-dev today (I switched from the *next* branch to the *master* branch due to some compilation error existing with the last commit on *next*). I still have the same error and I believe this is the whole error message I have. I mean I am running multiple time steps for my simulation so I have the same message at each time step, but I don't think that it is important to report these duplicates, is it? Best, Luc On 05/20/2014 09:14 PM, Matthew Knepley wrote: > > On Tue, May 20, 2014 at 4:33 PM, Luc Berger-Vergiat > > wrote: > > Hi all, > I am running an FEM simulation that uses Petsc as a linear solver. > I am setting up ISs and pass them to a DMShell in order to use the > FieldSplit capabilities of Petsc. > > When I pass the following options to Petsc: > > " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type > schur -pc_fieldsplit_schur_factorization_type full > -pc_fieldsplit_schur_precondition selfp > -pc_fieldsplit_0_fields 1,2 -pc_fieldsplit_1_fields 0 > -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type ilu > -fieldsplit_Field_0_ksp_type gmres -fieldsplit_Field_0_pc_type > mg -malloc_log mlog -log_summary time.log" > > I get an error message: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: > [0]PETSC ERROR: Must call DMShellSetGlobalVector() or > DMShellSetCreateGlobalVector() > [0]PETSC ERROR: See > http://http://www.mcs.anl.gov/petsc/documentation/faq.html for > trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: > v3.4.4-5071-g1163a46 GIT Date: 2014-03-26 22:20:51 -0500 > [0]PETSC ERROR: > /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap on a > arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014 > [0]PETSC ERROR: Configure options --download-cmake > --download-hypre --download-metis --download-mpich > --download-parmetis --with-debugging=no --with-share-libraries=no > [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in > /home/luc/research/petsc/src/dm/impls/shell/dmshell.c > [0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in > /home/luc/research/petsc/src/dm/interface/dm.c > [0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in > /home/luc/research/petsc/src/dm/interface/dmget.c > > > Always always always give the entire error message. > > Matt > > I am not really sure why this happens but it only happens when > -fieldsplit_Field_0_pc_type mg, with other preconditioners, I have > no problems. I attached the ksp_view in case that's any use. > > -- > Best, > Luc > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From stali at geology.wisc.edu Wed May 21 16:48:51 2014 From: stali at geology.wisc.edu (Tabrez Ali) Date: Wed, 21 May 2014 16:48:51 -0500 Subject: [petsc-users] Valgrind unhandled instruction Message-ID: <537D1F43.8010507@geology.wisc.edu> Hello With petsc-dev I get the following error with my own code and also with ex56 as shown below. Both run fine otherwise. This is with Valgrind 3.7 (in Debian stable). Is this a PETSc or Valgrind issue? T stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ valgrind ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace ==16123== Memcheck, a memory error detector ==16123== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==16123== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==16123== Command: ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace ==16123== vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x38 0x39 ==16123== valgrind: Unrecognised instruction at address 0x4228928. ==16123== at 0x4228928: ISCreateGeneral_Private (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) ==16123== by 0x4228D54: ISGeneralSetIndices_General (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) ==16123== by 0x4229504: ISGeneralSetIndices (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) ==16123== by 0x422976F: ISCreateGeneral (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) ==16123== by 0x4A94CAA: PCGAMGCoarsen_AGG (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) ==16123== by 0x4A84FD6: PCSetUp_GAMG (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) ==16123== by 0x49E8163: PCSetUp (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) ==16123== by 0x4AE6023: KSPSetUp (in /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) ==16123== by 0x804C7D6: main (in /home/stali/petsc-dev/src/ksp/ksp/examples/tutorials/ex56) ==16123== Your program just tried to execute an instruction that Valgrind ==16123== did not recognise. There are two possible reasons for this. ==16123== 1. Your program has a bug and erroneously jumped to a non-code ==16123== location. If you are running Memcheck and you just saw a ==16123== warning about a bad jump, it's probably your program's fault. ==16123== 2. The instruction is legitimate but Valgrind doesn't handle it, ==16123== i.e. it's Valgrind's fault. If you think this is the case or ==16123== you are not sure, please let us know and we'll try to fix it. ==16123== Either way, Valgrind will now raise a SIGILL signal which will ==16123== probably kill your program. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 4 Illegal instruction: Likely due to memory corruption [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] ISCreateGeneral_Private line 575 /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c [0]PETSC ERROR: [0] ISGeneralSetIndices_General line 674 /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c [0]PETSC ERROR: [0] ISGeneralSetIndices line 662 /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c [0]PETSC ERROR: [0] ISCreateGeneral line 631 /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c [0]PETSC ERROR: [0] PCGAMGCoarsen_AGG line 976 /home/stali/petsc-dev/src/ksp/pc/impls/gamg/agg.c [0]PETSC ERROR: [0] PCSetUp_GAMG line 487 /home/stali/petsc-dev/src/ksp/pc/impls/gamg/gamg.c [0]PETSC ERROR: [0] KSPSetUp line 219 /home/stali/petsc-dev/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4344-ge0d8a6f GIT Date: 2014-05-21 16:02:44 -0500 [0]PETSC ERROR: ./ex56 on a arch-linux2-c-debug named i5 by stali Wed May 21 16:41:07 2014 [0]PETSC ERROR: Configure options --with-fc=gfortran --with-cc=gcc --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 -march=native" --with-shared-libraries --with-debugging=1 [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 ==16123== ==16123== HEAP SUMMARY: ==16123== in use at exit: 4,627,684 bytes in 1,188 blocks ==16123== total heap usage: 1,649 allocs, 461 frees, 6,073,192 bytes allocated ==16123== ==16123== LEAK SUMMARY: ==16123== definitely lost: 0 bytes in 0 blocks ==16123== indirectly lost: 0 bytes in 0 blocks ==16123== possibly lost: 0 bytes in 0 blocks ==16123== still reachable: 4,627,684 bytes in 1,188 blocks ==16123== suppressed: 0 bytes in 0 blocks ==16123== Rerun with --leak-check=full to see details of leaked memory ==16123== ==16123== For counts of detected and suppressed errors, rerun with: -v ==16123== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 61 from 8) From balay at mcs.anl.gov Wed May 21 17:27:53 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 21 May 2014 17:27:53 -0500 Subject: [petsc-users] Valgrind unhandled instruction In-Reply-To: <537D1F43.8010507@geology.wisc.edu> References: <537D1F43.8010507@geology.wisc.edu> Message-ID: Looks like valgrind-3.7 doesn't know all instructions generated by "-O3 -march=native". And generally one should run valgrind with code compiled with '-g' anyway. I see similar issue with valgrind-3.7 - but the error goes away with valgrind-3.9 [compiled from source] Satish ----------- balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --version valgrind-3.7.0 balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --tool=memcheck -q ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace vex amd64->IR: unhandled instruction bytes: 0xC5 0xFB 0x2A 0xC2 0xBA 0x1 0x0 0x0 ==19041== valgrind: Unrecognised instruction at address 0x4ef760e. ==19041== at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) ==19041== by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) ==19041== by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) ==19041== by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56) ==19041== Your program just tried to execute an instruction that Valgrind ==19041== did not recognise. There are two possible reasons for this. ==19041== 1. Your program has a bug and erroneously jumped to a non-code ==19041== location. If you are running Memcheck and you just saw a ==19041== warning about a bad jump, it's probably your program's fault. ==19041== 2. The instruction is legitimate but Valgrind doesn't handle it, ==19041== i.e. it's Valgrind's fault. If you think this is the case or ==19041== you are not sure, please let us know and we'll try to fix it. ==19041== Either way, Valgrind will now raise a SIGILL signal which will ==19041== probably kill your program. ==19041== ==19041== Process terminating with default action of signal 4 (SIGILL) ==19041== Illegal opcode at address 0x4EF760E ==19041== at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) ==19041== by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) ==19041== by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) ==19041== by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56) Illegal instruction (core dumped) balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --version valgrind-3.9.0 balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --tool=memcheck -q ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace 0 KSP Residual norm 740.547 1 KSP Residual norm 104.004 2 KSP Residual norm 79.1334 3 KSP Residual norm 50.0497 4 KSP Residual norm 4.40859 5 KSP Residual norm 1.56451 6 KSP Residual norm 0.601773 7 KSP Residual norm 0.225864 8 KSP Residual norm 0.0122203 9 KSP Residual norm 0.00290625 0 KSP Residual norm 0.00740547 1 KSP Residual norm 0.00104004 2 KSP Residual norm 0.000791334 3 KSP Residual norm 0.000500497 4 KSP Residual norm 4.40859e-05 5 KSP Residual norm 1.56451e-05 6 KSP Residual norm 6.01773e-06 7 KSP Residual norm 2.25864e-06 8 KSP Residual norm 1.22203e-07 9 KSP Residual norm 2.90625e-08 0 KSP Residual norm 7.40547e-08 1 KSP Residual norm 1.04004e-08 2 KSP Residual norm 7.91334e-09 3 KSP Residual norm 5.00497e-09 4 KSP Residual norm 4.409e-10 5 KSP Residual norm 1.565e-10 6 KSP Residual norm 6.018e-11 7 KSP Residual norm 2.259e-11 8 KSP Residual norm < 1.e-11 9 KSP Residual norm < 1.e-11 [0]main |b-Ax|/|b|=6.068344e-05, |b|=5.391826e+00, emax=9.964453e-01 balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ On Wed, 21 May 2014, Tabrez Ali wrote: > Hello > > With petsc-dev I get the following error with my own code and also with ex56 > as shown below. Both run fine otherwise. This is with Valgrind 3.7 (in Debian > stable). > > Is this a PETSc or Valgrind issue? > > T > > stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ valgrind ./ex56 -ne 9 > -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 > -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves > -ksp_monitor_short -use_mat_nearnullspace > ==16123== Memcheck, a memory error detector > ==16123== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. > ==16123== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info > ==16123== Command: ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg > -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 > -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short > -use_mat_nearnullspace > ==16123== > vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x38 0x39 > ==16123== valgrind: Unrecognised instruction at address 0x4228928. > ==16123== at 0x4228928: ISCreateGeneral_Private (in > /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) > ==16123== by 0x4228D54: ISGeneralSetIndices_General (in > /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) > ==16123== by 0x4229504: ISGeneralSetIndices (in > /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) > ==16123== by 0x422976F: ISCreateGeneral (in > /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) > ==16123== by 0x4A94CAA: PCGAMGCoarsen_AGG (in > /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) > ==16123== by 0x4A84FD6: PCSetUp_GAMG (in > /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) > ==16123== by 0x49E8163: PCSetUp (in > /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) > ==16123== by 0x4AE6023: KSPSetUp (in > /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) > ==16123== by 0x804C7D6: main (in > /home/stali/petsc-dev/src/ksp/ksp/examples/tutorials/ex56) > ==16123== Your program just tried to execute an instruction that Valgrind > ==16123== did not recognise. There are two possible reasons for this. > ==16123== 1. Your program has a bug and erroneously jumped to a non-code > ==16123== location. If you are running Memcheck and you just saw a > ==16123== warning about a bad jump, it's probably your program's fault. > ==16123== 2. The instruction is legitimate but Valgrind doesn't handle it, > ==16123== i.e. it's Valgrind's fault. If you think this is the case or > ==16123== you are not sure, please let us know and we'll try to fix it. > ==16123== Either way, Valgrind will now raise a SIGILL signal which will > ==16123== probably kill your program. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 4 Illegal instruction: Likely due to > memory corruption > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or > try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] ISCreateGeneral_Private line 575 > /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c > [0]PETSC ERROR: [0] ISGeneralSetIndices_General line 674 > /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c > [0]PETSC ERROR: [0] ISGeneralSetIndices line 662 > /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c > [0]PETSC ERROR: [0] ISCreateGeneral line 631 > /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c > [0]PETSC ERROR: [0] PCGAMGCoarsen_AGG line 976 > /home/stali/petsc-dev/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: [0] PCSetUp_GAMG line 487 > /home/stali/petsc-dev/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: [0] KSPSetUp line 219 > /home/stali/petsc-dev/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for > trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4344-ge0d8a6f GIT > Date: 2014-05-21 16:02:44 -0500 > [0]PETSC ERROR: ./ex56 on a arch-linux2-c-debug named i5 by stali Wed May 21 > 16:41:07 2014 > [0]PETSC ERROR: Configure options --with-fc=gfortran --with-cc=gcc > --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS="-O3 > -march=native" --FOPTFLAGS="-O3 -march=native" --with-shared-libraries > --with-debugging=1 > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > ==16123== > ==16123== HEAP SUMMARY: > ==16123== in use at exit: 4,627,684 bytes in 1,188 blocks > ==16123== total heap usage: 1,649 allocs, 461 frees, 6,073,192 bytes > allocated > ==16123== > ==16123== LEAK SUMMARY: > ==16123== definitely lost: 0 bytes in 0 blocks > ==16123== indirectly lost: 0 bytes in 0 blocks > ==16123== possibly lost: 0 bytes in 0 blocks > ==16123== still reachable: 4,627,684 bytes in 1,188 blocks > ==16123== suppressed: 0 bytes in 0 blocks > ==16123== Rerun with --leak-check=full to see details of leaked memory > ==16123== > ==16123== For counts of detected and suppressed errors, rerun with: -v > ==16123== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 61 from 8) > > From stali at geology.wisc.edu Wed May 21 17:57:52 2014 From: stali at geology.wisc.edu (Tabrez Ali) Date: Wed, 21 May 2014 17:57:52 -0500 Subject: [petsc-users] Valgrind unhandled instruction In-Reply-To: References: <537D1F43.8010507@geology.wisc.edu> Message-ID: <537D2F70.5000703@geology.wisc.edu> Sorry I missed the flags. Thanks for the clarification. Tabrez On 05/21/2014 05:27 PM, Satish Balay wrote: > Looks like valgrind-3.7 doesn't know all instructions generated by > "-O3 -march=native". > > And generally one should run valgrind with code compiled with '-g' > anyway. > > I see similar issue with valgrind-3.7 - but the error goes away with > valgrind-3.9 [compiled from source] > > Satish > > ----------- > > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --version > valgrind-3.7.0 > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --tool=memcheck -q ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace > vex amd64->IR: unhandled instruction bytes: 0xC5 0xFB 0x2A 0xC2 0xBA 0x1 0x0 0x0 > ==19041== valgrind: Unrecognised instruction at address 0x4ef760e. > ==19041== at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) > ==19041== by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) > ==19041== by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) > ==19041== by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56) > ==19041== Your program just tried to execute an instruction that Valgrind > ==19041== did not recognise. There are two possible reasons for this. > ==19041== 1. Your program has a bug and erroneously jumped to a non-code > ==19041== location. If you are running Memcheck and you just saw a > ==19041== warning about a bad jump, it's probably your program's fault. > ==19041== 2. The instruction is legitimate but Valgrind doesn't handle it, > ==19041== i.e. it's Valgrind's fault. If you think this is the case or > ==19041== you are not sure, please let us know and we'll try to fix it. > ==19041== Either way, Valgrind will now raise a SIGILL signal which will > ==19041== probably kill your program. > ==19041== > ==19041== Process terminating with default action of signal 4 (SIGILL) > ==19041== Illegal opcode at address 0x4EF760E > ==19041== at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) > ==19041== by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) > ==19041== by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4) > ==19041== by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56) > Illegal instruction (core dumped) > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --version > valgrind-3.9.0 > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --tool=memcheck -q ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace > 0 KSP Residual norm 740.547 > 1 KSP Residual norm 104.004 > 2 KSP Residual norm 79.1334 > 3 KSP Residual norm 50.0497 > 4 KSP Residual norm 4.40859 > 5 KSP Residual norm 1.56451 > 6 KSP Residual norm 0.601773 > 7 KSP Residual norm 0.225864 > 8 KSP Residual norm 0.0122203 > 9 KSP Residual norm 0.00290625 > 0 KSP Residual norm 0.00740547 > 1 KSP Residual norm 0.00104004 > 2 KSP Residual norm 0.000791334 > 3 KSP Residual norm 0.000500497 > 4 KSP Residual norm 4.40859e-05 > 5 KSP Residual norm 1.56451e-05 > 6 KSP Residual norm 6.01773e-06 > 7 KSP Residual norm 2.25864e-06 > 8 KSP Residual norm 1.22203e-07 > 9 KSP Residual norm 2.90625e-08 > 0 KSP Residual norm 7.40547e-08 > 1 KSP Residual norm 1.04004e-08 > 2 KSP Residual norm 7.91334e-09 > 3 KSP Residual norm 5.00497e-09 > 4 KSP Residual norm 4.409e-10 > 5 KSP Residual norm 1.565e-10 > 6 KSP Residual norm 6.018e-11 > 7 KSP Residual norm 2.259e-11 > 8 KSP Residual norm< 1.e-11 > 9 KSP Residual norm< 1.e-11 > [0]main |b-Ax|/|b|=6.068344e-05, |b|=5.391826e+00, emax=9.964453e-01 > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ > > > On Wed, 21 May 2014, Tabrez Ali wrote: > >> Hello >> >> With petsc-dev I get the following error with my own code and also with ex56 >> as shown below. Both run fine otherwise. This is with Valgrind 3.7 (in Debian >> stable). >> >> Is this a PETSc or Valgrind issue? >> >> T >> >> stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ valgrind ./ex56 -ne 9 >> -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 >> -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves >> -ksp_monitor_short -use_mat_nearnullspace >> ==16123== Memcheck, a memory error detector >> ==16123== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. >> ==16123== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info >> ==16123== Command: ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg >> -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 >> -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short >> -use_mat_nearnullspace >> ==16123== >> vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x38 0x39 >> ==16123== valgrind: Unrecognised instruction at address 0x4228928. >> ==16123== at 0x4228928: ISCreateGeneral_Private (in >> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) >> ==16123== by 0x4228D54: ISGeneralSetIndices_General (in >> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) >> ==16123== by 0x4229504: ISGeneralSetIndices (in >> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) >> ==16123== by 0x422976F: ISCreateGeneral (in >> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) >> ==16123== by 0x4A94CAA: PCGAMGCoarsen_AGG (in >> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) >> ==16123== by 0x4A84FD6: PCSetUp_GAMG (in >> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) >> ==16123== by 0x49E8163: PCSetUp (in >> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) >> ==16123== by 0x4AE6023: KSPSetUp (in >> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4) >> ==16123== by 0x804C7D6: main (in >> /home/stali/petsc-dev/src/ksp/ksp/examples/tutorials/ex56) >> ==16123== Your program just tried to execute an instruction that Valgrind >> ==16123== did not recognise. There are two possible reasons for this. >> ==16123== 1. Your program has a bug and erroneously jumped to a non-code >> ==16123== location. If you are running Memcheck and you just saw a >> ==16123== warning about a bad jump, it's probably your program's fault. >> ==16123== 2. The instruction is legitimate but Valgrind doesn't handle it, >> ==16123== i.e. it's Valgrind's fault. If you think this is the case or >> ==16123== you are not sure, please let us know and we'll try to fix it. >> ==16123== Either way, Valgrind will now raise a SIGILL signal which will >> ==16123== probably kill your program. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 4 Illegal instruction: Likely due to >> memory corruption >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or >> try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory >> corruption errors >> [0]PETSC ERROR: likely location of problem given in stack below >> [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [0]PETSC ERROR: INSTEAD the line number of the start of the function >> [0]PETSC ERROR: is given. >> [0]PETSC ERROR: [0] ISCreateGeneral_Private line 575 >> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c >> [0]PETSC ERROR: [0] ISGeneralSetIndices_General line 674 >> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c >> [0]PETSC ERROR: [0] ISGeneralSetIndices line 662 >> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c >> [0]PETSC ERROR: [0] ISCreateGeneral line 631 >> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c >> [0]PETSC ERROR: [0] PCGAMGCoarsen_AGG line 976 >> /home/stali/petsc-dev/src/ksp/pc/impls/gamg/agg.c >> [0]PETSC ERROR: [0] PCSetUp_GAMG line 487 >> /home/stali/petsc-dev/src/ksp/pc/impls/gamg/gamg.c >> [0]PETSC ERROR: [0] KSPSetUp line 219 >> /home/stali/petsc-dev/src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Signal received >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for >> trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4344-ge0d8a6f GIT >> Date: 2014-05-21 16:02:44 -0500 >> [0]PETSC ERROR: ./ex56 on a arch-linux2-c-debug named i5 by stali Wed May 21 >> 16:41:07 2014 >> [0]PETSC ERROR: Configure options --with-fc=gfortran --with-cc=gcc >> --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS="-O3 >> -march=native" --FOPTFLAGS="-O3 -march=native" --with-shared-libraries >> --with-debugging=1 >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> [unset]: aborting job: >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> ==16123== >> ==16123== HEAP SUMMARY: >> ==16123== in use at exit: 4,627,684 bytes in 1,188 blocks >> ==16123== total heap usage: 1,649 allocs, 461 frees, 6,073,192 bytes >> allocated >> ==16123== >> ==16123== LEAK SUMMARY: >> ==16123== definitely lost: 0 bytes in 0 blocks >> ==16123== indirectly lost: 0 bytes in 0 blocks >> ==16123== possibly lost: 0 bytes in 0 blocks >> ==16123== still reachable: 4,627,684 bytes in 1,188 blocks >> ==16123== suppressed: 0 bytes in 0 blocks >> ==16123== Rerun with --leak-check=full to see details of leaked memory >> ==16123== >> ==16123== For counts of detected and suppressed errors, rerun with: -v >> ==16123== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 61 from 8) >> >> From likunt at caltech.edu Thu May 22 09:32:10 2014 From: likunt at caltech.edu (likunt at caltech.edu) Date: Thu, 22 May 2014 07:32:10 -0700 (PDT) Subject: [petsc-users] output vec Message-ID: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> Dear Petsc developers, I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like this: m1 m2 m3 m4 m5 m6 ... Here is my code to do this: ================================================================== PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view); PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU); PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE); for(int step=0; step References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> Message-ID: <878uptj2n2.fsf@jedbrown.org> likunt at caltech.edu writes: > Dear Petsc developers, > > I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like this: > > m1 m2 m3 > m4 m5 m6 > ... > > Here is my code to do this: > > ================================================================== > PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view); > PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU); > PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE); > for(int step=0; step { > //calculate M at current step > DMDAVecGetArray(da, M, &aM); > DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); > for(int node=xs; node { > PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n", > aM[node].x, aM[node].y, aM[node].z); > PetscViewerFlush(view); > } > DMDAVecRestoreArray(da, M, &aM); > } > ================================================================= > > but this turns out to be very slow. Yes, ASCII output is slow. > I am trying to write it in a binary file, but I cannot find the > corresponding functionality (such as PETSC_VIEWER_ASCII_SYMMODU and > PetscViewerASCIISynchronizedPrintf in binary form). Just use VecView to a binary viewer. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From knepley at gmail.com Thu May 22 09:42:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 May 2014 09:42:52 -0500 Subject: [petsc-users] output vec In-Reply-To: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> Message-ID: On Thu, May 22, 2014 at 9:32 AM, wrote: > Dear Petsc developers, > > I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like this: > > m1 m2 m3 > m4 m5 m6 > ... > > Here is my code to do this: > > ================================================================== > PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view); > PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU); > PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE); > for(int step=0; step { > //calculate M at current step > DMDAVecGetArray(da, M, &aM); > DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); > for(int node=xs; node { > PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n", > aM[node].x, aM[node].y, aM[node].z); > PetscViewerFlush(view); > } > DMDAVecRestoreArray(da, M, &aM); > } > ================================================================= > > but this turns out to be very slow. I am trying to write it in a binary > file, but I cannot find the corresponding functionality (such as > PETSC_VIEWER_ASCII_SYMMODU and PetscViewerASCIISynchronizedPrintf in > binary form). Thanks. > There is PetscViewerBindaryWrite(), but what do you really want to do? What you suggest doing will be very slow. Why not just use PETSc binary output? Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From likunt at caltech.edu Thu May 22 10:02:29 2014 From: likunt at caltech.edu (Likun Tan) Date: Thu, 22 May 2014 11:02:29 -0400 Subject: [petsc-users] output vec In-Reply-To: References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> Message-ID: <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> Thanks for your suggestion. Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e. m1 m2 m3 m4 m5 m6 But I prefer the form m1 m2 m3 m4 m5 m6 Since in the end I will have about 1e+7 elements in the vec. If there is no way to output the vec in the second form, I will simply use VecView. Thanks. > On May 22, 2014, at 10:42 AM, Matthew Knepley wrote: > >> On Thu, May 22, 2014 at 9:32 AM, wrote: >> Dear Petsc developers, >> >> I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like this: >> >> m1 m2 m3 >> m4 m5 m6 >> ... >> >> Here is my code to do this: >> >> ================================================================== >> PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view); >> PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU); >> PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE); >> for(int step=0; step> { >> //calculate M at current step >> DMDAVecGetArray(da, M, &aM); >> DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); >> for(int node=xs; node> { >> PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n", >> aM[node].x, aM[node].y, aM[node].z); >> PetscViewerFlush(view); >> } >> DMDAVecRestoreArray(da, M, &aM); >> } >> ================================================================= >> >> but this turns out to be very slow. I am trying to write it in a binary >> file, but I cannot find the corresponding functionality (such as >> PETSC_VIEWER_ASCII_SYMMODU and PetscViewerASCIISynchronizedPrintf in >> binary form). Thanks. > > There is PetscViewerBindaryWrite(), but what do you really want to do? What you suggest > doing will be very slow. Why not just use PETSc binary output? > > Matt > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu May 22 10:07:06 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 22 May 2014 09:07:06 -0600 Subject: [petsc-users] output vec In-Reply-To: <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> Message-ID: <874n0hj1id.fsf@jedbrown.org> Likun Tan writes: > Thanks for your suggestion. > Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e. > m1 > m2 > m3 > m4 > m5 > m6 The binary viewer writes a *binary* file. No formatting or line breaks. > But I prefer the form > > m1 m2 m3 > m4 m5 m6 > > Since in the end I will have about 1e+7 elements in the vec. If there is no way to output the vec in the second form, I will simply use VecView. Thanks. Use VecView to write a binary (not ASCII) file. See PetscViewerBinaryOpen(). You can look at it with python, matlab/octave, etc. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From knepley at gmail.com Thu May 22 10:20:42 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 May 2014 10:20:42 -0500 Subject: [petsc-users] output vec In-Reply-To: <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> Message-ID: On Thu, May 22, 2014 at 10:02 AM, Likun Tan wrote: > Thanks for your suggestion. > Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e. > No it won't. Binary files have no newlines or spaces. Matt > m1 > m2 > m3 > m4 > m5 > m6 > > But I prefer the form > > m1 m2 m3 > m4 m5 m6 > > Since in the end I will have about 1e+7 elements in the vec. If there is > no way to output the vec in the second form, I will simply use VecView. > Thanks. > > On May 22, 2014, at 10:42 AM, Matthew Knepley wrote: > > On Thu, May 22, 2014 at 9:32 AM, wrote: > >> Dear Petsc developers, >> >> I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like >> this: >> >> m1 m2 m3 >> m4 m5 m6 >> ... >> >> Here is my code to do this: >> >> ================================================================== >> PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view); >> PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU); >> PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE); >> for(int step=0; step> { >> //calculate M at current step >> DMDAVecGetArray(da, M, &aM); >> DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); >> for(int node=xs; node> { >> PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n", >> aM[node].x, aM[node].y, aM[node].z); >> PetscViewerFlush(view); >> } >> DMDAVecRestoreArray(da, M, &aM); >> } >> ================================================================= >> >> but this turns out to be very slow. I am trying to write it in a binary >> file, but I cannot find the corresponding functionality (such as >> PETSC_VIEWER_ASCII_SYMMODU and PetscViewerASCIISynchronizedPrintf in >> binary form). Thanks. >> > > There is PetscViewerBindaryWrite(), but what do you really want to do? > What you suggest > doing will be very slow. Why not just use PETSc binary output? > > Matt > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From likunt at caltech.edu Thu May 22 11:20:13 2014 From: likunt at caltech.edu (Likun Tan) Date: Thu, 22 May 2014 12:20:13 -0400 Subject: [petsc-users] output vec In-Reply-To: <874n0hj1id.fsf@jedbrown.org> References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> <874n0hj1id.fsf@jedbrown.org> Message-ID: I am using VecView to output the vec in a binary file and tried to open it in Matlab. I define the precision to be double, but Matlab does not give reasonable values of my vec (almost extremely large or small or NaN values). Here is my code ======================================== PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view); for(step=0; step On May 22, 2014, at 11:07 AM, Jed Brown wrote: > > Likun Tan writes: > >> Thanks for your suggestion. >> Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e. >> m1 >> m2 >> m3 >> m4 >> m5 >> m6 > > The binary viewer writes a *binary* file. No formatting or line breaks. > >> But I prefer the form >> >> m1 m2 m3 >> m4 m5 m6 >> >> Since in the end I will have about 1e+7 elements in the vec. If there is no way to output the vec in the second form, I will simply use VecView. Thanks. > > Use VecView to write a binary (not ASCII) file. See > PetscViewerBinaryOpen(). You can look at it with python, matlab/octave, > etc. From knepley at gmail.com Thu May 22 11:26:14 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 May 2014 11:26:14 -0500 Subject: [petsc-users] output vec In-Reply-To: References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> <874n0hj1id.fsf@jedbrown.org> Message-ID: On Thu, May 22, 2014 at 11:20 AM, Likun Tan wrote: > I am using VecView to output the vec in a binary file and tried to open it > in Matlab. I define the precision to be double, but Matlab does not give > reasonable values of my vec (almost extremely large or small or NaN > values). Here is my code > Are you using PetscBinaryRead.m in Matlab? If so, send the code snippet for a small vector, all the output, and the binary file. Matt > ======================================== > PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view); > for(step=0; step { > //compute M at current step > VecView(M, view); > } > PetscViewerDestroy(&view); > ======================================= > > I am not sure if there is any problem of my Petsc code. Your comment is > well appreciated. > > > On May 22, 2014, at 11:07 AM, Jed Brown wrote: > > > > Likun Tan writes: > > > >> Thanks for your suggestion. > >> Using VecView or PetscViewerBinaryWrite will print the vec vertically, > i.e. > >> m1 > >> m2 > >> m3 > >> m4 > >> m5 > >> m6 > > > > The binary viewer writes a *binary* file. No formatting or line breaks. > > > >> But I prefer the form > >> > >> m1 m2 m3 > >> m4 m5 m6 > >> > >> Since in the end I will have about 1e+7 elements in the vec. If there > is no way to output the vec in the second form, I will simply use VecView. > Thanks. > > > > Use VecView to write a binary (not ASCII) file. See > > PetscViewerBinaryOpen(). You can look at it with python, matlab/octave, > > etc. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu May 22 11:34:39 2014 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 22 May 2014 12:34:39 -0400 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <871tvpqt8u.fsf@jedbrown.org> References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org> Message-ID: If the solver is degrading as the coefficients change, and I would assume get more nasty, you can try deleting the solver at each time step. This will be about 2x more expensive, because it does the setup each solve, but it might fix your problem. You also might try: -pc_type hypre -pc_hypre_type boomeramg On Mon, May 19, 2014 at 6:49 PM, Jed Brown wrote: > Michele Rosso writes: > > > Jed, > > > > thank you very much! > > I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type > > sor/ and report back. > > Yes, I removed the nullspace from both the system matrix and the rhs. > > Is there a way to have something similar to Dendy's multigrid or the > > deflated conjugate gradient method with PETSc? > > Dendy's MG needs geometry. The algorithm to produce the interpolation > operators is not terribly complicated so it could be done, though DMDA > support for cell-centered is a somewhat awkward. "Deflated CG" can mean > lots of things so you'll have to be more precise. (Most everything in > the "deflation" world has a clear analogue in the MG world, but the > deflation community doesn't have a precise language to talk about their > methods so you always have to read the paper carefully to find out if > it's completely standard or if there is something new.) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fiona at epcc.ed.ac.uk Thu May 22 11:59:56 2014 From: fiona at epcc.ed.ac.uk (Fiona Reid) Date: Thu, 22 May 2014 17:59:56 +0100 Subject: [petsc-users] Obtaining TS rosw solver output at regular time steps Message-ID: <537E2D0C.5050606@epcc.ed.ac.uk> Dear PETSc Users, Can anyone advise as to how I can obtain output from the TS rosw solver at regular time steps, e.g. every 0.05 seconds? I'm using a very slightly modified version of the example code from petsc-3.4.3/src/ts/examples/tutorials/ex20.c (changes are user.mu = 1.0 and user->next_output += 0.05). I set the initial time step to be 0.05 via TSSetInitialTimeStep. If I use the default solver (beuler) and the -monitor option I get output looking like: ./ex20 -ts_type beuler -monitor | more [0.0] 0 TS 0.000000 (dt = 0.050000) X 2.000000e+00 0.000000e+00 [0.1] 1 TS 0.050000 (dt = 0.050000) X 1.995658e+00 -8.683325e-02 [0.1] 2 TS 0.100000 (dt = 0.050000) X 1.987545e+00 -1.622726e-01 [0.2] 3 TS 0.150000 (dt = 0.050000) X 1.976146e+00 -2.279661e-01 [0.2] 4 TS 0.200000 (dt = 0.050000) X 1.961876e+00 -2.854046e-01 [0.2] 5 TS 0.250000 (dt = 0.050000) X 1.945081e+00 -3.359109e-01 [0.3] 6 TS 0.300000 (dt = 0.050000) X 1.926048e+00 -3.806427e-01 [0.3] 7 TS 0.350000 (dt = 0.050000) X 1.905018e+00 -4.206033e-01 [0.4] 8 TS 0.400000 (dt = 0.050000) X 1.882185e+00 -4.566572e-01 [0.4] 9 TS 0.450000 (dt = 0.050000) X 1.857708e+00 -4.895467e-01 [0.5] 10 TS 0.500000 (dt = 0.050000) X 1.831713e+00 -5.199087e-01 However if I switch to using the rosw solver instead I get: ./ex20 -ts_type rosw -monitor | more [0.0] 0 TS 0.000000 (dt = 0.050000) X 0.000000e+00 0.000000e+00 [0.1] 1 TS 0.050000 (dt = 0.061949) X 1.997620e+00 -9.284729e-02 [0.1] 2 TS 0.111949 (dt = 0.065192) X 1.990961e+00 -1.726821e-01 [0.2] 3 TS 0.177141 (dt = 0.068763) X 1.980577e+00 -2.414006e-01 [0.2] 4 TS 0.245904 (dt = 0.073732) X 1.966977e+00 -3.007620e-01 [0.2] 5 TS 0.319635 (dt = 0.080204) X 1.950593e+00 -3.523419e-01 [0.3] 5 TS 0.319635 (dt = 0.080204) X 1.931848e+00 -3.974691e-01 [0.3] 6 TS 0.399840 (dt = 0.088357) X 1.910959e+00 -4.373211e-01 [0.4] 7 TS 0.488197 (dt = 0.098465) X 1.888151e+00 -4.729449e-01 [0.4] 7 TS 0.488197 (dt = 0.098465) X 1.863723e+00 -5.051196e-01 [0.5] 8 TS 0.586663 (dt = 0.110786) X 1.837695e+00 -5.346152e-01 [0.5] 8 TS 0.586663 (dt = 0.110786) X 1.810291e+00 -5.620222e-01 Sometimes I get two different X values for the same value of TS. Ideally I'd like to have the output from rosw for exactly the same time values as beuler, e.g. every 0.05 seconds, such that it's possible to directly compare the two solvers. Is there a way to fix the time step in PETSc for the rosw solver such that I can get output every 0.05 seconds? if so how can I do this? Thank you very much in advance. Fiona -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jaolive at MIT.EDU Thu May 22 12:47:55 2014 From: jaolive at MIT.EDU (Jean-Arthur Louis Olive) Date: Thu, 22 May 2014 17:47:55 +0000 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve In-Reply-To: <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> References: <53725A86.3070804@uidaho.edu> <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> Message-ID: <0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu> Hi Barry, sorry about the late reply- We indeed use structured grids (DMDA 2d) - but do not ever provide a Jacobian for our non-linear stokes problem (instead just rely on petsc's FD approximation). I understand "snes_type test" is meant to compare petsc?s Jacobian with a user-provided analytical Jacobian. Are you saying we should provide an exact Jacobian for our simple linear test and see if there?s a problem with the approximate Jacobian? Thanks, Arthur & Eric > If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below. > > Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian. > > Barry > > > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive wrote: > >> Hi all, >> we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations. >> >> So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below). >> >> RESIDUAL 1 (NO COUPLING): >> for (j=info->ys; jys+info->ym; j++) { >> for (i=info->xs; ixs+info->xm; i++) { >> f[j][i].P = x[j][i].P - 3000000; >> f[j][i].vx= 2*x[j][i].vx; >> f[j][i].vy= 3*x[j][i].vy - 2; >> f[j][i].T = x[j][i].T; >> } >> >> RESIDUAL 2 (ONE COUPLING TERM): >> for (j=info->ys; jys+info->ym; j++) { >> for (i=info->xs; ixs+info->xm; i++) { >> f[j][i].P = x[j][i].P - 3; >> f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; >> f[j][i].vy= x[j][i].vy - 2; >> f[j][i].T = x[j][i].T; >> } >> } >> >> >> and our default set of options is: >> >> >> OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp >> >> >> With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below: >> >> >> Result from Solve - RESIDUAL 1 >> 0 SNES Function norm 8.485281374240e+07 >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 >> 1 SNES Function norm 1.131370849896e+02 >> 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 2 SNES Function norm 1.131370849896e+02 >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 >> >> >> With the coupled residual (Residual 2), the norms do not match, see below >> >> >> Result from Solve - RESIDUAL 2: >> 0 SNES Function norm 1.019803902719e+02 >> 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 >> 1 SNES Function norm 1.697056274848e+02 >> 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 >> 2 SNES Function norm 3.236770473841e-07 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> >> >> Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second. >> >> >> Result from Solve with -snes_fd - RESIDUAL 2 >> 0 SNES Function norm 8.485281374240e+07 >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 >> 1 SNES Function norm 2.039607805429e+02 >> 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 >> 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] >> 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 >> 3 SNES Function norm 2.549509757105e+01 >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 >> >> >> Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms? >> >> Thanks a lot, >> Arthur and Eric > -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1855 bytes Desc: not available URL: From knepley at gmail.com Thu May 22 12:52:07 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 May 2014 12:52:07 -0500 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve In-Reply-To: <0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu> References: <53725A86.3070804@uidaho.edu> <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> <0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu> Message-ID: On Thu, May 22, 2014 at 12:47 PM, Jean-Arthur Louis Olive wrote: > Hi Barry, > sorry about the late reply- > We indeed use structured grids (DMDA 2d) - but do not ever provide a > Jacobian for our non-linear stokes problem (instead just rely on petsc's FD > approximation). I understand "snes_type test" is meant to compare petsc?s > Jacobian with a user-provided analytical Jacobian. > Are you saying we should provide an exact Jacobian for our simple linear > test and see if there?s a problem with the approximate Jacobian? > The Jacobian computed by PETSc uses a finite-difference approximation, and thus is only accurate to maybe 1.0e-7 depending on the conditioning of your system. Are you trying to compare things that are more precise than that? You can provide an exact Jacobian to get machine accuracy. Matt > Thanks, > Arthur & Eric > > > > > If you are using DMDA and either DMGetColoring or the SNESSetDM > approach and dof is 4 then we color each of the 4 variables per grid point > with a different color so coupling between variables within a grid point is > not a problem. This would not explain the problem you are seeing below. > > > > Run your code with -snes_type test and read the results and follow the > directions to debug your Jacobian. > > > > Barry > > > > > > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive > wrote: > > > >> Hi all, > >> we are using PETSc to solve the steady state Stokes equations with > non-linear viscosities using finite difference. Recently we have realized > that our true residual norm after the last KSP solve did not match next > SNES function norm when solving the linear Stokes equations. > >> > >> So to understand this better, we set up two extremely simple linear > residuals, one with no coupling between variables (vx, vy, P and T), the > other with one coupling term (shown below). > >> > >> RESIDUAL 1 (NO COUPLING): > >> for (j=info->ys; jys+info->ym; j++) { > >> for (i=info->xs; ixs+info->xm; i++) { > >> f[j][i].P = x[j][i].P - 3000000; > >> f[j][i].vx= 2*x[j][i].vx; > >> f[j][i].vy= 3*x[j][i].vy - 2; > >> f[j][i].T = x[j][i].T; > >> } > >> > >> RESIDUAL 2 (ONE COUPLING TERM): > >> for (j=info->ys; jys+info->ym; j++) { > >> for (i=info->xs; ixs+info->xm; i++) { > >> f[j][i].P = x[j][i].P - 3; > >> f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; > >> f[j][i].vy= x[j][i].vy - 2; > >> f[j][i].T = x[j][i].T; > >> } > >> } > >> > >> > >> and our default set of options is: > >> > >> > >> OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 > -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor > -snes_converged_reason -snes_view -log_summary -options_left 1 > -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp > >> > >> > >> With the uncoupled residual (Residual 1), we get matching KSP and SNES > norm, highlighted below: > >> > >> > >> Result from Solve - RESIDUAL 1 > >> 0 SNES Function norm 8.485281374240e+07 > >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm > 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm > 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 > >> 1 SNES Function norm 1.131370849896e+02 > >> 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm > 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 > >> 2 SNES Function norm 1.131370849896e+02 > >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 > >> > >> > >> With the coupled residual (Residual 2), the norms do not match, see > below > >> > >> > >> Result from Solve - RESIDUAL 2: > >> 0 SNES Function norm 1.019803902719e+02 > >> 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm > 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm > 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 > >> 1 SNES Function norm 1.697056274848e+02 > >> 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm > 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm > 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 > >> 2 SNES Function norm 3.236770473841e-07 > >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > >> > >> > >> Lastly, if we add -snes_fd to our options, the norms for residual 2 get > better - they match after the first iteration but not after the second. > >> > >> > >> Result from Solve with -snes_fd - RESIDUAL 2 > >> 0 SNES Function norm 8.485281374240e+07 > >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm > 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm > 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 > >> 1 SNES Function norm 2.039607805429e+02 > >> 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm > 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm > 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 > >> 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] > >> 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm > 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 > >> 3 SNES Function norm 2.549509757105e+01 > >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 > >> > >> > >> Does this mean that our Jacobian is not approximated properly by the > default ?coloring? method when it has off-diagonal terms? > >> > >> Thanks a lot, > >> Arthur and Eric > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu May 22 12:57:37 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 22 May 2014 11:57:37 -0600 Subject: [petsc-users] Obtaining TS rosw solver output at regular time steps In-Reply-To: <537E2D0C.5050606@epcc.ed.ac.uk> References: <537E2D0C.5050606@epcc.ed.ac.uk> Message-ID: <87vbsxhf1q.fsf@jedbrown.org> Fiona Reid writes: > Dear PETSc Users, > > Can anyone advise as to how I can obtain output from the TS rosw solver > at regular time steps, e.g. every 0.05 seconds? > > I'm using a very slightly modified version of the example code from > petsc-3.4.3/src/ts/examples/tutorials/ex20.c (changes are user.mu = 1.0 > and user->next_output += 0.05). I set the initial time step to be 0.05 > via TSSetInitialTimeStep. If you want a constant time step size, use -ts_adapt_type none. By default, RosW uses an adaptive step size with an embedded error estimator. Note that RosW is a multi-stage method, so you might, for example, compare the accuracy and efficiency of -ts_type rosw -ts_rosw_type ra34pw2 -ts_dt 0.2 to -ts_type beuler -ts_dt 0.05 > If I use the default solver (beuler) and the -monitor option I get > output looking like: > > ./ex20 -ts_type beuler -monitor | more > [0.0] 0 TS 0.000000 (dt = 0.050000) X 2.000000e+00 0.000000e+00 > [0.1] 1 TS 0.050000 (dt = 0.050000) X 1.995658e+00 -8.683325e-02 > [0.1] 2 TS 0.100000 (dt = 0.050000) X 1.987545e+00 -1.622726e-01 > [0.2] 3 TS 0.150000 (dt = 0.050000) X 1.976146e+00 -2.279661e-01 > [0.2] 4 TS 0.200000 (dt = 0.050000) X 1.961876e+00 -2.854046e-01 > [0.2] 5 TS 0.250000 (dt = 0.050000) X 1.945081e+00 -3.359109e-01 > [0.3] 6 TS 0.300000 (dt = 0.050000) X 1.926048e+00 -3.806427e-01 > [0.3] 7 TS 0.350000 (dt = 0.050000) X 1.905018e+00 -4.206033e-01 > [0.4] 8 TS 0.400000 (dt = 0.050000) X 1.882185e+00 -4.566572e-01 > [0.4] 9 TS 0.450000 (dt = 0.050000) X 1.857708e+00 -4.895467e-01 > [0.5] 10 TS 0.500000 (dt = 0.050000) X 1.831713e+00 -5.199087e-01 > > However if I switch to using the rosw solver instead I get: > > ./ex20 -ts_type rosw -monitor | more > [0.0] 0 TS 0.000000 (dt = 0.050000) X 0.000000e+00 0.000000e+00 > [0.1] 1 TS 0.050000 (dt = 0.061949) X 1.997620e+00 -9.284729e-02 > [0.1] 2 TS 0.111949 (dt = 0.065192) X 1.990961e+00 -1.726821e-01 > [0.2] 3 TS 0.177141 (dt = 0.068763) X 1.980577e+00 -2.414006e-01 > [0.2] 4 TS 0.245904 (dt = 0.073732) X 1.966977e+00 -3.007620e-01 > [0.2] 5 TS 0.319635 (dt = 0.080204) X 1.950593e+00 -3.523419e-01 > [0.3] 5 TS 0.319635 (dt = 0.080204) X 1.931848e+00 -3.974691e-01 > [0.3] 6 TS 0.399840 (dt = 0.088357) X 1.910959e+00 -4.373211e-01 > [0.4] 7 TS 0.488197 (dt = 0.098465) X 1.888151e+00 -4.729449e-01 > [0.4] 7 TS 0.488197 (dt = 0.098465) X 1.863723e+00 -5.051196e-01 > [0.5] 8 TS 0.586663 (dt = 0.110786) X 1.837695e+00 -5.346152e-01 > [0.5] 8 TS 0.586663 (dt = 0.110786) X 1.810291e+00 -5.620222e-01 > > Sometimes I get two different X values for the same value of TS. Ideally > I'd like to have the output from rosw for exactly the same time values > as beuler, e.g. every 0.05 seconds, such that it's possible to directly > compare the two solvers. > > Is there a way to fix the time step in PETSc for the rosw solver such > that I can get output every 0.05 seconds? if so how can I do this? > > Thank you very much in advance. > > Fiona > > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From jaolive at MIT.EDU Thu May 22 13:01:52 2014 From: jaolive at MIT.EDU (Jean-Arthur Louis Olive) Date: Thu, 22 May 2014 18:01:52 +0000 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve In-Reply-To: References: <53725A86.3070804@uidaho.edu> <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> <0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu> Message-ID: <07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu> Hi Matt, our underlying problem is the mismatch between KSP ans SNES norms, even when solving a simple linear system, e.g., for (j=info->ys; jys+info->ym; j++) { for (i=info->xs; ixs+info->xm; i++) { f[j][i].P = x[j][i].P - 3; f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; f[j][i].vy= x[j][i].vy - 2; f[j][i].T = x[j][i].T; } } which should not have any conditioning issue. So I don?t think in this case it?s an accuracy problem- but something could be wrong with the FD estimation of our Jacobian (?) Arthur On May 22, 2014, at 11:52 AM, Matthew Knepley wrote: > On Thu, May 22, 2014 at 12:47 PM, Jean-Arthur Louis Olive wrote: > Hi Barry, > sorry about the late reply- > We indeed use structured grids (DMDA 2d) - but do not ever provide a Jacobian for our non-linear stokes problem (instead just rely on petsc's FD approximation). I understand "snes_type test" is meant to compare petsc?s Jacobian with a user-provided analytical Jacobian. > Are you saying we should provide an exact Jacobian for our simple linear test and see if there?s a problem with the approximate Jacobian? > > The Jacobian computed by PETSc uses a finite-difference approximation, and thus is only accurate to maybe 1.0e-7 > depending on the conditioning of your system. Are you trying to compare things that are more precise than that? You > can provide an exact Jacobian to get machine accuracy. > > Matt > > Thanks, > Arthur & Eric > > > > > If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below. > > > > Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian. > > > > Barry > > > > > > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive wrote: > > > >> Hi all, > >> we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations. > >> > >> So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below). > >> > >> RESIDUAL 1 (NO COUPLING): > >> for (j=info->ys; jys+info->ym; j++) { > >> for (i=info->xs; ixs+info->xm; i++) { > >> f[j][i].P = x[j][i].P - 3000000; > >> f[j][i].vx= 2*x[j][i].vx; > >> f[j][i].vy= 3*x[j][i].vy - 2; > >> f[j][i].T = x[j][i].T; > >> } > >> > >> RESIDUAL 2 (ONE COUPLING TERM): > >> for (j=info->ys; jys+info->ym; j++) { > >> for (i=info->xs; ixs+info->xm; i++) { > >> f[j][i].P = x[j][i].P - 3; > >> f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; > >> f[j][i].vy= x[j][i].vy - 2; > >> f[j][i].T = x[j][i].T; > >> } > >> } > >> > >> > >> and our default set of options is: > >> > >> > >> OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp > >> > >> > >> With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below: > >> > >> > >> Result from Solve - RESIDUAL 1 > >> 0 SNES Function norm 8.485281374240e+07 > >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 > >> 1 SNES Function norm 1.131370849896e+02 > >> 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 > >> 2 SNES Function norm 1.131370849896e+02 > >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 > >> > >> > >> With the coupled residual (Residual 2), the norms do not match, see below > >> > >> > >> Result from Solve - RESIDUAL 2: > >> 0 SNES Function norm 1.019803902719e+02 > >> 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 > >> 1 SNES Function norm 1.697056274848e+02 > >> 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 > >> 2 SNES Function norm 3.236770473841e-07 > >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 > >> > >> > >> Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second. > >> > >> > >> Result from Solve with -snes_fd - RESIDUAL 2 > >> 0 SNES Function norm 8.485281374240e+07 > >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 > >> 1 SNES Function norm 2.039607805429e+02 > >> 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 > >> 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] > >> 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 > >> 3 SNES Function norm 2.549509757105e+01 > >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 > >> > >> > >> Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms? > >> > >> Thanks a lot, > >> Arthur and Eric > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1855 bytes Desc: not available URL: From knepley at gmail.com Thu May 22 13:19:50 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 May 2014 13:19:50 -0500 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve In-Reply-To: <07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu> References: <53725A86.3070804@uidaho.edu> <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> <0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu> <07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu> Message-ID: On Thu, May 22, 2014 at 1:01 PM, Jean-Arthur Louis Olive wrote: > Hi Matt, > our underlying problem is the mismatch between KSP ans SNES norms, even > when solving a simple linear system, e.g., > > for (j=info->ys; jys+info->ym; j++) { > for (i=info->xs; ixs+info->xm; i++) { > f[j][i].P = x[j][i].P - 3; > f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; > f[j][i].vy= x[j][i].vy - 2; > f[j][i].T = x[j][i].T; > } > } > > which should not have any conditioning issue. So I don?t think in this > case it?s an accuracy problem- but something could be wrong with the FD > estimation of our Jacobian (?) > I think you are misinterpreting the output. As I said before, the FD Jacobian will only be accurate to about 1.0e-7 (which is what I see with my own code). Thus it will only match the SNES residual to this precision. If you want an exact match, you need to code up the exact Jacobian. Matt > Arthur > > > On May 22, 2014, at 11:52 AM, Matthew Knepley wrote: > > On Thu, May 22, 2014 at 12:47 PM, Jean-Arthur Louis Olive > wrote: > >> Hi Barry, >> sorry about the late reply- >> We indeed use structured grids (DMDA 2d) - but do not ever provide a >> Jacobian for our non-linear stokes problem (instead just rely on petsc's FD >> approximation). I understand "snes_type test" is meant to compare petsc?s >> Jacobian with a user-provided analytical Jacobian. >> Are you saying we should provide an exact Jacobian for our simple linear >> test and see if there?s a problem with the approximate Jacobian? >> > > The Jacobian computed by PETSc uses a finite-difference approximation, and > thus is only accurate to maybe 1.0e-7 > depending on the conditioning of your system. Are you trying to compare > things that are more precise than that? You > can provide an exact Jacobian to get machine accuracy. > > Matt > > >> Thanks, >> Arthur & Eric >> >> >> >> > If you are using DMDA and either DMGetColoring or the SNESSetDM >> approach and dof is 4 then we color each of the 4 variables per grid point >> with a different color so coupling between variables within a grid point is >> not a problem. This would not explain the problem you are seeing below. >> > >> > Run your code with -snes_type test and read the results and follow >> the directions to debug your Jacobian. >> > >> > Barry >> > >> > >> > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive >> wrote: >> > >> >> Hi all, >> >> we are using PETSc to solve the steady state Stokes equations with >> non-linear viscosities using finite difference. Recently we have realized >> that our true residual norm after the last KSP solve did not match next >> SNES function norm when solving the linear Stokes equations. >> >> >> >> So to understand this better, we set up two extremely simple linear >> residuals, one with no coupling between variables (vx, vy, P and T), the >> other with one coupling term (shown below). >> >> >> >> RESIDUAL 1 (NO COUPLING): >> >> for (j=info->ys; jys+info->ym; j++) { >> >> for (i=info->xs; ixs+info->xm; i++) { >> >> f[j][i].P = x[j][i].P - 3000000; >> >> f[j][i].vx= 2*x[j][i].vx; >> >> f[j][i].vy= 3*x[j][i].vy - 2; >> >> f[j][i].T = x[j][i].T; >> >> } >> >> >> >> RESIDUAL 2 (ONE COUPLING TERM): >> >> for (j=info->ys; jys+info->ym; j++) { >> >> for (i=info->xs; ixs+info->xm; i++) { >> >> f[j][i].P = x[j][i].P - 3; >> >> f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; >> >> f[j][i].vy= x[j][i].vy - 2; >> >> f[j][i].T = x[j][i].T; >> >> } >> >> } >> >> >> >> >> >> and our default set of options is: >> >> >> >> >> >> OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 >> -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor >> -snes_converged_reason -snes_view -log_summary -options_left 1 >> -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp >> >> >> >> >> >> With the uncoupled residual (Residual 1), we get matching KSP and SNES >> norm, highlighted below: >> >> >> >> >> >> Result from Solve - RESIDUAL 1 >> >> 0 SNES Function norm 8.485281374240e+07 >> >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid >> norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid >> norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 >> >> 1 SNES Function norm 1.131370849896e+02 >> >> 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid >> norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 >> >> 2 SNES Function norm 1.131370849896e+02 >> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 >> >> >> >> >> >> With the coupled residual (Residual 2), the norms do not match, see >> below >> >> >> >> >> >> Result from Solve - RESIDUAL 2: >> >> 0 SNES Function norm 1.019803902719e+02 >> >> 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid >> norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid >> norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 >> >> 1 SNES Function norm 1.697056274848e+02 >> >> 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid >> norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid >> norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 >> >> 2 SNES Function norm 3.236770473841e-07 >> >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> >> >> >> >> >> Lastly, if we add -snes_fd to our options, the norms for residual 2 >> get better - they match after the first iteration but not after the second. >> >> >> >> >> >> Result from Solve with -snes_fd - RESIDUAL 2 >> >> 0 SNES Function norm 8.485281374240e+07 >> >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid >> norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid >> norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 >> >> 1 SNES Function norm 2.039607805429e+02 >> >> 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid >> norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid >> norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 >> >> 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] >> >> 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid >> norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 >> >> 3 SNES Function norm 2.549509757105e+01 >> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 >> >> >> >> >> >> Does this mean that our Jacobian is not approximated properly by the >> default ?coloring? method when it has off-diagonal terms? >> >> >> >> Thanks a lot, >> >> Arthur and Eric >> > >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 22 13:21:19 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 22 May 2014 13:21:19 -0500 Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve In-Reply-To: <07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu> References: <53725A86.3070804@uidaho.edu> <591D878E-0113-42FE-923D-07E0282E9597@mit.edu> <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov> <0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu> <07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu> Message-ID: 1 SNES Function norm 1.697056274848e+02 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 2 SNES Function norm 3.236770473841e-07 With matrix free multiply 2 SNES Function norm 3.236770473841e-07 will not and should not match true resid norm 5.777940247956e-12 they are computed in completely different ways and one has lost half the digits in the finite differencing thus the linear system only approximates the ?nonlinear system? (which also happens to be linear) to roughly half the decimal digits so the results above are completely reasonable and expected (in single precision 5.777940247956e-12 and 3.236770473841e-07 compared to O(1) are both zero . Of course with "exact arithmetic? and no differencing they will be same. Barry -snes_fd is not practical in any way. Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second. Result from Solve with -snes_fd - RESIDUAL 2 0 SNES Function norm 8.485281374240e+07 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 1 SNES Function norm 2.039607805429e+02 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 3 SNES Function norm 2.549509757105e+01 Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 On May 22, 2014, at 1:01 PM, Jean-Arthur Louis Olive wrote: > Hi Matt, > our underlying problem is the mismatch between KSP ans SNES norms, even when solving a simple linear system, e.g., > > for (j=info->ys; jys+info->ym; j++) { > for (i=info->xs; ixs+info->xm; i++) { > f[j][i].P = x[j][i].P - 3; > f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; > f[j][i].vy= x[j][i].vy - 2; > f[j][i].T = x[j][i].T; > } > } > > which should not have any conditioning issue. So I don?t think in this case it?s an accuracy problem- but something could be wrong with the FD estimation of our Jacobian (?) > > Arthur > > > On May 22, 2014, at 11:52 AM, Matthew Knepley wrote: > >> On Thu, May 22, 2014 at 12:47 PM, Jean-Arthur Louis Olive wrote: >> Hi Barry, >> sorry about the late reply- >> We indeed use structured grids (DMDA 2d) - but do not ever provide a Jacobian for our non-linear stokes problem (instead just rely on petsc's FD approximation). I understand "snes_type test" is meant to compare petsc?s Jacobian with a user-provided analytical Jacobian. >> Are you saying we should provide an exact Jacobian for our simple linear test and see if there?s a problem with the approximate Jacobian? >> >> The Jacobian computed by PETSc uses a finite-difference approximation, and thus is only accurate to maybe 1.0e-7 >> depending on the conditioning of your system. Are you trying to compare things that are more precise than that? You >> can provide an exact Jacobian to get machine accuracy. >> >> Matt >> >> Thanks, >> Arthur & Eric >> >> >> >> > If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below. >> > >> > Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian. >> > >> > Barry >> > >> > >> > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive wrote: >> > >> >> Hi all, >> >> we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations. >> >> >> >> So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below). >> >> >> >> RESIDUAL 1 (NO COUPLING): >> >> for (j=info->ys; jys+info->ym; j++) { >> >> for (i=info->xs; ixs+info->xm; i++) { >> >> f[j][i].P = x[j][i].P - 3000000; >> >> f[j][i].vx= 2*x[j][i].vx; >> >> f[j][i].vy= 3*x[j][i].vy - 2; >> >> f[j][i].T = x[j][i].T; >> >> } >> >> >> >> RESIDUAL 2 (ONE COUPLING TERM): >> >> for (j=info->ys; jys+info->ym; j++) { >> >> for (i=info->xs; ixs+info->xm; i++) { >> >> f[j][i].P = x[j][i].P - 3; >> >> f[j][i].vx= x[j][i].vx - 3*x[j][i].vy; >> >> f[j][i].vy= x[j][i].vy - 2; >> >> f[j][i].T = x[j][i].T; >> >> } >> >> } >> >> >> >> >> >> and our default set of options is: >> >> >> >> >> >> OPTIONS: mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp >> >> >> >> >> >> With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below: >> >> >> >> >> >> Result from Solve - RESIDUAL 1 >> >> 0 SNES Function norm 8.485281374240e+07 >> >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06 >> >> 1 SNES Function norm 1.131370849896e+02 >> >> 0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00 >> >> 2 SNES Function norm 1.131370849896e+02 >> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2 >> >> >> >> >> >> With the coupled residual (Residual 2), the norms do not match, see below >> >> >> >> >> >> Result from Solve - RESIDUAL 2: >> >> 0 SNES Function norm 1.019803902719e+02 >> >> 0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01 >> >> 1 SNES Function norm 1.697056274848e+02 >> >> 0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14 >> >> 2 SNES Function norm 3.236770473841e-07 >> >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2 >> >> >> >> >> >> Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second. >> >> >> >> >> >> Result from Solve with -snes_fd - RESIDUAL 2 >> >> 0 SNES Function norm 8.485281374240e+07 >> >> 0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06 >> >> 1 SNES Function norm 2.039607805429e+02 >> >> 0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00 >> >> 1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01 >> >> 2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT] >> >> 0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00 >> >> 3 SNES Function norm 2.549509757105e+01 >> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3 >> >> >> >> >> >> Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms? >> >> >> >> Thanks a lot, >> >> Arthur and Eric >> > >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From fiona at epcc.ed.ac.uk Thu May 22 13:26:38 2014 From: fiona at epcc.ed.ac.uk (Fiona Reid) Date: Thu, 22 May 2014 19:26:38 +0100 Subject: [petsc-users] Obtaining TS rosw solver output at regular time steps In-Reply-To: <87vbsxhf1q.fsf@jedbrown.org> References: <537E2D0C.5050606@epcc.ed.ac.uk> <87vbsxhf1q.fsf@jedbrown.org> Message-ID: <537E415E.7010409@epcc.ed.ac.uk> On 22/05/2014 18:57, Jed Brown wrote: > If you want a constant time step size, use -ts_adapt_type none. By > default, RosW uses an adaptive step size with an embedded error > estimator. Note that RosW is a multi-stage method, so you might, for > example, compare the accuracy and efficiency of > > -ts_type rosw -ts_rosw_type ra34pw2 -ts_dt 0.2 > > to > > -ts_type beuler -ts_dt 0.05 Many thanks Jed, that's brilliant and does exactly what I need. Fiona -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From fiona at epcc.ed.ac.uk Thu May 22 13:53:52 2014 From: fiona at epcc.ed.ac.uk (Fiona Reid) Date: Thu, 22 May 2014 19:53:52 +0100 Subject: [petsc-users] Obtaining TS rosw solver output at regular time steps In-Reply-To: <87vbsxhf1q.fsf@jedbrown.org> References: <537E2D0C.5050606@epcc.ed.ac.uk> <87vbsxhf1q.fsf@jedbrown.org> Message-ID: <537E47C0.5010205@epcc.ed.ac.uk> Apologies everyone, I have another somewhat related question. If I actually want to use a variable time step with RosW is there any way to output the results at regular 0.05 seconds intervals? I realise this will interpolate between two points but it would be good to be able to plot all my data for the same time values. Using "-ts_adapt_type none" doesn't quite give me a "good enough" solution. Many thanks, Fiona -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From likunt at caltech.edu Thu May 22 13:55:16 2014 From: likunt at caltech.edu (Likun Tan) Date: Thu, 22 May 2014 14:55:16 -0400 Subject: [petsc-users] output vec In-Reply-To: References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> <874n0hj1id.fsf@jedbrown.org> Message-ID: Hi Matt, I am not using PetscBinaryRead. I wrote a binary file from Petsc and use Matlab's function to read it, I.e. fileID = fopen('result.bin', 'w'); data = fread(fileID, 'double'); But this gives me unreasonable values of data. I checked this example http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex54f.F.html which is exactly what I need for my problem. Do you have a C version of it ? Many thanks. > On May 22, 2014, at 12:26 PM, Matthew Knepley wrote: > >> On Thu, May 22, 2014 at 11:20 AM, Likun Tan wrote: >> I am using VecView to output the vec in a binary file and tried to open it in Matlab. I define the precision to be double, but Matlab does not give reasonable values of my vec (almost extremely large or small or NaN values). Here is my code > > Are you using PetscBinaryRead.m in Matlab? If so, send the code snippet for a small vector, all the output, and the binary file. > > Matt > >> ======================================== >> PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view); >> for(step=0; step> { >> //compute M at current step >> VecView(M, view); >> } >> PetscViewerDestroy(&view); >> ======================================= >> >> I am not sure if there is any problem of my Petsc code. Your comment is well appreciated. >> >> > On May 22, 2014, at 11:07 AM, Jed Brown wrote: >> > >> > Likun Tan writes: >> > >> >> Thanks for your suggestion. >> >> Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e. >> >> m1 >> >> m2 >> >> m3 >> >> m4 >> >> m5 >> >> m6 >> > >> > The binary viewer writes a *binary* file. No formatting or line breaks. >> > >> >> But I prefer the form >> >> >> >> m1 m2 m3 >> >> m4 m5 m6 >> >> >> >> Since in the end I will have about 1e+7 elements in the vec. If there is no way to output the vec in the second form, I will simply use VecView. Thanks. >> > >> > Use VecView to write a binary (not ASCII) file. See >> > PetscViewerBinaryOpen(). You can look at it with python, matlab/octave, >> > etc. > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Thu May 22 13:58:10 2014 From: danyang.su at gmail.com (Danyang Su) Date: Thu, 22 May 2014 11:58:10 -0700 Subject: [petsc-users] Question on local vec to global vec for dof > 1 Message-ID: <537E48C2.3060105@gmail.com> Hi All, I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2. The whole domain has 10 nodes in z direction. The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. The following is used to set the rhs value. call VecGetArrayF90(x_vec_loc, vecpointer, ierr) vecpointer = (calculate the rhs value here) call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) *Vecview Correct * *Vecview Wrong* dof local node Process [0] _Process [0] _ /_Process [0] _/ 1 1 1.395982780116148E-021 1.39598e-021 1.39598e-021 1 2 0.000000000000000E+000 0 0 1 3 0.000000000000000E+000 0 0 1 4 5.642372883946980E-037 5.64237e-037 5.64237e-037 1 5 0.000000000000000E+000 0 0 1 6 -1.395982780116148E-021 -7.52316e-037 -1.39598e-021 Line A 2 1 0.000000000000000E+000 7.52316e-037 0 2 2 0.000000000000000E+000 0 0 2 3 0.000000000000000E+000 1.68459e-016 0 2 4 4.814824860968090E-035 0.1296 4.81482e-035 2 5 0.000000000000000E+000 _/Process [1]/_ Line B 2 6 -1.371273884908092E-019 0 7.52316e-037 Line C 0 0 Process [1] 0 1.68459e-016 1 1 1.395982780116148E-021 4.81482e-035 0.1296 Line D 1 2 -7.523163845262640E-037 0 1.37127e-019 Line E 1 3 7.523163845262640E-037 -7.22224e-035 -7.22224e-035 1 4 0.000000000000000E+000 7.22224e-035 7.22224e-035 1 5 1.684590875336239E-016 0 0 1 6 0.129600000000000 128623 128623 2 1 1.371273884908092E-019 0 0 Line F 2 2 -7.222237291452134E-035 2 3 7.222237291452134E-035 2 4 0.000000000000000E+000 2 5 128623.169844761 2 6 0.000000000000000E+000 The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values. *How can I handle this kind of local vector to global vector assembly?* *In fact, the codes can work if the dof and local node is as follows.* dof local node 1 1 2 1 1 2 2 2 1 3 2 3 Thanks and regards, Danyang -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 22 13:59:09 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 May 2014 13:59:09 -0500 Subject: [petsc-users] output vec In-Reply-To: References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> <874n0hj1id.fsf@jedbrown.org> Message-ID: On Thu, May 22, 2014 at 1:55 PM, Likun Tan wrote: > Hi Matt, > > I am not using PetscBinaryRead. I wrote a binary file from Petsc and use > Matlab's function to read it, I.e. > > fileID = fopen('result.bin', 'w'); > data = fread(fileID, 'double'); > > But this gives me unreasonable values of data. I checked this example > > > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex54f.F.html > > which is exactly what I need for my problem. Do you have a C version of it > ? Many thanks. > Why would you rewrite this? https://bitbucket.org/petsc/petsc/src/2c43c009db31f079231059c9efed501d4deca8bf/bin/matlab/PetscBinaryRead.m?at=master Matt > On May 22, 2014, at 12:26 PM, Matthew Knepley wrote: > > On Thu, May 22, 2014 at 11:20 AM, Likun Tan wrote: > >> I am using VecView to output the vec in a binary file and tried to open >> it in Matlab. I define the precision to be double, but Matlab does not give >> reasonable values of my vec (almost extremely large or small or NaN >> values). Here is my code >> > > Are you using PetscBinaryRead.m in Matlab? If so, send the code snippet > for a small vector, all the output, and the binary file. > > Matt > > >> ======================================== >> PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view); >> for(step=0; step> { >> //compute M at current step >> VecView(M, view); >> } >> PetscViewerDestroy(&view); >> ======================================= >> >> I am not sure if there is any problem of my Petsc code. Your comment is >> well appreciated. >> >> > On May 22, 2014, at 11:07 AM, Jed Brown wrote: >> > >> > Likun Tan writes: >> > >> >> Thanks for your suggestion. >> >> Using VecView or PetscViewerBinaryWrite will print the vec vertically, >> i.e. >> >> m1 >> >> m2 >> >> m3 >> >> m4 >> >> m5 >> >> m6 >> > >> > The binary viewer writes a *binary* file. No formatting or line breaks. >> > >> >> But I prefer the form >> >> >> >> m1 m2 m3 >> >> m4 m5 m6 >> >> >> >> Since in the end I will have about 1e+7 elements in the vec. If there >> is no way to output the vec in the second form, I will simply use VecView. >> Thanks. >> > >> > Use VecView to write a binary (not ASCII) file. See >> > PetscViewerBinaryOpen(). You can look at it with python, matlab/octave, >> > etc. >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 22 14:01:05 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 May 2014 14:01:05 -0500 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: <537E48C2.3060105@gmail.com> References: <537E48C2.3060105@gmail.com> Message-ID: On Thu, May 22, 2014 at 1:58 PM, Danyang Su wrote: > Hi All, > > I have a 1D transient flow problem (1 dof) coupled with energy balance (1 > dof), so the total dof per node is 2. > > The whole domain has 10 nodes in z direction. > > The program runs well with 1 processor but failed in 2 processors. The > matrix is the same for 1 processor and 2 processor but the rhs are > different. > > The following is used to set the rhs value. > > call VecGetArrayF90(x_vec_loc, vecpointer, ierr) > vecpointer = (calculate the rhs value here) > call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) > call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) > call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) > > > *Vecview Correct * *Vecview Wrong* > dof local node Process [0] *Process > [0] * * Process [0] * > 1 1 1.395982780116148E-021 > 1.39598e-021 1.39598e-021 > 1 2 0.000000000000000E+000 > 0 0 > 1 3 0.000000000000000E+000 > 0 0 > 1 4 5.642372883946980E-037 > 5.64237e-037 5.64237e-037 > 1 5 0.000000000000000E+000 > 0 0 > 1 6 -1.395982780116148E-021 -7.52316e-037 > -1.39598e-021 Line A > 2 1 0.000000000000000E+000 > 7.52316e-037 0 > 2 2 0.000000000000000E+000 0 > 0 > 2 3 0.000000000000000E+000 > 1.68459e-016 0 > 2 4 4.814824860968090E-035 0.1296 > 4.81482e-035 > 2 5 0.000000000000000E+000 > *Process [1]* Line B > 2 6 -1.371273884908092E-019 0 > 7.52316e-037 Line C > > 0 0 > Process [1] > 0 1.68459e-016 > 1 1 1.395982780116148E-021 > 4.81482e-035 0.1296 > Line D > 1 2 -7.523163845262640E-037 0 > 1.37127e-019 Line E > 1 3 7.523163845262640E-037 > -7.22224e-035 -7.22224e-035 > 1 4 0.000000000000000E+000 > 7.22224e-035 7.22224e-035 > 1 5 1.684590875336239E-016 > 0 0 > 1 6 0.129600000000000 > 128623 128623 > 2 1 1.371273884908092E-019 > 0 > 0 Line F > 2 2 -7.222237291452134E-035 > 2 3 7.222237291452134E-035 > 2 4 0.000000000000000E+000 > 2 5 128623.169844761 > 2 6 0.000000000000000E+000 > > The red line (Line A, C, D and F) is the ghost values for 2 subdomains, > but when run with 2 processor, the program treates Line B, C, D, and E as > ghost values. > *How can I handle this kind of local vector to global vector assembly?* > Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for. Matt > > *In fact, the codes can work if the dof and local node is as follows.* > dof local node > 1 1 > 2 1 > 1 2 > 2 2 > 1 3 > 2 3 > > Thanks and regards, > > Danyang > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu May 22 15:22:30 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 22 May 2014 14:22:30 -0600 Subject: [petsc-users] Obtaining TS rosw solver output at regular time steps In-Reply-To: <537E47C0.5010205@epcc.ed.ac.uk> References: <537E2D0C.5050606@epcc.ed.ac.uk> <87vbsxhf1q.fsf@jedbrown.org> <537E47C0.5010205@epcc.ed.ac.uk> Message-ID: <87ha4hh8c9.fsf@jedbrown.org> Fiona Reid writes: > Apologies everyone, I have another somewhat related question. > > If I actually want to use a variable time step with RosW is there any > way to output the results at regular 0.05 seconds intervals? I realise > this will interpolate between two points but it would be good to be able > to plot all my data for the same time values. > > Using "-ts_adapt_type none" doesn't quite give me a "good enough" solution. I recommend writing a monitor (TSMonitorSet) that checks whether an "interesting" time has been passed on the step that just completed, then use TSInterpolate() to obtain a solution at that "interesting" time. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From danyang.su at gmail.com Thu May 22 16:44:47 2014 From: danyang.su at gmail.com (Danyang Su) Date: Thu, 22 May 2014 14:44:47 -0700 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: References: <537E48C2.3060105@gmail.com> Message-ID: <537E6FCF.4030408@gmail.com> On 22/05/2014 12:01 PM, Matthew Knepley wrote: > On Thu, May 22, 2014 at 1:58 PM, Danyang Su > wrote: > > Hi All, > > I have a 1D transient flow problem (1 dof) coupled with energy > balance (1 dof), so the total dof per node is 2. > > The whole domain has 10 nodes in z direction. > > The program runs well with 1 processor but failed in 2 processors. > The matrix is the same for 1 processor and 2 processor but the rhs > are different. > > The following is used to set the rhs value. > > call VecGetArrayF90(x_vec_loc, vecpointer, ierr) > vecpointer = (calculate the rhs value here) > call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) > call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) > call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) > > *Vecview Correct * *Vecview Wrong* > dof local node Process [0] _Process [0] _ /_Process > [0] _/ > 1 1 1.395982780116148E-021 1.39598e-021 > 1.39598e-021 > 1 2 0.000000000000000E+000 > 0 0 > 1 3 0.000000000000000E+000 0 > 0 > 1 4 5.642372883946980E-037 5.64237e-037 > 5.64237e-037 > 1 5 0.000000000000000E+000 > 0 0 > 1 6 -1.395982780116148E-021 -7.52316e-037 > -1.39598e-021 Line A > 2 1 0.000000000000000E+000 7.52316e-037 0 > 2 2 0.000000000000000E+000 0 0 > 2 3 0.000000000000000E+000 1.68459e-016 0 > 2 4 4.814824860968090E-035 0.1296 4.81482e-035 > 2 5 0.000000000000000E+000 _/Process [1]/_ Line B > 2 6 -1.371273884908092E-019 0 > 7.52316e-037 Line C > 0 0 > Process [1] 0 1.68459e-016 > 1 1 1.395982780116148E-021 4.81482e-035 > 0.1296 Line D > 1 2 -7.523163845262640E-037 0 > 1.37127e-019 Line E > 1 3 7.523163845262640E-037 -7.22224e-035 > -7.22224e-035 > 1 4 0.000000000000000E+000 7.22224e-035 > 7.22224e-035 > 1 5 1.684590875336239E-016 > 0 0 > 1 6 0.129600000000000 > 128623 128623 > 2 1 1.371273884908092E-019 0 > 0 Line F > 2 2 -7.222237291452134E-035 > 2 3 7.222237291452134E-035 > 2 4 0.000000000000000E+000 > 2 5 128623.169844761 > 2 6 0.000000000000000E+000 > > The red line (Line A, C, D and F) is the ghost values for 2 > subdomains, but when run with 2 processor, the program treates > Line B, C, D, and E as ghost values. > *How can I handle this kind of local vector to global vector > assembly?* > > > Why are you not using DMDAVecGetArrayF90()? This is exactly what it is > for. Thanks, Matthew. I tried the following codes but still cannot get the correct global rhs vector call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr) do i = 1,nvz !nvz is local node amount, here is 6 vecpointer1d(0,i-1) = x_array_loc(i) !assume x_array_loc is the local rhs (the third column in the above mentioned data) vecpointer1d(1,i-1) = x_array_loc(i+nvz) end do call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr) call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) Now the rhs for 1 processor is as follows. It is not what I want. 1.39598e-021 0 -0 -0 -0 -0 5.64237e-037 4.81482e-035 -0 -0 -7.52316e-037 -7.22224e-035 7.52316e-037 7.22224e-035 -0 -0 1.68459e-016 128623 0.1296 0 > > Matt > > > *In fact, the codes can work if the dof and local node is as follows.* > dof local node > 1 1 > 2 1 > 1 2 > 2 2 > 1 3 > 2 3 > > Thanks and regards, > > Danyang > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From likunt at caltech.edu Thu May 22 18:01:06 2014 From: likunt at caltech.edu (likunt at caltech.edu) Date: Thu, 22 May 2014 16:01:06 -0700 (PDT) Subject: [petsc-users] output vec In-Reply-To: References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> <874n0hj1id.fsf@jedbrown.org> Message-ID: <58214.131.215.248.200.1400799666.squirrel@webmail.caltech.edu> Thanks for your suggestion. I've successfully read some binary files from petsc examples using PetscBinaryRead, but I still have problem when reading the binary file from my code. The issue is that only the first 8 elements are read correctly and the rest are either extremely large or small numbers. I am using the following commands for data output: PetscViewerBinaryOpen(PETSC_COMM_WORLD, 'result.txt', FILE_MODE_WRITE, &view); VecView(field.M, view); PetscViewerDestroy(&view); Would you please give comments on the possible reason for this? Thank you. > On Thu, May 22, 2014 at 1:55 PM, Likun Tan wrote: > >> Hi Matt, >> I am not using PetscBinaryRead. I wrote a binary file from Petsc and use >> Matlab's function to read it, I.e. >> fileID = fopen('result.bin', 'w'); >> data = fread(fileID, 'double'); >> But this gives me unreasonable values of data. I checked this example http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex54f.F.html which is exactly what I need for my problem. Do you have a C version of it >> ? Many thanks. > > Why would you rewrite this? > > > https://bitbucket.org/petsc/petsc/src/2c43c009db31f079231059c9efed501d4deca8bf/bin/matlab/PetscBinaryRead.m?at=master > > Matt > > >> On May 22, 2014, at 12:26 PM, Matthew Knepley wrote: >> On Thu, May 22, 2014 at 11:20 AM, Likun Tan wrote: >>> I am using VecView to output the vec in a binary file and tried to open >>> it in Matlab. I define the precision to be double, but Matlab does not give >>> reasonable values of my vec (almost extremely large or small or NaN values). Here is my code >> Are you using PetscBinaryRead.m in Matlab? If so, send the code snippet for a small vector, all the output, and the binary file. >> Matt >>> ======================================== >>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view); >>> for(step=0; step>> { >>> //compute M at current step >>> VecView(M, view); >>> } >>> PetscViewerDestroy(&view); >>> ======================================= >>> I am not sure if there is any problem of my Petsc code. Your comment is >>> well appreciated. >>> > On May 22, 2014, at 11:07 AM, Jed Brown wrote: >>> > >>> > Likun Tan writes: >>> > >>> >> Thanks for your suggestion. >>> >> Using VecView or PetscViewerBinaryWrite will print the vec >>> vertically, >>> i.e. >>> >> m1 >>> >> m2 >>> >> m3 >>> >> m4 >>> >> m5 >>> >> m6 >>> > >>> > The binary viewer writes a *binary* file. No formatting or line >>> breaks. >>> > >>> >> But I prefer the form >>> >> >>> >> m1 m2 m3 >>> >> m4 m5 m6 >>> >> >>> >> Since in the end I will have about 1e+7 elements in the vec. If >>> there >>> is no way to output the vec in the second form, I will simply use VecView. >>> Thanks. >>> > >>> > Use VecView to write a binary (not ASCII) file. See >>> > PetscViewerBinaryOpen(). You can look at it with python, >>> matlab/octave, >>> > etc. >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > From bsmith at mcs.anl.gov Thu May 22 18:03:01 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 22 May 2014 18:03:01 -0500 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: <537E6FCF.4030408@gmail.com> References: <537E48C2.3060105@gmail.com> <537E6FCF.4030408@gmail.com> Message-ID: Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on x_vec_loc only on the first process. Is it what you expect? Also what is vecpointer1d declared to be? Barry On May 22, 2014, at 4:44 PM, Danyang Su wrote: > On 22/05/2014 12:01 PM, Matthew Knepley wrote: >> On Thu, May 22, 2014 at 1:58 PM, Danyang Su wrote: >> Hi All, >> >> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2. >> >> The whole domain has 10 nodes in z direction. >> >> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. >> >> The following is used to set the rhs value. >> >> call VecGetArrayF90(x_vec_loc, vecpointer, ierr) >> vecpointer = (calculate the rhs value here) >> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) >> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >> >> Vecview Correct Vecview Wrong >> dof local node Process [0] Process [0] Process [0] >> 1 1 1.395982780116148E-021 1.39598e-021 1.39598e-021 >> 1 2 0.000000000000000E+000 0 0 >> 1 3 0.000000000000000E+000 0 0 >> 1 4 5.642372883946980E-037 5.64237e-037 5.64237e-037 >> 1 5 0.000000000000000E+000 0 0 >> 1 6 -1.395982780116148E-021 -7.52316e-037 -1.39598e-021 Line A >> 2 1 0.000000000000000E+000 7.52316e-037 0 >> 2 2 0.000000000000000E+000 0 0 >> 2 3 0.000000000000000E+000 1.68459e-016 0 >> 2 4 4.814824860968090E-035 0.1296 4.81482e-035 >> 2 5 0.000000000000000E+000 Process [1] Line B >> 2 6 -1.371273884908092E-019 0 7.52316e-037 Line C >> 0 0 >> Process [1] 0 1.68459e-016 >> 1 1 1.395982780116148E-021 4.81482e-035 0.1296 Line D >> 1 2 -7.523163845262640E-037 0 1.37127e-019 Line E >> 1 3 7.523163845262640E-037 -7.22224e-035 -7.22224e-035 >> 1 4 0.000000000000000E+000 7.22224e-035 7.22224e-035 >> 1 5 1.684590875336239E-016 0 0 >> 1 6 0.129600000000000 128623 128623 >> 2 1 1.371273884908092E-019 0 0 Line F >> 2 2 -7.222237291452134E-035 >> 2 3 7.222237291452134E-035 >> 2 4 0.000000000000000E+000 >> 2 5 128623.169844761 >> 2 6 0.000000000000000E+000 >> >> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values. >> How can I handle this kind of local vector to global vector assembly? >> >> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for. > Thanks, Matthew. > > I tried the following codes but still cannot get the correct global rhs vector > > call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr) > do i = 1,nvz !nvz is local node amount, here is 6 > vecpointer1d(0,i-1) = x_array_loc(i) !assume x_array_loc is the local rhs (the third column in the above mentioned data) > vecpointer1d(1,i-1) = x_array_loc(i+nvz) > end do > call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr) > call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) > call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) > > > Now the rhs for 1 processor is as follows. It is not what I want. > > 1.39598e-021 > 0 > -0 > -0 > -0 > -0 > 5.64237e-037 > 4.81482e-035 > -0 > -0 > -7.52316e-037 > -7.22224e-035 > 7.52316e-037 > 7.22224e-035 > -0 > -0 > 1.68459e-016 > 128623 > 0.1296 > 0 >> >> Matt >> >> >> In fact, the codes can work if the dof and local node is as follows. >> dof local node >> 1 1 >> 2 1 >> 1 2 >> 2 2 >> 1 3 >> 2 3 >> >> Thanks and regards, >> >> Danyang >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From jed at jedbrown.org Thu May 22 18:04:31 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 22 May 2014 17:04:31 -0600 Subject: [petsc-users] output vec In-Reply-To: <58214.131.215.248.200.1400799666.squirrel@webmail.caltech.edu> References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu> <874n0hj1id.fsf@jedbrown.org> <58214.131.215.248.200.1400799666.squirrel@webmail.caltech.edu> Message-ID: <87ppj5fm9s.fsf@jedbrown.org> likunt at caltech.edu writes: > Thanks for your suggestion. > I've successfully read some binary files from petsc examples using > PetscBinaryRead, but I still have problem when reading the binary file > from my code. The issue is that only the first 8 elements are read > correctly and the rest are either extremely large or small numbers. I am > using the following commands for data output: It sounds like your code for reading the file is incorrect. You can look at the implementation (in C, Matlab, or Python), or you can just call VecLoad(). > PetscViewerBinaryOpen(PETSC_COMM_WORLD, 'result.txt', FILE_MODE_WRITE, > &view); > VecView(field.M, view); > PetscViewerDestroy(&view); -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From danyang.su at gmail.com Thu May 22 18:33:01 2014 From: danyang.su at gmail.com (Danyang Su) Date: Thu, 22 May 2014 16:33:01 -0700 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: References: <537E48C2.3060105@gmail.com> <537E6FCF.4030408@gmail.com> Message-ID: <537E892D.9030808@gmail.com> Hi Barry, I use the following routine to reorder from the local rhs to global rhs. PetscScalar, pointer :: vecpointer1d(:,:) call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr) !x_vec_gbl is a global vector created by DMCreateGlobalVector do i = nvzls,nvzle !local node number without ghost node vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1) !x_array_loc is local rhs vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz) !nvz = 6 for the present 1d example end do call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr) Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order. x_vec_gbl x_vec_gbl dof node VecView(Current) dof node VecView (Expected) 1 1 1.39598e-021 1 1 1.39598e-021 2 1 0 1 2 0 1 2 -0 1 3 0 2 2 -0 1 4 5.64237e-037 1 3 -0 1 5 0 2 3 -0 1 6 -7.52316e-037 1 4 5.64237e-037 1 7 7.52316e-037 2 4 4.81482e-035 1 8 0 1 5 -0 1 9 1.68459e-016 2 5 -0 1 10 0.1296 1 6 -7.52316e-037 2 1 0 2 6 -7.22224e-035 2 2 0 1 7 7.52316e-037 2 3 0 2 7 7.22224e-035 2 4 4.81482e-035 1 8 -0 2 5 0 2 8 -0 2 6 -7.22224e-035 1 9 1.68459e-016 2 7 7.22224e-035 2 9 128623 2 8 0 1 10 0.1296 2 9 128623 2 10 0 2 10 0 Thanks and regards, Danyang On 22/05/2014 4:03 PM, Barry Smith wrote: > Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on > x_vec_loc only on the first process. Is it what you expect? > > Also what is vecpointer1d declared to be? > > > Barry > > On May 22, 2014, at 4:44 PM, Danyang Su wrote: > >> On 22/05/2014 12:01 PM, Matthew Knepley wrote: >>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su wrote: >>> Hi All, >>> >>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2. >>> >>> The whole domain has 10 nodes in z direction. >>> >>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. >>> >>> The following is used to set the rhs value. >>> >>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr) >>> vecpointer = (calculate the rhs value here) >>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) >>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>> >>> Vecview Correct Vecview Wrong >>> dof local node Process [0] Process [0] Process [0] >>> 1 1 1.395982780116148E-021 1.39598e-021 1.39598e-021 >>> 1 2 0.000000000000000E+000 0 0 >>> 1 3 0.000000000000000E+000 0 0 >>> 1 4 5.642372883946980E-037 5.64237e-037 5.64237e-037 >>> 1 5 0.000000000000000E+000 0 0 >>> 1 6 -1.395982780116148E-021 -7.52316e-037 -1.39598e-021 Line A >>> 2 1 0.000000000000000E+000 7.52316e-037 0 >>> 2 2 0.000000000000000E+000 0 0 >>> 2 3 0.000000000000000E+000 1.68459e-016 0 >>> 2 4 4.814824860968090E-035 0.1296 4.81482e-035 >>> 2 5 0.000000000000000E+000 Process [1] Line B >>> 2 6 -1.371273884908092E-019 0 7.52316e-037 Line C >>> 0 0 >>> Process [1] 0 1.68459e-016 >>> 1 1 1.395982780116148E-021 4.81482e-035 0.1296 Line D >>> 1 2 -7.523163845262640E-037 0 1.37127e-019 Line E >>> 1 3 7.523163845262640E-037 -7.22224e-035 -7.22224e-035 >>> 1 4 0.000000000000000E+000 7.22224e-035 7.22224e-035 >>> 1 5 1.684590875336239E-016 0 0 >>> 1 6 0.129600000000000 128623 128623 >>> 2 1 1.371273884908092E-019 0 0 Line F >>> 2 2 -7.222237291452134E-035 >>> 2 3 7.222237291452134E-035 >>> 2 4 0.000000000000000E+000 >>> 2 5 128623.169844761 >>> 2 6 0.000000000000000E+000 >>> >>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values. >>> How can I handle this kind of local vector to global vector assembly? >>> >>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for. >> Thanks, Matthew. >> >> I tried the following codes but still cannot get the correct global rhs vector >> >> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr) >> do i = 1,nvz !nvz is local node amount, here is 6 >> vecpointer1d(0,i-1) = x_array_loc(i) !assume x_array_loc is the local rhs (the third column in the above mentioned data) >> vecpointer1d(1,i-1) = x_array_loc(i+nvz) >> end do >> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr) >> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >> >> >> Now the rhs for 1 processor is as follows. It is not what I want. >> >> 1.39598e-021 >> 0 >> -0 >> -0 >> -0 >> -0 >> 5.64237e-037 >> 4.81482e-035 >> -0 >> -0 >> -7.52316e-037 >> -7.22224e-035 >> 7.52316e-037 >> 7.22224e-035 >> -0 >> -0 >> 1.68459e-016 >> 128623 >> 0.1296 >> 0 >>> Matt >>> >>> >>> In fact, the codes can work if the dof and local node is as follows. >>> dof local node >>> 1 1 >>> 2 1 >>> 1 2 >>> 2 2 >>> 1 3 >>> 2 3 >>> >>> Thanks and regards, >>> >>> Danyang >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 22 19:34:52 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 22 May 2014 19:34:52 -0500 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: <537E892D.9030808@gmail.com> References: <537E48C2.3060105@gmail.com> <537E6FCF.4030408@gmail.com> <537E892D.9030808@gmail.com> Message-ID: <3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov> DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?. Barry On May 22, 2014, at 6:33 PM, Danyang Su wrote: > Hi Barry, > > I use the following routine to reorder from the local rhs to global rhs. > > PetscScalar, pointer :: vecpointer1d(:,:) > > call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr) !x_vec_gbl is a global vector created by DMCreateGlobalVector > do i = nvzls,nvzle !local node number without ghost node > vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1) !x_array_loc is local rhs > vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz) !nvz = 6 for the present 1d example > end do > call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr) > > Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order. > > x_vec_gbl x_vec_gbl > dof node VecView(Current) dof node VecView (Expected) > 1 1 1.39598e-021 1 1 1.39598e-021 > 2 1 0 1 2 0 > 1 2 -0 1 3 0 > 2 2 -0 1 4 5.64237e-037 > 1 3 -0 1 5 0 > 2 3 -0 1 6 -7.52316e-037 > 1 4 5.64237e-037 1 7 7.52316e-037 > 2 4 4.81482e-035 1 8 0 > 1 5 -0 1 9 1.68459e-016 > 2 5 -0 1 10 0.1296 > > 1 6 -7.52316e-037 2 1 0 > 2 6 -7.22224e-035 2 2 0 > 1 7 7.52316e-037 2 3 0 > 2 7 7.22224e-035 2 4 4.81482e-035 > 1 8 -0 2 5 0 > 2 8 -0 2 6 -7.22224e-035 > 1 9 1.68459e-016 2 7 7.22224e-035 > 2 9 128623 2 8 0 > 1 10 0.1296 2 9 128623 > 2 10 0 2 10 0 > > Thanks and regards, > > Danyang > > > On 22/05/2014 4:03 PM, Barry Smith wrote: >> Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on >> x_vec_loc only on the first process. Is it what you expect? >> >> Also what is vecpointer1d declared to be? >> >> >> Barry >> >> On May 22, 2014, at 4:44 PM, Danyang Su >> >> wrote: >> >> >>> On 22/05/2014 12:01 PM, Matthew Knepley wrote: >>> >>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su >>>> wrote: >>>> Hi All, >>>> >>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2. >>>> >>>> The whole domain has 10 nodes in z direction. >>>> >>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. >>>> >>>> The following is used to set the rhs value. >>>> >>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr) >>>> vecpointer = (calculate the rhs value here) >>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) >>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>> >>>> Vecview Correct Vecview Wrong >>>> dof local node Process [0] Process [0] Process [0] >>>> 1 1 1.395982780116148E-021 1.39598e-021 1.39598e-021 >>>> 1 2 0.000000000000000E+000 0 0 >>>> 1 3 0.000000000000000E+000 0 0 >>>> 1 4 5.642372883946980E-037 5.64237e-037 5.64237e-037 >>>> 1 5 0.000000000000000E+000 0 0 >>>> 1 6 -1.395982780116148E-021 -7.52316e-037 -1.39598e-021 Line A >>>> 2 1 0.000000000000000E+000 7.52316e-037 0 >>>> 2 2 0.000000000000000E+000 0 0 >>>> 2 3 0.000000000000000E+000 1.68459e-016 0 >>>> 2 4 4.814824860968090E-035 0.1296 4.81482e-035 >>>> 2 5 0.000000000000000E+000 Process [1] Line B >>>> 2 6 -1.371273884908092E-019 0 7.52316e-037 Line C >>>> 0 0 >>>> Process [1] 0 1.68459e-016 >>>> 1 1 1.395982780116148E-021 4.81482e-035 0.1296 Line D >>>> 1 2 -7.523163845262640E-037 0 1.37127e-019 Line E >>>> 1 3 7.523163845262640E-037 -7.22224e-035 -7.22224e-035 >>>> 1 4 0.000000000000000E+000 7.22224e-035 7.22224e-035 >>>> 1 5 1.684590875336239E-016 0 0 >>>> 1 6 0.129600000000000 128623 128623 >>>> 2 1 1.371273884908092E-019 0 0 Line F >>>> 2 2 -7.222237291452134E-035 >>>> 2 3 7.222237291452134E-035 >>>> 2 4 0.000000000000000E+000 >>>> 2 5 128623.169844761 >>>> 2 6 0.000000000000000E+000 >>>> >>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values. >>>> How can I handle this kind of local vector to global vector assembly? >>>> >>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for. >>>> >>> Thanks, Matthew. >>> >>> I tried the following codes but still cannot get the correct global rhs vector >>> >>> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr) >>> do i = 1,nvz !nvz is local node amount, here is 6 >>> vecpointer1d(0,i-1) = x_array_loc(i) !assume x_array_loc is the local rhs (the third column in the above mentioned data) >>> vecpointer1d(1,i-1) = x_array_loc(i+nvz) >>> end do >>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr) >>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>> >>> >>> Now the rhs for 1 processor is as follows. It is not what I want. >>> >>> 1.39598e-021 >>> 0 >>> -0 >>> -0 >>> -0 >>> -0 >>> 5.64237e-037 >>> 4.81482e-035 >>> -0 >>> -0 >>> -7.52316e-037 >>> -7.22224e-035 >>> 7.52316e-037 >>> 7.22224e-035 >>> -0 >>> -0 >>> 1.68459e-016 >>> 128623 >>> 0.1296 >>> 0 >>> >>>> Matt >>>> >>>> >>>> In fact, the codes can work if the dof and local node is as follows. >>>> dof local node >>>> 1 1 >>>> 2 1 >>>> 1 2 >>>> 2 2 >>>> 1 3 >>>> 2 3 >>>> >>>> Thanks and regards, >>>> >>>> Danyang >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> > From danyang.su at gmail.com Thu May 22 19:42:53 2014 From: danyang.su at gmail.com (Danyang Su) Date: Thu, 22 May 2014 17:42:53 -0700 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: <3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov> References: <537E48C2.3060105@gmail.com> <537E6FCF.4030408@gmail.com> <537E892D.9030808@gmail.com> <3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov> Message-ID: <537E998D.8040101@gmail.com> On 22/05/2014 5:34 PM, Barry Smith wrote: > DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?. Then, is there any routine convert matrix to be "interlaced"? Thanks, Danyang > > Barry > > On May 22, 2014, at 6:33 PM, Danyang Su wrote: > >> Hi Barry, >> >> I use the following routine to reorder from the local rhs to global rhs. >> >> PetscScalar, pointer :: vecpointer1d(:,:) >> >> call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr) !x_vec_gbl is a global vector created by DMCreateGlobalVector >> do i = nvzls,nvzle !local node number without ghost node >> vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1) !x_array_loc is local rhs >> vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz) !nvz = 6 for the present 1d example >> end do >> call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr) >> >> Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order. >> >> x_vec_gbl x_vec_gbl >> dof node VecView(Current) dof node VecView (Expected) >> 1 1 1.39598e-021 1 1 1.39598e-021 >> 2 1 0 1 2 0 >> 1 2 -0 1 3 0 >> 2 2 -0 1 4 5.64237e-037 >> 1 3 -0 1 5 0 >> 2 3 -0 1 6 -7.52316e-037 >> 1 4 5.64237e-037 1 7 7.52316e-037 >> 2 4 4.81482e-035 1 8 0 >> 1 5 -0 1 9 1.68459e-016 >> 2 5 -0 1 10 0.1296 >> >> 1 6 -7.52316e-037 2 1 0 >> 2 6 -7.22224e-035 2 2 0 >> 1 7 7.52316e-037 2 3 0 >> 2 7 7.22224e-035 2 4 4.81482e-035 >> 1 8 -0 2 5 0 >> 2 8 -0 2 6 -7.22224e-035 >> 1 9 1.68459e-016 2 7 7.22224e-035 >> 2 9 128623 2 8 0 >> 1 10 0.1296 2 9 128623 >> 2 10 0 2 10 0 >> >> Thanks and regards, >> >> Danyang >> >> >> On 22/05/2014 4:03 PM, Barry Smith wrote: >>> Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on >>> x_vec_loc only on the first process. Is it what you expect? >>> >>> Also what is vecpointer1d declared to be? >>> >>> >>> Barry >>> >>> On May 22, 2014, at 4:44 PM, Danyang Su >>> >>> wrote: >>> >>> >>>> On 22/05/2014 12:01 PM, Matthew Knepley wrote: >>>> >>>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su >>>>> wrote: >>>>> Hi All, >>>>> >>>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2. >>>>> >>>>> The whole domain has 10 nodes in z direction. >>>>> >>>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. >>>>> >>>>> The following is used to set the rhs value. >>>>> >>>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr) >>>>> vecpointer = (calculate the rhs value here) >>>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) >>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>> >>>>> Vecview Correct Vecview Wrong >>>>> dof local node Process [0] Process [0] Process [0] >>>>> 1 1 1.395982780116148E-021 1.39598e-021 1.39598e-021 >>>>> 1 2 0.000000000000000E+000 0 0 >>>>> 1 3 0.000000000000000E+000 0 0 >>>>> 1 4 5.642372883946980E-037 5.64237e-037 5.64237e-037 >>>>> 1 5 0.000000000000000E+000 0 0 >>>>> 1 6 -1.395982780116148E-021 -7.52316e-037 -1.39598e-021 Line A >>>>> 2 1 0.000000000000000E+000 7.52316e-037 0 >>>>> 2 2 0.000000000000000E+000 0 0 >>>>> 2 3 0.000000000000000E+000 1.68459e-016 0 >>>>> 2 4 4.814824860968090E-035 0.1296 4.81482e-035 >>>>> 2 5 0.000000000000000E+000 Process [1] Line B >>>>> 2 6 -1.371273884908092E-019 0 7.52316e-037 Line C >>>>> 0 0 >>>>> Process [1] 0 1.68459e-016 >>>>> 1 1 1.395982780116148E-021 4.81482e-035 0.1296 Line D >>>>> 1 2 -7.523163845262640E-037 0 1.37127e-019 Line E >>>>> 1 3 7.523163845262640E-037 -7.22224e-035 -7.22224e-035 >>>>> 1 4 0.000000000000000E+000 7.22224e-035 7.22224e-035 >>>>> 1 5 1.684590875336239E-016 0 0 >>>>> 1 6 0.129600000000000 128623 128623 >>>>> 2 1 1.371273884908092E-019 0 0 Line F >>>>> 2 2 -7.222237291452134E-035 >>>>> 2 3 7.222237291452134E-035 >>>>> 2 4 0.000000000000000E+000 >>>>> 2 5 128623.169844761 >>>>> 2 6 0.000000000000000E+000 >>>>> >>>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values. >>>>> How can I handle this kind of local vector to global vector assembly? >>>>> >>>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for. >>>>> >>>> Thanks, Matthew. >>>> >>>> I tried the following codes but still cannot get the correct global rhs vector >>>> >>>> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr) >>>> do i = 1,nvz !nvz is local node amount, here is 6 >>>> vecpointer1d(0,i-1) = x_array_loc(i) !assume x_array_loc is the local rhs (the third column in the above mentioned data) >>>> vecpointer1d(1,i-1) = x_array_loc(i+nvz) >>>> end do >>>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr) >>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>> >>>> >>>> Now the rhs for 1 processor is as follows. It is not what I want. >>>> >>>> 1.39598e-021 >>>> 0 >>>> -0 >>>> -0 >>>> -0 >>>> -0 >>>> 5.64237e-037 >>>> 4.81482e-035 >>>> -0 >>>> -0 >>>> -7.52316e-037 >>>> -7.22224e-035 >>>> 7.52316e-037 >>>> 7.22224e-035 >>>> -0 >>>> -0 >>>> 1.68459e-016 >>>> 128623 >>>> 0.1296 >>>> 0 >>>> >>>>> Matt >>>>> >>>>> >>>>> In fact, the codes can work if the dof and local node is as follows. >>>>> dof local node >>>>> 1 1 >>>>> 2 1 >>>>> 1 2 >>>>> 2 2 >>>>> 1 3 >>>>> 2 3 >>>>> >>>>> Thanks and regards, >>>>> >>>>> Danyang >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> From jed at jedbrown.org Thu May 22 19:51:12 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 22 May 2014 18:51:12 -0600 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: <537E998D.8040101@gmail.com> References: <537E48C2.3060105@gmail.com> <537E6FCF.4030408@gmail.com> <537E892D.9030808@gmail.com> <3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov> <537E998D.8040101@gmail.com> Message-ID: <87egzlfhbz.fsf@jedbrown.org> Danyang Su writes: > On 22/05/2014 5:34 PM, Barry Smith wrote: >> DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?. > Then, is there any routine convert matrix to be "interlaced"? I don't know what you mean. DMCreateMatrix() will give you a Mat preallocated for interlaced and that's how you should assemble it (e.g., with MatSetValuesBlockedStencil()). There are several examples that use this interface. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From bsmith at mcs.anl.gov Thu May 22 19:51:38 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 22 May 2014 19:51:38 -0500 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: <537E998D.8040101@gmail.com> References: <537E48C2.3060105@gmail.com> <537E6FCF.4030408@gmail.com> <537E892D.9030808@gmail.com> <3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov> <537E998D.8040101@gmail.com> Message-ID: VecStrideScatter,Gather can be used to go from an interlaced vector to a separate vector for each component. You can also write code as below where you have a non-interlated ARRAY and you put/take the values into the array obtained with the DMDAVecGetArrayF90. In other words PETSc vectors remained interlaced but you work with other arrays that are not interlaced. Barry On May 22, 2014, at 7:42 PM, Danyang Su wrote: > On 22/05/2014 5:34 PM, Barry Smith wrote: >> DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?. > Then, is there any routine convert matrix to be "interlaced"? > > Thanks, > > Danyang >> >> Barry >> >> On May 22, 2014, at 6:33 PM, Danyang Su wrote: >> >>> Hi Barry, >>> >>> I use the following routine to reorder from the local rhs to global rhs. >>> >>> PetscScalar, pointer :: vecpointer1d(:,:) >>> >>> call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr) !x_vec_gbl is a global vector created by DMCreateGlobalVector >>> do i = nvzls,nvzle !local node number without ghost node >>> vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1) !x_array_loc is local rhs >>> vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz) !nvz = 6 for the present 1d example >>> end do >>> call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr) >>> >>> Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order. >>> >>> x_vec_gbl x_vec_gbl >>> dof node VecView(Current) dof node VecView (Expected) >>> 1 1 1.39598e-021 1 1 1.39598e-021 >>> 2 1 0 1 2 0 >>> 1 2 -0 1 3 0 >>> 2 2 -0 1 4 5.64237e-037 >>> 1 3 -0 1 5 0 >>> 2 3 -0 1 6 -7.52316e-037 >>> 1 4 5.64237e-037 1 7 7.52316e-037 >>> 2 4 4.81482e-035 1 8 0 >>> 1 5 -0 1 9 1.68459e-016 >>> 2 5 -0 1 10 0.1296 >>> 1 6 -7.52316e-037 2 1 0 >>> 2 6 -7.22224e-035 2 2 0 >>> 1 7 7.52316e-037 2 3 0 >>> 2 7 7.22224e-035 2 4 4.81482e-035 >>> 1 8 -0 2 5 0 >>> 2 8 -0 2 6 -7.22224e-035 >>> 1 9 1.68459e-016 2 7 7.22224e-035 >>> 2 9 128623 2 8 0 >>> 1 10 0.1296 2 9 128623 >>> 2 10 0 2 10 0 >>> >>> Thanks and regards, >>> >>> Danyang >>> >>> >>> On 22/05/2014 4:03 PM, Barry Smith wrote: >>>> Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on >>>> x_vec_loc only on the first process. Is it what you expect? >>>> >>>> Also what is vecpointer1d declared to be? >>>> >>>> >>>> Barry >>>> >>>> On May 22, 2014, at 4:44 PM, Danyang Su >>>> >>>> wrote: >>>> >>>> >>>>> On 22/05/2014 12:01 PM, Matthew Knepley wrote: >>>>> >>>>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su >>>>>> wrote: >>>>>> Hi All, >>>>>> >>>>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2. >>>>>> >>>>>> The whole domain has 10 nodes in z direction. >>>>>> >>>>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. >>>>>> >>>>>> The following is used to set the rhs value. >>>>>> >>>>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr) >>>>>> vecpointer = (calculate the rhs value here) >>>>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) >>>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>>> >>>>>> Vecview Correct Vecview Wrong >>>>>> dof local node Process [0] Process [0] Process [0] >>>>>> 1 1 1.395982780116148E-021 1.39598e-021 1.39598e-021 >>>>>> 1 2 0.000000000000000E+000 0 0 >>>>>> 1 3 0.000000000000000E+000 0 0 >>>>>> 1 4 5.642372883946980E-037 5.64237e-037 5.64237e-037 >>>>>> 1 5 0.000000000000000E+000 0 0 >>>>>> 1 6 -1.395982780116148E-021 -7.52316e-037 -1.39598e-021 Line A >>>>>> 2 1 0.000000000000000E+000 7.52316e-037 0 >>>>>> 2 2 0.000000000000000E+000 0 0 >>>>>> 2 3 0.000000000000000E+000 1.68459e-016 0 >>>>>> 2 4 4.814824860968090E-035 0.1296 4.81482e-035 >>>>>> 2 5 0.000000000000000E+000 Process [1] Line B >>>>>> 2 6 -1.371273884908092E-019 0 7.52316e-037 Line C >>>>>> 0 0 >>>>>> Process [1] 0 1.68459e-016 >>>>>> 1 1 1.395982780116148E-021 4.81482e-035 0.1296 Line D >>>>>> 1 2 -7.523163845262640E-037 0 1.37127e-019 Line E >>>>>> 1 3 7.523163845262640E-037 -7.22224e-035 -7.22224e-035 >>>>>> 1 4 0.000000000000000E+000 7.22224e-035 7.22224e-035 >>>>>> 1 5 1.684590875336239E-016 0 0 >>>>>> 1 6 0.129600000000000 128623 128623 >>>>>> 2 1 1.371273884908092E-019 0 0 Line F >>>>>> 2 2 -7.222237291452134E-035 >>>>>> 2 3 7.222237291452134E-035 >>>>>> 2 4 0.000000000000000E+000 >>>>>> 2 5 128623.169844761 >>>>>> 2 6 0.000000000000000E+000 >>>>>> >>>>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values. >>>>>> How can I handle this kind of local vector to global vector assembly? >>>>>> >>>>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for. >>>>>> >>>>> Thanks, Matthew. >>>>> >>>>> I tried the following codes but still cannot get the correct global rhs vector >>>>> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr) >>>>> do i = 1,nvz !nvz is local node amount, here is 6 >>>>> vecpointer1d(0,i-1) = x_array_loc(i) !assume x_array_loc is the local rhs (the third column in the above mentioned data) >>>>> vecpointer1d(1,i-1) = x_array_loc(i+nvz) >>>>> end do >>>>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr) >>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>> >>>>> >>>>> Now the rhs for 1 processor is as follows. It is not what I want. >>>>> >>>>> 1.39598e-021 >>>>> 0 >>>>> -0 >>>>> -0 >>>>> -0 >>>>> -0 >>>>> 5.64237e-037 >>>>> 4.81482e-035 >>>>> -0 >>>>> -0 >>>>> -7.52316e-037 >>>>> -7.22224e-035 >>>>> 7.52316e-037 >>>>> 7.22224e-035 >>>>> -0 >>>>> -0 >>>>> 1.68459e-016 >>>>> 128623 >>>>> 0.1296 >>>>> 0 >>>>> >>>>>> Matt >>>>>> >>>>>> In fact, the codes can work if the dof and local node is as follows. >>>>>> dof local node >>>>>> 1 1 >>>>>> 2 1 >>>>>> 1 2 >>>>>> 2 2 >>>>>> 1 3 >>>>>> 2 3 >>>>>> >>>>>> Thanks and regards, >>>>>> >>>>>> Danyang >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> > From avi.mosher at wyss.harvard.edu Thu May 22 23:56:08 2014 From: avi.mosher at wyss.harvard.edu (Robinson-Mosher, Avram Lev) Date: Fri, 23 May 2014 00:56:08 -0400 Subject: [petsc-users] Is it possible to create a BAIJ matrix with non-square blocks? Message-ID: Hi all, I'm interested in using PETSc's sparse matrices with block elements, but I would like the elements to be small non-square matrices (e.g., 4 by 3). Is this possible? I see that the general construction functions assume that the elements will be square. Regards, -Avi From jed at jedbrown.org Fri May 23 00:14:32 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 22 May 2014 23:14:32 -0600 Subject: [petsc-users] Is it possible to create a BAIJ matrix with non-square blocks? In-Reply-To: References: Message-ID: <8738g1dqkn.fsf@jedbrown.org> "Robinson-Mosher, Avram Lev" writes: > Hi all, I'm interested in using PETSc's sparse matrices with block > elements, but I would like the elements to be small non-square > matrices (e.g., 4 by 3). Is this possible? I see that the general > construction functions assume that the elements will be square. The constant block-size matrices are only for square blocks. But you can often get some benefits by using AIJ matrices with Inodes (default). So just create an AIJ matrix with fields interlaced so that 4x3 blocks exist, then PETSc will coalesce the consecutive rows with identical sparsity pattern, making the result similar to 4x1 blocks. This already provides most of the bandwidth benefit of blocked matrices. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From fiona at epcc.ed.ac.uk Fri May 23 08:14:04 2014 From: fiona at epcc.ed.ac.uk (Fiona Reid) Date: Fri, 23 May 2014 14:14:04 +0100 Subject: [petsc-users] Obtaining TS rosw solver output at regular time steps In-Reply-To: <87ha4hh8c9.fsf@jedbrown.org> References: <537E2D0C.5050606@epcc.ed.ac.uk> <87vbsxhf1q.fsf@jedbrown.org> <537E47C0.5010205@epcc.ed.ac.uk> <87ha4hh8c9.fsf@jedbrown.org> Message-ID: <537F499C.2000803@epcc.ed.ac.uk> On 22/05/2014 21:22, Jed Brown wrote: > I recommend writing a monitor (TSMonitorSet) that checks whether an > "interesting" time has been passed on the step that just completed, then > use TSInterpolate() to obtain a solution at that "interesting" time. Many thanks Jed. I have that all working now. Cheers, Fiona -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From info at jubileedvds.com Fri May 23 10:01:46 2014 From: info at jubileedvds.com (Jubilee DVDs) Date: Fri, 23 May 2014 17:01:46 +0200 (SAST) Subject: [petsc-users] Jubilee DVDs Newsletter Message-ID: <1195896-1400857192269-133838-250313049-1-0@b.ss40.shsend.com> An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Fri May 23 13:16:04 2014 From: danyang.su at gmail.com (Danyang Su) Date: Fri, 23 May 2014 11:16:04 -0700 Subject: [petsc-users] Question on local vec to global vec for dof > 1 In-Reply-To: References: <537E48C2.3060105@gmail.com> <537E6FCF.4030408@gmail.com> <537E892D.9030808@gmail.com> <3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov> <537E998D.8040101@gmail.com> Message-ID: <537F9064.90202@gmail.com> Hi All, Thanks for all your kindly reply. I convert the my codes from non-interlaced structure to interlaced structured and it can work now. Thanks and regards, Danyang On 22/05/2014 5:51 PM, Barry Smith wrote: > VecStrideScatter,Gather can be used to go from an interlaced vector to a separate vector for each component. You can also write code as below where you have a non-interlated ARRAY and you put/take the values into the array obtained with the DMDAVecGetArrayF90. In other words PETSc vectors remained interlaced but you work with other arrays that are not interlaced. > > Barry > > On May 22, 2014, at 7:42 PM, Danyang Su wrote: > >> On 22/05/2014 5:34 PM, Barry Smith wrote: >>> DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?. >> Then, is there any routine convert matrix to be "interlaced"? >> >> Thanks, >> >> Danyang >>> Barry >>> >>> On May 22, 2014, at 6:33 PM, Danyang Su wrote: >>> >>>> Hi Barry, >>>> >>>> I use the following routine to reorder from the local rhs to global rhs. >>>> >>>> PetscScalar, pointer :: vecpointer1d(:,:) >>>> >>>> call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr) !x_vec_gbl is a global vector created by DMCreateGlobalVector >>>> do i = nvzls,nvzle !local node number without ghost node >>>> vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1) !x_array_loc is local rhs >>>> vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz) !nvz = 6 for the present 1d example >>>> end do >>>> call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr) >>>> >>>> Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order. >>>> >>>> x_vec_gbl x_vec_gbl >>>> dof node VecView(Current) dof node VecView (Expected) >>>> 1 1 1.39598e-021 1 1 1.39598e-021 >>>> 2 1 0 1 2 0 >>>> 1 2 -0 1 3 0 >>>> 2 2 -0 1 4 5.64237e-037 >>>> 1 3 -0 1 5 0 >>>> 2 3 -0 1 6 -7.52316e-037 >>>> 1 4 5.64237e-037 1 7 7.52316e-037 >>>> 2 4 4.81482e-035 1 8 0 >>>> 1 5 -0 1 9 1.68459e-016 >>>> 2 5 -0 1 10 0.1296 >>>> 1 6 -7.52316e-037 2 1 0 >>>> 2 6 -7.22224e-035 2 2 0 >>>> 1 7 7.52316e-037 2 3 0 >>>> 2 7 7.22224e-035 2 4 4.81482e-035 >>>> 1 8 -0 2 5 0 >>>> 2 8 -0 2 6 -7.22224e-035 >>>> 1 9 1.68459e-016 2 7 7.22224e-035 >>>> 2 9 128623 2 8 0 >>>> 1 10 0.1296 2 9 128623 >>>> 2 10 0 2 10 0 >>>> >>>> Thanks and regards, >>>> >>>> Danyang >>>> >>>> >>>> On 22/05/2014 4:03 PM, Barry Smith wrote: >>>>> Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on >>>>> x_vec_loc only on the first process. Is it what you expect? >>>>> >>>>> Also what is vecpointer1d declared to be? >>>>> >>>>> >>>>> Barry >>>>> >>>>> On May 22, 2014, at 4:44 PM, Danyang Su >>>>> >>>>> wrote: >>>>> >>>>> >>>>>> On 22/05/2014 12:01 PM, Matthew Knepley wrote: >>>>>> >>>>>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su >>>>>>> wrote: >>>>>>> Hi All, >>>>>>> >>>>>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2. >>>>>>> >>>>>>> The whole domain has 10 nodes in z direction. >>>>>>> >>>>>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. >>>>>>> >>>>>>> The following is used to set the rhs value. >>>>>>> >>>>>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr) >>>>>>> vecpointer = (calculate the rhs value here) >>>>>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr) >>>>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>>>> >>>>>>> Vecview Correct Vecview Wrong >>>>>>> dof local node Process [0] Process [0] Process [0] >>>>>>> 1 1 1.395982780116148E-021 1.39598e-021 1.39598e-021 >>>>>>> 1 2 0.000000000000000E+000 0 0 >>>>>>> 1 3 0.000000000000000E+000 0 0 >>>>>>> 1 4 5.642372883946980E-037 5.64237e-037 5.64237e-037 >>>>>>> 1 5 0.000000000000000E+000 0 0 >>>>>>> 1 6 -1.395982780116148E-021 -7.52316e-037 -1.39598e-021 Line A >>>>>>> 2 1 0.000000000000000E+000 7.52316e-037 0 >>>>>>> 2 2 0.000000000000000E+000 0 0 >>>>>>> 2 3 0.000000000000000E+000 1.68459e-016 0 >>>>>>> 2 4 4.814824860968090E-035 0.1296 4.81482e-035 >>>>>>> 2 5 0.000000000000000E+000 Process [1] Line B >>>>>>> 2 6 -1.371273884908092E-019 0 7.52316e-037 Line C >>>>>>> 0 0 >>>>>>> Process [1] 0 1.68459e-016 >>>>>>> 1 1 1.395982780116148E-021 4.81482e-035 0.1296 Line D >>>>>>> 1 2 -7.523163845262640E-037 0 1.37127e-019 Line E >>>>>>> 1 3 7.523163845262640E-037 -7.22224e-035 -7.22224e-035 >>>>>>> 1 4 0.000000000000000E+000 7.22224e-035 7.22224e-035 >>>>>>> 1 5 1.684590875336239E-016 0 0 >>>>>>> 1 6 0.129600000000000 128623 128623 >>>>>>> 2 1 1.371273884908092E-019 0 0 Line F >>>>>>> 2 2 -7.222237291452134E-035 >>>>>>> 2 3 7.222237291452134E-035 >>>>>>> 2 4 0.000000000000000E+000 >>>>>>> 2 5 128623.169844761 >>>>>>> 2 6 0.000000000000000E+000 >>>>>>> >>>>>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values. >>>>>>> How can I handle this kind of local vector to global vector assembly? >>>>>>> >>>>>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for. >>>>>>> >>>>>> Thanks, Matthew. >>>>>> >>>>>> I tried the following codes but still cannot get the correct global rhs vector >>>>>> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr) >>>>>> do i = 1,nvz !nvz is local node amount, here is 6 >>>>>> vecpointer1d(0,i-1) = x_array_loc(i) !assume x_array_loc is the local rhs (the third column in the above mentioned data) >>>>>> vecpointer1d(1,i-1) = x_array_loc(i+nvz) >>>>>> end do >>>>>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr) >>>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr) >>>>>> >>>>>> >>>>>> Now the rhs for 1 processor is as follows. It is not what I want. >>>>>> >>>>>> 1.39598e-021 >>>>>> 0 >>>>>> -0 >>>>>> -0 >>>>>> -0 >>>>>> -0 >>>>>> 5.64237e-037 >>>>>> 4.81482e-035 >>>>>> -0 >>>>>> -0 >>>>>> -7.52316e-037 >>>>>> -7.22224e-035 >>>>>> 7.52316e-037 >>>>>> 7.22224e-035 >>>>>> -0 >>>>>> -0 >>>>>> 1.68459e-016 >>>>>> 128623 >>>>>> 0.1296 >>>>>> 0 >>>>>> >>>>>>> Matt >>>>>>> >>>>>>> In fact, the codes can work if the dof and local node is as follows. >>>>>>> dof local node >>>>>>> 1 1 >>>>>>> 2 1 >>>>>>> 1 2 >>>>>>> 2 2 >>>>>>> 1 3 >>>>>>> 2 3 >>>>>>> >>>>>>> Thanks and regards, >>>>>>> >>>>>>> Danyang >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> From spk at ldeo.columbia.edu Sat May 24 07:04:07 2014 From: spk at ldeo.columbia.edu (Samar Khatiwala) Date: Sat, 24 May 2014 08:04:07 -0400 Subject: [petsc-users] Installation problems on IBM machine Message-ID: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu> Hello, I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with: ========================================== Building PETSc using CMake with 21 build threads ========================================== make: 1254-002 Cannot find a rule to create target 21 from dependencies. Stop. make: 1254-004 The error code from the last command is 2. ? Please see attached logs. I configured with: config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]" What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no avail. Any help would be appreciated. Thanks very much! Samar -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log.gz Type: application/x-gzip Size: 3162 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.gz Type: application/x-gzip Size: 140308 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sat May 24 08:22:40 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 24 May 2014 08:22:40 -0500 Subject: [petsc-users] Installation problems on IBM machine In-Reply-To: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu> References: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu> Message-ID: <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov> Try using make all-legacy Barry That is some cmake problem; fortunately we are abandoning cmake for the future. The reason the problem persists is likely because cmake has cached something somewhere that doesn?t get rebuilt. On May 24, 2014, at 7:04 AM, Samar Khatiwala wrote: > Hello, > > I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with: > > ========================================== > Building PETSc using CMake with 21 build threads > ========================================== > make: 1254-002 Cannot find a rule to create target 21 from dependencies. > Stop. > make: 1254-004 The error code from the last command is 2. > ? > > Please see attached logs. I configured with: > > config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]" > > What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with > --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now > get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no > avail. > > Any help would be appreciated. Thanks very much! > > Samar > > > > From s_g at berkeley.edu Sat May 24 11:12:09 2014 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Sat, 24 May 2014 09:12:09 -0700 Subject: [petsc-users] Installation problems on IBM machine In-Reply-To: <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov> References: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu> <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov> Message-ID: <5380C4D9.7060602@berkeley.edu> > fortunately we are abandoning cmake for the future. Barry, Are you going to go back to standard make? or have you selected a new make system? -sanjay On 5/24/14 6:22 AM, Barry Smith wrote: > > Try using make all-legacy > > Barry > > That is some cmake problem; fortunately we are abandoning cmake for the future. The reason the problem persists is likely because cmake has cached something somewhere that doesn?t get rebuilt. > > > On May 24, 2014, at 7:04 AM, Samar Khatiwala wrote: > >> Hello, >> >> I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with: >> >> ========================================== >> Building PETSc using CMake with 21 build threads >> ========================================== >> make: 1254-002 Cannot find a rule to create target 21 from dependencies. >> Stop. >> make: 1254-004 The error code from the last command is 2. >> ? >> >> Please see attached logs. I configured with: >> >> config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]" >> >> What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with >> --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now >> get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no >> avail. >> >> Any help would be appreciated. Thanks very much! >> >> Samar >> >> >> >> From spk at ldeo.columbia.edu Sat May 24 11:13:56 2014 From: spk at ldeo.columbia.edu (Samar Khatiwala) Date: Sat, 24 May 2014 12:13:56 -0400 Subject: [petsc-users] Installation problems on IBM machine In-Reply-To: <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov> References: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu> <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov> Message-ID: <30DC2CFB-6950-469D-B6A4-4B1DB1D40C96@ldeo.columbia.edu> Hi Barry, That solved the problem! Thanks so much for the fast and helpful reply! Best, Samar On May 24, 2014, at 9:22 AM, Barry Smith wrote: > > > Try using make all-legacy > > Barry > > That is some cmake problem; fortunately we are abandoning cmake for the future. The reason the problem persists is likely because cmake has cached something somewhere that doesn?t get rebuilt. > > > On May 24, 2014, at 7:04 AM, Samar Khatiwala wrote: > >> Hello, >> >> I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with: >> >> ========================================== >> Building PETSc using CMake with 21 build threads >> ========================================== >> make: 1254-002 Cannot find a rule to create target 21 from dependencies. >> Stop. >> make: 1254-004 The error code from the last command is 2. >> ? >> >> Please see attached logs. I configured with: >> >> config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]" >> >> What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with >> --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now >> get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no >> avail. >> >> Any help would be appreciated. Thanks very much! >> >> Samar >> >> >> >> > From bsmith at mcs.anl.gov Sat May 24 11:33:35 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 24 May 2014 11:33:35 -0500 Subject: [petsc-users] Installation problems on IBM machine In-Reply-To: <5380C4D9.7060602@berkeley.edu> References: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu> <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov> <5380C4D9.7060602@berkeley.edu> Message-ID: <1EC2EB94-D76B-4058-92C7-0C4DF65B70E7@mcs.anl.gov> In the development version we have the default now be very simple code that uses gnumake (much simpler than the cmake stuff). We?ve found most machines have gnumake installed and if not, gnumake is very portable and ./configure has ?download-gnumake so the user doesn?t even need to deal with installing it. In our next release we will support legacy, cmake, gnumake for compiling. After that if all goes well with gnumake we may remove the legacy and cmake stuff. Barry On May 24, 2014, at 11:12 AM, Sanjay Govindjee wrote: >> fortunately we are abandoning cmake for the future. > > Barry, > Are you going to go back to standard make? > or have you selected a new make system? > -sanjay > > On 5/24/14 6:22 AM, Barry Smith wrote: >> >> Try using make all-legacy >> >> Barry >> >> That is some cmake problem; fortunately we are abandoning cmake for the future. The reason the problem persists is likely because cmake has cached something somewhere that doesn?t get rebuilt. >> >> >> On May 24, 2014, at 7:04 AM, Samar Khatiwala wrote: >> >>> Hello, >>> >>> I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with: >>> >>> ========================================== >>> Building PETSc using CMake with 21 build threads >>> ========================================== >>> make: 1254-002 Cannot find a rule to create target 21 from dependencies. >>> Stop. >>> make: 1254-004 The error code from the last command is 2. >>> ? >>> >>> Please see attached logs. I configured with: >>> >>> config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]" >>> >>> What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with >>> --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now >>> get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no >>> avail. >>> >>> Any help would be appreciated. Thanks very much! >>> >>> Samar >>> >>> >>> >>> > From qince168 at gmail.com Sat May 24 22:20:44 2014 From: qince168 at gmail.com (Ce Qin) Date: Sun, 25 May 2014 11:20:44 +0800 Subject: [petsc-users] Question about TaoLineSearchApply. Message-ID: Dear all, Now I'm using TAO to solve an optimization problem. I want to output some internal data at each iteration in the monitor routine. In the cg solver I found that TaoMonitor is called after TaoLineSearchApply. So I'm wondering that is the objective and gradient TaoLineSearchApply returns lastest computed? Does TAO guarantee this? Sorry for my poor English. If you have any question please let me know. Any help will be appreciated. Thanks in advance. Best regards, Ce Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat May 24 22:37:15 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 24 May 2014 22:37:15 -0500 Subject: [petsc-users] Question about TaoLineSearchApply. In-Reply-To: References: Message-ID: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> On May 24, 2014, at 10:20 PM, Ce Qin wrote: > Dear all, > > Now I'm using TAO to solve an optimization problem. I want to output some internal data at each iteration in the monitor routine. In the cg solver I found that TaoMonitor is called after TaoLineSearchApply. So I'm wondering that is the objective and gradient TaoLineSearchApply returns lastest computed? Does TAO guarantee this? According to the documentation and code yes it is suppose to always return the object function and gradient at the new solution value. Of course it is possible there is an error in one of our line search routines that results in not computing the final values. If you think think there is an error please point to the exact line search you are using and parameters etc (a test code is best and we will investigate if there is an error). Barry > > Sorry for my poor English. If you have any question please let me know. Any help will be appreciated. Thanks in advance. > > Best regards, > Ce Qin > From qince168 at gmail.com Sun May 25 07:11:41 2014 From: qince168 at gmail.com (Ce Qin) Date: Sun, 25 May 2014 20:11:41 +0800 Subject: [petsc-users] Question about TaoLineSearchApply. In-Reply-To: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> References: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> Message-ID: Hi Barry, I think there is confusion about my question. The purpose of line search is to find an \alpha_{k} that minimize f(x + \alpha p). This procedure usually choose several \alpha and returns a proper \alpha_{k}. For each \alpha, we need to compute the objective function and gradient of (x + \alpha p). In my FormFunctionGradient function, I save some internal data(model response) which will be printed in the monitor routine. If the objective function and gradient returned by TaoLineSearchApply isn't the latest computed, my internal data becomes invalid in the monitor routine. My question is the order of the objective function and gradient corresponding to \alpha_{k} in this line search procedure. If it is not clear, please let me know. Thanks. Best regards, Ce Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.sarich at gmail.com Sun May 25 07:52:09 2014 From: jason.sarich at gmail.com (Jason Sarich) Date: Sun, 25 May 2014 07:52:09 -0500 Subject: [petsc-users] Question about TaoLineSearchApply. In-Reply-To: <0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov> References: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> <0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov> Message-ID: Hi Ce Qin, What you are doing will work fine. There is no back-tracking in a linesearch to a previously computed x, the last x computed is the current solution. Jason Sarich On Sun, May 25, 2014 at 7:11 AM, Ce Qin wrote: > Hi Barry, > > I think there is confusion about my question. > > The purpose of line search is to find an \alpha_{k} that minimize f(x + > \alpha p). This procedure usually choose several \alpha and returns a > proper \alpha_{k}. For each \alpha, we need to compute the objective > function and gradient of (x + \alpha p). In my FormFunctionGradient > function, I save some internal data(model response) which will be printed > in the monitor routine. If the objective function and gradient returned by > TaoLineSearchApply isn't the latest computed, my internal data becomes > invalid in the monitor routine. My question is the order of the objective > function and gradient corresponding to \alpha_{k} in this line search > procedure. > > If it is not clear, please let me know. Thanks. > > Best regards, > Ce Qin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qince168 at gmail.com Sun May 25 08:12:50 2014 From: qince168 at gmail.com (Ce Qin) Date: Sun, 25 May 2014 21:12:50 +0800 Subject: [petsc-users] Question about TaoLineSearchApply. In-Reply-To: References: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> <0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov> Message-ID: Thanks, Jason. That's what I need. One more question, How many function and gradient evaluations one MT line search need? I'm doing geophysical inversion, function and gradient evaluation is very expensive, so I want to minimize the function calls. Do you have any suggestions? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.sarich at gmail.com Sun May 25 09:10:36 2014 From: jason.sarich at gmail.com (Jason Sarich) Date: Sun, 25 May 2014 09:10:36 -0500 Subject: [petsc-users] Question about TaoLineSearchApply. In-Reply-To: References: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> <0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov> Message-ID: Hi Ce Qin, Typically the MT line search accepts the first trial point, it is mostly used to throw out a few terrible guesses and to help prevent stalling. You should get a general idea of how much time is spent line searching just be checking the number of TAO iterations versus the number of function evaluations in -tao_view, and there are simple ways to access this information directly if you need to. There are some parameters for the line search that make it more or less selective, but saving a few function evaluations here in the line search will probably cost more evaluations in the broader view of the optimization in general. If you haven't done so yet, try the lmvm algorithm as a substitute for cg, it works on the same information and usually performs a little better. Jason On Sun, May 25, 2014 at 8:12 AM, Ce Qin wrote: > Thanks, Jason. That's what I need. > > One more question, How many function and gradient evaluations one MT line > search need? > I'm doing geophysical inversion, function and gradient evaluation is very > expensive, so I want to minimize the function calls. Do you have any > suggestions? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qince168 at gmail.com Sun May 25 09:22:26 2014 From: qince168 at gmail.com (Ce Qin) Date: Sun, 25 May 2014 22:22:26 +0800 Subject: [petsc-users] Question about TaoLineSearchApply. In-Reply-To: References: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> <0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov> Message-ID: Thanks, Jason. I will try it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun May 25 11:16:34 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 25 May 2014 11:16:34 -0500 Subject: [petsc-users] Question about TaoLineSearchApply. In-Reply-To: References: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> <0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov> Message-ID: <54A85E1A-3579-43AB-94C1-FDED6D6FBBBA@mcs.anl.gov> Also run with -log_summary and you?ll see the percentage of time in the line search and also in the various function evaluations. The time for the line search includes the time of the function and gradient evaluations IT does, while the time for the functions and gradients includes the time for ALL function and gradient evaluations. Barry On May 25, 2014, at 8:12 AM, Ce Qin wrote: > Thanks, Jason. That's what I need. > > One more question, How many function and gradient evaluations one MT line search need? > I'm doing geophysical inversion, function and gradient evaluation is very expensive, so I want to minimize the function calls. Do you have any suggestions? From Lukasz.Kaczmarczyk at glasgow.ac.uk Sun May 25 12:34:56 2014 From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk) Date: Sun, 25 May 2014 17:34:56 +0000 Subject: [petsc-users] PetscLayou Message-ID: Hello, I use PetscLayout to set the ranges for matrix partitioning (petsc-3.4.3), however, for one of the problems, I get from warring from valgrind. This warning is only for particular problem, for other code executions with different input data, valgrind do not shows any errors at that point. Could you tell me if this could lead to segmentation fault, for example for code compiled with Intel compiler? Where is my mistake? Kind regards, Lukasz 533 PetscLayout layout; 534 ierr = PetscLayoutCreate(PETSC_COMM_WORLD,&layout); CHKERRQ(ierr); 535 ierr = PetscLayoutSetBlockSize(layout,1); CHKERRQ(ierr); 536 ierr = PetscLayoutSetSize(layout,nb_dofs_row); CHKERRQ(ierr); 537 ierr = PetscLayoutSetUp(layout); CHKERRQ(ierr); 538 PetscInt rstart,rend; 539 ierr = PetscLayoutGetRange(layout,&rstart,&rend); CHKERRQ Partition problem COUPLED_PROBLEM create_Mat: row lower 0 row upper 837 create_Mat: row lower 837 row upper 1674 create_Mat: row lower 1674 row upper 2511 create_Mat: row lower 2511 row upper 3347 create_Mat: row lower 3347 row upper 4183 create_Mat: row lower 4183 row upper 5019 create_Mat: row lower 5019 row upper 5855 create_Mat: row lower 5855 row upper 6691 create_Mat: row lower 6691 row upper 7527 create_Mat: row lower 7527 row upper 8363 create_Mat: row lower 8363 row upper 9199 create_Mat: row lower 9199 row upper 10035 ==80351== Source and destination overlap in memcpy(0x1e971f94, 0x1e971fa8, 28) ==80351== at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==80351== by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) ==80351== by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so) ==80351== by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) ==80351== by 0x565C4C8: PetscLayoutSetUp (pmap.c:158) ==80355== Source and destination overlap in memcpy(0x1e97e784, 0x1e97e788, 44) ==80355== at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==80355== by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) ==80355== by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so) ==80355== by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) ==80355== by 0x565C4C8: PetscLayoutSetUp (pmap.c:158) ==80355== by 0xDEFB18: int MoFEM::FieldCore::create_Mat(std::string const&, _p_Mat**, char const*, int**, int**, double**, bool, int) (FieldCore.hpp:537) Full code source can be viewed from here, https://bitbucket.org/likask/mofem-joseph/src/3185671a406bc3f02336b3886775dff037dbd4fe/mofem_v0.1/do_not_blink/FieldCore.hpp?at=release From bsmith at mcs.anl.gov Sun May 25 13:03:52 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 25 May 2014 13:03:52 -0500 Subject: [petsc-users] PetscLayou In-Reply-To: References: Message-ID: <57628F62-A4CC-4DEB-A538-3B3B20D8126F@mcs.anl.gov> Looking at your code and the PETSc source code I see nothing wrong. > Source and destination overlap in memcpy(0x1e97e784, 0x1e97e788, 44) > ==80355== at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==80355== by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) > ==80355== by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so) > ==80355== by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) I can only guess an error in the MPI implementation at this point. Maybe more with the MPI then the compiler? But that is guessing Barry On May 25, 2014, at 12:34 PM, Lukasz Kaczmarczyk wrote: > Hello, > > I use PetscLayout to set the ranges for matrix partitioning (petsc-3.4.3), however, for one of the problems, I get from warring from valgrind. This warning is only for particular problem, for other code executions with different input data, valgrind do not shows any errors at that point. > > Could you tell me if this could lead to segmentation fault, for example for code compiled with Intel compiler? Where is my mistake? > > Kind regards, > Lukasz > > 533 PetscLayout layout; > 534 ierr = PetscLayoutCreate(PETSC_COMM_WORLD,&layout); CHKERRQ(ierr); > 535 ierr = PetscLayoutSetBlockSize(layout,1); CHKERRQ(ierr); > 536 ierr = PetscLayoutSetSize(layout,nb_dofs_row); CHKERRQ(ierr); > 537 ierr = PetscLayoutSetUp(layout); CHKERRQ(ierr); > 538 PetscInt rstart,rend; > 539 ierr = PetscLayoutGetRange(layout,&rstart,&rend); CHKERRQ > > > Partition problem COUPLED_PROBLEM > create_Mat: row lower 0 row upper 837 > create_Mat: row lower 837 row upper 1674 > create_Mat: row lower 1674 row upper 2511 > create_Mat: row lower 2511 row upper 3347 > create_Mat: row lower 3347 row upper 4183 > create_Mat: row lower 4183 row upper 5019 > create_Mat: row lower 5019 row upper 5855 > create_Mat: row lower 5855 row upper 6691 > create_Mat: row lower 6691 row upper 7527 > create_Mat: row lower 7527 row upper 8363 > create_Mat: row lower 8363 row upper 9199 > create_Mat: row lower 9199 row upper 10035 > ==80351== Source and destination overlap in memcpy(0x1e971f94, 0x1e971fa8, 28) > ==80351== at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==80351== by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) > ==80351== by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so) > ==80351== by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) > ==80351== by 0x565C4C8: PetscLayoutSetUp (pmap.c:158) > ==80355== Source and destination overlap in memcpy(0x1e97e784, 0x1e97e788, 44) > ==80355== at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) > ==80355== by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) > ==80355== by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so) > ==80355== by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2) > ==80355== by 0x565C4C8: PetscLayoutSetUp (pmap.c:158) > ==80355== by 0xDEFB18: int MoFEM::FieldCore::create_Mat(std::string const&, _p_Mat**, char const*, int**, int**, double**, bool, int) (FieldCore.hpp:537) > > > Full code source can be viewed from here, > https://bitbucket.org/likask/mofem-joseph/src/3185671a406bc3f02336b3886775dff037dbd4fe/mofem_v0.1/do_not_blink/FieldCore.hpp?at=release From epscodes at gmail.com Sun May 25 19:23:27 2014 From: epscodes at gmail.com (Xiangdong) Date: Sun, 25 May 2014 20:23:27 -0400 Subject: [petsc-users] DM vector restriction questions Message-ID: Hello everyone, I have a questions about vectors in an DMDA with DOF>1. For example, in 1d with number of grid N and DOF=2 (two fields u and v), the length of the global vector is 2*N. What is the best way to restrict this vector (length 2*N) to a vector (length N) corresponding to the field u only? This will help me obtain the properties of field u by using VecSum and other vec functions. With DMDAVecGetArray and looping over only u field can do the job. However, I am just wondering whether any petsc function can provide either the restriction matrix or the vector restricted to a single field. Thank you. Best, Xiangdong -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun May 25 19:40:48 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 25 May 2014 18:40:48 -0600 Subject: [petsc-users] DM vector restriction questions In-Reply-To: References: Message-ID: <87zji574of.fsf@jedbrown.org> Xiangdong writes: > Hello everyone, > > I have a questions about vectors in an DMDA with DOF>1. For example, in 1d > with number of grid N and DOF=2 (two fields u and v), the length of the > global vector is 2*N. > > What is the best way to restrict this vector (length 2*N) to a vector > (length N) corresponding to the field u only? This will help me obtain the > properties of field u by using VecSum and other vec functions. http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecStrideGather.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sun May 25 19:51:08 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 25 May 2014 19:51:08 -0500 Subject: [petsc-users] DM vector restriction questions In-Reply-To: <87zji574of.fsf@jedbrown.org> References: <87zji574of.fsf@jedbrown.org> Message-ID: <0619379E-275E-418F-A91C-78A9AED1384E@mcs.anl.gov> And VecStrideNorm(), VecStrideScale(), VecStrideNormAll() etc. Let us know what is missing and what you need? Barry On May 25, 2014, at 7:40 PM, Jed Brown wrote: > Xiangdong writes: > >> Hello everyone, >> >> I have a questions about vectors in an DMDA with DOF>1. For example, in 1d >> with number of grid N and DOF=2 (two fields u and v), the length of the >> global vector is 2*N. >> >> What is the best way to restrict this vector (length 2*N) to a vector >> (length N) corresponding to the field u only? This will help me obtain the >> properties of field u by using VecSum and other vec functions. > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecStrideGather.html From qince168 at gmail.com Sun May 25 20:31:39 2014 From: qince168 at gmail.com (Ce Qin) Date: Mon, 26 May 2014 09:31:39 +0800 Subject: [petsc-users] Question about TaoLineSearchApply. In-Reply-To: <54A85E1A-3579-43AB-94C1-FDED6D6FBBBA@mcs.anl.gov> References: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov> <0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov> <54A85E1A-3579-43AB-94C1-FDED6D6FBBBA@mcs.anl.gov> Message-ID: Thanks for all your kind help! Best regards, Ce Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjsgr100 at gmail.com Mon May 26 04:32:29 2014 From: pjsgr100 at gmail.com (Pedro Rodrigues) Date: Mon, 26 May 2014 10:32:29 +0100 Subject: [petsc-users] ExodusII Message-ID: Hi I successfully built EXODUSII under Windows using netCDF, HDF5 (ZLIB and SZIP also). I ran an example to mesh generation with success. I would like to ask you to add support to EXODUSII under this platform. Fortran bindings are not there (or don't work) but I made that with Fortran interfaces (that can also be done with 'c' directives). VS2012 does not contain inttypes.h but I modified a cygwin file to make that available. regards -- Pedro Rodrigues -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon May 26 08:17:17 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 26 May 2014 07:17:17 -0600 Subject: [petsc-users] ExodusII In-Reply-To: References: Message-ID: <87egzg7k82.fsf@jedbrown.org> Pedro Rodrigues writes: > Hi > > I successfully built EXODUSII under Windows using netCDF, HDF5 > (ZLIB and SZIP also). I ran an example to mesh generation with success. I > would like to ask you to add support to EXODUSII under this > platform. Fortran bindings are not there (or don't work) but I made that > with Fortran interfaces (that can also be done with 'c' directives). The best way to add support is to have upstream accept your patch to make ExodusII compatible. The second best way is to submit a patch to PETSc. > VS2012 does not contain inttypes.h but I modified a cygwin file to > make that available. This sounds problematic because installing packages should not involve modifying system files. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From hgbk2008 at gmail.com Mon May 26 11:02:10 2014 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Mon, 26 May 2014 18:02:10 +0200 Subject: [petsc-users] row scale the matrix Message-ID: <53836582.6010804@gmail.com> Hi My matrix contains some overshoot entries in the diagonal and I want to row scale by a factor that I defined. How can I do that with petsc ? (I don't want to use MatDiagonalScale instead, I also don't want to create a diagonal matrix and left multiply to the system.) BR Bui From mrosso at uci.edu Mon May 26 11:20:25 2014 From: mrosso at uci.edu (Michele Rosso) Date: Mon, 26 May 2014 09:20:25 -0700 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org> Message-ID: <538369C9.6010209@uci.edu> Mark, thank you for your input and sorry my late reply: I saw your email only now. By setting up the solver each time step you mean re-defining the KSP context every time? Why should this help? I will definitely try that as well as the hypre solution and report back. Again, thank you. Michele On 05/22/2014 09:34 AM, Mark Adams wrote: > If the solver is degrading as the coefficients change, and I would > assume get more nasty, you can try deleting the solver at each time > step. This will be about 2x more expensive, because it does the setup > each solve, but it might fix your problem. > > You also might try: > > -pc_type hypre > -pc_hypre_type boomeramg > > > > > On Mon, May 19, 2014 at 6:49 PM, Jed Brown > wrote: > > Michele Rosso > writes: > > > Jed, > > > > thank you very much! > > I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type > > sor/ and report back. > > Yes, I removed the nullspace from both the system matrix and the > rhs. > > Is there a way to have something similar to Dendy's multigrid or the > > deflated conjugate gradient method with PETSc? > > Dendy's MG needs geometry. The algorithm to produce the interpolation > operators is not terribly complicated so it could be done, though DMDA > support for cell-centered is a somewhat awkward. "Deflated CG" > can mean > lots of things so you'll have to be more precise. (Most everything in > the "deflation" world has a clear analogue in the MG world, but the > deflation community doesn't have a precise language to talk about > their > methods so you always have to read the paper carefully to find out if > it's completely standard or if there is something new.) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon May 26 13:44:13 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 26 May 2014 13:44:13 -0500 Subject: [petsc-users] row scale the matrix In-Reply-To: <53836582.6010804@gmail.com> References: <53836582.6010804@gmail.com> Message-ID: <78644A10-9A8F-452F-8306-B445B5C3D60E@mcs.anl.gov> Why not MatDiagonalScale()? The left diagonal matrix l scales each row i of the matrix by l[i,i] so it seems to do exactly what you want. Barry On May 26, 2014, at 11:02 AM, Hoang Giang Bui wrote: > Hi > > My matrix contains some overshoot entries in the diagonal and I want to row scale by a factor that I defined. How can I do that with petsc ? (I don't want to use MatDiagonalScale instead, I also don't want to create a diagonal matrix and left multiply to the system.) > > BR > Bui > From vbaros at hsr.ch Mon May 26 14:56:39 2014 From: vbaros at hsr.ch (Baros Vladimir) Date: Mon, 26 May 2014 19:56:39 +0000 Subject: [petsc-users] ExodusII Message-ID: Necessary header files, can be found here: https://code.google.com/p/msinttypes/ It contains necessary inttypes.h and stdint.h headers I successfully used them to build exodus lib with Visual Studio. Can anyone enable the support for exodus in Windows? From C.Klaij at marin.nl Tue May 27 08:47:55 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 27 May 2014 13:47:55 +0000 Subject: [petsc-users] MatNestGetISs in fortran Message-ID: <64c4658aeb7441abbe20e4aa252554a2@MAR190N1.marin.local> I'm trying to use MatNestGetISs in a fortran program but it seems to be missing from the fortran include file (PETSc 3.4). dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From tlk0812 at hotmail.com Tue May 27 15:17:46 2014 From: tlk0812 at hotmail.com (Likun Tan) Date: Tue, 27 May 2014 13:17:46 -0700 Subject: [petsc-users] Set the directory of output file Message-ID: Hello, I want to create and write my simulation result in a binary file called result.dat, but I want to set my file in a different folder, say /home/username/output I am using PetscViewerBinaryOpen(PETSC_COMM_WORLD, result.dat, FILE_MODE_WRITE, & view) But this will create the file in the current folder by default. How could I modify the command to set a new path? Thank you. From balay at mcs.anl.gov Tue May 27 15:40:27 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 27 May 2014 15:40:27 -0500 Subject: [petsc-users] Set the directory of output file In-Reply-To: References: Message-ID: On Tue, 27 May 2014, Likun Tan wrote: > Hello, > > I want to create and write my simulation result in a binary file called result.dat, but I want to set my file in a different folder, say /home/username/output > > I am using > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, result.dat, FILE_MODE_WRITE, & view) > > But this will create the file in the current folder by default. How could I modify the command to set a new path? Thank you. You should be able to do: PetscViewerBinaryOpen(PETSC_COMM_WORLD, "/home/username/output/result.dat", FILE_MODE_WRITE, & view) [or specify the path to the output file at runtime and use PetscOptionsGetString() to extract this string and use with PetscViewerBinaryOpen(). For ex: check '-f0' usage in src/ksp/ksp/examples/tutorials/ex10.c] Satish From tlk0812 at hotmail.com Tue May 27 15:57:44 2014 From: tlk0812 at hotmail.com (Likun Tan) Date: Tue, 27 May 2014 13:57:44 -0700 Subject: [petsc-users] Set the directory of output file In-Reply-To: References: Message-ID: It works. Thank you very much. > On May 27, 2014, at 1:40 PM, Satish Balay wrote: > >> On Tue, 27 May 2014, Likun Tan wrote: >> >> Hello, >> >> I want to create and write my simulation result in a binary file called result.dat, but I want to set my file in a different folder, say /home/username/output >> >> I am using >> >> PetscViewerBinaryOpen(PETSC_COMM_WORLD, result.dat, FILE_MODE_WRITE, & view) >> >> But this will create the file in the current folder by default. How could I modify the command to set a new path? Thank you. > > You should be able to do: > > PetscViewerBinaryOpen(PETSC_COMM_WORLD, "/home/username/output/result.dat", FILE_MODE_WRITE, & view) > > [or specify the path to the output file at runtime and use > PetscOptionsGetString() to extract this string and use with > PetscViewerBinaryOpen(). For ex: check '-f0' usage in > src/ksp/ksp/examples/tutorials/ex10.c] > > Satish > > From zonexo at gmail.com Tue May 27 20:09:09 2014 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 28 May 2014 09:09:09 +0800 Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 In-Reply-To: References: <534C9A2C.5060404@gmail.com> <53520587.6010606@gmail.com> <62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov> <535248E8.2070002@gmail.com> <535284E0.8010901@gmail.com> <5352934C.1010306@gmail.com> <53529B09.8040009@gmail.com> <5353173D.60609@gmail.com> <53546B03.1010407@gmail.com> <537188D8.2030307@gmail.com> <53795BCC.8020500@gmail.com> <53797A4D.6090602@gmail.com> <5379A433.5000401@gmail.com> Message-ID: <53853735.5080500@gmail.com> On 20/5/2014 1:43 AM, Barry Smith wrote: > On May 19, 2014, at 1:26 AM, TAY wee-beng wrote: > >> On 19/5/2014 11:36 AM, Barry Smith wrote: >>> On May 18, 2014, at 10:28 PM, TAY wee-beng wrote: >>> >>>> On 19/5/2014 9:53 AM, Matthew Knepley wrote: >>>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng wrote: >>>>> Hi Barry, >>>>> >>>>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu? >>>>> >>>>> Yes it works with Intel. Is this using optimization? >>>> Hi Matt, >>>> >>>> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel. Does it definitely mean that it's a bug in ifort? >>> No. Does it run clean under valgrind? >> Hi, >> >> Do you mean the debug or optimized version? > Both. Hi, has anyone tested the code I sent? I am still not able to pinpoint the error. Thanks. > >> Thanks. >>>>> Matt >>>>> >>>>> Thank you >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 14/5/2014 12:03 AM, Barry Smith wrote: >>>>> Please send you current code. So we may compile and run it. >>>>> >>>>> Barry >>>>> >>>>> >>>>> On May 12, 2014, at 9:52 PM, TAY wee-beng wrote: >>>>> >>>>> Hi, >>>>> >>>>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures. >>>>> >>>>> Thank you. >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 21/4/2014 8:58 AM, Barry Smith wrote: >>>>> Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email. >>>>> >>>>> Barry >>>>> >>>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng wrote: >>>>> >>>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng wrote: >>>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng wrote: >>>>> >>>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng wrote: >>>>> >>>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>>> Hmm, >>>>> >>>>> Interface DMDAVecGetArrayF90 >>>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>>> USE_DM_HIDE >>>>> DM_HIDE da1 >>>>> VEC_HIDE v >>>>> PetscScalar,pointer :: d1(:,:,:) >>>>> PetscErrorCode ierr >>>>> End Subroutine >>>>> >>>>> So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array?? >>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>> Hi, >>>>> >>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u". >>>>> >>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why... >>>>> >>>>> Also, supposed I call: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> u_array .... >>>>> >>>>> v_array .... etc >>>>> >>>>> Now to restore the array, does it matter the sequence they are restored? >>>>> No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code. >>>>> >>>>> Hi, >>>>> >>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization: >>>>> >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> u_array = 0.d0 >>>>> >>>>> v_array = 0.d0 >>>>> >>>>> w_array = 0.d0 >>>>> >>>>> p_array = 0.d0 >>>>> >>>>> >>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi? >>>>> >>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F? >>>>> Hi Matt, >>>>> >>>>> Do you mean putting the above lines into ex11f90.F and test? >>>>> >>>>> It already has DMDAVecGetArray(). Just run it. >>>>> Hi, >>>>> >>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region. >>>>> >>>>> No the global/local difference should not matter. >>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though. >>>>> >>>>> DMGetLocalVector() >>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter? >>>>> >>>>> If so, when should I call them? >>>>> >>>>> You just need a local vector from somewhere. >>>>> Hi, >>>>> >>>>> Anyone can help with the questions below? Still trying to find why my code doesn't work. >>>>> >>>>> Thanks. >>>>> Hi, >>>>> >>>>> I insert part of my error region code into ex11f90: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> u_array = 0.d0 >>>>> v_array = 0.d0 >>>>> w_array = 0.d0 >>>>> p_array = 0.d0 >>>>> >>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> It worked w/o error. I'm going to change the way the modules are defined in my code. >>>>> >>>>> My code contains a main program and a number of modules files, with subroutines inside e.g. >>>>> >>>>> module solve >>>>> <- add include file? >>>>> subroutine RRK >>>>> <- add include file? >>>>> end subroutine RRK >>>>> >>>>> end module solve >>>>> >>>>> So where should the include files (#include ) be placed? >>>>> >>>>> After the module or inside the subroutine? >>>>> >>>>> Thanks. >>>>> Matt >>>>> Thanks. >>>>> Matt >>>>> Thanks. >>>>> Matt >>>>> Thanks >>>>> >>>>> Regards. >>>>> Matt >>>>> As in w, then v and u? >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> thanks >>>>> Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F (and they do not start at 1). This is how to get the loop bounds. >>>>> Hi, >>>>> >>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?. >>>>> >>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so? >>>>> Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors. So really C or Fortran doesn?t make any difference. >>>>> >>>>> >>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>>> If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1. >>>>> >>>>> Barry >>>>> >>>>> Thanks. >>>>> Barry >>>>> >>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng wrote: >>>>> >>>>> Hi, >>>>> >>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)" >>>>> >>>>> However, by re-writing my code, I found out a few things: >>>>> >>>>> 1. if I write my code this way: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> u_array = .... >>>>> >>>>> v_array = .... >>>>> >>>>> w_array = .... >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> The code runs fine. >>>>> >>>>> 2. if I write my code this way: >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above. >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>>> >>>>> where the subroutine is: >>>>> >>>>> subroutine uvw_array_change(u,v,w) >>>>> >>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>> >>>>> u ... >>>>> v... >>>>> w ... >>>>> >>>>> end subroutine uvw_array_change. >>>>> >>>>> The above will give an error at : >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> 3. Same as above, except I change the order of the last 3 lines to: >>>>> >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>> >>>>> So they are now in reversed order. Now it works. >>>>> >>>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>>> >>>>> subroutine uvw_array_change(u,v,w) >>>>> >>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>> >>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>> >>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3)) >>>>> >>>>> u ... >>>>> v... >>>>> w ... >>>>> >>>>> end subroutine uvw_array_change. >>>>> >>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick". >>>>> >>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) " >>>>> >>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other than the "trick" to continue using the 1 indices convention. >>>>> >>>>> Thank you. >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>>> Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>> >>>>> >>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng wrote: >>>>> >>>>> Hi Barry, >>>>> >>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode. >>>>> >>>>> I have attached my code. >>>>> >>>>> Thank you >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>>> Please send the code that creates da_w and the declarations of w_array >>>>> >>>>> Barry >>>>> >>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>>> >>>>> wrote: >>>>> >>>>> >>>>> Hi Barry, >>>>> >>>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>>> >>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>> >>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts. >>>>> >>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too. >>>>> >>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>> -------------------------------------------------------------------------- >>>>> An MPI process has executed an operation involving a call to the >>>>> "fork()" system call to create a child process. Open MPI is currently >>>>> operating in a condition that could result in memory corruption or >>>>> other system errors; your MPI job may hang, crash, or produce silent >>>>> data corruption. The use of fork() (or system() or other calls that >>>>> create child processes) is strongly discouraged. >>>>> >>>>> The process that invoked fork was: >>>>> >>>>> Local host: n12-76 (PID 20235) >>>>> MPI_COMM_WORLD rank: 2 >>>>> >>>>> If you are *absolutely sure* that your application will successfully >>>>> and correctly survive a call to fork(), you may disable this warning >>>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>>> -------------------------------------------------------------------------- >>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76 >>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76 >>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76 >>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76 >>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork >>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>>>> >>>>> .... >>>>> >>>>> 1 >>>>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>> [1]PETSC ERROR: or see >>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org >>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>> [1]PETSC ERROR: to get more information on the crash. >>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>> [3]PETSC ERROR: or see >>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org >>>>> on GNU/linux and Apple Mac OS X to find memory corruption errors >>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>> [3]PETSC ERROR: to get more information on the crash. >>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null) >>>>> >>>>> ... >>>>> Thank you. >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>>> >>>>> Because IO doesn?t always get flushed immediately it may not be hanging at this point. It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point. >>>>> >>>>> Barry >>>>> >>>>> This routines don?t have any parallel communication in them so are unlikely to hang. >>>>> >>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>>> >>>>> >>>>> >>>>> wrote: >>>>> >>>>> >>>>> >>>>> Hi, >>>>> >>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>>> >>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >>>>> call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in reverse order >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>> -- >>>>> Thank you. >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener From gmulas at oa-cagliari.inaf.it Wed May 28 12:27:39 2014 From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas) Date: Wed, 28 May 2014 19:27:39 +0200 (CEST) Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC Message-ID: Hello. After some time stuck doing less exciting stuff, I got back to trying to use slepc for science. I am trying to use the relatively new functionality for arbitrary selection of the eigenpairs to be found, and I would want some support to understand if I am doing things correctly. I apologise with Jose Roman for disappearing when I should have helped testing this, it was not my choice. But now I am back at it. In particular, I want to obtain the eigenvectors with the maximum projection in a given subspace (defined by a number of normalised vectors, let's call them targets). I don't know in advance how many eigenpairs must be determined: I want to obtain enough eigenvectors that the projection of all targets in the space of these eigenvectors is very nearly identical to the targets themselves. So my strategy, so far, is the following: 1) create and set up the matrix H to be diagonalised ierr = MatCreate(mixpars->slepc_comm, &H); CHKERRQ(ierr); ierr = MatSetSizes(H, PETSC_DECIDE, PETSC_DECIDE, statesinlist, statesinlist); CHKERRQ(ierr); ierr = MatSetFromOptions(H);CHKERRQ(ierr); ierr = MatSetUp(H);CHKERRQ(ierr); ... ... some MatSetValue(H,...); ... ierr = MatAssemblyBegin(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatGetVecs(H, &xr, PETSC_NULL);CHKERRQ(ierr); ierr = MatGetVecs(H, &xi, PETSC_NULL);CHKERRQ(ierr); 2) create the eps ierr = EPSCreate(mixpars->slepc_comm,&eps);CHKERRQ(ierr); ierr = EPSSetOperators(eps,H,PETSC_NULL);CHKERRQ(ierr); ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); ierr = EPSSetTolerances(eps, tol, PETSC_DECIDE); ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr); ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr); 3) set the eps to use an arbitrary selection function /* every time I solve, I want to find one eigenvector, and it must be the one with the largest component along the target state */ ierr = EPSSetWhichEigenpairs(eps, EPS_LARGEST_MAGNITUDE); ierr = EPSSetArbitrarySelection(eps, computeprojection(), (void *) &targetcompindex); ierr = EPSSetDimensions(eps, 1, PETSC_IGNORE, PETSC_IGNORE); CHKERRQ(ierr); 4) run a loop over the target vectors, and iteratively call EPSSolve until each target vector is completely contained in the space of the eigenvectors found. Before each call to EPSSolve, I set the initial guess equal to the target vector, and set the deflation space to be the set of eigenvectors found so far. After each call to EPSSolve, I add the new eigenvectors to the deflation space one by one, and check if the target state is (nearly) fully contained in the eigenvectors space. If yes, I move on to the next target state and so on. 5) free everything, destroy eps, matrices, vectors etc. I have some questions about the above: 1) should it work in principle, or am I getting it all wrong? 2) should I destroy and recreate the eps after each call to EPSSolve and before next call? Or, since the underlying matrix is always the same, can I just call EPSSetInitialSpace(), EPSSetDeflationSpace(), update the internal parameter to be passed to the arbitrary selection function and I can call again EPSSolve? 3) Since what I want is going on to find one eigenpair at a time of the same problem until some condition is fulfilled, is there a way in which I can achieve this without setting it up again and again every time? Can I specify an arbitrary function that is called by EPSSolve to decide whether enough eigenpairs were computed or not, instead of doing it in this somewhat awkward manner? 4) more technical: since I add vectors one by one to the deflation space, to begin with I allocate a Vec *Cv with PetscMalloc(statesinlist*sizeof(Vec *), &Cv); where statesinlist is the size of the problem, hence the maximum hypothetical size of the deflation space. I would prefer to allocate this dynamically, enlarging it as needed. Is there something like realloc() in PETSC/SLEPC? Thanks in advance, bye Giacomo Mulas -- _________________________________________________________________ Giacomo Mulas _________________________________________________________________ INAF - Osservatorio Astronomico di Cagliari via della scienza 5 - 09047 Selargius (CA) tel. +39 070 71180244 mob. : +39 329 6603810 _________________________________________________________________ "When the storms are raging around you, stay right where you are" (Freddy Mercury) _________________________________________________________________ From knepley at gmail.com Wed May 28 13:08:19 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 28 May 2014 13:08:19 -0500 Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: Message-ID: On Wed, May 28, 2014 at 12:27 PM, Giacomo Mulas wrote: > Hello. > > After some time stuck doing less exciting stuff, I got back to trying to > use > slepc for science. I am trying to use the relatively new functionality for > arbitrary selection of the eigenpairs to be found, and I would want some > support to understand if I am doing things correctly. I apologise with > Jose > Roman for disappearing when I should have helped testing this, it was not > my > choice. But now I am back at it. > > In particular, I want to obtain the eigenvectors with the maximum > projection > in a given subspace (defined by a number of normalised vectors, let's call > them targets). > > I don't know in advance how many eigenpairs must be determined: I want to > obtain enough eigenvectors that the projection of all targets in the space > of these eigenvectors is very nearly identical to the targets themselves. > > So my strategy, so far, is the following: > > 1) create and set up the matrix H to be diagonalised > > ierr = MatCreate(mixpars->slepc_comm, &H); CHKERRQ(ierr); > ierr = MatSetSizes(H, PETSC_DECIDE, PETSC_DECIDE, statesinlist, > statesinlist); > CHKERRQ(ierr); > ierr = MatSetFromOptions(H);CHKERRQ(ierr); > ierr = MatSetUp(H);CHKERRQ(ierr); > ... > ... some MatSetValue(H,...); > ... > ierr = MatAssemblyBegin(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatGetVecs(H, &xr, PETSC_NULL);CHKERRQ(ierr); > ierr = MatGetVecs(H, &xi, PETSC_NULL);CHKERRQ(ierr); > > 2) create the eps > > ierr = EPSCreate(mixpars->slepc_comm,&eps);CHKERRQ(ierr); > ierr = EPSSetOperators(eps,H,PETSC_NULL);CHKERRQ(ierr); > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); > ierr = EPSSetTolerances(eps, tol, PETSC_DECIDE); > ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); > ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr); > ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); > ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr); > > 3) set the eps to use an arbitrary selection function > > /* every time I solve, I want to find one eigenvector, and it must be > the one with the largest component along the target state */ > ierr = EPSSetWhichEigenpairs(eps, EPS_LARGEST_MAGNITUDE); > ierr = EPSSetArbitrarySelection(eps, computeprojection(), > (void *) &targetcompindex); > ierr = EPSSetDimensions(eps, 1, PETSC_IGNORE, PETSC_IGNORE); > CHKERRQ(ierr); > > 4) run a loop over the target vectors, and iteratively call EPSSolve until > each target vector is completely contained in the space of the > eigenvectors found. Before each call to EPSSolve, I set the initial > guess > equal to the target vector, and set the deflation space to be the set of > eigenvectors found so far. After each call to EPSSolve, I add the new > eigenvectors to the deflation space one by one, and check if the target > state is (nearly) fully contained in the eigenvectors space. If yes, I > move on to the next target state and so on. > > 5) free everything, destroy eps, matrices, vectors etc. > > > I have some questions about the above: > > 1) should it work in principle, or am I getting it all wrong? > > 2) should I destroy and recreate the eps after each call to EPSSolve and > before next call? Or, since the underlying matrix is always the same, can I > just call EPSSetInitialSpace(), EPSSetDeflationSpace(), update the internal > parameter to be passed to the arbitrary selection function and I can call > again EPSSolve? > > 3) Since what I want is going on to find one eigenpair at a time of the > same > problem until some condition is fulfilled, is there a way in which I can > achieve this without setting it up again and again every time? Can I > specify > an arbitrary function that is called by EPSSolve to decide whether enough > eigenpairs were computed or not, instead of doing it in this somewhat > awkward manner? > > 4) more technical: since I add vectors one by one to the deflation space, > to > begin with I allocate a Vec *Cv with PetscMalloc(statesinlist*sizeof(Vec > *), &Cv); > where statesinlist is the size of the problem, hence the maximum > hypothetical size of the deflation space. I would prefer to allocate this > dynamically, enlarging it as needed. Is there something like realloc() in > PETSC/SLEPC? > There is not. However, since this is just a set of Vec pointers, allocating and copying should be fine. The amount of memory taken up by the pointers is very very small. Thanks, Matt > Thanks in advance, bye > Giacomo Mulas > > -- > _________________________________________________________________ > > Giacomo Mulas > _________________________________________________________________ > > INAF - Osservatorio Astronomico di Cagliari > via della scienza 5 - 09047 Selargius (CA) > > tel. +39 070 71180244 > mob. : +39 329 6603810 > _________________________________________________________________ > > "When the storms are raging around you, stay right where you are" > (Freddy Mercury) > _________________________________________________________________ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hus003 at ucsd.edu Wed May 28 13:21:53 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Wed, 28 May 2014 18:21:53 +0000 Subject: [petsc-users] Question about dm_view Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you! ( Hui ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 28 13:25:14 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 28 May 2014 13:25:14 -0500 Subject: [petsc-users] Question about dm_view In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> Run as./ex5 -help | grep view to see the possibilities. It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately. Barry On May 28, 2014, at 1:21 PM, Sun, Hui wrote: > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you! ( Hui ) From hus003 at ucsd.edu Wed May 28 13:28:33 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Wed, 28 May 2014 18:28:33 +0000 Subject: [petsc-users] Question about dm_view In-Reply-To: <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>, <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU> Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag , what does this mean? ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Wednesday, May 28, 2014 11:25 AM To: Sun, Hui Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Question about dm_view Run as./ex5 -help | grep view to see the possibilities. It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately. Barry On May 28, 2014, at 1:21 PM, Sun, Hui wrote: > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you! ( Hui ) From jroman at dsic.upv.es Wed May 28 13:59:27 2014 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 28 May 2014 20:59:27 +0200 Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: Message-ID: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> El 28/05/2014, a las 19:27, Giacomo Mulas escribi?: > Hello. > > After some time stuck doing less exciting stuff, I got back to trying to use > slepc for science. I am trying to use the relatively new functionality for > arbitrary selection of the eigenpairs to be found, and I would want some > support to understand if I am doing things correctly. I apologise with Jose > Roman for disappearing when I should have helped testing this, it was not my > choice. But now I am back at it. > > In particular, I want to obtain the eigenvectors with the maximum projection > in a given subspace (defined by a number of normalised vectors, let's call > them targets). > > I don't know in advance how many eigenpairs must be determined: I want to > obtain enough eigenvectors that the projection of all targets in the space > of these eigenvectors is very nearly identical to the targets themselves. > > So my strategy, so far, is the following: > > 1) create and set up the matrix H to be diagonalised > > ierr = MatCreate(mixpars->slepc_comm, &H); CHKERRQ(ierr); > ierr = MatSetSizes(H, PETSC_DECIDE, PETSC_DECIDE, statesinlist, > statesinlist); > CHKERRQ(ierr); > ierr = MatSetFromOptions(H);CHKERRQ(ierr); > ierr = MatSetUp(H);CHKERRQ(ierr); > ... > ... some MatSetValue(H,...); > ... > ierr = MatAssemblyBegin(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatGetVecs(H, &xr, PETSC_NULL);CHKERRQ(ierr); > ierr = MatGetVecs(H, &xi, PETSC_NULL);CHKERRQ(ierr); > > 2) create the eps > > ierr = EPSCreate(mixpars->slepc_comm,&eps);CHKERRQ(ierr); > ierr = EPSSetOperators(eps,H,PETSC_NULL);CHKERRQ(ierr); > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); > ierr = EPSSetTolerances(eps, tol, PETSC_DECIDE); > ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); > ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr); > ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); > ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr); > > 3) set the eps to use an arbitrary selection function > > /* every time I solve, I want to find one eigenvector, and it must be > the one with the largest component along the target state */ > ierr = EPSSetWhichEigenpairs(eps, EPS_LARGEST_MAGNITUDE); > ierr = EPSSetArbitrarySelection(eps, computeprojection(), > (void *) &targetcompindex); > ierr = EPSSetDimensions(eps, 1, PETSC_IGNORE, PETSC_IGNORE); > CHKERRQ(ierr); > > 4) run a loop over the target vectors, and iteratively call EPSSolve until > each target vector is completely contained in the space of the > eigenvectors found. Before each call to EPSSolve, I set the initial guess > equal to the target vector, and set the deflation space to be the set of > eigenvectors found so far. After each call to EPSSolve, I add the new > eigenvectors to the deflation space one by one, and check if the target > state is (nearly) fully contained in the eigenvectors space. If yes, I > move on to the next target state and so on. > > 5) free everything, destroy eps, matrices, vectors etc. > > > I have some questions about the above: > > 1) should it work in principle, or am I getting it all wrong? I don't see much problem. > > 2) should I destroy and recreate the eps after each call to EPSSolve and > before next call? Or, since the underlying matrix is always the same, can I > just call EPSSetInitialSpace(), EPSSetDeflationSpace(), update the internal > parameter to be passed to the arbitrary selection function and I can call > again EPSSolve? No need to recreate the solver. The only thing is EPSSetDeflationSpace() - I would suggest calling EPSRemoveDeflationSpace() and then EPSSetDeflationSpace() again with the extended set of vectors. Do not call EPSSetDeflationSpace() with a single vector every time. > > 3) Since what I want is going on to find one eigenpair at a time of the same > problem until some condition is fulfilled, is there a way in which I can > achieve this without setting it up again and again every time? Can I specify > an arbitrary function that is called by EPSSolve to decide whether enough > eigenpairs were computed or not, instead of doing it in this somewhat > awkward manner? No. > > 4) more technical: since I add vectors one by one to the deflation space, to > begin with I allocate a Vec *Cv with PetscMalloc(statesinlist*sizeof(Vec *), &Cv); > where statesinlist is the size of the problem, hence the maximum > hypothetical size of the deflation space. I would prefer to allocate this > dynamically, enlarging it as needed. Is there something like realloc() in > PETSC/SLEPC? > > Thanks in advance, bye > Giacomo Mulas > > -- > _________________________________________________________________ > > Giacomo Mulas > _________________________________________________________________ > > INAF - Osservatorio Astronomico di Cagliari > via della scienza 5 - 09047 Selargius (CA) > > tel. +39 070 71180244 > mob. : +39 329 6603810 > _________________________________________________________________ > > "When the storms are raging around you, stay right where you are" > (Freddy Mercury) > _________________________________________________________________ From knepley at gmail.com Wed May 28 14:10:25 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 28 May 2014 14:10:25 -0500 Subject: [petsc-users] Question about dm_view In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: On Wed, May 28, 2014 at 1:28 PM, Sun, Hui wrote: > Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it > comes out a list of options related to _view, all of which have the tag > , what does this mean? > The is the current value. They are all false because you have not turned them on. IF you are using the release version, the viewing option is -da_view. The -dm_view is the new version which we are about to release. Thanks, Matt > ________________________________________ > From: Barry Smith [bsmith at mcs.anl.gov] > Sent: Wednesday, May 28, 2014 11:25 AM > To: Sun, Hui > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Question about dm_view > > Run as./ex5 -help | grep view to see the possibilities. It depends on > PETSc version number. When using the graphics want you generally want a > -draw_pause -1 to stop that program at the graphic otherwise it pops up and > disappears immediately. > > Barry > > > On May 28, 2014, at 1:21 PM, Sun, Hui wrote: > > > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial > College from this site: Slides. In slide page 28, there is description of > viewing the DA. I'm testing from my MAC the same commands listed on that > page, for example, ex5 -dm_view, nothing interesting happen except the > Number of Newton iterations is outputted. I'm expecting that the PETSc > numbering would show up as a graphic window or something. Can anyone tell > me what's missing here? Thank you! ( Hui ) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hus003 at ucsd.edu Wed May 28 14:13:32 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Wed, 28 May 2014 19:13:32 +0000 Subject: [petsc-users] Question about dm_view In-Reply-To: References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>, Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU> Do I have to turn it on thru ./configure and then make everything again? ________________________________ From: Matthew Knepley [knepley at gmail.com] Sent: Wednesday, May 28, 2014 12:10 PM To: Sun, Hui Cc: Barry Smith; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Question about dm_view On Wed, May 28, 2014 at 1:28 PM, Sun, Hui > wrote: Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag , what does this mean? The is the current value. They are all false because you have not turned them on. IF you are using the release version, the viewing option is -da_view. The -dm_view is the new version which we are about to release. Thanks, Matt ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Wednesday, May 28, 2014 11:25 AM To: Sun, Hui Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Question about dm_view Run as./ex5 -help | grep view to see the possibilities. It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately. Barry On May 28, 2014, at 1:21 PM, Sun, Hui > wrote: > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you! ( Hui ) -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed May 28 14:18:58 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 28 May 2014 14:18:58 -0500 Subject: [petsc-users] Question about dm_view In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: On Wed, May 28, 2014 at 2:13 PM, Sun, Hui wrote: > Do I have to turn it on thru ./configure and then make everything again? > No. You should see that option in the output of -help. By "turned on" I meant that the value of the option is FALSE. Thanks, Matt > ------------------------------ > *From:* Matthew Knepley [knepley at gmail.com] > *Sent:* Wednesday, May 28, 2014 12:10 PM > *To:* Sun, Hui > *Cc:* Barry Smith; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Question about dm_view > > On Wed, May 28, 2014 at 1:28 PM, Sun, Hui wrote: > >> Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it >> comes out a list of options related to _view, all of which have the tag >> , what does this mean? >> > > The is the current value. They are all false because you have > not turned them on. IF you are using the release version, > the viewing option is -da_view. The -dm_view is the new version which we > are about to release. > > Thanks, > > Matt > > >> ________________________________________ >> From: Barry Smith [bsmith at mcs.anl.gov] >> Sent: Wednesday, May 28, 2014 11:25 AM >> To: Sun, Hui >> Cc: petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] Question about dm_view >> >> Run as./ex5 -help | grep view to see the possibilities. It depends on >> PETSc version number. When using the graphics want you generally want a >> -draw_pause -1 to stop that program at the graphic otherwise it pops up and >> disappears immediately. >> >> Barry >> >> >> On May 28, 2014, at 1:21 PM, Sun, Hui wrote: >> >> > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial >> College from this site: Slides. In slide page 28, there is description of >> viewing the DA. I'm testing from my MAC the same commands listed on that >> page, for example, ex5 -dm_view, nothing interesting happen except the >> Number of Newton iterations is outputted. I'm expecting that the PETSc >> numbering would show up as a graphic window or something. Can anyone tell >> me what's missing here? Thank you! ( Hui ) >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed May 28 15:12:44 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 28 May 2014 15:12:44 -0500 Subject: [petsc-users] Question about dm_view In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>, <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov> On May 28, 2014, at 2:13 PM, Sun, Hui wrote: > Do I have to turn it on thru ./configure and then make everything again? No, just run the program with the option. For example if there is printed -dm_view_draw then run the program with -dm_view_draw true Barry > > From: Matthew Knepley [knepley at gmail.com] > Sent: Wednesday, May 28, 2014 12:10 PM > To: Sun, Hui > Cc: Barry Smith; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Question about dm_view > > On Wed, May 28, 2014 at 1:28 PM, Sun, Hui wrote: > Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag , what does this mean? > > The is the current value. They are all false because you have not turned them on. IF you are using the release version, > the viewing option is -da_view. The -dm_view is the new version which we are about to release. > > Thanks, > > Matt > > ________________________________________ > From: Barry Smith [bsmith at mcs.anl.gov] > Sent: Wednesday, May 28, 2014 11:25 AM > To: Sun, Hui > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Question about dm_view > > Run as./ex5 -help | grep view to see the possibilities. It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately. > > Barry > > > On May 28, 2014, at 1:21 PM, Sun, Hui wrote: > > > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you! ( Hui ) > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From hus003 at ucsd.edu Wed May 28 15:16:48 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Wed, 28 May 2014 20:16:48 +0000 Subject: [petsc-users] Question about dm_view In-Reply-To: <364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>, <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>, <364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B70E6@XMAIL-MBX-BH1.AD.UCSD.EDU> Thanks, now I get it working. -Hui ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Wednesday, May 28, 2014 1:12 PM To: Sun, Hui Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Question about dm_view On May 28, 2014, at 2:13 PM, Sun, Hui wrote: > Do I have to turn it on thru ./configure and then make everything again? No, just run the program with the option. For example if there is printed -dm_view_draw then run the program with -dm_view_draw true Barry > > From: Matthew Knepley [knepley at gmail.com] > Sent: Wednesday, May 28, 2014 12:10 PM > To: Sun, Hui > Cc: Barry Smith; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Question about dm_view > > On Wed, May 28, 2014 at 1:28 PM, Sun, Hui wrote: > Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag , what does this mean? > > The is the current value. They are all false because you have not turned them on. IF you are using the release version, > the viewing option is -da_view. The -dm_view is the new version which we are about to release. > > Thanks, > > Matt > > ________________________________________ > From: Barry Smith [bsmith at mcs.anl.gov] > Sent: Wednesday, May 28, 2014 11:25 AM > To: Sun, Hui > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Question about dm_view > > Run as./ex5 -help | grep view to see the possibilities. It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately. > > Barry > > > On May 28, 2014, at 1:21 PM, Sun, Hui wrote: > > > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you! ( Hui ) > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From danyang.su at gmail.com Wed May 28 16:57:54 2014 From: danyang.su at gmail.com (Danyang Su) Date: Wed, 28 May 2014 14:57:54 -0700 Subject: [petsc-users] Running problem with pc_type hypre Message-ID: <53865BE2.2060807@gmail.com> Hi All, I am testing my codes under windows with PETSc V3.4.4. When running with option -pc_type hypre using 1 processor, the program exactly uses 6 processors (my computer is 6 processors 12 threads) and the program crashed after many timesteps. The error information is as follows: job aborted: [ranks] message [0] fatal error Fatal error in MPI_Comm_create: Internal MPI error!, error stack: MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, group=0xc80300f2, new_comm=0x000000001EA6DD30) failed MPI_Comm_create(524).......: MPIR_Comm_create_intra(209): MPIR_Get_contextid(253)....: Too many communicators When running with option -pc_type hypre using 2 processors or more, the program exactly uses all the threads, making the system seriously overburden and the program runs very slowly. When running without -pc_type hypre, the program works fine without any problem. Does anybody have the same problem in windows. Thanks and regards, Danyang From bsmith at mcs.anl.gov Wed May 28 18:01:02 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 28 May 2014 18:01:02 -0500 Subject: [petsc-users] Running problem with pc_type hypre In-Reply-To: <53865BE2.2060807@gmail.com> References: <53865BE2.2060807@gmail.com> Message-ID: Some possibilities: Are you sure that the hypre was compiled with exactly the same MPI as the that used to build PETSc? On May 28, 2014, at 4:57 PM, Danyang Su wrote: > Hi All, > > I am testing my codes under windows with PETSc V3.4.4. > > When running with option -pc_type hypre using 1 processor, the program exactly uses 6 processors (my computer is 6 processors 12 threads) 6 threads? or 6 processes? It should not be possible for it to use more processes then what you start the program with. hypre can be configured to use OpenMP thread parallelism PLUS MPI parallelism. Was it configured/compiled for that? If so you want to turn that off, configure and compile hypre before linking to PETSc so it does not use OpenMP. Are you sure you don?t have a bunch of zombie MPI processes running from previous jobs that crashed. They suck up CPU but are not involved in the current MPI run. Reboot the machine to get rid of them all. Barry > and the program crashed after many timesteps. The error information is as follows: > > job aborted: > [ranks] message > > [0] fatal error > Fatal error in MPI_Comm_create: Internal MPI error!, error stack: > MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, group=0xc80300f2, new_comm=0x000000001EA6DD30) failed > MPI_Comm_create(524).......: > MPIR_Comm_create_intra(209): > MPIR_Get_contextid(253)....: Too many communicators > > When running with option -pc_type hypre using 2 processors or more, the program exactly uses all the threads, making the system seriously overburden and the program runs very slowly. > > When running without -pc_type hypre, the program works fine without any problem. > > Does anybody have the same problem in windows. > > Thanks and regards, > > Danyang From danyang.su at gmail.com Wed May 28 18:10:52 2014 From: danyang.su at gmail.com (Danyang Su) Date: Wed, 28 May 2014 16:10:52 -0700 Subject: [petsc-users] Running problem with pc_type hypre In-Reply-To: References: <53865BE2.2060807@gmail.com> Message-ID: <53866CFC.1080401@gmail.com> Hi Barry, I need further check on it. Running this executable file on another machine results into mkl_intel_thread.dll missing error. I am not sure at present if the mkl_intel_thread.dll version causes this problem. Thanks, Danyang On 28/05/2014 4:01 PM, Barry Smith wrote: > Some possibilities: > > Are you sure that the hypre was compiled with exactly the same MPI as the that used to build PETSc? > > On May 28, 2014, at 4:57 PM, Danyang Su wrote: > >> Hi All, >> >> I am testing my codes under windows with PETSc V3.4.4. >> >> When running with option -pc_type hypre using 1 processor, the program exactly uses 6 processors (my computer is 6 processors 12 threads) > 6 threads? or 6 processes? It should not be possible for it to use more processes then what you start the program with. > > hypre can be configured to use OpenMP thread parallelism PLUS MPI parallelism. Was it configured/compiled for that? If so you want to turn that off, > configure and compile hypre before linking to PETSc so it does not use OpenMP. > > Are you sure you don?t have a bunch of zombie MPI processes running from previous jobs that crashed. They suck up CPU but are not involved in the current MPI run. Reboot the machine to get rid of them all. > > Barry > >> and the program crashed after many timesteps. The error information is as follows: >> >> job aborted: >> [ranks] message >> >> [0] fatal error >> Fatal error in MPI_Comm_create: Internal MPI error!, error stack: >> MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, group=0xc80300f2, new_comm=0x000000001EA6DD30) failed >> MPI_Comm_create(524).......: >> MPIR_Comm_create_intra(209): >> MPIR_Get_contextid(253)....: Too many communicators >> >> When running with option -pc_type hypre using 2 processors or more, the program exactly uses all the threads, making the system seriously overburden and the program runs very slowly. >> >> When running without -pc_type hypre, the program works fine without any problem. >> >> Does anybody have the same problem in windows. >> >> Thanks and regards, >> >> Danyang From bsmith at mcs.anl.gov Wed May 28 18:27:46 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 28 May 2014 18:27:46 -0500 Subject: [petsc-users] Running problem with pc_type hypre In-Reply-To: <53866CFC.1080401@gmail.com> References: <53865BE2.2060807@gmail.com> <53866CFC.1080401@gmail.com> Message-ID: <3B72A454-9960-4AA5-BB85-E5598EB973E8@mcs.anl.gov> This could be an issue. In general with PETSc you want to link against MKL libraries that DO NOT use threading. Otherwise you get oversubscription to the threads. Barry On May 28, 2014, at 6:10 PM, Danyang Su wrote: > Hi Barry, > > I need further check on it. Running this executable file on another machine results into mkl_intel_thread.dll missing error. I am not sure at present if the mkl_intel_thread.dll version causes this problem. > > Thanks, > > Danyang > > On 28/05/2014 4:01 PM, Barry Smith wrote: >> Some possibilities: >> >> Are you sure that the hypre was compiled with exactly the same MPI as the that used to build PETSc? >> >> On May 28, 2014, at 4:57 PM, Danyang Su wrote: >> >>> Hi All, >>> >>> I am testing my codes under windows with PETSc V3.4.4. >>> >>> When running with option -pc_type hypre using 1 processor, the program exactly uses 6 processors (my computer is 6 processors 12 threads) >> 6 threads? or 6 processes? It should not be possible for it to use more processes then what you start the program with. >> >> hypre can be configured to use OpenMP thread parallelism PLUS MPI parallelism. Was it configured/compiled for that? If so you want to turn that off, >> configure and compile hypre before linking to PETSc so it does not use OpenMP. >> >> Are you sure you don?t have a bunch of zombie MPI processes running from previous jobs that crashed. They suck up CPU but are not involved in the current MPI run. Reboot the machine to get rid of them all. >> >> Barry >> >>> and the program crashed after many timesteps. The error information is as follows: >>> >>> job aborted: >>> [ranks] message >>> >>> [0] fatal error >>> Fatal error in MPI_Comm_create: Internal MPI error!, error stack: >>> MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, group=0xc80300f2, new_comm=0x000000001EA6DD30) failed >>> MPI_Comm_create(524).......: >>> MPIR_Comm_create_intra(209): >>> MPIR_Get_contextid(253)....: Too many communicators >>> >>> When running with option -pc_type hypre using 2 processors or more, the program exactly uses all the threads, making the system seriously overburden and the program runs very slowly. >>> >>> When running without -pc_type hypre, the program works fine without any problem. >>> >>> Does anybody have the same problem in windows. >>> >>> Thanks and regards, >>> >>> Danyang > From mfadams at lbl.gov Wed May 28 21:54:28 2014 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 28 May 2014 22:54:28 -0400 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: <538369C9.6010209@uci.edu> References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org> <538369C9.6010209@uci.edu> Message-ID: On Mon, May 26, 2014 at 12:20 PM, Michele Rosso wrote: > Mark, > > thank you for your input and sorry my late reply: I saw your email only > now. > By setting up the solver each time step you mean re-defining the KSP > context every time? > THe simplest thing is to just delete the object and create it again. THere are "reset" methods that do the same thing semantically but it is probably just easier to destroy the KSP object and recreate it and redo your setup code. > Why should this help? > AMG methods optimized for a particular operator but "stale" setup data often work well on problems that evolve, at least for a while, and it saves a lot of time to not redo the "setup" every time. How often you should "refresh" the setup data is problem dependant and the application needs to control that. There are some hooks to fine tune how much setup data is recomputed each solve, but we are just trying to see if redoing the setup every time helps. If this fixes the problem then we can think about cost. If it does not fix the problem then it is more serious. > I will definitely try that as well as the hypre solution and report back. > Again, thank you. > > Michele > > > On 05/22/2014 09:34 AM, Mark Adams wrote: > > If the solver is degrading as the coefficients change, and I would assume > get more nasty, you can try deleting the solver at each time step. This > will be about 2x more expensive, because it does the setup each solve, but > it might fix your problem. > > You also might try: > > -pc_type hypre > -pc_hypre_type boomeramg > > > > > On Mon, May 19, 2014 at 6:49 PM, Jed Brown wrote: > >> Michele Rosso writes: >> >> > Jed, >> > >> > thank you very much! >> > I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type >> > sor/ and report back. >> > Yes, I removed the nullspace from both the system matrix and the rhs. >> > Is there a way to have something similar to Dendy's multigrid or the >> > deflated conjugate gradient method with PETSc? >> >> Dendy's MG needs geometry. The algorithm to produce the interpolation >> operators is not terribly complicated so it could be done, though DMDA >> support for cell-centered is a somewhat awkward. "Deflated CG" can mean >> lots of things so you'll have to be more precise. (Most everything in >> the "deflation" world has a clear analogue in the MG world, but the >> deflation community doesn't have a precise language to talk about their >> methods so you always have to read the paper carefully to find out if >> it's completely standard or if there is something new.) >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qince168 at gmail.com Thu May 29 01:08:01 2014 From: qince168 at gmail.com (Ce Qin) Date: Thu, 29 May 2014 14:08:01 +0800 Subject: [petsc-users] How to use cmake to get external libraries? Message-ID: Dear all, I'n now using cmake to build my project. I had tried Jed's FindPETSc module, it works pretty fine. However, the variable PETSC_LIBRARIES it generates only have libpetsc.so. And I want to use the external libraries directly, so I need to get the whole libraries like PETSC_LIB in petscvariables. Can anyone provide some hints? Thanks in advance. Best regards, Ce Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu May 29 01:15:03 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 29 May 2014 00:15:03 -0600 Subject: [petsc-users] How to use cmake to get external libraries? In-Reply-To: References: Message-ID: <87vbspyuu0.fsf@jedbrown.org> Ce Qin writes: > Dear all, > > I'n now using cmake to build my project. I had tried Jed's FindPETSc > module, it works pretty fine. However, the variable PETSC_LIBRARIES it > generates only have libpetsc.so. And I want to use the external libraries > directly, so I need to get the whole libraries like PETSC_LIB in > petscvariables. Can anyone provide some hints? Best to have Find${OtherLibrary}.cmake for those libraries. It tangles dependencies and will be harder to maintain in the long run if you modify FindPETSc.cmake to provide access to optional third-party libraries. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From qince168 at gmail.com Thu May 29 01:32:25 2014 From: qince168 at gmail.com (Ce Qin) Date: Thu, 29 May 2014 14:32:25 +0800 Subject: [petsc-users] How to use cmake to get external libraries? In-Reply-To: <87vbspyuu0.fsf@jedbrown.org> References: <87vbspyuu0.fsf@jedbrown.org> Message-ID: Thanks for your quick reply, Jed. The find modules of many scientific libraries are not available. I also considered using Makefile to build my project, but writing a extensible Makefile (supporting out-of-source build) is not easy to do. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrosso at uci.edu Thu May 29 01:44:34 2014 From: mrosso at uci.edu (Michele Rosso) Date: Wed, 28 May 2014 23:44:34 -0700 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org> <538369C9.6010209@uci.edu> Message-ID: <5386D752.6080103@uci.edu> Thanks Mark! I will try and let you know. On 05/28/2014 07:54 PM, Mark Adams wrote: > > > > On Mon, May 26, 2014 at 12:20 PM, Michele Rosso > wrote: > > Mark, > > thank you for your input and sorry my late reply: I saw your email > only now. > By setting up the solver each time step you mean re-defining the > KSP context every time? > > > THe simplest thing is to just delete the object and create it again. > THere are "reset" methods that do the same thing semantically but it > is probably just easier to destroy the KSP object and recreate it and > redo your setup code. > > Why should this help? > > > AMG methods optimized for a particular operator but "stale" setup data > often work well on problems that evolve, at least for a while, and it > saves a lot of time to not redo the "setup" every time. How often you > should "refresh" the setup data is problem dependant and the > application needs to control that. There are some hooks to fine tune > how much setup data is recomputed each solve, but we are just trying > to see if redoing the setup every time helps. If this fixes the > problem then we can think about cost. If it does not fix the problem > then it is more serious. > > I will definitely try that as well as the hypre solution and > report back. > Again, thank you. > > Michele > > > On 05/22/2014 09:34 AM, Mark Adams wrote: >> If the solver is degrading as the coefficients change, and I >> would assume get more nasty, you can try deleting the solver at >> each time step. This will be about 2x more expensive, because it >> does the setup each solve, but it might fix your problem. >> >> You also might try: >> >> -pc_type hypre >> -pc_hypre_type boomeramg >> >> >> >> >> On Mon, May 19, 2014 at 6:49 PM, Jed Brown > > wrote: >> >> Michele Rosso > writes: >> >> > Jed, >> > >> > thank you very much! >> > I will try with ///-mg_levels_ksp_type chebyshev >> -mg_levels_pc_type >> > sor/ and report back. >> > Yes, I removed the nullspace from both the system matrix >> and the rhs. >> > Is there a way to have something similar to Dendy's >> multigrid or the >> > deflated conjugate gradient method with PETSc? >> >> Dendy's MG needs geometry. The algorithm to produce the >> interpolation >> operators is not terribly complicated so it could be done, >> though DMDA >> support for cell-centered is a somewhat awkward. "Deflated >> CG" can mean >> lots of things so you'll have to be more precise. (Most >> everything in >> the "deflation" world has a clear analogue in the MG world, >> but the >> deflation community doesn't have a precise language to talk >> about their >> methods so you always have to read the paper carefully to >> find out if >> it's completely standard or if there is something new.) >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu May 29 01:49:26 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 29 May 2014 00:49:26 -0600 Subject: [petsc-users] How to use cmake to get external libraries? In-Reply-To: References: <87vbspyuu0.fsf@jedbrown.org> Message-ID: <87sintyt8p.fsf@jedbrown.org> Ce Qin writes: > The find modules of many scientific libraries are not available. I'm sorry, but this is their problem (or your problem, or CMake's problem). PETSc provides interfaces to many other packages, but we can't support every aspect of direct use of those packages. It's easy to make my FindPETSc.cmake use the full list of libraries, but I won't make that change and I won't support that use. I.e., if you are going to tangle dependencies like that, I'm not the one responsible for supporting it. > I also considered using Makefile to build my project, but writing a > extensible Makefile (supporting out-of-source build) is not easy to > do. Makefiles with out-of-source support are not difficult, at least if you use gnumake. The PETSc build (in 'master') is one example. Here is one for a simpler project: https://bitbucket.org/hpgmg/hpgmg/src Look at base.mk and local.mk. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From qince168 at gmail.com Thu May 29 02:01:20 2014 From: qince168 at gmail.com (Ce Qin) Date: Thu, 29 May 2014 15:01:20 +0800 Subject: [petsc-users] How to use cmake to get external libraries? In-Reply-To: <87sintyt8p.fsf@jedbrown.org> References: <87vbspyuu0.fsf@jedbrown.org> <87sintyt8p.fsf@jedbrown.org> Message-ID: Thanks, I will look at it. Best regards, Ce Qin -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmulas at oa-cagliari.inaf.it Thu May 29 04:45:18 2014 From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas) Date: Thu, 29 May 2014 11:45:18 +0200 (CEST) Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> Message-ID: Hi Jose, nice hearing from you. Thanks for your help. On Wed, 28 May 2014, Jose E. Roman wrote: > I don't see much problem. > >> >> 2) should I destroy and recreate the eps after each call to EPSSolve and >> before next call? Or, since the underlying matrix is always the same, can I >> just call EPSSetInitialSpace(), EPSSetDeflationSpace(), update the internal >> parameter to be passed to the arbitrary selection function and I can call >> again EPSSolve? > > No need to recreate the solver. The only thing is EPSSetDeflationSpace() - > I would suggest calling EPSRemoveDeflationSpace() and then > EPSSetDeflationSpace() again with the extended set of vectors. Do not > call EPSSetDeflationSpace() with a single vector every time. Yes, indeed. I did not detail it in my description but what I do is keep an array Vec[] (allocated large enough at the beginning) and I attach eigenvectors to it as I find them. After every call to EPSSolve I do ierr = EPSGetConverged(eps, &nconverged); CHKERRQ(ierr); if (nconverged > 0) { for (petsck=0; petsck<=nconverged-1; petsck++) { EPSGetEigenpair(eps, petsck, &lambdar, &lambdai, xr, xi); newtotconverged = totconverged+petsk; ierr = VecDuplicate(xr, Cv+newtotconverged); ierr = VecCopy(xr, Cv[newtotconverged]); } totconverged += nconverged; ierr = EPSSetDeflationSpace(eps, totconverged, Cv); } So every time I call EPSSetDeflationSpace() I do give it the complete set of eigenvectors found so far, and their number, including the previously found ones. In the man page of EPSSetDeflationSpace it says that if another deflation space was previously defined, it is killed and replaced by the new one, that's why I did not explicitly call EPSRemoveDeflationSpace() until I'm done. Should I instead call EPSRemoveDeflationSpace every time before calling EPSSetDeflationSpace? i.e. add a call to EPSRemoveDeflationSpace() before the one to EPSSetDeflationSpace() in the snippet above? >> 3) Since what I want is going on to find one eigenpair at a time of the same >> problem until some condition is fulfilled, is there a way in which I can >> achieve this without setting it up again and again every time? Can I specify >> an arbitrary function that is called by EPSSolve to decide whether enough >> eigenpairs were computed or not, instead of doing it in this somewhat >> awkward manner? > > No. ok. Then another question comes after this: how much overhead is involved in setting up the deflation space etc. for every eigenvector that is computed? I guess that EPSSolve, when more than one eigenpair is needed, actually does the internal book-keeping of adding the already converged eigenvectors to the deflation space, so doing it by hand out of EPSSolve looks inefficient (even if maybe it's not so bad, I don't know the internals of the code). If I have a reasonable guess of how many eigenvectors I will need, would it be much more efficient to determine more eigenvectors than one with each EPSSolve call, with the risk of wasting time finding a few more than I need? How expensive is finding one eigenpair compared to the overhead of setting up a new deflation space and calling EPSSolve repeatedly? How do the costs scale with the size of the matrix, and the size of the deflation space? After I get at least one version of my code working properly, I will try doing some tests with this, if it's useful. Bye, thanks Giacomo -- _________________________________________________________________ Giacomo Mulas _________________________________________________________________ INAF - Osservatorio Astronomico di Cagliari via della scienza 5 - 09047 Selargius (CA) tel. +39 070 71180244 mob. : +39 329 6603810 _________________________________________________________________ "When the storms are raging around you, stay right where you are" (Freddy Mercury) _________________________________________________________________ From gmulas at oa-cagliari.inaf.it Thu May 29 10:03:31 2014 From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas) Date: Thu, 29 May 2014 17:03:31 +0200 (CEST) Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> Message-ID: Hi Jose, and list. I am in the process of writing the function to use with EPSSetArbitrarySelection(). Inside it, I will need to take some given component (which one is included in the info passed via ctx) of the eigenvector and square it. To do this, since the eigenvector is not necessarily local, I will need to first do a scatter to a local 1-component vector. So this would be like: ... some omitted machinery to cast the info from *ctx to more easily accessible form... ierr = ISCreateStride(PETXC_COMM_WORLD,1,myindex,1,&is1_from);CHKERRQ(ierr); ierr = ISCreateStride(PETSC_COMM_WORLD,1,0,1,&is1_to);CHKERRQ(ierr); ierr = VecCreateSeq(PETSC_COMM_SELF, 1, &localx1);CHKERRQ(ierr); ierr = VecScatterCreate(xr,is1_from,localx1,is1_to,&scatter1); CHKERRQ(ierr); ierr = VecScatterBegin(scatter1,xr,localx1,INSERT_VALUES, SCATTER_FORWARD); ierr = VecScatterEnd(scatter1,xr,localx1,INSERT_VALUES, SCATTER_FORWARD); ierr = VecGetArray(localx1,&comp); *rr = comp*comp; ierr = VecRestoreArray(localx1, &comp); ierr = VecDestroy(localx1); ierr = VecScatterDestroy(&scatter1); ierr = ISDestroy(&is1_from); ierr = ISDestroy(&is1_to); *ri = 0; ... some internal housekeeping omitted return 0; The questions are: 1) when the arbitrary function is called, is it called on all nodes simultaneously, so that collective functions can be expected to work properly, being called on all involved nodes at the same time? Should all processes compute the *rr and *ri to be returned, and return the same value? would it be more efficient to create a unit vector uv containing only one nonzero component, and then use VecDot(xr, uv, &comp), instead of pulling the component I need and squaring it as I did above? 2) since the stride, the 1-component vector, the scatter are presumably the same through all calls within one EPSSolve, can I take them out of the arbitrary function, and make them available to it through *ctx? For this to work, the structure of xr, the eigenvector passed to the arbitrary function, must be known outside of EPSSolve. Thanks, bye Giacomo -- _________________________________________________________________ Giacomo Mulas _________________________________________________________________ INAF - Osservatorio Astronomico di Cagliari via della scienza 5 - 09047 Selargius (CA) tel. +39 070 71180244 mob. : +39 329 6603810 _________________________________________________________________ "When the storms are raging around you, stay right where you are" (Freddy Mercury) _________________________________________________________________ From knepley at gmail.com Thu May 29 10:29:54 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 May 2014 10:29:54 -0500 Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> Message-ID: On Thu, May 29, 2014 at 10:03 AM, Giacomo Mulas wrote: > Hi Jose, and list. > > I am in the process of writing the function to use with > EPSSetArbitrarySelection(). > > Inside it, I will need to take some given component (which one is included > in the info passed via ctx) of the eigenvector and square it. To do this, > since the eigenvector is not necessarily local, I will need to first do a > scatter to a local 1-component vector. So this would be like: > > ... some omitted machinery to cast the info from *ctx to more easily > accessible form... > There might be an easier way to do this: PetscScalar val = 0.0, gval; VecGetOwnershipRange(xr, &low, &high); if ((myindex >= low) && (myindex < high)) { VecGetArray(localx1,&a); val = a[myindex-low]; VecRestoreArray(localx1, &a); } MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); Now everyone has the value at myindex. Matt > ierr = ISCreateStride(PETXC_COMM_WORLD,1,myindex,1,&is1_from); > CHKERRQ(ierr); > ierr = ISCreateStride(PETSC_COMM_WORLD,1,0,1,&is1_to);CHKERRQ(ierr); > ierr = VecCreateSeq(PETSC_COMM_SELF, 1, &localx1);CHKERRQ(ierr); > ierr = VecScatterCreate(xr,is1_from,localx1,is1_to,&scatter1); > CHKERRQ(ierr); > ierr = VecScatterBegin(scatter1,xr,localx1,INSERT_VALUES, > SCATTER_FORWARD); > ierr = VecScatterEnd(scatter1,xr,localx1,INSERT_VALUES, > SCATTER_FORWARD); > ierr = VecGetArray(localx1,&comp); > *rr = comp*comp; > ierr = VecRestoreArray(localx1, &comp); > ierr = VecDestroy(localx1); > ierr = VecScatterDestroy(&scatter1); > ierr = ISDestroy(&is1_from); > ierr = ISDestroy(&is1_to); > *ri = 0; > > ... some internal housekeeping omitted > > return 0; > > The questions are: > > 1) when the arbitrary function is called, is it called on all nodes > simultaneously, so that collective functions can be expected to work > properly, being called on all involved nodes at the same time? Should all > processes compute the *rr and *ri to be returned, and return the same > value? > would it be more efficient to create a unit vector uv containing only one > nonzero component, and then use VecDot(xr, uv, &comp), instead of pulling > the component I need and squaring it as I did above? > > > 2) since the stride, the 1-component vector, the scatter are presumably > the same through all calls within one EPSSolve, can I take them out of the > arbitrary function, and make them available to it through *ctx? For this > to work, the structure of xr, the eigenvector passed to the arbitrary > function, must be known outside of EPSSolve. > > > Thanks, bye > Giacomo > > -- > _________________________________________________________________ > > Giacomo Mulas > _________________________________________________________________ > > INAF - Osservatorio Astronomico di Cagliari > via della scienza 5 - 09047 Selargius (CA) > > tel. +39 070 71180244 > mob. : +39 329 6603810 > _________________________________________________________________ > > "When the storms are raging around you, stay right where you are" > (Freddy Mercury) > _________________________________________________________________ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From masghar1397 at gmail.com Thu May 29 10:38:13 2014 From: masghar1397 at gmail.com (M Asghar) Date: Thu, 29 May 2014 16:38:13 +0100 Subject: [petsc-users] Accessing MUMPS INFOG values Message-ID: Hi, Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface? I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also. Many thanks in advance. M Asghar -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu May 29 10:43:19 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 29 May 2014 10:43:19 -0500 Subject: [petsc-users] Accessing MUMPS INFOG values In-Reply-To: <6ebee9bb5c5942d69d1f546025e9660f@LUCKMAN.anl.gov> References: <6ebee9bb5c5942d69d1f546025e9660f@LUCKMAN.anl.gov> Message-ID: Asghar: > Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG > etc) via the PETSc interface? Yes. Use the latest petsc (master branch). See petsc/src/ksp/ksp/examples/tutorials/ex52.c Hong > > I am working with SLEPc and am using MUMPS for the factorisation. I would > like to access the contents of their INFOG array within our code > particularly when an error occurs in order to determine whether any remedial > action can be taken. The error code returned from PETSc is useful; any > additional information from MUMPS that can be accessed from within ones code > would be very helpful also. > > Many thanks in advance. > > M Asghar > From gmulas at oa-cagliari.inaf.it Thu May 29 10:58:19 2014 From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas) Date: Thu, 29 May 2014 17:58:19 +0200 (CEST) Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> Message-ID: On Thu, 29 May 2014, Matthew Knepley wrote: > There might be an easier way to do this: > ? PetscScalar val = 0.0, gval; > > ??VecGetOwnershipRange(xr, &low, &high); > ? if ((myindex >= low) && (myindex < high)) { > ? ? VecGetArray(localx1,&a); > ? ? val = a[myindex-low]; > ? ? VecRestoreArray(localx1, &a); > ? } > ? MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); > > Now everyone has the value at myindex. brilliant, why didn't I think of this? Only, I guess you were copying/pasting and some variable names slipped, namely localx instead of xr. Should it be ? PetscScalar val = 0.0, gval; PetscScalar *a; ??VecGetOwnershipRange(xr, &low, &high); ? if ((myindex >= low) && (myindex < high)) { ? ? VecGetArray(xr,&a); ? ? val = a[myindex-low]; ? ? VecRestoreArray(xr, &a); ? } ? MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); *rr = gval*gval; *ri = 0; ? Thanks! Giacomo -- _________________________________________________________________ Giacomo Mulas _________________________________________________________________ INAF - Osservatorio Astronomico di Cagliari via della scienza 5 - 09047 Selargius (CA) tel. +39 070 71180244 mob. : +39 329 6603810 _________________________________________________________________ "When the storms are raging around you, stay right where you are" (Freddy Mercury) _________________________________________________________________ From hus003 at ucsd.edu Thu May 29 11:04:39 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Thu, 29 May 2014 16:04:39 +0000 Subject: [petsc-users] Question about dm_view In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B70E6@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>, <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>, <364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov>, <7501CC2B7BBCC44A92ECEEC316170ECB6B70E6@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B738B@XMAIL-MBX-BH1.AD.UCSD.EDU> A continuing problem: While I was running ./ex5, the output is normal. When I was running ./ex5 -help | grep whatever, the output is still normal. However, when I tried ./ex5 -help | head -20, it output the first 20 lines from help, then it output the some error message. I'm curious why there is such an error message. The error message is pasted below. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading or writing to a socket [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./ex5 on a darwin-op named blablabla by blablabla Thu May 29 08:57:47 2014 [0]PETSC ERROR: Libraries linked from /usr/local/petsc-3.1-p8/darwin-opt/lib [0]PETSC ERROR: Configure run at Tue Mar 11 16:25:14 2014 [0]PETSC ERROR: Configure options --CC=/usr/local/openmpi-1.4.3/bin/mpicc --CXX=/usr/local/openmpi-1.4.3/bin/mpicxx --FC=/usr/local/openmpi-1.4.3/bin/mpif90 --LDFLAGS="-L/usr/local/openmpi-1.4.3/lib -Wl,-rpath,/usr/local/openmpi-1.4.3/lib" --PETSC_ARCH=darwin-opt --with-debugging=0 --with-hypre=1 --with-blas-lapack-lib --with-c++-support --download-hypre --download-f-blas-lapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 59. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. ________________________________________ From: Sun, Hui Sent: Wednesday, May 28, 2014 1:16 PM To: Barry Smith Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: RE: [petsc-users] Question about dm_view Thanks, now I get it working. -Hui ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Wednesday, May 28, 2014 1:12 PM To: Sun, Hui Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Question about dm_view On May 28, 2014, at 2:13 PM, Sun, Hui wrote: > Do I have to turn it on thru ./configure and then make everything again? No, just run the program with the option. For example if there is printed -dm_view_draw then run the program with -dm_view_draw true Barry > > From: Matthew Knepley [knepley at gmail.com] > Sent: Wednesday, May 28, 2014 12:10 PM > To: Sun, Hui > Cc: Barry Smith; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Question about dm_view > > On Wed, May 28, 2014 at 1:28 PM, Sun, Hui wrote: > Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag , what does this mean? > > The is the current value. They are all false because you have not turned them on. IF you are using the release version, > the viewing option is -da_view. The -dm_view is the new version which we are about to release. > > Thanks, > > Matt > > ________________________________________ > From: Barry Smith [bsmith at mcs.anl.gov] > Sent: Wednesday, May 28, 2014 11:25 AM > To: Sun, Hui > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Question about dm_view > > Run as./ex5 -help | grep view to see the possibilities. It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately. > > Barry > > > On May 28, 2014, at 1:21 PM, Sun, Hui wrote: > > > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you! ( Hui ) > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From masghar1397 at gmail.com Thu May 29 11:09:14 2014 From: masghar1397 at gmail.com (M Asghar) Date: Thu, 29 May 2014 17:09:14 +0100 Subject: [petsc-users] Accessing MUMPS INFOG values In-Reply-To: References: <6ebee9bb5c5942d69d1f546025e9660f@LUCKMAN.anl.gov> Message-ID: Hi, Many thanks for the quick reply! I can see calls to MatMumpsGetInfog in ex52.c. * This is in PETSc's dev copy if I'm not mistaken - will this make it into the next PETSc release? * Will/does this have a Fortran equivalent? Many thanks, M Asghar On Thu, May 29, 2014 at 4:43 PM, Hong Zhang wrote: > Asghar: > > Is it possible to access the contents of MUMPS array INFOG (and INFO, > RINFOG > > etc) via the PETSc interface? > > Yes. Use the latest petsc (master branch). > See petsc/src/ksp/ksp/examples/tutorials/ex52.c > > Hong > > > > I am working with SLEPc and am using MUMPS for the factorisation. I would > > like to access the contents of their INFOG array within our code > > particularly when an error occurs in order to determine whether any > remedial > > action can be taken. The error code returned from PETSc is useful; any > > additional information from MUMPS that can be accessed from within ones > code > > would be very helpful also. > > > > Many thanks in advance. > > > > M Asghar > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From D.Lathouwers at tudelft.nl Thu May 29 11:31:00 2014 From: D.Lathouwers at tudelft.nl (Danny Lathouwers - TNW) Date: Thu, 29 May 2014 16:31:00 +0000 Subject: [petsc-users] rtol meaning Message-ID: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net> Dear users, I have a problem where I step time and repeatedly solve a system with differing rhs. At some time step petsc solver returns that initial solution is good enough (converged reason = 2 with 0 iterations done). I do not expect this behaviour. I use rtol = 0.001, atol=0 and dtol =large number. The manual seems to suggest the criterion is: rnorm < MAX (rtol * rnorm_0, abstol) (probably based on preconditioned residual). How could this lead to zero iterations being done? Or is the criterion based on rnorm/bnorm instead (which I found in some reference on the internet concerning petsc and would explain the observed behaviour). Thanks, Danny. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 29 11:58:51 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 May 2014 11:58:51 -0500 Subject: [petsc-users] rtol meaning In-Reply-To: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net> References: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net> Message-ID: On Thu, May 29, 2014 at 11:31 AM, Danny Lathouwers - TNW < D.Lathouwers at tudelft.nl> wrote: > Dear users, > > > > I have a problem where I step time and repeatedly solve a system with > differing rhs. > > At some time step petsc solver returns that initial solution is good > enough (converged reason = 2 with 0 iterations done). > > I do not expect this behaviour. I use rtol = 0.001, atol=0 and dtol =large > number. > > > > The manual seems to suggest the criterion is: rnorm < MAX (rtol * rnorm_0, abstol) (probably based on preconditioned residual). > > How could this lead to zero iterations being done? Or is the criterion based on rnorm/bnorm instead (which I found in some reference on the internet concerning petsc and would explain the observed behaviour). > > Its ||b||, not ||r_0||. You can change it to get the other behavior, as detailed here http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/interface/iterativ.c.html#KSPConvergedDefault Matt > Thanks, > > Danny. > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 29 12:00:35 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 May 2014 12:00:35 -0500 Subject: [petsc-users] Question about dm_view In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B738B@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU> <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU> <364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov> <7501CC2B7BBCC44A92ECEEC316170ECB6B70E6@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B738B@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: On Thu, May 29, 2014 at 11:04 AM, Sun, Hui wrote: > A continuing problem: While I was running ./ex5, the output is normal. > When I was running ./ex5 -help | grep whatever, the output is still normal. > However, when I tried ./ex5 -help | head -20, it output the first 20 lines > from help, then it output the some error message. I'm curious why there is > such an error message. The error message is pasted below. > You will notice that the signal is "Broken Pipe". When 'head' is done, it sends a SIGPIPE to the process producing output. This is the standard behavior. Matt > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading > or writing to a socket > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 > CDT 2011 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./ex5 on a darwin-op named blablabla by blablabla Thu May > 29 08:57:47 2014 > [0]PETSC ERROR: Libraries linked from > /usr/local/petsc-3.1-p8/darwin-opt/lib > [0]PETSC ERROR: Configure run at Tue Mar 11 16:25:14 2014 > [0]PETSC ERROR: Configure options --CC=/usr/local/openmpi-1.4.3/bin/mpicc > --CXX=/usr/local/openmpi-1.4.3/bin/mpicxx > --FC=/usr/local/openmpi-1.4.3/bin/mpif90 > --LDFLAGS="-L/usr/local/openmpi-1.4.3/lib > -Wl,-rpath,/usr/local/openmpi-1.4.3/lib" --PETSC_ARCH=darwin-opt > --with-debugging=0 --with-hypre=1 --with-blas-lapack-lib --with-c++-support > --download-hypre --download-f-blas-lapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 > --FOPTFLAGS=-O3 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 59. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > > > > ________________________________________ > From: Sun, Hui > Sent: Wednesday, May 28, 2014 1:16 PM > To: Barry Smith > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: RE: [petsc-users] Question about dm_view > > Thanks, now I get it working. -Hui > > ________________________________________ > From: Barry Smith [bsmith at mcs.anl.gov] > Sent: Wednesday, May 28, 2014 1:12 PM > To: Sun, Hui > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Question about dm_view > > On May 28, 2014, at 2:13 PM, Sun, Hui wrote: > > > Do I have to turn it on thru ./configure and then make everything again? > > No, just run the program with the option. For example if there is > printed -dm_view_draw then run the program with -dm_view_draw true > > Barry > > > > > From: Matthew Knepley [knepley at gmail.com] > > Sent: Wednesday, May 28, 2014 12:10 PM > > To: Sun, Hui > > Cc: Barry Smith; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Question about dm_view > > > > On Wed, May 28, 2014 at 1:28 PM, Sun, Hui wrote: > > Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it > comes out a list of options related to _view, all of which have the tag > , what does this mean? > > > > The is the current value. They are all false because you have > not turned them on. IF you are using the release version, > > the viewing option is -da_view. The -dm_view is the new version which we > are about to release. > > > > Thanks, > > > > Matt > > > > ________________________________________ > > From: Barry Smith [bsmith at mcs.anl.gov] > > Sent: Wednesday, May 28, 2014 11:25 AM > > To: Sun, Hui > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Question about dm_view > > > > Run as./ex5 -help | grep view to see the possibilities. It depends on > PETSc version number. When using the graphics want you generally want a > -draw_pause -1 to stop that program at the graphic otherwise it pops up and > disappears immediately. > > > > Barry > > > > > > On May 28, 2014, at 1:21 PM, Sun, Hui wrote: > > > > > Hello, I'm new to PETSc. I'm reading a tutorial slide given in > Imperial College from this site: Slides. In slide page 28, there is > description of viewing the DA. I'm testing from my MAC the same commands > listed on that page, for example, ex5 -dm_view, nothing interesting happen > except the Number of Newton iterations is outputted. I'm expecting that the > PETSc numbering would show up as a graphic window or something. Can anyone > tell me what's missing here? Thank you! ( Hui ) > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 29 12:01:30 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 May 2014 12:01:30 -0500 Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> Message-ID: On Thu, May 29, 2014 at 10:58 AM, Giacomo Mulas wrote: > On Thu, 29 May 2014, Matthew Knepley wrote: > > There might be an easier way to do this: >> PetscScalar val = 0.0, gval; >> >> VecGetOwnershipRange(xr, &low, &high); >> if ((myindex >= low) && (myindex < high)) { >> VecGetArray(localx1,&a); >> val = a[myindex-low]; >> VecRestoreArray(localx1, &a); >> } >> MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); >> >> Now everyone has the value at myindex. >> > > brilliant, why didn't I think of this? Only, I guess you were > copying/pasting and some variable names slipped, namely localx instead of > xr. Should it be > Yes Matt > PetscScalar val = 0.0, gval; > PetscScalar *a; > > VecGetOwnershipRange(xr, &low, &high); > if ((myindex >= low) && (myindex < high)) { > VecGetArray(xr,&a); > val = a[myindex-low]; VecRestoreArray(xr, &a); > } > MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); > *rr = gval*gval; > *ri = 0; > > ? > > Thanks! > Giacomo > > -- > _________________________________________________________________ > > Giacomo Mulas > _________________________________________________________________ > > INAF - Osservatorio Astronomico di Cagliari > via della scienza 5 - 09047 Selargius (CA) > > tel. +39 070 71180244 > mob. : +39 329 6603810 > _________________________________________________________________ > > "When the storms are raging around you, stay right where you are" > (Freddy Mercury) > _________________________________________________________________ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Thu May 29 13:23:53 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Thu, 29 May 2014 11:23:53 -0700 (PDT) Subject: [petsc-users] About parallel performance Message-ID: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> Hello, I?implemented?PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. ? I have used -log_summary to print out the performance summary?as attached (log_summary_p1 for serial run and log_summary_p2 for?the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. My questions are: ? 1. what?is the bottle neck of the parallel?run according to the summary? 2. Do you have any suggestions to improve the parallel performance? ? Thanks a lot for your suggestions! ? Regards, Qin???? -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_summary_p1.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_summary_p2.txt URL: From bsmith at mcs.anl.gov Thu May 29 13:43:54 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 May 2014 13:43:54 -0500 Subject: [petsc-users] Accessing MUMPS INFOG values In-Reply-To: References: Message-ID: <55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov> We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h (versions also for single precision and complex). So what PETSc should provide in mumps.c is a function something like #undef __FUNCT__ #define __FUNCT__ "MatMUMPSGetStruc" PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc) { Mat_MUMPS *mumps=(Mat_MUMPS*)A->spptr; PetscFunctionBegin; *struc = (void *) mumps->id PetscFunctionReturn(0); } so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled Also add a prototype for this function in petscmat.h Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like. Let us know how it goes and we?ll get this stuff into the development version of PETSc. Barry On May 29, 2014, at 10:38 AM, M Asghar wrote: > Hi, > > Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface? > > I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also. > > Many thanks in advance. > > M Asghar > From bsmith at mcs.anl.gov Thu May 29 13:45:49 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 May 2014 13:45:49 -0500 Subject: [petsc-users] Accessing MUMPS INFOG values In-Reply-To: <55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov> References: <55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov> Message-ID: <613FC909-426E-48E1-AB49-3E0252809887@mcs.anl.gov> Ignore this email. I see Hong already did it a different way so you already have access to all this information. Barry On May 29, 2014, at 1:43 PM, Barry Smith wrote: > > We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h > (versions also for single precision and complex). So what PETSc should provide in mumps.c is a function something like > > #undef __FUNCT__ > #define __FUNCT__ "MatMUMPSGetStruc" > PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc) > { > Mat_MUMPS *mumps=(Mat_MUMPS*)A->spptr; > > PetscFunctionBegin; > *struc = (void *) mumps->id > PetscFunctionReturn(0); > } > so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled > Also add a prototype for this function in petscmat.h > > > Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like. > > Let us know how it goes and we?ll get this stuff into the development version of PETSc. > > Barry > > > > > On May 29, 2014, at 10:38 AM, M Asghar wrote: > >> Hi, >> >> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface? >> >> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also. >> >> Many thanks in advance. >> >> M Asghar >> > From jroman at dsic.upv.es Thu May 29 13:51:32 2014 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 29 May 2014 20:51:32 +0200 Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> Message-ID: <6680CB15-AAA4-4404-9567-5AFF0F651D57@dsic.upv.es> El 29/05/2014, a las 17:03, Giacomo Mulas escribi?: > Hi Jose, and list. > > I am in the process of writing the function to use with > EPSSetArbitrarySelection(). > > Inside it, I will need to take some given component (which one is included > in the info passed via ctx) of the eigenvector and square it. To do this, > since the eigenvector is not necessarily local, I will need to first do a > scatter to a local 1-component vector. So this would be like: > > ... some omitted machinery to cast the info from *ctx to more easily > accessible form... > > ierr = ISCreateStride(PETXC_COMM_WORLD,1,myindex,1,&is1_from);CHKERRQ(ierr); > ierr = ISCreateStride(PETSC_COMM_WORLD,1,0,1,&is1_to);CHKERRQ(ierr); > ierr = VecCreateSeq(PETSC_COMM_SELF, 1, &localx1);CHKERRQ(ierr); > ierr = VecScatterCreate(xr,is1_from,localx1,is1_to,&scatter1); CHKERRQ(ierr); > ierr = VecScatterBegin(scatter1,xr,localx1,INSERT_VALUES, > SCATTER_FORWARD); > ierr = VecScatterEnd(scatter1,xr,localx1,INSERT_VALUES, > SCATTER_FORWARD); > ierr = VecGetArray(localx1,&comp); > *rr = comp*comp; > ierr = VecRestoreArray(localx1, &comp); > ierr = VecDestroy(localx1); > ierr = VecScatterDestroy(&scatter1); > ierr = ISDestroy(&is1_from); > ierr = ISDestroy(&is1_to); > *ri = 0; > > ... some internal housekeeping omitted > > return 0; > > The questions are: > > 1) when the arbitrary function is called, is it called on all nodes > simultaneously, so that collective functions can be expected to work > properly, being called on all involved nodes at the same time? Should all > processes compute the *rr and *ri to be returned, and return the same value? > would it be more efficient to create a unit vector uv containing only one > nonzero component, and then use VecDot(xr, uv, &comp), instead of pulling > the component I need and squaring it as I did above? > Yes, all processes must have the same values. Use the code snippet proposed by Matt. > > 2) since the stride, the 1-component vector, the scatter are presumably the same through all calls within one EPSSolve, can I take them out of the arbitrary function, and make them available to it through *ctx? For this > to work, the structure of xr, the eigenvector passed to the arbitrary > function, must be known outside of EPSSolve. Yes. Internally, all vectors are basically cloned from a template vector created with MatGetVecs(A,xr,NULL) so you can do the same outside EPSSolve() to determine local sizes. Jose > > > Thanks, bye > Giacomo > > -- > _________________________________________________________________ > > Giacomo Mulas > _________________________________________________________________ > > INAF - Osservatorio Astronomico di Cagliari > via della scienza 5 - 09047 Selargius (CA) > > tel. +39 070 71180244 > mob. : +39 329 6603810 > _________________________________________________________________ > > "When the storms are raging around you, stay right where you are" > (Freddy Mercury) > _________________________________________________________________ From bsmith at mcs.anl.gov Thu May 29 14:12:00 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 May 2014 14:12:00 -0500 Subject: [petsc-users] About parallel performance In-Reply-To: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> Message-ID: You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory. If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. Barry On May 29, 2014, at 1:23 PM, Qin Lu wrote: > Hello, > > I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). > > For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. > > I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. > My questions are: > > 1. what is the bottle neck of the parallel run according to the summary? > 2. Do you have any suggestions to improve the parallel performance? > > Thanks a lot for your suggestions! > > Regards, > Qin From D.Lathouwers at tudelft.nl Thu May 29 14:49:41 2014 From: D.Lathouwers at tudelft.nl (Danny Lathouwers - TNW) Date: Thu, 29 May 2014 19:49:41 +0000 Subject: [petsc-users] rtol meaning In-Reply-To: References: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net> Message-ID: <4E6B33F4128CED4DB307BA83146E9A64259081DE@SRV362.tudelft.net> Thanks Matt for your quick response. I got to believe that it was the relative ratio of the residual from the following petsc links: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetTolerances.html and http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPDefaultConverged.html#KSPDefaultConverged Perhaps these pages are outdated? Cheers, Danny. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 29 15:03:37 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 May 2014 15:03:37 -0500 Subject: [petsc-users] rtol meaning In-Reply-To: <4E6B33F4128CED4DB307BA83146E9A64259081DE@SRV362.tudelft.net> References: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net> <4E6B33F4128CED4DB307BA83146E9A64259081DE@SRV362.tudelft.net> Message-ID: <91B5A9C1-52F4-4E35-9841-29D396EDFCA4@mcs.anl.gov> Danny, The manual pages are a little sloppy and inconsistent. By default it uses ||b|| or || preconditioned b|| as the starting point. At the bottom of the badly formatted page you?ll see "- - rnorm_0 is the two norm of the right hand side. When initial guess is non-zero you can call KSPDefaultConvergedSetUIRNorm() to use the norm of (b - A*(initial guess)) as the starting point for relative norm convergence testing." Likely you want to call KSPDefaultConvergedSetUIRNorm if that is how you want to detect convergence. We?ll cleanup the manual pages, thanks for pointing out the confusion. Barry You can see the source code at http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/interface/iterativ.c.html#KSPDefaultConverged and confirm that what Matt said is correct. On May 29, 2014, at 2:49 PM, Danny Lathouwers - TNW wrote: > Thanks Matt for your quick response. > > I got to believe that it was the relative ratio of the residual from the following petsc links: > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetTolerances.html > and > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPDefaultConverged.html#KSPDefaultConverged > > Perhaps these pages are outdated? > > Cheers, > Danny. From lu_qin_2000 at yahoo.com Thu May 29 16:06:19 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Thu, 29 May 2014 14:06:19 -0700 (PDT) Subject: [petsc-users] About parallel performance In-Reply-To: References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> Message-ID: <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. ? The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of?p2 (143 sec) is a little?faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). It seems I?need a more efficient parallel preconditioner. Do you have any suggestions for that? Many thanks, Qin ----- Original Message ----- From: Barry Smith To: Qin Lu Cc: "petsc-users at mcs.anl.gov" Sent: Thursday, May 29, 2014 2:12 PM Subject: Re: [petsc-users] About parallel performance ? You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). ? Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. ? Barry On May 29, 2014, at 1:23 PM, Qin Lu wrote: > Hello, > > I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). > > For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >? > I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. > My questions are: >? > 1. what is the bottle neck of the parallel run according to the summary? > 2. Do you have any suggestions to improve the parallel performance? >? > Thanks a lot for your suggestions! >? > Regards, > Qin? ? From bsmith at mcs.anl.gov Thu May 29 16:17:28 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 May 2014 16:17:28 -0500 Subject: [petsc-users] About parallel performance In-Reply-To: <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> Message-ID: <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can cd src/benchmarks/streams/ make MPIVersion mpiexec -n 1 ./MPIVersion mpiexec -n 2 ./MPIVersion and send all the results Barry On May 29, 2014, at 4:06 PM, Qin Lu wrote: > For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. > > The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). > > It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? > > Many thanks, > Qin > > ----- Original Message ----- > From: Barry Smith > To: Qin Lu > Cc: "petsc-users at mcs.anl.gov" > Sent: Thursday, May 29, 2014 2:12 PM > Subject: Re: [petsc-users] About parallel performance > > > You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). > > Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory. If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. > > Barry > > > > > > On May 29, 2014, at 1:23 PM, Qin Lu wrote: > >> Hello, >> >> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). >> >> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >> >> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. >> My questions are: >> >> 1. what is the bottle neck of the parallel run according to the summary? >> 2. Do you have any suggestions to improve the parallel performance? >> >> Thanks a lot for your suggestions! >> >> Regards, >> Qin From hzhang at mcs.anl.gov Thu May 29 16:29:31 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 29 May 2014 16:29:31 -0500 Subject: [petsc-users] Accessing MUMPS INFOG values In-Reply-To: <4e4cfc6e38dd40cf9e93cb3c5aa39258@NAGURSKI.anl.gov> References: <55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov> <4e4cfc6e38dd40cf9e93cb3c5aa39258@NAGURSKI.anl.gov> Message-ID: Barry : > Ignore this email. I see Hong already did it a different way so you already have access to all this information. He asks * Will/does this have a Fortran equivalent? I'm not sure if the needed Fortran stubs are created automatically or we must create them manually? Hong > > On May 29, 2014, at 1:43 PM, Barry Smith wrote: > >> >> We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h >> (versions also for single precision and complex). So what PETSc should provide in mumps.c is a function something like >> >> #undef __FUNCT__ >> #define __FUNCT__ "MatMUMPSGetStruc" >> PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc) >> { >> Mat_MUMPS *mumps=(Mat_MUMPS*)A->spptr; >> >> PetscFunctionBegin; >> *struc = (void *) mumps->id >> PetscFunctionReturn(0); >> } >> so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled >> Also add a prototype for this function in petscmat.h >> >> >> Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like. >> >> Let us know how it goes and we?ll get this stuff into the development version of PETSc. >> >> Barry >> >> >> >> >> On May 29, 2014, at 10:38 AM, M Asghar wrote: >> >>> Hi, >>> >>> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface? >>> >>> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also. >>> >>> Many thanks in advance. >>> >>> M Asghar >>> >> > From bsmith at mcs.anl.gov Thu May 29 16:47:40 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 May 2014 16:47:40 -0500 Subject: [petsc-users] Accessing MUMPS INFOG values In-Reply-To: References: <55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov> <4e4cfc6e38dd40cf9e93cb3c5aa39258@NAGURSKI.anl.gov> Message-ID: <644168A2-5E32-44AA-A402-F09B5851E395@mcs.anl.gov> On May 29, 2014, at 4:29 PM, Hong Zhang wrote: > Barry : >> Ignore this email. I see Hong already did it a different way so you already have access to all this information. > > He asks > * Will/does this have a Fortran equivalent? > > I'm not sure if the needed Fortran stubs are created automatically or > we must create them manually? You need to write manual pages for each of these functions and make sure they start with /*@ and end with @*/ then run make allfortranstubs and make sure they get generated. Barry > Hong > >> >> On May 29, 2014, at 1:43 PM, Barry Smith wrote: >> >>> >>> We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h >>> (versions also for single precision and complex). So what PETSc should provide in mumps.c is a function something like >>> >>> #undef __FUNCT__ >>> #define __FUNCT__ "MatMUMPSGetStruc" >>> PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc) >>> { >>> Mat_MUMPS *mumps=(Mat_MUMPS*)A->spptr; >>> >>> PetscFunctionBegin; >>> *struc = (void *) mumps->id >>> PetscFunctionReturn(0); >>> } >>> so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled >>> Also add a prototype for this function in petscmat.h >>> >>> >>> Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like. >>> >>> Let us know how it goes and we?ll get this stuff into the development version of PETSc. >>> >>> Barry >>> >>> >>> >>> >>> On May 29, 2014, at 10:38 AM, M Asghar wrote: >>> >>>> Hi, >>>> >>>> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface? >>>> >>>> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also. >>>> >>>> Many thanks in advance. >>>> >>>> M Asghar >>>> >>> >> From bsmith at mcs.anl.gov Thu May 29 16:54:45 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 May 2014 16:54:45 -0500 Subject: [petsc-users] About parallel performance In-Reply-To: <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> Message-ID: <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance. Barry On May 29, 2014, at 4:37 PM, Qin Lu wrote: > Barry, > > I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later): > > ================= > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion > Number of MPI processes 1 > Function Rate (MB/s) > Copy: 21682.9932 > Scale: 21637.5509 > Add: 21583.0395 > Triad: 21504.6563 > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion > Number of MPI processes 2 > Function Rate (MB/s) > Copy: 21369.6976 > Scale: 21632.3203 > Add: 22203.7107 > Triad: 22305.1841 > ======================= > > Thanks a lot, > Qin > > From: Barry Smith > To: Qin Lu > Cc: "petsc-users at mcs.anl.gov" > Sent: Thursday, May 29, 2014 4:17 PM > Subject: Re: [petsc-users] About parallel performance > > > > You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can > > cd src/benchmarks/streams/ > > make MPIVersion > > mpiexec -n 1 ./MPIVersion > > mpiexec -n 2 ./MPIVersion > > and send all the results > > Barry > > > > On May 29, 2014, at 4:06 PM, Qin Lu wrote: > >> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. >> >> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). >> >> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? >> >> Many thanks, >> Qin >> >> ----- Original Message ----- >> From: Barry Smith >> To: Qin Lu >> Cc: "petsc-users at mcs.anl.gov" >> Sent: Thursday, May 29, 2014 2:12 PM >> Subject: Re: [petsc-users] About parallel performance >> >> >> You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). >> >> Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory. If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. >> >> Barry >> >> >> >> >> >> On May 29, 2014, at 1:23 PM, Qin Lu wrote: >> >>> Hello, >>> >>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). >>> >>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >>> >>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. >>> My questions are: >>> >>> 1. what is the bottle neck of the parallel run according to the summary? >>> 2. Do you have any suggestions to improve the parallel performance? >>> >>> Thanks a lot for your suggestions! >>> >>> Regards, >>> Qin From lu_qin_2000 at yahoo.com Thu May 29 17:15:47 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Thu, 29 May 2014 15:15:47 -0700 (PDT) Subject: [petsc-users] About parallel performance In-Reply-To: <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> Message-ID: <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> Barry, ? How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1? ? The machine has?very new?Intel chips and is very for serial run.?What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2)?that was not built correctly? Many thanks, Qin ? ----- Original Message ----- From: Barry Smith To: Qin Lu ; petsc-users Cc: Sent: Thursday, May 29, 2014 4:54 PM Subject: Re: [petsc-users] About parallel performance ? In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. ? But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance. ? Barry On May 29, 2014, at 4:37 PM, Qin Lu wrote: > Barry, > > I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later): > > ================= > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion > Number of MPI processes 1 > Function? ? ? Rate (MB/s) > Copy:? ? ? 21682.9932 > Scale:? ? ? 21637.5509 > Add:? ? ? ? 21583.0395 > Triad:? ? ? 21504.6563 > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion > Number of MPI processes 2 > Function? ? ? Rate (MB/s) > Copy:? ? ? 21369.6976 > Scale:? ? ? 21632.3203 > Add:? ? ? ? 22203.7107 > Triad:? ? ? 22305.1841 > ======================= > > Thanks a lot, > Qin > > From: Barry Smith > To: Qin Lu > Cc: "petsc-users at mcs.anl.gov" > Sent: Thursday, May 29, 2014 4:17 PM > Subject: Re: [petsc-users] About parallel performance > > > >? You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can > > cd? src/benchmarks/streams/ > > make MPIVersion > > mpiexec -n 1 ./MPIVersion > > mpiexec -n 2 ./MPIVersion > >? ? and send all the results > >? ? Barry > > > > On May 29, 2014, at 4:06 PM, Qin Lu wrote: > >> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. >>? >> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). >> >> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? >> >> Many thanks, >> Qin >> >> ----- Original Message ----- >> From: Barry Smith >> To: Qin Lu >> Cc: "petsc-users at mcs.anl.gov" >> Sent: Thursday, May 29, 2014 2:12 PM >> Subject: Re: [petsc-users] About parallel performance >> >> >>? ? You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). >> >>? ? Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. >> >>? ? Barry >> >> >> >> >> >> On May 29, 2014, at 1:23 PM, Qin Lu wrote: >> >>> Hello, >>> >>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). >>> >>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >>>? >>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. >>> My questions are: >>>? >>> 1. what is the bottle neck of the parallel run according to the summary? >>> 2. Do you have any suggestions to improve the parallel performance? >>>? >>> Thanks a lot for your suggestions! >>>? >>> Regards, >>> Qin? ? ? ? ? From knepley at gmail.com Thu May 29 17:27:30 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 May 2014 17:27:30 -0500 Subject: [petsc-users] About parallel performance In-Reply-To: <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> Message-ID: On Thu, May 29, 2014 at 5:15 PM, Qin Lu wrote: > Barry, > > How did you read the test results? For a machine good for parallism, > should the data of np=2 be about half of the those of np=1? Ideally, the numbers should be about twice as big for np = 2. > > The machine has very new Intel chips and is very for serial run. What may > cause the bad parallism? - the configurations of the machine, or I am using > a MPI lib (MPICH2) that was not built correctly? > The cause is machine architecture. The memory bandwidth is only sufficient for one core. Thanks, Matt > Many thanks, > Qin > > ----- Original Message ----- > From: Barry Smith > To: Qin Lu ; petsc-users > Cc: > Sent: Thursday, May 29, 2014 4:54 PM > Subject: Re: [petsc-users] About parallel performance > > > In that PETSc version BasicVersion is actually the MPI streams benchmark > so you ran the right thing. Your machine is totally worthless for sparse > linear algebra parallelism. The entire memory bandwidth is used by the > first core so adding the second core to the computation gives you no > improvement at all in the streams benchmark. > > But the single core memory bandwidth is pretty good so for problems that > don?t need parallelism you should get good performance. > > Barry > > > > > On May 29, 2014, at 4:37 PM, Qin Lu wrote: > > > Barry, > > > > I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean > BasicVersion? I built and ran it (if you did mean MPIVersion, I will get > PETSc-3.4 later): > > > > ================= > > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion > > Number of MPI processes 1 > > Function Rate (MB/s) > > Copy: 21682.9932 > > Scale: 21637.5509 > > Add: 21583.0395 > > Triad: 21504.6563 > > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion > > Number of MPI processes 2 > > Function Rate (MB/s) > > Copy: 21369.6976 > > Scale: 21632.3203 > > Add: 22203.7107 > > Triad: 22305.1841 > > ======================= > > > > Thanks a lot, > > Qin > > > > From: Barry Smith > > To: Qin Lu > > Cc: "petsc-users at mcs.anl.gov" > > Sent: Thursday, May 29, 2014 4:17 PM > > Subject: Re: [petsc-users] About parallel performance > > > > > > > > You need to run the streams benchmarks are one and two processes to > see how the memory bandwidth changes. If you are using petsc-3.4 you can > > > > cd src/benchmarks/streams/ > > > > make MPIVersion > > > > mpiexec -n 1 ./MPIVersion > > > > mpiexec -n 2 ./MPIVersion > > > > and send all the results > > > > Barry > > > > > > > > On May 29, 2014, at 4:06 PM, Qin Lu wrote: > > > >> For now I only care about the CPU of PETSc subroutines. I tried to add > PetscLogEventBegin/End and the results are consistent with the log_summary > attached in my first email. > >> > >> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs > are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between > p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little > faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 > sec). So the total CPU of PETSc subtroutines are about the same between p1 > and p2 (502 sec vs. 488 sec). > >> > >> It seems I need a more efficient parallel preconditioner. Do you have > any suggestions for that? > >> > >> Many thanks, > >> Qin > >> > >> ----- Original Message ----- > >> From: Barry Smith > >> To: Qin Lu > >> Cc: "petsc-users at mcs.anl.gov" > >> Sent: Thursday, May 29, 2014 2:12 PM > >> Subject: Re: [petsc-users] About parallel performance > >> > >> > >> You need to determine where the other 80% of the time is. My guess > it is in setting the values into the matrix each time. Use > PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code > that computes all the entries in the matrix and calls MatSetValues() and > MatAssemblyBegin/End(). > >> > >> Likely the reason the linear solver does not scale better is that > you have a machine with multiple cores that share the same memory bandwidth > and the first core is already using well over half the memory bandwidth so > the second core cannot be fully utilized since both cores have to wait for > data to arrive from memory. If you are using the development version of > PETSc you can run make streams NPMAX=2 from the PETSc root directory and > send this to us to confirm this. > >> > >> Barry > >> > >> > >> > >> > >> > >> On May 29, 2014, at 1:23 PM, Qin Lu wrote: > >> > >>> Hello, > >>> > >>> I implemented PETSc parallel linear solver in a program, the > implementation is basically the same as > /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, > and let PETSc partition the matrix through MatGetOwnershipRange. However, a > few tests shows the parallel solver is always a little slower the serial > solver (I have excluded the matrix generation CPU). > >>> > >>> For serial run I used PCILU as preconditioner; for parallel run, I > used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type > preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around > 200,000. > >>> > >>> I have used -log_summary to print out the performance summary as > attached (log_summary_p1 for serial run and log_summary_p2 for the run with > 2 processes). It seems the KSPSolve counts only for less than 20% of Global > %T. > >>> My questions are: > >>> > >>> 1. what is the bottle neck of the parallel run according to the > summary? > >>> 2. Do you have any suggestions to improve the parallel performance? > >>> > >>> Thanks a lot for your suggestions! > >>> > >>> Regards, > >>> Qin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Thu May 29 17:40:25 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Thu, 29 May 2014 15:40:25 -0700 (PDT) Subject: [petsc-users] About parallel performance In-Reply-To: References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> Message-ID: <1401403225.85082.YahooMailNeo@web160202.mail.bf1.yahoo.com> Is this determined by how the machine was built (which I can not do anything), or by how the MPI/meassge-passing?is configured at the cluster (which I can ask IT?people to modify)? - this machine is actually a node of a linux cluster. ? Thanks, Qin? ________________________________ From: Matthew Knepley To: Qin Lu Cc: Barry Smith ; petsc-users Sent: Thursday, May 29, 2014 5:27 PM Subject: Re: [petsc-users] About parallel performance On Thu, May 29, 2014 at 5:15 PM, Qin Lu wrote: Barry, >? >How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1? Ideally, the numbers should be about twice as big for np = 2. ? >The machine has?very new?Intel chips and is very for serial run.?What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2)?that was not built correctly? > The cause is machine architecture. The memory bandwidth is only sufficient for one core. ? Thanks, ? ? ?Matt Many thanks, >Qin >? >----- Original Message ----- >From: Barry Smith >To: Qin Lu ; petsc-users >Cc: >Sent: Thursday, May 29, 2014 4:54 PM >Subject: Re: [petsc-users] About parallel performance > > >? In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. > >? But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance. > >? ?Barry > > > > >On May 29, 2014, at 4:37 PM, Qin Lu wrote: > >> Barry, >> >> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later): >> >> ================= >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion >> Number of MPI processes 1 >> Function? ? ? Rate (MB/s) >> Copy:? ? ? ?21682.9932 >> Scale:? ? ? 21637.5509 >> Add:? ? ? ? 21583.0395 >> Triad:? ? ? 21504.6563 >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion >> Number of MPI processes 2 >> Function? ? ? Rate (MB/s) >> Copy:? ? ? ?21369.6976 >> Scale:? ? ? 21632.3203 >> Add:? ? ? ? 22203.7107 >> Triad:? ? ? 22305.1841 >> ======================= >> >> Thanks a lot, >> Qin >> >> From: Barry Smith >> To: Qin Lu >> Cc: "petsc-users at mcs.anl.gov" >> Sent: Thursday, May 29, 2014 4:17 PM >> Subject: Re: [petsc-users] About parallel performance >> >> >> >>? ?You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can >> >> cd? src/benchmarks/streams/ >> >> make MPIVersion >> >> mpiexec -n 1 ./MPIVersion >> >> mpiexec -n 2 ./MPIVersion >> >>? ? and send all the results >> >>? ? Barry >> >> >> >> On May 29, 2014, at 4:06 PM, Qin Lu wrote: >> >>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. >>>? >>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). >>> >>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? >>> >>> Many thanks, >>> Qin >>> >>> ----- Original Message ----- >>> From: Barry Smith >>> To: Qin Lu >>> Cc: "petsc-users at mcs.anl.gov" >>> Sent: Thursday, May 29, 2014 2:12 PM >>> Subject: Re: [petsc-users] About parallel performance >>> >>> >>>? ? ?You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). >>> >>>? ? ?Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. >>> >>>? ? ?Barry >>> >>> >>> >>> >>> >>> On May 29, 2014, at 1:23 PM, Qin Lu wrote: >>> >>>> Hello, >>>> >>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). >>>> >>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >>>>? >>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. >>>> My questions are: >>>>? >>>> 1. what is the bottle neck of the parallel run according to the summary? >>>> 2. Do you have any suggestions to improve the parallel performance? >>>>? >>>> Thanks a lot for your suggestions! >>>>? >>>> Regards, >>>> Qin? ? ? ? ? > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu May 29 17:45:34 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 May 2014 17:45:34 -0500 Subject: [petsc-users] About parallel performance In-Reply-To: <1401403225.85082.YahooMailNeo@web160202.mail.bf1.yahoo.com> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> <1401403225.85082.YahooMailNeo@web160202.mail.bf1.yahoo.com> Message-ID: On Thu, May 29, 2014 at 5:40 PM, Qin Lu wrote: > Is this determined by how the machine was built (which I can not do > anything), or by how the MPI/meassge-passing is configured at the cluster > (which I can ask IT people to modify)? - this machine is actually a node of > a linux cluster. > It is determined by how the machine was built. Your best bet for scalability is to use one process per node. Thanks, Matt > > Thanks, > Qin > > *From:* Matthew Knepley > *To:* Qin Lu > *Cc:* Barry Smith ; petsc-users < > petsc-users at mcs.anl.gov> > *Sent:* Thursday, May 29, 2014 5:27 PM > *Subject:* Re: [petsc-users] About parallel performance > > On Thu, May 29, 2014 at 5:15 PM, Qin Lu wrote: > > Barry, > > How did you read the test results? For a machine good for parallism, > should the data of np=2 be about half of the those of np=1? > > > Ideally, the numbers should be about twice as big for np = 2. > > > > The machine has very new Intel chips and is very for serial run. What may > cause the bad parallism? - the configurations of the machine, or I am using > a MPI lib (MPICH2) that was not built correctly? > > > The cause is machine architecture. The memory bandwidth is only sufficient > for one core. > > Thanks, > > Matt > > > > > Many thanks, > Qin > > ----- Original Message ----- > From: Barry Smith > To: Qin Lu ; petsc-users > Cc: > Sent: Thursday, May 29, 2014 4:54 PM > Subject: Re: [petsc-users] About parallel performance > > > In that PETSc version BasicVersion is actually the MPI streams benchmark > so you ran the right thing. Your machine is totally worthless for sparse > linear algebra parallelism. The entire memory bandwidth is used by the > first core so adding the second core to the computation gives you no > improvement at all in the streams benchmark. > > But the single core memory bandwidth is pretty good so for problems that > don?t need parallelism you should get good performance. > > Barry > > > > > On May 29, 2014, at 4:37 PM, Qin Lu wrote: > > > Barry, > > > > I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean > BasicVersion? I built and ran it (if you did mean MPIVersion, I will get > PETSc-3.4 later): > > > > ================= > > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion > > Number of MPI processes 1 > > Function Rate (MB/s) > > Copy: 21682.9932 > > Scale: 21637.5509 > > Add: 21583.0395 > > Triad: 21504.6563 > > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion > > Number of MPI processes 2 > > Function Rate (MB/s) > > Copy: 21369.6976 > > Scale: 21632.3203 > > Add: 22203.7107 > > Triad: 22305.1841 > > ======================= > > > > Thanks a lot, > > Qin > > > > From: Barry Smith > > To: Qin Lu > > Cc: "petsc-users at mcs.anl.gov" > > Sent: Thursday, May 29, 2014 4:17 PM > > Subject: Re: [petsc-users] About parallel performance > > > > > > > > You need to run the streams benchmarks are one and two processes to > see how the memory bandwidth changes. If you are using petsc-3.4 you can > > > > cd src/benchmarks/streams/ > > > > make MPIVersion > > > > mpiexec -n 1 ./MPIVersion > > > > mpiexec -n 2 ./MPIVersion > > > > and send all the results > > > > Barry > > > > > > > > On May 29, 2014, at 4:06 PM, Qin Lu wrote: > > > >> For now I only care about the CPU of PETSc subroutines. I tried to add > PetscLogEventBegin/End and the results are consistent with the log_summary > attached in my first email. > >> > >> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs > are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between > p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little > faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 > sec). So the total CPU of PETSc subtroutines are about the same between p1 > and p2 (502 sec vs. 488 sec). > >> > >> It seems I need a more efficient parallel preconditioner. Do you have > any suggestions for that? > >> > >> Many thanks, > >> Qin > >> > >> ----- Original Message ----- > >> From: Barry Smith > >> To: Qin Lu > >> Cc: "petsc-users at mcs.anl.gov" > >> Sent: Thursday, May 29, 2014 2:12 PM > >> Subject: Re: [petsc-users] About parallel performance > >> > >> > >> You need to determine where the other 80% of the time is. My guess > it is in setting the values into the matrix each time. Use > PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code > that computes all the entries in the matrix and calls MatSetValues() and > MatAssemblyBegin/End(). > >> > >> Likely the reason the linear solver does not scale better is that > you have a machine with multiple cores that share the same memory bandwidth > and the first core is already using well over half the memory bandwidth so > the second core cannot be fully utilized since both cores have to wait for > data to arrive from memory. If you are using the development version of > PETSc you can run make streams NPMAX=2 from the PETSc root directory and > send this to us to confirm this. > >> > >> Barry > >> > >> > >> > >> > >> > >> On May 29, 2014, at 1:23 PM, Qin Lu wrote: > >> > >>> Hello, > >>> > >>> I implemented PETSc parallel linear solver in a program, the > implementation is basically the same as > /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, > and let PETSc partition the matrix through MatGetOwnershipRange. However, a > few tests shows the parallel solver is always a little slower the serial > solver (I have excluded the matrix generation CPU). > >>> > >>> For serial run I used PCILU as preconditioner; for parallel run, I > used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type > preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around > 200,000. > >>> > >>> I have used -log_summary to print out the performance summary as > attached (log_summary_p1 for serial run and log_summary_p2 for the run with > 2 processes). It seems the KSPSolve counts only for less than 20% of Global > %T. > >>> My questions are: > >>> > >>> 1. what is the bottle neck of the parallel run according to the > summary? > >>> 2. Do you have any suggestions to improve the parallel performance? > >>> > >>> Thanks a lot for your suggestions! > >>> > >>> Regards, > >>> Qin > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu May 29 17:46:08 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 29 May 2014 17:46:08 -0500 Subject: [petsc-users] About parallel performance In-Reply-To: <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> Message-ID: <7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov> For the parallel case a perfect machine would have twice the memory bandwidth when using 2 cores as opposed to 1 core. For yours it is almost exactly the same. The issue is not with the MPI or software. It depends on how many memory sockets there are and how they are shared by the various cores. As I said the initial memory bandwidth for one core 21,682. gigabytes per second is good so it is a very good sequential machine. Here are the results on my laptop Number of MPI processes 1 Process 0 Barrys-MacBook-Pro.local Function Rate (MB/s) Copy: 7928.7346 Scale: 8271.5103 Add: 11017.0430 Triad: 10843.9018 Number of MPI processes 2 Process 0 Barrys-MacBook-Pro.local Process 1 Barrys-MacBook-Pro.local Function Rate (MB/s) Copy: 13513.0365 Scale: 13516.7086 Add: 15455.3952 Triad: 15562.0822 ------------------------------------------------ np speedup 1 1.0 2 1.44 Note that the memory bandwidth is much lower than your machine but there is an increase in speedup from one to two cores because one core cannot utilize all the memory bandwidth. But even with two cores my laptop will be slower on PETSc then one core on your machine. Here is the performance on a workstation we have that has multiple CPUs and multiple memory sockets Number of MPI processes 1 Process 0 es Function Rate (MB/s) Copy: 13077.8260 Scale: 12867.1966 Add: 14637.6757 Triad: 14414.4478 Number of MPI processes 2 Process 0 es Process 1 es Function Rate (MB/s) Copy: 22663.3116 Scale: 22102.5495 Add: 25768.1550 Triad: 26076.0410 Number of MPI processes 3 Process 0 es Process 1 es Process 2 es Function Rate (MB/s) Copy: 27501.7610 Scale: 26971.2183 Add: 30433.3276 Triad: 31302.9396 Number of MPI processes 4 Process 0 es Process 1 es Process 2 es Process 3 es Function Rate (MB/s) Copy: 29302.3183 Scale: 30165.5295 Add: 34577.3458 Triad: 35195.8067 ------------------------------------------------ np speedup 1 1.0 2 1.81 3 2.17 4 2.44 Note that one core has a lower memory bandwidth than your machine but as I add more cores the memory bandwidth increases by a factor of 2.4 There is nothing wrong with your machine, it is just not suitable to run sparse linear algebra on multiple cores for it. Barry On May 29, 2014, at 5:15 PM, Qin Lu wrote: > Barry, > > How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1? > > The machine has very new Intel chips and is very for serial run. What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2) that was not built correctly? > Many thanks, > Qin > > ----- Original Message ----- > From: Barry Smith > To: Qin Lu ; petsc-users > Cc: > Sent: Thursday, May 29, 2014 4:54 PM > Subject: Re: [petsc-users] About parallel performance > > > In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. > > But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance. > > Barry > > > > > On May 29, 2014, at 4:37 PM, Qin Lu wrote: > >> Barry, >> >> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later): >> >> ================= >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion >> Number of MPI processes 1 >> Function Rate (MB/s) >> Copy: 21682.9932 >> Scale: 21637.5509 >> Add: 21583.0395 >> Triad: 21504.6563 >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion >> Number of MPI processes 2 >> Function Rate (MB/s) >> Copy: 21369.6976 >> Scale: 21632.3203 >> Add: 22203.7107 >> Triad: 22305.1841 >> ======================= >> >> Thanks a lot, >> Qin >> >> From: Barry Smith >> To: Qin Lu >> Cc: "petsc-users at mcs.anl.gov" >> Sent: Thursday, May 29, 2014 4:17 PM >> Subject: Re: [petsc-users] About parallel performance >> >> >> >> You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can >> >> cd src/benchmarks/streams/ >> >> make MPIVersion >> >> mpiexec -n 1 ./MPIVersion >> >> mpiexec -n 2 ./MPIVersion >> >> and send all the results >> >> Barry >> >> >> >> On May 29, 2014, at 4:06 PM, Qin Lu wrote: >> >>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. >>> >>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). >>> >>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? >>> >>> Many thanks, >>> Qin >>> >>> ----- Original Message ----- >>> From: Barry Smith >>> To: Qin Lu >>> Cc: "petsc-users at mcs.anl.gov" >>> Sent: Thursday, May 29, 2014 2:12 PM >>> Subject: Re: [petsc-users] About parallel performance >>> >>> >>> You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). >>> >>> Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory. If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. >>> >>> Barry >>> >>> >>> >>> >>> >>> On May 29, 2014, at 1:23 PM, Qin Lu wrote: >>> >>>> Hello, >>>> >>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). >>>> >>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >>>> >>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. >>>> My questions are: >>>> >>>> 1. what is the bottle neck of the parallel run according to the summary? >>>> 2. Do you have any suggestions to improve the parallel performance? >>>> >>>> Thanks a lot for your suggestions! >>>> >>>> Regards, >>>> Qin From lu_qin_2000 at yahoo.com Thu May 29 17:46:23 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Thu, 29 May 2014 15:46:23 -0700 (PDT) Subject: [petsc-users] About parallel performance In-Reply-To: References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> <1401403225.85082.YahooMailNeo@web160202.mail.bf1.yahoo.com> Message-ID: <1401403583.88868.YahooMailNeo@web160206.mail.bf1.yahoo.com> Thanks a lot! I will try that. ? Qin? ________________________________ From: Matthew Knepley To: Qin Lu Cc: Barry Smith ; petsc-users Sent: Thursday, May 29, 2014 5:45 PM Subject: Re: [petsc-users] About parallel performance On Thu, May 29, 2014 at 5:40 PM, Qin Lu wrote: Is this determined by how the machine was built (which I can not do anything), or by how the MPI/meassge-passing?is configured at the cluster (which I can ask IT?people to modify)? - this machine is actually a node of a linux cluster. It is determined by how the machine was built. Your best bet for scalability is to use one process per node. ? Thanks, ? ? ?Matt ? >Thanks, >Qin? > > > From: Matthew Knepley >To: Qin Lu >Cc: Barry Smith ; petsc-users >Sent: Thursday, May 29, 2014 5:27 PM >Subject: Re: [petsc-users] About parallel performance > > > >On Thu, May 29, 2014 at 5:15 PM, Qin Lu wrote: > >Barry, >>? >>How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1? > > >Ideally, the numbers should be about twice as big for np = 2. > >? >>The machine has?very new?Intel chips and is very for serial run.?What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2)?that was not built correctly? >> > > >The cause is machine architecture. The memory bandwidth is only sufficient for one core. > > >? Thanks, > > >? ? ?Matt > > > > > >Many thanks, >>Qin >>? >>----- Original Message ----- >>From: Barry Smith >>To: Qin Lu ; petsc-users >>Cc: >>Sent: Thursday, May 29, 2014 4:54 PM >>Subject: Re: [petsc-users] About parallel performance >> >> >>? In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. >> >>? But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance. >> >>? ?Barry >> >> >> >> >>On May 29, 2014, at 4:37 PM, Qin Lu wrote: >> >>> Barry, >>> >>> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later): >>> >>> ================= >>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion >>> Number of MPI processes 1 >>> Function? ? ? Rate (MB/s) >>> Copy:? ? ? ?21682.9932 >>> Scale:? ? ? 21637.5509 >>> Add:? ? ? ? 21583.0395 >>> Triad:? ? ? 21504.6563 >>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion >>> Number of MPI processes 2 >>> Function? ? ? Rate (MB/s) >>> Copy:? ? ? ?21369.6976 >>> Scale:? ? ? 21632.3203 >>> Add:? ? ? ? 22203.7107 >>> Triad:? ? ? 22305.1841 >>> ======================= >>> >>> Thanks a lot, >>> Qin >>> >>> From: Barry Smith >>> To: Qin Lu >>> Cc: "petsc-users at mcs.anl.gov" >>> Sent: Thursday, May 29, 2014 4:17 PM >>> Subject: Re: [petsc-users] About parallel performance >>> >>> >>> >>>? ?You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can >>> >>> cd? src/benchmarks/streams/ >>> >>> make MPIVersion >>> >>> mpiexec -n 1 ./MPIVersion >>> >>> mpiexec -n 2 ./MPIVersion >>> >>>? ? and send all the results >>> >>>? ? Barry >>> >>> >>> >>> On May 29, 2014, at 4:06 PM, Qin Lu wrote: >>> >>>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. >>>>? >>>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). >>>> >>>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? >>>> >>>> Many thanks, >>>> Qin >>>> >>>> ----- Original Message ----- >>>> From: Barry Smith >>>> To: Qin Lu >>>> Cc: "petsc-users at mcs.anl.gov" >>>> Sent: Thursday, May 29, 2014 2:12 PM >>>> Subject: Re: [petsc-users] About parallel performance >>>> >>>> >>>>? ? ?You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). >>>> >>>>? ? ?Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. >>>> >>>>? ? ?Barry >>>> >>>> >>>> >>>> >>>> >>>> On May 29, 2014, at 1:23 PM, Qin Lu wrote: >>>> >>>>> Hello, >>>>> >>>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). >>>>> >>>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >>>>>? >>>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. >>>>> My questions are: >>>>>? >>>>> 1. what is the bottle neck of the parallel run according to the summary? >>>>> 2. Do you have any suggestions to improve the parallel performance? >>>>>? >>>>> Thanks a lot for your suggestions! >>>>>? >>>>> Regards, >>>>> Qin? ? ? ? ? >> > > > >-- >What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >-- Norbert Wiener > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lu_qin_2000 at yahoo.com Thu May 29 17:49:24 2014 From: lu_qin_2000 at yahoo.com (Qin Lu) Date: Thu, 29 May 2014 15:49:24 -0700 (PDT) Subject: [petsc-users] About parallel performance In-Reply-To: <7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov> References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com> <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com> <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov> <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com> <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov> <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com> <7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov> Message-ID: <1401403764.96816.YahooMailNeo@web160202.mail.bf1.yahoo.com> Barry, ? Thanks a lot for the info! I know now?what?was the problem.? ? Qin ________________________________ From: Barry Smith To: Qin Lu Cc: petsc-users Sent: Thursday, May 29, 2014 5:46 PM Subject: Re: [petsc-users] About parallel performance ? For the parallel case a perfect machine would have twice the memory bandwidth when using 2 cores as opposed to 1 core. For yours it is almost exactly the same. The issue is not with the MPI or software. It depends on how many memory sockets there are and how they are shared by the various cores. As I said the initial memory bandwidth for one core 21,682. gigabytes per second is good so it is a very good sequential machine. ? Here are the results on my laptop Number of MPI processes 1 Process 0 Barrys-MacBook-Pro.local Function? ? ? Rate (MB/s) Copy:? ? ? ? 7928.7346 Scale:? ? ? 8271.5103 Add:? ? ? ? 11017.0430 Triad:? ? ? 10843.9018 Number of MPI processes 2 Process 0 Barrys-MacBook-Pro.local Process 1 Barrys-MacBook-Pro.local Function? ? ? Rate (MB/s) Copy:? ? ? 13513.0365 Scale:? ? ? 13516.7086 Add:? ? ? ? 15455.3952 Triad:? ? ? 15562.0822 ------------------------------------------------ np? speedup 1 1.0 2 1.44 Note that the memory bandwidth is much lower than your machine but there is an increase in speedup from one to two cores because one core cannot utilize all the memory bandwidth. But even with two cores my laptop will be slower on PETSc then one core on your machine. Here is the performance on a workstation we have that has multiple CPUs and multiple memory sockets Number of MPI processes 1 Process 0 es Function? ? ? Rate (MB/s) Copy:? ? ? 13077.8260 Scale:? ? ? 12867.1966 Add:? ? ? ? 14637.6757 Triad:? ? ? 14414.4478 Number of MPI processes 2 Process 0 es Process 1 es Function? ? ? Rate (MB/s) Copy:? ? ? 22663.3116 Scale:? ? ? 22102.5495 Add:? ? ? ? 25768.1550 Triad:? ? ? 26076.0410 Number of MPI processes 3 Process 0 es Process 1 es Process 2 es Function? ? ? Rate (MB/s) Copy:? ? ? 27501.7610 Scale:? ? ? 26971.2183 Add:? ? ? ? 30433.3276 Triad:? ? ? 31302.9396 Number of MPI processes 4 Process 0 es Process 1 es Process 2 es Process 3 es Function? ? ? Rate (MB/s) Copy:? ? ? 29302.3183 Scale:? ? ? 30165.5295 Add:? ? ? ? 34577.3458 Triad:? ? ? 35195.8067 ------------------------------------------------ np? speedup 1 1.0 2 1.81 3 2.17 4 2.44 Note that one core has a lower memory bandwidth than your machine but as I add more cores the memory bandwidth increases by a factor of 2.4 There is nothing wrong with your machine, it is just not suitable to run sparse linear algebra on multiple cores for it. ? Barry On May 29, 2014, at 5:15 PM, Qin Lu wrote: > Barry, >? > How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1? >? > The machine has very new Intel chips and is very for serial run. What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2) that was not built correctly? > Many thanks, > Qin >? > ----- Original Message ----- > From: Barry Smith > To: Qin Lu ; petsc-users > Cc: > Sent: Thursday, May 29, 2014 4:54 PM > Subject: Re: [petsc-users] About parallel performance > > >? In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. > >? But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance. > >? ? Barry > > > > > On May 29, 2014, at 4:37 PM, Qin Lu wrote: > >> Barry, >> >> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later): >> >> ================= >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion >> Number of MPI processes 1 >> Function? ? ? Rate (MB/s) >> Copy:? ? ? 21682.9932 >> Scale:? ? ? 21637.5509 >> Add:? ? ? ? 21583.0395 >> Triad:? ? ? 21504.6563 >> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion >> Number of MPI processes 2 >> Function? ? ? Rate (MB/s) >> Copy:? ? ? 21369.6976 >> Scale:? ? ? 21632.3203 >> Add:? ? ? ? 22203.7107 >> Triad:? ? ? 22305.1841 >> ======================= >> >> Thanks a lot, >> Qin >> >> From: Barry Smith >> To: Qin Lu >> Cc: "petsc-users at mcs.anl.gov" >> Sent: Thursday, May 29, 2014 4:17 PM >> Subject: Re: [petsc-users] About parallel performance >> >> >> >>? ? You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can >> >> cd? src/benchmarks/streams/ >> >> make MPIVersion >> >> mpiexec -n 1 ./MPIVersion >> >> mpiexec -n 2 ./MPIVersion >> >>? ? and send all the results >> >>? ? Barry >> >> >> >> On May 29, 2014, at 4:06 PM, Qin Lu wrote: >> >>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email. >>>? >>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec). >>> >>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that? >>> >>> Many thanks, >>> Qin >>> >>> ----- Original Message ----- >>> From: Barry Smith >>> To: Qin Lu >>> Cc: "petsc-users at mcs.anl.gov" >>> Sent: Thursday, May 29, 2014 2:12 PM >>> Subject: Re: [petsc-users] About parallel performance >>> >>> >>>? ? ? You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End(). >>> >>>? ? ? Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this. >>> >>>? ? ? Barry >>> >>> >>> >>> >>> >>> On May 29, 2014, at 1:23 PM, Qin Lu wrote: >>> >>>> Hello, >>>> >>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU). >>>> >>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000. >>>>? >>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. >>>> My questions are: >>>>? >>>> 1. what is the bottle neck of the parallel run according to the summary? >>>> 2. Do you have any suggestions to improve the parallel performance? >>>>? >>>> Thanks a lot for your suggestions! >>>>? >>>> Regards, >>>> Qin? ? ? ? ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu May 29 21:29:15 2014 From: hzhang at mcs.anl.gov (Hong Zhang) Date: Thu, 29 May 2014 21:29:15 -0500 Subject: [petsc-users] Accessing MUMPS INFOG values In-Reply-To: References: <55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov> <4e4cfc6e38dd40cf9e93cb3c5aa39258@NAGURSKI.anl.gov> Message-ID: >> He asks >> * Will/does this have a Fortran equivalent? >> >> I'm not sure if the needed Fortran stubs are created automatically or >> we must create them manually? > > You need to write manual pages for each of these functions and make sure they start with /*@ and end with @*/ then run make allfortranstubs and make sure they get generated. OK, I'll get this done. Hong >>> >>> On May 29, 2014, at 1:43 PM, Barry Smith wrote: >>> >>>> >>>> We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h >>>> (versions also for single precision and complex). So what PETSc should provide in mumps.c is a function something like >>>> >>>> #undef __FUNCT__ >>>> #define __FUNCT__ "MatMUMPSGetStruc" >>>> PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc) >>>> { >>>> Mat_MUMPS *mumps=(Mat_MUMPS*)A->spptr; >>>> >>>> PetscFunctionBegin; >>>> *struc = (void *) mumps->id >>>> PetscFunctionReturn(0); >>>> } >>>> so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled >>>> Also add a prototype for this function in petscmat.h >>>> >>>> >>>> Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like. >>>> >>>> Let us know how it goes and we?ll get this stuff into the development version of PETSc. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> On May 29, 2014, at 10:38 AM, M Asghar wrote: >>>> >>>>> Hi, >>>>> >>>>> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface? >>>>> >>>>> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also. >>>>> >>>>> Many thanks in advance. >>>>> >>>>> M Asghar >>>>> >>>> >>> > From D.Lathouwers at tudelft.nl Fri May 30 03:11:02 2014 From: D.Lathouwers at tudelft.nl (Danny Lathouwers - TNW) Date: Fri, 30 May 2014 08:11:02 +0000 Subject: [petsc-users] rtol meaning In-Reply-To: <91B5A9C1-52F4-4E35-9841-29D396EDFCA4@mcs.anl.gov> References: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net> <4E6B33F4128CED4DB307BA83146E9A64259081DE@SRV362.tudelft.net>, <91B5A9C1-52F4-4E35-9841-29D396EDFCA4@mcs.anl.gov> Message-ID: Thanks for the clarification. Petsc saves me a lot of time not having to write the icc etc so i can live with these small issues very well. Petsc behaves as expected now. Danny Sent from my iPad > On 29 mei 2014, at 22:04, "Barry Smith" wrote: > > > Danny, > > The manual pages are a little sloppy and inconsistent. By default it uses ||b|| or || preconditioned b|| as the starting point. At the bottom of the badly formatted page you?ll see "- - rnorm_0 is the two norm of the right hand side. When initial guess is non-zero you can call KSPDefaultConvergedSetUIRNorm() to use the norm of (b - A*(initial guess)) as the starting point for relative norm convergence testing." > > Likely you want to call KSPDefaultConvergedSetUIRNorm if that is how you want to detect convergence. > > We?ll cleanup the manual pages, thanks for pointing out the confusion. > > Barry > > You can see the source code at http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/interface/iterativ.c.html#KSPDefaultConverged and confirm that what Matt said is correct. > > >> On May 29, 2014, at 2:49 PM, Danny Lathouwers - TNW wrote: >> >> Thanks Matt for your quick response. >> >> I got to believe that it was the relative ratio of the residual from the following petsc links: >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetTolerances.html >> and >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPDefaultConverged.html#KSPDefaultConverged >> >> Perhaps these pages are outdated? >> >> Cheers, >> Danny. > From jed at jedbrown.org Thu May 29 19:27:28 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 30 May 2014 02:27:28 +0200 Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> Message-ID: <87bnugyutr.fsf@jedbrown.org> Matthew Knepley writes: > There might be an easier way to do this: > PetscScalar val = 0.0, gval; > > VecGetOwnershipRange(xr, &low, &high); > if ((myindex >= low) && (myindex < high)) { > VecGetArray(localx1,&a); > val = a[myindex-low]; > VecRestoreArray(localx1, &a); > } > MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); > > Now everyone has the value at myindex. Yes, but VecGetArray is collective so please don't do it quite this way. Instead, write VecGetArray(localx1,&a); if ((myindex >= low) && (myindex < high)) { val = a[myindex-low]; } VecRestoreArray(localx1, &a); MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From jed at jedbrown.org Thu May 29 20:07:09 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 30 May 2014 03:07:09 +0200 Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid In-Reply-To: References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org> <537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org> <537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org> <538369C9.6010209@uci.edu> Message-ID: <87zji0xef6.fsf@jedbrown.org> Mark Adams writes: >> thank you for your input and sorry my late reply: I saw your email only >> now. >> By setting up the solver each time step you mean re-defining the KSP >> context every time? >> > > THe simplest thing is to just delete the object and create it again. THere > are "reset" methods that do the same thing semantically but it is probably > just easier to destroy the KSP object and recreate it and redo your setup > code. Mark, if PCReset (via KSPReset) does not produce the same behavior as destroying the KSP and recreating it, it is a bug. I think this is the case, but if it's not, it needs to be fixed. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From jed at jedbrown.org Thu May 29 20:51:16 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 30 May 2014 03:51:16 +0200 Subject: [petsc-users] ExodusII In-Reply-To: References: Message-ID: <87ppiwxcdn.fsf@jedbrown.org> Baros Vladimir writes: > Necessary header files, can be found here: > https://code.google.com/p/msinttypes/ > > It contains necessary inttypes.h and stdint.h headers > I successfully used them to build exodus lib with Visual Studio. > > Can anyone enable the support for exodus in Windows? As I said in my reply to Pedro, stdint.h is system functionality that is none of PETSc's business to be installing. PETSc could add tests for those headers and if so, attempt to build Exodus.II. But upstream really has to at least claim to support it. (We are not Exodus.II developers. The --download-* functionality in PETSc configure is supposed to be a convenience only, but it generates a disproportionate support workload. Attempting to support a configuration that upstream does not support and that is not used by any active PETSc developer is inviting increased support workload and a poor user experience. You're always welcome to install Exodus.II or any other library yourself.) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From knepley at gmail.com Fri May 30 06:35:30 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 30 May 2014 06:35:30 -0500 Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: <87bnugyutr.fsf@jedbrown.org> References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> <87bnugyutr.fsf@jedbrown.org> Message-ID: On Thu, May 29, 2014 at 7:27 PM, Jed Brown wrote: > Matthew Knepley writes: > > There might be an easier way to do this: > > PetscScalar val = 0.0, gval; > > > > VecGetOwnershipRange(xr, &low, &high); > > if ((myindex >= low) && (myindex < high)) { > > VecGetArray(localx1,&a); > > val = a[myindex-low]; > > VecRestoreArray(localx1, &a); > > } > > MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); > > > > Now everyone has the value at myindex. > > Yes, but VecGetArray is collective so please don't do it quite this way. > Instead, write > > VecGetArray(localx1,&a); > if ((myindex >= low) && (myindex < high)) { > val = a[myindex-low]; > } > VecRestoreArray(localx1, &a); > MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); > I think its better to use the non-collective version: VecGetOwnershipRange(xr, &low, &high); if ((myindex >= low) && (myindex < high)) { VecGetArrayRead(xr,&a); val = a[myindex-low]; VecRestoreArrayRead(xr, &a); } MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); Thanks Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmulas at oa-cagliari.inaf.it Fri May 30 07:42:07 2014 From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas) Date: Fri, 30 May 2014 14:42:07 +0200 (CEST) Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> <87bnugyutr.fsf@jedbrown.org> Message-ID: On Fri, 30 May 2014, Matthew Knepley wrote: > I think its better to use the non-collective version: > > VecGetOwnershipRange(xr, &low, &high); > if ((myindex >= low) && (myindex < high)) { > ? ?VecGetArrayRead(xr,&a); > ? ?val = a[myindex-low]; > ? ?VecRestoreArrayRead(xr, &a); > } > MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); I agree, and I used the above now. In any case, as it came out of this discussion, may I suggest that the man page of EPSSetArbitrarySelection() should document that the arbitrary selection user-defined function is collective, i.e. it is called on all nodes in PETSC_COMM_WORLD (and is thus an implicit MPI syncronisation point)? As it is now, if one just looks at the docs this is unclear, and it is consequently unclear also if one may use collective calls inside that user-defined function (the answer is yes, from this discussion). One may argue that it must be so, since the user-defined function can use the eigenvectors which by definition may be nonlocal, but making this explicit would not hurt. Giacomo -- _________________________________________________________________ Giacomo Mulas _________________________________________________________________ INAF - Osservatorio Astronomico di Cagliari via della scienza 5 - 09047 Selargius (CA) tel. +39 070 71180244 mob. : +39 329 6603810 _________________________________________________________________ "When the storms are raging around you, stay right where you are" (Freddy Mercury) _________________________________________________________________ From hus003 at ucsd.edu Sat May 31 00:55:14 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 31 May 2014 05:55:14 +0000 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> I'm looking at snes example ex19.c, on "nonlinear driven cavity multigrid 2d. You can also access it via the website ( http://acts.nersc.gov/petsc/example3/ex19.c.html ) There are three user defined local functions ( FormFunctionLocal, FormFunctionLocali, FormFunctionLocali4 ) that serves as discretized PDE operators declared before main, and is defined right after main. In the middle of the main, there are these four lines: 1. ierr = DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,ad_FormFunctionLocal,admf_FormFunctionLocal);CHKERRQ(ierr); 2. ierr = DMMGSetFromOptions(dmmg);CHKERRQ(ierr); 3. ierr = DMMGSetSNESLocali(dmmg,FormFunctionLocali,0,admf_FormFunctionLocali);CHKERRQ(ierr); 4. ierr = DMMGSetSNESLocalib(dmmg,FormFunctionLocali4,0,admfb_FormFunctionLocali4);CHKERRQ(ierr); I have the following questions: 1. What are ad_FormFunctionLocal, admf_FormFunctionLocal from line 1? They are not defined anywhere in ex19.c. Other terms such as admf_FormFunctionLocali and admfb_FormFunctionLocali4 are also not defined anywhere in the file. 2. To me it seems like DMMGSetSNESLocal, DMMGSetSNESLocali, and DMMGSetSNESLocalib evaluates the function for all grid points, for a single grid point and for a single degree of freedom, respectively. But how does the process choose which one to use? Thanks! - Hui -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sat May 31 03:43:27 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 31 May 2014 10:43:27 +0200 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87ppiutk28.fsf@jedbrown.org> "Sun, Hui" writes: > I'm looking at snes example ex19.c, on "nonlinear driven cavity multigrid 2d. You can also access it via the website ( http://acts.nersc.gov/petsc/example3/ex19.c.html ) These are not the same version. The acts.nersc.gov link is a very old version of that example. The source-transformation algorithmic differentiation tool ADIC is being used to compute derivatives (the ad_* and admf_* functions). ADIC is not maintained, has an unfortunate license, and was not widely used so the "automatic" support has been removed from PETSc. Please look at the current version, which has neither DMMG (removed/merged into SNES some years ago) or ADIC. > There are three user defined local functions ( FormFunctionLocal, FormFunctionLocali, FormFunctionLocali4 ) that serves as discretized PDE operators declared before main, and is defined right after main. In the middle of the main, there are these four lines: > > 1. ierr = DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,ad_FormFunctionLocal,admf_FormFunctionLocal);CHKERRQ(ierr); > 2. ierr = DMMGSetFromOptions(dmmg);CHKERRQ(ierr); > 3. ierr = DMMGSetSNESLocali(dmmg,FormFunctionLocali,0,admf_FormFunctionLocali);CHKERRQ(ierr); > 4. ierr = DMMGSetSNESLocalib(dmmg,FormFunctionLocali4,0,admfb_FormFunctionLocali4);CHKERRQ(ierr); > > I have the following questions: > > 1. What are ad_FormFunctionLocal, admf_FormFunctionLocal from line 1? They are not defined anywhere in ex19.c. Other terms such as admf_FormFunctionLocali and admfb_FormFunctionLocali4 are also not defined anywhere in the file. > > 2. To me it seems like DMMGSetSNESLocal, DMMGSetSNESLocali, and DMMGSetSNESLocalib evaluates the function for all grid points, for a single grid point and for a single degree of freedom, respectively. But how does the process choose which one to use? > > Thanks! - Hui -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From jroman at dsic.upv.es Sat May 31 04:46:50 2014 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sat, 31 May 2014 11:46:50 +0200 Subject: [petsc-users] question about arbitrary eigenvector selection in SLEPC In-Reply-To: References: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es> <87bnugyutr.fsf@jedbrown.org> Message-ID: El 30/05/2014, a las 14:42, Giacomo Mulas escribi?: > On Fri, 30 May 2014, Matthew Knepley wrote: > >> I think its better to use the non-collective version: >> VecGetOwnershipRange(xr, &low, &high); >> if ((myindex >= low) && (myindex < high)) { >> VecGetArrayRead(xr,&a); >> val = a[myindex-low]; >> VecRestoreArrayRead(xr, &a); >> } >> MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD); > > I agree, and I used the above now. In any case, as it came out of this > discussion, may I suggest that the man page of EPSSetArbitrarySelection() > should document that the arbitrary selection user-defined function is > collective, i.e. it is called on all nodes in PETSC_COMM_WORLD (and is thus > an implicit MPI syncronisation point)? As it is now, if one just looks at > the docs this is unclear, and it is consequently unclear also if one may use > collective calls inside that user-defined function (the answer is yes, from > this discussion). One may argue that it must be so, since the user-defined > function can use the eigenvectors which by definition may be nonlocal, but > making this explicit would not hurt. > > Giacomo > Done. https://bitbucket.org/slepc/slepc/commits/26293bc Thanks. Jose From hus003 at ucsd.edu Sat May 31 08:27:38 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 31 May 2014 13:27:38 +0000 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <87ppiutk28.fsf@jedbrown.org> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87ppiutk28.fsf@jedbrown.org> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> Thank you Jed. The version I was using is 3.1, it is too old. ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, May 31, 2014 1:43 AM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c "Sun, Hui" writes: > I'm looking at snes example ex19.c, on "nonlinear driven cavity multigrid 2d. You can also access it via the website ( http://acts.nersc.gov/petsc/example3/ex19.c.html ) These are not the same version. The acts.nersc.gov link is a very old version of that example. The source-transformation algorithmic differentiation tool ADIC is being used to compute derivatives (the ad_* and admf_* functions). ADIC is not maintained, has an unfortunate license, and was not widely used so the "automatic" support has been removed from PETSc. Please look at the current version, which has neither DMMG (removed/merged into SNES some years ago) or ADIC. > There are three user defined local functions ( FormFunctionLocal, FormFunctionLocali, FormFunctionLocali4 ) that serves as discretized PDE operators declared before main, and is defined right after main. In the middle of the main, there are these four lines: > > 1. ierr = DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,ad_FormFunctionLocal,admf_FormFunctionLocal);CHKERRQ(ierr); > 2. ierr = DMMGSetFromOptions(dmmg);CHKERRQ(ierr); > 3. ierr = DMMGSetSNESLocali(dmmg,FormFunctionLocali,0,admf_FormFunctionLocali);CHKERRQ(ierr); > 4. ierr = DMMGSetSNESLocalib(dmmg,FormFunctionLocali4,0,admfb_FormFunctionLocali4);CHKERRQ(ierr); > > I have the following questions: > > 1. What are ad_FormFunctionLocal, admf_FormFunctionLocal from line 1? They are not defined anywhere in ex19.c. Other terms such as admf_FormFunctionLocali and admfb_FormFunctionLocali4 are also not defined anywhere in the file. > > 2. To me it seems like DMMGSetSNESLocal, DMMGSetSNESLocali, and DMMGSetSNESLocalib evaluates the function for all grid points, for a single grid point and for a single degree of freedom, respectively. But how does the process choose which one to use? > > Thanks! - Hui From mairhofer at itt.uni-stuttgart.de Sat May 31 10:02:59 2014 From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer) Date: Sat, 31 May 2014 17:02:59 +0200 Subject: [petsc-users] Customized Jacobi-Vector action approximation Message-ID: <5389EF23.30603@itt.uni-stuttgart.de> Hi all, I am using PETSc to solve a system of nonlinear equations arising from Density Functional Theory. Depending on the actual problem setup the residulas of the matrix-free linear solver (GMRES) stagnate and the nonlinear system converges only slowly. Besides preconditioning my second idea to improve the performance of the linear solver was to use a higher order approximation of the Jacobi-vector product. Therefore, I am trying to write a user defined subroutine that calculates the approximation of the matrix-free Jacobi-Vector product, i.e. I would like to have a routine which can replace the default 1st order approximation J(x)*v = (F(x+eps*v) - F(x) ) / eps for instance by a 2nd order approximation such as J(x)*v = (F(x+eps*v) - F(x-eps*v) ) / 2eps So assuming that I have a subroutine which claculates the approximation of J(x)*v, how do I get PETSc to use this result in the SNES solver? Thank you very much, Jonas From jed at jedbrown.org Sat May 31 10:11:31 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 31 May 2014 17:11:31 +0200 Subject: [petsc-users] Customized Jacobi-Vector action approximation In-Reply-To: <5389EF23.30603@itt.uni-stuttgart.de> References: <5389EF23.30603@itt.uni-stuttgart.de> Message-ID: <87mwdyt23g.fsf@jedbrown.org> Jonas Mairhofer writes: > Hi all, > > I am using PETSc to solve a system of nonlinear equations arising from > Density Functional Theory. Depending on the actual problem setup the > residulas of the matrix-free linear solver (GMRES) > stagnate and the nonlinear system converges only slowly. > Besides preconditioning my second idea to improve the performance of the > linear solver was to use a higher order approximation of the > Jacobi-vector product. Therefore, I am trying to write a user defined > subroutine that calculates the approximation of the matrix-free > Jacobi-Vector product, i.e. I would like to have a routine which can > replace the default 1st order approximation > > J(x)*v = (F(x+eps*v) - F(x) ) / eps > > for instance by a 2nd order approximation such as > > J(x)*v = (F(x+eps*v) - F(x-eps*v) ) / 2eps > > So assuming that I have a subroutine which claculates the approximation > of J(x)*v, how do I get PETSc to use this result in the SNES solver? Unless you are trying to add the centered difference code to the PETSc library, you should create a MatShell that computes the action by your formula. Note that the centered difference does not help with rounding error, so you'll likely want to use a larger step size (eps) and rely on the function having sufficient smoothness if you hope to achieve better accuracy than the one-sided difference. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sat May 31 11:07:15 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 31 May 2014 11:07:15 -0500 Subject: [petsc-users] Customized Jacobi-Vector action approximation In-Reply-To: <5389EF23.30603@itt.uni-stuttgart.de> References: <5389EF23.30603@itt.uni-stuttgart.de> Message-ID: You might consider trying some of the non-Newton based nonlinear solvers now available in the development version of PETSc http://www.mcs.anl.gov/petsc/developers/index.html Here is a list of them see their manual pages for more details -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- On May 31, 2014, at 10:02 AM, Jonas Mairhofer wrote: > Hi all, > > I am using PETSc to solve a system of nonlinear equations arising from Density Functional Theory. Depending on the actual problem setup the residulas of the matrix-free linear solver (GMRES) > stagnate and the nonlinear system converges only slowly. > Besides preconditioning my second idea to improve the performance of the linear solver was to use a higher order approximation of the Jacobi-vector product. Therefore, I am trying to write a user defined subroutine that calculates the approximation of the matrix-free Jacobi-Vector product, i.e. I would like to have a routine which can replace the default 1st order approximation > > J(x)*v = (F(x+eps*v) - F(x) ) / eps > > for instance by a 2nd order approximation such as > > J(x)*v = (F(x+eps*v) - F(x-eps*v) ) / 2eps > > So assuming that I have a subroutine which claculates the approximation of J(x)*v, how do I get PETSc to use this result in the SNES solver? > > Thank you very much, > Jonas From hus003 at ucsd.edu Sat May 31 12:46:48 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 31 May 2014 17:46:48 +0000 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87ppiutk28.fsf@jedbrown.org>, <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> Continue this topic. Right now I'm looking at ex19.c from PETSc v3.3 and v3.4, both containing a user defined function NonlinearGS(SNES, Vec, Vec, void*), I'm wondering why the arguments are not passed by reference or pointers? Will a copy been made for the first three arguments once NonlinearGS is called? -Hui ________________________________________ From: Sun, Hui Sent: Saturday, May 31, 2014 6:27 AM To: Jed Brown; petsc-users at mcs.anl.gov Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c Thank you Jed. The version I was using is 3.1, it is too old. ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, May 31, 2014 1:43 AM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c "Sun, Hui" writes: > I'm looking at snes example ex19.c, on "nonlinear driven cavity multigrid 2d. You can also access it via the website ( http://acts.nersc.gov/petsc/example3/ex19.c.html ) These are not the same version. The acts.nersc.gov link is a very old version of that example. The source-transformation algorithmic differentiation tool ADIC is being used to compute derivatives (the ad_* and admf_* functions). ADIC is not maintained, has an unfortunate license, and was not widely used so the "automatic" support has been removed from PETSc. Please look at the current version, which has neither DMMG (removed/merged into SNES some years ago) or ADIC. > There are three user defined local functions ( FormFunctionLocal, FormFunctionLocali, FormFunctionLocali4 ) that serves as discretized PDE operators declared before main, and is defined right after main. In the middle of the main, there are these four lines: > > 1. ierr = DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,ad_FormFunctionLocal,admf_FormFunctionLocal);CHKERRQ(ierr); > 2. ierr = DMMGSetFromOptions(dmmg);CHKERRQ(ierr); > 3. ierr = DMMGSetSNESLocali(dmmg,FormFunctionLocali,0,admf_FormFunctionLocali);CHKERRQ(ierr); > 4. ierr = DMMGSetSNESLocalib(dmmg,FormFunctionLocali4,0,admfb_FormFunctionLocali4);CHKERRQ(ierr); > > I have the following questions: > > 1. What are ad_FormFunctionLocal, admf_FormFunctionLocal from line 1? They are not defined anywhere in ex19.c. Other terms such as admf_FormFunctionLocali and admfb_FormFunctionLocali4 are also not defined anywhere in the file. > > 2. To me it seems like DMMGSetSNESLocal, DMMGSetSNESLocali, and DMMGSetSNESLocalib evaluates the function for all grid points, for a single grid point and for a single degree of freedom, respectively. But how does the process choose which one to use? > > Thanks! - Hui From jed at jedbrown.org Sat May 31 12:52:06 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 31 May 2014 19:52:06 +0200 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87ha45u989.fsf@jedbrown.org> "Sun, Hui" writes: > Continue this topic. Right now I'm looking at ex19.c from PETSc v3.3 > and v3.4, both containing a user defined function NonlinearGS(SNES, > Vec, Vec, void*), I'm wondering why the arguments are not passed by > reference or pointers? All PETSc objects (like SNES, Vec, etc.) are pointers to private structures. typedef struct _p_Vec *Vec; You cannot dereference the pointer because the implementation is private, but it is passed around as a pointer. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From jonasmairhofer86 at gmail.com Sat May 31 14:02:47 2014 From: jonasmairhofer86 at gmail.com (Jonas Mairhofer) Date: Sat, 31 May 2014 21:02:47 +0200 Subject: [petsc-users] Customized Jacobi-Vector action approximation In-Reply-To: References: <5389EF23.30603@itt.uni-stuttgart.de> Message-ID: Thank you both for your fast answers! I agree, that it might not make a big difference using the centered difference formula, but just to get it from my list of things that could help I will try and implement it. I don't understand how I could miss this forum discussion when I was looking for a way to implement this all day yesterday, but the second link I got now from google typing "petsc MatShell" is a long discussion you had with another user on exactly what I want to do :) Just in case anyone else is looking for the same thing : http://lists.mcs.anl.gov/pipermail/petsc-users/2010-August/006821.html On Sat, May 31, 2014 at 6:07 PM, Barry Smith wrote: > > You might consider trying some of the non-Newton based nonlinear solvers > now available in the development version of PETSc > http://www.mcs.anl.gov/petsc/developers/index.html Here is a list of > them see their manual pages for more details > > > > > > On May 31, 2014, at 10:02 AM, Jonas Mairhofer < > mairhofer at itt.uni-stuttgart.de> wrote: > > > Hi all, > > > > I am using PETSc to solve a system of nonlinear equations arising from > Density Functional Theory. Depending on the actual problem setup the > residulas of the matrix-free linear solver (GMRES) > > stagnate and the nonlinear system converges only slowly. > > Besides preconditioning my second idea to improve the performance of the > linear solver was to use a higher order approximation of the Jacobi-vector > product. Therefore, I am trying to write a user defined subroutine that > calculates the approximation of the matrix-free Jacobi-Vector product, i.e. > I would like to have a routine which can replace the default 1st order > approximation > > > > J(x)*v = (F(x+eps*v) - F(x) ) / eps > > > > for instance by a 2nd order approximation such as > > > > J(x)*v = (F(x+eps*v) - F(x-eps*v) ) / 2eps > > > > So assuming that I have a subroutine which claculates the approximation > of J(x)*v, how do I get PETSc to use this result in the SNES solver? > > > > Thank you very much, > > Jonas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hus003 at ucsd.edu Sat May 31 16:38:50 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 31 May 2014 21:38:50 +0000 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <87ha45u989.fsf@jedbrown.org> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87ha45u989.fsf@jedbrown.org> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU> Thank you Jed for explaining this to me. I tried to compile and run with the following options: ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason 1). I use 2 cores and get the following output: lid velocity = 100, prandtl # = 1, grashof # = 10000 0 SNES Function norm 1111.93 1 SNES Function norm 829.129 2 SNES Function norm 532.66 3 SNES Function norm 302.926 4 SNES Function norm 3.64014 5 SNES Function norm 0.0410053 6 SNES Function norm 4.57951e-06 Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 Number of SNES iterations = 6 Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s. 2). I use 8 cores and get the following output: lid velocity = 100, prandtl # = 1, grashof # = 10000 0 SNES Function norm 1111.93 1 SNES Function norm 829.049 2 SNES Function norm 532.616 3 SNES Function norm 303.165 4 SNES Function norm 3.93436 Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4 Number of SNES iterations = 4 Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s. First of all, the two runs yields different results. Secondly, the time cost comparison doesn't seem to be scaling correctly. ( I have used petsctime.h to calculate the time cost. ) Do you have any insight of what might be missing? -Hui ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, May 31, 2014 10:52 AM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c "Sun, Hui" writes: > Continue this topic. Right now I'm looking at ex19.c from PETSc v3.3 > and v3.4, both containing a user defined function NonlinearGS(SNES, > Vec, Vec, void*), I'm wondering why the arguments are not passed by > reference or pointers? All PETSc objects (like SNES, Vec, etc.) are pointers to private structures. typedef struct _p_Vec *Vec; You cannot dereference the pointer because the implementation is private, but it is passed around as a pointer. From jed at jedbrown.org Sat May 31 16:48:28 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 31 May 2014 23:48:28 +0200 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ha45u989.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87zjhxsjpv.fsf@jedbrown.org> "Sun, Hui" writes: > Thank you Jed for explaining this to me. I tried to compile and run with the following options: > ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason > > 1). I use 2 cores and get the following output: > lid velocity = 100, prandtl # = 1, grashof # = 10000 > 0 SNES Function norm 1111.93 > 1 SNES Function norm 829.129 > 2 SNES Function norm 532.66 > 3 SNES Function norm 302.926 > 4 SNES Function norm 3.64014 > 5 SNES Function norm 0.0410053 > 6 SNES Function norm 4.57951e-06 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 > Number of SNES iterations = 6 > Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s. > > 2). I use 8 cores and get the following output: > lid velocity = 100, prandtl # = 1, grashof # = 10000 > 0 SNES Function norm 1111.93 > 1 SNES Function norm 829.049 > 2 SNES Function norm 532.616 > 3 SNES Function norm 303.165 > 4 SNES Function norm 3.93436 > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4 > Number of SNES iterations = 4 > Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s. > > First of all, the two runs yields different results. The linear solve did not converge in the second case. Run a more robust linear solver. These problems can get difficult, but I think -pc_type asm -sub_pc_type lu should be sufficient. > Secondly, the time cost comparison doesn't seem to be scaling correctly. > ( I have used petsctime.h to calculate the time cost. ) 1. Run in optimized mode. 2. Don't use more processes than you have cores (I don't know if this affects you). 3. This problem is too small to take advantage of much (if any) parallelism. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From hus003 at ucsd.edu Sat May 31 18:06:37 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 31 May 2014 23:06:37 +0000 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <87zjhxsjpv.fsf@jedbrown.org> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ha45u989.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87zjhxsjpv.fsf@jedbrown.org> Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU> Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 18.1767s and 55.0017s respectively. That seems quite reasonable. By the way, how do I know which matrix solver and which preconditioner is being called? Besides, I have another question: I try to program finite difference for 2D Stokes flow with Dirichlet or Neumann bdry conditions, using staggered MAC grid. I looked up all the examples in snes, there are three stokes flow examples, all of which are finite element. I was thinking about naming (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j), then define u, v, p as three petscscalers on (i,j), but in that case u will have one more column than p and v will have one more row than p. If there is already something there in PETSc about MAC grid, then I don't have to worry about those details. Do you know any examples or references doing that? Hui ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, May 31, 2014 2:48 PM To: Sun, Hui; petsc-users at mcs.anl.gov Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c "Sun, Hui" writes: > Thank you Jed for explaining this to me. I tried to compile and run with the following options: > ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason > > 1). I use 2 cores and get the following output: > lid velocity = 100, prandtl # = 1, grashof # = 10000 > 0 SNES Function norm 1111.93 > 1 SNES Function norm 829.129 > 2 SNES Function norm 532.66 > 3 SNES Function norm 302.926 > 4 SNES Function norm 3.64014 > 5 SNES Function norm 0.0410053 > 6 SNES Function norm 4.57951e-06 > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 > Number of SNES iterations = 6 > Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s. > > 2). I use 8 cores and get the following output: > lid velocity = 100, prandtl # = 1, grashof # = 10000 > 0 SNES Function norm 1111.93 > 1 SNES Function norm 829.049 > 2 SNES Function norm 532.616 > 3 SNES Function norm 303.165 > 4 SNES Function norm 3.93436 > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4 > Number of SNES iterations = 4 > Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s. > > First of all, the two runs yields different results. The linear solve did not converge in the second case. Run a more robust linear solver. These problems can get difficult, but I think -pc_type asm -sub_pc_type lu should be sufficient. > Secondly, the time cost comparison doesn't seem to be scaling correctly. > ( I have used petsctime.h to calculate the time cost. ) 1. Run in optimized mode. 2. Don't use more processes than you have cores (I don't know if this affects you). 3. This problem is too small to take advantage of much (if any) parallelism. From jed at jedbrown.org Sat May 31 18:13:34 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 01 Jun 2014 01:13:34 +0200 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ha45u989.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU> <87zjhxsjpv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87sinpsfs1.fsf@jedbrown.org> "Sun, Hui" writes: > Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 18.1767s and 55.0017s respectively. That seems quite reasonable. > > By the way, how do I know which matrix solver and which preconditioner is being called? -ksp_view (or -snes_view, which includes the same information once per nonlinear solve). > Besides, I have another question: I try to program finite difference > for 2D Stokes flow with Dirichlet or Neumann bdry conditions, using > staggered MAC grid. I looked up all the examples in snes, there are > three stokes flow examples, all of which are finite element. I was > thinking about naming (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j), > then define u, v, p as three petscscalers on (i,j), but in that case u > will have one more column than p and v will have one more row than > p. If there is already something there in PETSc about MAC grid, then I > don't have to worry about those details. Do you know any examples or > references doing that? What you describe is a common approach. You set trivial "boundary conditions" for those silent dofs and otherwise ignore them. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sat May 31 18:16:04 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 31 May 2014 18:16:04 -0500 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ha45u989.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87zjhxsjpv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: On May 31, 2014, at 6:06 PM, Sun, Hui wrote: > Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 18.1767s and 55.0017s respectively. That seems quite reasonable. > > By the way, how do I know which matrix solver and which preconditioner is being called? Run with -snes_view (or -ts_view if using the ODE integrators). > > Besides, I have another question: I try to program finite difference for 2D Stokes flow with Dirichlet or Neumann bdry conditions, using staggered MAC grid. I looked up all the examples in snes, there are three stokes flow examples, all of which are finite element. I was thinking about naming (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j), then define u, v, p as three petscscalers on (i,j), but in that case u will have one more column than p and v will have one more row than p. If there is already something there in PETSc about MAC grid, then I don't have to worry about those details. Do you know any examples or references doing that? Unfortunately the DMDA is not ideal for this since it only supports the same number of dof at each grid point. You need to decouple the extra ?variables? and not use their values to do a MAC grid. For example in two dimensions with u (velocity in x direction), v (velocity in y direction) and p (pressure at cell centers), and pure Dirichlet boundary conditions then create a DMDA with a dof of three and for each cell treat the first component of the cell as u (on the lower side of cell) , the second as v (on left side of cell) and the third as p (on center of cell). For the final row of cells across the top there is no v or p, just the u along the bottoms of the cells and for the final row of cells along the right there is only a v. So make all the ?extra? equations be simply f.v[i][j] = x.v[i][j] (or x.p or x.u depending on where) and put a 1 on the diagonal of that row/column of the Jacobian). Yes it is a little annoyingly cumbersome. Barry > > Hui > > > > ________________________________________ > From: Jed Brown [jed at jedbrown.org] > Sent: Saturday, May 31, 2014 2:48 PM > To: Sun, Hui; petsc-users at mcs.anl.gov > Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c > > "Sun, Hui" writes: > >> Thank you Jed for explaining this to me. I tried to compile and run with the following options: >> ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason >> >> 1). I use 2 cores and get the following output: >> lid velocity = 100, prandtl # = 1, grashof # = 10000 >> 0 SNES Function norm 1111.93 >> 1 SNES Function norm 829.129 >> 2 SNES Function norm 532.66 >> 3 SNES Function norm 302.926 >> 4 SNES Function norm 3.64014 >> 5 SNES Function norm 0.0410053 >> 6 SNES Function norm 4.57951e-06 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 >> Number of SNES iterations = 6 >> Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s. >> >> 2). I use 8 cores and get the following output: >> lid velocity = 100, prandtl # = 1, grashof # = 10000 >> 0 SNES Function norm 1111.93 >> 1 SNES Function norm 829.049 >> 2 SNES Function norm 532.616 >> 3 SNES Function norm 303.165 >> 4 SNES Function norm 3.93436 >> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4 >> Number of SNES iterations = 4 >> Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s. >> >> First of all, the two runs yields different results. > > The linear solve did not converge in the second case. > > Run a more robust linear solver. These problems can get difficult, but > I think -pc_type asm -sub_pc_type lu should be sufficient. > >> Secondly, the time cost comparison doesn't seem to be scaling correctly. >> ( I have used petsctime.h to calculate the time cost. ) > > 1. Run in optimized mode. > > 2. Don't use more processes than you have cores (I don't know if this > affects you). > > 3. This problem is too small to take advantage of much (if any) > parallelism. From hus003 at ucsd.edu Sat May 31 18:28:20 2014 From: hus003 at ucsd.edu (Sun, Hui) Date: Sat, 31 May 2014 23:28:20 +0000 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ha45u989.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>, <87zjhxsjpv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>, Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B760C@XMAIL-MBX-BH1.AD.UCSD.EDU> Thank you Jed and Barry for being very helpful answering all my questions! Right now, I have GMRES as the solver, how do I change it to BiCGStab? Best, Hui ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Saturday, May 31, 2014 4:16 PM To: Sun, Hui Cc: Jed Brown; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c On May 31, 2014, at 6:06 PM, Sun, Hui wrote: > Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 18.1767s and 55.0017s respectively. That seems quite reasonable. > > By the way, how do I know which matrix solver and which preconditioner is being called? Run with -snes_view (or -ts_view if using the ODE integrators). > > Besides, I have another question: I try to program finite difference for 2D Stokes flow with Dirichlet or Neumann bdry conditions, using staggered MAC grid. I looked up all the examples in snes, there are three stokes flow examples, all of which are finite element. I was thinking about naming (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j), then define u, v, p as three petscscalers on (i,j), but in that case u will have one more column than p and v will have one more row than p. If there is already something there in PETSc about MAC grid, then I don't have to worry about those details. Do you know any examples or references doing that? Unfortunately the DMDA is not ideal for this since it only supports the same number of dof at each grid point. You need to decouple the extra ?variables? and not use their values to do a MAC grid. For example in two dimensions with u (velocity in x direction), v (velocity in y direction) and p (pressure at cell centers), and pure Dirichlet boundary conditions then create a DMDA with a dof of three and for each cell treat the first component of the cell as u (on the lower side of cell) , the second as v (on left side of cell) and the third as p (on center of cell). For the final row of cells across the top there is no v or p, just the u along the bottoms of the cells and for the final row of cells along the right there is only a v. So make all the ?extra? equations be simply f.v[i][j] = x.v[i][j] (or x.p or x.u depending on where) and put a 1 on the diagonal of that row/column of the Jacobian). Yes it is a little annoyingly cumbersome. Barry > > Hui > > > > ________________________________________ > From: Jed Brown [jed at jedbrown.org] > Sent: Saturday, May 31, 2014 2:48 PM > To: Sun, Hui; petsc-users at mcs.anl.gov > Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c > > "Sun, Hui" writes: > >> Thank you Jed for explaining this to me. I tried to compile and run with the following options: >> ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason >> >> 1). I use 2 cores and get the following output: >> lid velocity = 100, prandtl # = 1, grashof # = 10000 >> 0 SNES Function norm 1111.93 >> 1 SNES Function norm 829.129 >> 2 SNES Function norm 532.66 >> 3 SNES Function norm 302.926 >> 4 SNES Function norm 3.64014 >> 5 SNES Function norm 0.0410053 >> 6 SNES Function norm 4.57951e-06 >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6 >> Number of SNES iterations = 6 >> Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s. >> >> 2). I use 8 cores and get the following output: >> lid velocity = 100, prandtl # = 1, grashof # = 10000 >> 0 SNES Function norm 1111.93 >> 1 SNES Function norm 829.049 >> 2 SNES Function norm 532.616 >> 3 SNES Function norm 303.165 >> 4 SNES Function norm 3.93436 >> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4 >> Number of SNES iterations = 4 >> Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s. >> >> First of all, the two runs yields different results. > > The linear solve did not converge in the second case. > > Run a more robust linear solver. These problems can get difficult, but > I think -pc_type asm -sub_pc_type lu should be sufficient. > >> Secondly, the time cost comparison doesn't seem to be scaling correctly. >> ( I have used petsctime.h to calculate the time cost. ) > > 1. Run in optimized mode. > > 2. Don't use more processes than you have cores (I don't know if this > affects you). > > 3. This problem is too small to take advantage of much (if any) > parallelism. From jed at jedbrown.org Sat May 31 18:29:18 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 01 Jun 2014 01:29:18 +0200 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ha45u989.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU> <87zjhxsjpv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87ppitsf1t.fsf@jedbrown.org> Barry Smith writes: >So make all the ?extra? equations be simply f.v[i][j] = x.v[i][j] (or >x.p or x.u depending on where) and put a 1 on the diagonal of that >row/column of the Jacobian). This would typically be f[j][i].v = x[j][i].v, etc. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: From jed at jedbrown.org Sat May 31 18:46:50 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 01 Jun 2014 01:46:50 +0200 Subject: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B760C@XMAIL-MBX-BH1.AD.UCSD.EDU> References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ppiutk28.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU> <87ha45u989.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU> <87zjhxsjpv.fsf@jedbrown.org> <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU> <7501CC2B7BBCC44A92ECEEC316170ECB6B760C@XMAIL-MBX-BH1.AD.UCSD.EDU> Message-ID: <87ha45se8l.fsf@jedbrown.org> "Sun, Hui" writes: > Thank you Jed and Barry for being very helpful answering all my > questions! Right now, I have GMRES as the solver, how do I change it > to BiCGStab? -ksp_type bcgs -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: