From D.Lathouwers at tudelft.nl  Thu May  1 02:35:35 2014
From: D.Lathouwers at tudelft.nl (Danny Lathouwers - TNW)
Date: Thu, 1 May 2014 07:35:35 +0000
Subject: [petsc-users] singular matrix solve using
 MAT_SHIFT_POSITIVE_DEFINITE option
In-Reply-To: <CAMYG4Gn1QC+4wVOCL+cjaX=4w6FtiNy88bVt_T7Aimk-2jBvpg@mail.gmail.com>
References: <4E6B33F4128CED4DB307BA83146E9A64258E28C4@SRV362.tudelft.net>
	<CAMYG4Gn1QC+4wVOCL+cjaX=4w6FtiNy88bVt_T7Aimk-2jBvpg@mail.gmail.com>
Message-ID: <4E6B33F4128CED4DB307BA83146E9A64258E2E72@SRV362.tudelft.net>

Thank you Matt. I?ll try that soon. Will let you know if this works for me.

From: Matthew Knepley [mailto:knepley at gmail.com]
Sent: donderdag 1 mei 2014 0:09
To: Danny Lathouwers - TNW
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] singular matrix solve using MAT_SHIFT_POSITIVE_DEFINITE option

On Wed, Apr 30, 2014 at 2:53 PM, Danny Lathouwers - TNW <D.Lathouwers at tudelft.nl<mailto:D.Lathouwers at tudelft.nl>> wrote:
Dear users,

I encountered a strange problem. I have a singular matrix P (Poisson, Neumann boundary conditions, N=4). The rhs b sums to 0.
If I hand-fill the matrix with the right entries (non-zeroes only) things work with KSPCG and ICC preconditioning and using the MAT_SHIFT_POSITIVE_DEFINITE option. Convergence in 2 iterations to (a) correct solution. So far for the debugging problem.

That option changes the preconditioner matrix to (alpha I + P). I don't know of a theoretical reason that this
should be a good preconditioner, but perhaps it exists. Certainly ICC is exquisitely sensitive (you can easily
write down matrices where an epsilon change destroys convergence).

Yes, you should use null space, and here it is really easy

  -ksp_constant_null_space

Its possible that this fixes your convergence, if the ICC perturbation was introducing components in the
null space to your solution.

   Matt

My real problem computes P from D * M * D^T. If I do this I get the same matrix (on std out I do not see the difference to all digits).
The system P * x = b now does NOT converge.
More strange is that is if I remove the zeroes from D then things do work again.
Either things are overly sensitive or I am misusing petsc.
It does work when using e.g. the AMG preconditioner (again it is a correct but different solution). So system really seems OK.

Should I also use the Null space commands as I have seen in some of the examples as well?
But, I recall from many years ago when using MICCG (alpha) preconditioning that no such tricks were needed for CG with Poisson-Neumann. I am supposing the MAT_SHIFT_POSITIVE_DEFINITE option does something similar as MICCG.

For clarity I have included the code (unfortunately this is the smallest I could get it; it?s quite straightforward though).
By setting the value of option to 1 in main.f90 the code use P = D * M * D^T otherwise it will use the hand-filled matrix.
The code prints the matrix P and solution etc.

Anyone any hints on this?
What other preconditioners (serial) are suitable for this problem besides ICC/AMG?

Thanks very much.
Danny Lathouwers


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/b06beddd/attachment.html>

From epscodes at gmail.com  Thu May  1 10:32:44 2014
From: epscodes at gmail.com (Xiangdong)
Date: Thu, 1 May 2014 11:32:44 -0400
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
Message-ID: <CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>

Under what condition, SNESGetFunctionNorm() will output different results
from SENEGetFunction + VecNorm (with NORM_2)?

For most of my test cases, it is the same. However, when I have some
special (trivial) initial guess to the SNES problem, I see different norms.

Another phenomenon I noticed with this is that KSP in SNES squeeze my
matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When
I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25,
and the rhs and solution is with length 25. Do you have any clue on what
triggered this? To my surprise, when I output the Jacobian inside the
FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct
numerical entries. Why does the operator obtained from KSP is different and
got rows eliminated? These rows got eliminated have only one entries per
row, but the rhs in that row is not zero. Eliminating these rows would give
wrong solutions.

Thank you.

Xiangdong


On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com> wrote:
>
>> It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo
>> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize
>> the array f. Zero the array f solved the problem and gave consistent result.
>>
>> Just curious, why does not petsc initialize the array f to zero by
>> default inside petsc when passing the f array to FormFunctionLocal?
>>
>
> If you directly set entires, you might not want us to spend the time
> writing those zeros.
>
>
>> I have another quick question about the array x passed to
>> FormFunctionLocal. If I want to know the which x is evaluated, how can I
>> output x in a vector format? Currently, I created a global vector vecx and
>> a local vector vecx_local, get the array of vecx_local_array, copy the x to
>> vecx_local_array,  scatter to global vecx and output vecx. Is there a quick
>> way to restore the array x to a vector and output?
>>
>
> I cannot think of a better way than that.
>
>    Matt
>
>
>> Thank you.
>>
>> Best,
>> Xiangdong
>>
>>
>>
>> On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>> On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
>>>
>>> > Hello everyone,
>>> >
>>> > When I run snes program,
>>>
>>>                ^^^^ what SNES program??
>>>
>>> > it outputs "SNES Function norm 1.23456789e+10". It seems that this
>>> norm is different from residue norm (even if solving F(x)=0)
>>>
>>>    Please send the full output where you see this.
>>>
>>> > and also differ from norm of the Jacobian. What is the definition of
>>> this "SNES Function Norm??
>>>
>>>    The SNES Function Norm as printed by PETSc is suppose to the 2-norm
>>> of F(x) - b (where b is usually zero) and this is also the same thing as
>>> the ?residue norm?
>>>
>>>    Barry
>>>
>>> >
>>> > Thank you.
>>> >
>>> > Best,
>>> > Xiangdong
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/081fd5ea/attachment.html>

From hemak at asu.edu  Thu May  1 12:51:44 2014
From: hemak at asu.edu (Hema Krishnamurthy)
Date: Thu, 1 May 2014 17:51:44 +0000
Subject: [petsc-users] SVD Implementation
Message-ID: <842702CC6788EE46BFA492B17777501B0C66FE17@exmbw01.asurite.ad.asu.edu>

Hi,

Could someone please explain as to why the input data to SVD is being scaled in PETSc?

HANDLER(MatScale(Y,1./sqrt(nColsGlobal-1))); // data'T / sqrt(N-1) is being done before the call to SVDCreate()


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/aafc4e07/attachment.html>

From bsmith at mcs.anl.gov  Thu May  1 12:58:27 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 1 May 2014 12:58:27 -0500
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
	<CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
Message-ID: <F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>


On May 1, 2014, at 10:32 AM, Xiangdong <epscodes at gmail.com> wrote:

> Under what condition, SNESGetFunctionNorm() will output different results from SENEGetFunction + VecNorm (with NORM_2)? 
> 
> For most of my test cases, it is the same. However, when I have some special (trivial) initial guess to the SNES problem, I see different norms.

   Please send more details on your ?trivial? case where the values are different. It could be that we are not setting the function norm properly on early exit from the solvers. 
> 
> Another phenomenon I noticed with this is that KSP in SNES squeeze my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, and the rhs and solution is with length 25. Do you have any clue on what triggered this? To my surprise, when I output the Jacobian inside the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct numerical entries. Why does the operator obtained from KSP is different and got rows eliminated? These rows got eliminated have only one entries per row, but the rhs in that row is not zero. Eliminating these rows would give wrong solutions.

   Hmm, we never squeeze out rows/columns from the Jacobian. The size of the Jacobian set with SNESSetJacobian() should always match that obtained with KSPGetOperators() on the linear system.   Please send more details on how you get this. Are you calling the KSPGetOperators() inside a preconditioner where the the preconditioner has chopped up the operator?

   Barry

> 
> Thank you.
> 
> Xiangdong
> 
> 
> 
> 
> 
> 
> On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com> wrote:
> It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize the array f. Zero the array f solved the problem and gave consistent result.
> 
> Just curious, why does not petsc initialize the array f to zero by default inside petsc when passing the f array to FormFunctionLocal?
> 
> If you directly set entires, you might not want us to spend the time writing those zeros.
>  
> I have another quick question about the array x passed to FormFunctionLocal. If I want to know the which x is evaluated, how can I output x in a vector format? Currently, I created a global vector vecx and a local vector vecx_local, get the array of vecx_local_array, copy the x to vecx_local_array,  scatter to global vecx and output vecx. Is there a quick way to restore the array x to a vector and output?
> 
> I cannot think of a better way than that.
> 
>    Matt
>  
> Thank you.
> 
> Best,
> Xiangdong
> 
> 
> 
> On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
> 
> > Hello everyone,
> >
> > When I run snes program,
> 
>                ^^^^ what SNES program??
> 
> > it outputs "SNES Function norm 1.23456789e+10". It seems that this norm is different from residue norm (even if solving F(x)=0)
> 
>    Please send the full output where you see this.
> 
> > and also differ from norm of the Jacobian. What is the definition of this "SNES Function Norm??
> 
>    The SNES Function Norm as printed by PETSc is suppose to the 2-norm of F(x) - b (where b is usually zero) and this is also the same thing as the ?residue norm?
> 
>    Barry
> 
> >
> > Thank you.
> >
> > Best,
> > Xiangdong
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 


From epscodes at gmail.com  Thu May  1 14:43:09 2014
From: epscodes at gmail.com (Xiangdong)
Date: Thu, 1 May 2014 15:43:09 -0400
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
	<CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
	<F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>
Message-ID: <CAAPpcpk4fagRja6o_ui8iC==jh2vQ44kJpZaJAgOOzgQHx6Fwg@mail.gmail.com>

Here is the order of functions I called:

DMDACreate3d();

SNESCreate();

SNESSetDM();  (DM with dof=2);

DMSetApplicationContext();

DMDASNESSetFunctionLocal();

SNESVISetVariableBounds();

DMDASNESetJacobianLocal();

SNESSetFromOptions();

SNESSolve();

SNESGetKSP();
KSPGetSolution();
KSPGetRhs();
KSPGetOperators();   //get operator kspA, kspx, kspb;

SNESGetFunctionNorm();  ==> get norm fnorma;
SNESGetFunction(); VecNorm(); ==> get norm fnormb;
SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the
solution x and get norm fnormc;

Inside the FormJacobianLocal(), I output the matrix jac and preB;

I found that fnorma matches the default SNES monitor output "SNES Function
norm", but fnormb=fnormc != fnorma. The solution x, the residue fx obtained
by  snescomputefunction, mat jac and preB are length 50 or 50-by-50, while
the kspA, kspx, kspb are 25-by-25 or length 25.

I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb;
x(2:2:end)=0; It seems that it completely ignores the second degree of
freedom (setting it to zero). I saw this for (close to) constant initial
guess, while for heterogeneous initial guess, it works fine and the matrix
and vector size are correct, and the solution is correct. So this
eliminating row behavior seems to be initial guess dependent.

I saw this even if I use snes_fd, so we can rule out the possibility of
wrong Jacobian. For the FormFunctionLocal(), I checked via
SNESComputeFunction and it output the correct vector of residue.

Are the orders of function calls correct?

Thank you.

Xiangdong


On Thu, May 1, 2014 at 1:58 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On May 1, 2014, at 10:32 AM, Xiangdong <epscodes at gmail.com> wrote:
>
> > Under what condition, SNESGetFunctionNorm() will output different
> results from SENEGetFunction + VecNorm (with NORM_2)?
> >
> > For most of my test cases, it is the same. However, when I have some
> special (trivial) initial guess to the SNES problem, I see different norms.
>
>    Please send more details on your ?trivial? case where the values are
> different. It could be that we are not setting the function norm properly
> on early exit from the solvers.
> >
> > Another phenomenon I noticed with this is that KSP in SNES squeeze my
> matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When
> I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25,
> and the rhs and solution is with length 25. Do you have any clue on what
> triggered this? To my surprise, when I output the Jacobian inside the
> FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct
> numerical entries. Why does the operator obtained from KSP is different and
> got rows eliminated? These rows got eliminated have only one entries per
> row, but the rhs in that row is not zero. Eliminating these rows would give
> wrong solutions.
>
>    Hmm, we never squeeze out rows/columns from the Jacobian. The size of
> the Jacobian set with SNESSetJacobian() should always match that obtained
> with KSPGetOperators() on the linear system.   Please send more details on
> how you get this. Are you calling the KSPGetOperators() inside a
> preconditioner where the the preconditioner has chopped up the operator?
>
>    Barry
>
> >
> > Thank you.
> >
> > Xiangdong
> >
> >
> >
> >
> >
> >
> > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com> wrote:
> > It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo
> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize
> the array f. Zero the array f solved the problem and gave consistent result.
> >
> > Just curious, why does not petsc initialize the array f to zero by
> default inside petsc when passing the f array to FormFunctionLocal?
> >
> > If you directly set entires, you might not want us to spend the time
> writing those zeros.
> >
> > I have another quick question about the array x passed to
> FormFunctionLocal. If I want to know the which x is evaluated, how can I
> output x in a vector format? Currently, I created a global vector vecx and
> a local vector vecx_local, get the array of vecx_local_array, copy the x to
> vecx_local_array,  scatter to global vecx and output vecx. Is there a quick
> way to restore the array x to a vector and output?
> >
> > I cannot think of a better way than that.
> >
> >    Matt
> >
> > Thank you.
> >
> > Best,
> > Xiangdong
> >
> >
> >
> > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> >
> > On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
> >
> > > Hello everyone,
> > >
> > > When I run snes program,
> >
> >                ^^^^ what SNES program??
> >
> > > it outputs "SNES Function norm 1.23456789e+10". It seems that this
> norm is different from residue norm (even if solving F(x)=0)
> >
> >    Please send the full output where you see this.
> >
> > > and also differ from norm of the Jacobian. What is the definition of
> this "SNES Function Norm??
> >
> >    The SNES Function Norm as printed by PETSc is suppose to the 2-norm
> of F(x) - b (where b is usually zero) and this is also the same thing as
> the ?residue norm?
> >
> >    Barry
> >
> > >
> > > Thank you.
> > >
> > > Best,
> > > Xiangdong
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/712c97b5/attachment.html>

From jingyue at gmail.com  Thu May  1 15:32:48 2014
From: jingyue at gmail.com (Jingyue Wang)
Date: Thu, 01 May 2014 15:32:48 -0500
Subject: [petsc-users] SLEPc configuration problem
Message-ID: <5362AF70.9050909@gmail.com>

Hi,

Can anyone please help me on how to configure SLEPc?  I have installed 
PETSc 3.4.4 (compiled with MKL) and downloaded and extracted the source 
of SLEPc 3.4.4. I set up

export SLEPC_DIR="/home/jwang/opt/slepc-3.4.4"
export PETSC_DIR="/home/jwang/opt/petsc-3.4.4"
export PETSC_ARCH=linux-amd64-opt

However, after I enter the source directory of SLEPc and type 
./configure, I got the error messages that I append at the end of the 
email. I tried to read the python configuration code and it seems that 
the reason is self.framework is None and the reason for self.framework 
is None is in the script.py in my petsc-3.4.4/config/BuildSystem 
directory, the following code in function loadConfigure(self, argDB = None):
      .....
      if not 'configureCache' in argDB:
           self.logPrint('No cached configure in RDict at 
'+str(argDB.saveFilename))
           return None
      .....
returns a None value.

It seems the reason is SLEPc can not find cached configuration in PETSc, 
but I don't know how to enable such cached configuration in PETSc...


***********************Error 
messages*****************************************

Checking environment...
Checking PETSc installation...
Checking LAPACK library...

Traceback (most recent call last):
   File "./configure", line 10, in <module>
     execfile(os.path.join(os.path.dirname(__file__), 'config', 
'configure.py'))
   File "./config/configure.py", line 401, in <module>
     cmakeok = 
cmakeboot.main(slepcdir,petscdir,petscarch=petscconf.ARCH,log=log)
   File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 172, in main
     return 
PETScMaker(slepcdir,petscdir,petscarch,argDB,framework).cmakeboot(args,log)
   File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 87, in 
cmakeboot
     self.setup()
   File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 83, in setup
     self.setupModules()
   File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 51, in 
setupModules
     self.mpi           = 
self.framework.require('config.packages.MPI',         None)
AttributeError: 'NoneType' object has no attribute 'require'

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/c69eeec6/attachment-0001.html>

From jroman at dsic.upv.es  Thu May  1 15:48:27 2014
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Thu, 1 May 2014 22:48:27 +0200
Subject: [petsc-users] SLEPc configuration problem
In-Reply-To: <5362AF70.9050909@gmail.com>
References: <5362AF70.9050909@gmail.com>
Message-ID: <126D95D7-DDFF-4ED4-995C-D20D07DB5E4F@dsic.upv.es>


El 01/05/2014, a las 22:32, Jingyue Wang escribi?:

> Hi,
> 
> Can anyone please help me on how to configure SLEPc?  I have installed PETSc 3.4.4 (compiled with MKL) and downloaded and extracted the source of SLEPc 3.4.4. I set up 
> 
> export SLEPC_DIR="/home/jwang/opt/slepc-3.4.4"
> export PETSC_DIR="/home/jwang/opt/petsc-3.4.4"
> export PETSC_ARCH=linux-amd64-opt
> 
> However, after I enter the source directory of SLEPc and type ./configure, I got the error messages that I append at the end of the email. I tried to read the python configuration code and it seems that the reason is self.framework is None and the reason for self.framework is None is in the script.py in my petsc-3.4.4/config/BuildSystem directory, the following code in function loadConfigure(self, argDB = None):
>      .....
>      if not 'configureCache' in argDB:
>           self.logPrint('No cached configure in RDict at '+str(argDB.saveFilename))
>           return None
>      .....
> returns a None value.
> 
> It seems the reason is SLEPc can not find cached configuration in PETSc, but I don't know how to enable such cached configuration in PETSc...
> 
> 
> ***********************Error messages*****************************************
> 
> Checking environment...
> Checking PETSc installation...
> Checking LAPACK library...
> 
> Traceback (most recent call last):
>   File "./configure", line 10, in <module>
>     execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py'))
>   File "./config/configure.py", line 401, in <module>
>     cmakeok = cmakeboot.main(slepcdir,petscdir,petscarch=petscconf.ARCH,log=log)
>   File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 172, in main
>     return PETScMaker(slepcdir,petscdir,petscarch,argDB,framework).cmakeboot(args,log)
>   File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 87, in cmakeboot
>     self.setup()
>   File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 83, in setup
>     self.setupModules()
>   File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 51, in setupModules
>     self.mpi           = self.framework.require('config.packages.MPI',         None)
> AttributeError: 'NoneType' object has no attribute 'require'
> 

This problem has been reported before and it may happen occasionally.
Check the file $PETSC_DIR/$PETSC_ARCH/conf/RDict.db - see if it has a smaller size than usual. If this is the case, then the problem is that PETSc's configuration did not write this file completely, I don't know the reason.

Suggest to reconfigure PETSc.

Jose


From popov at uni-mainz.de  Thu May  1 17:25:06 2014
From: popov at uni-mainz.de (Anton Popov)
Date: Fri, 2 May 2014 00:25:06 +0200
Subject: [petsc-users] Assembling a matrix for a DMComposite vector
In-Reply-To: <CAGqzBJzLBR=+80CCk5gLvx4J6fMJkH0qSTuwfKywAESvzpaUqA@mail.gmail.com>
References: <CAGqzBJzA3_PZ9UtXAsmG7ARdGtwnxFpr-oqc11nZXDzrSLcraA@mail.gmail.com>
	<535F6D64.5040700@uni-mainz.de>
	<CAGqzBJzLBR=+80CCk5gLvx4J6fMJkH0qSTuwfKywAESvzpaUqA@mail.gmail.com>
Message-ID: <5362C9C2.40602@uni-mainz.de>

On 5/1/14 10:39 PM, Anush Krishnan wrote:
> Hi Anton,
>
> On 29 April 2014 05:14, Anton Popov <popov at uni-mainz.de 
> <mailto:popov at uni-mainz.de>> wrote:
>
>
>     You can do the whole thing much easier (to my opinion).
>     Since you created two DMDA anyway, just do:
>
>     - find first index on every processor using MPI_Scan
>     - create two global vectors (no ghosts)
>     - put proper global indicies to global vectors
>     - create two local vectors (with ghosts) and set ALL entries to -1
>     (to have what you need in boundary ghosts)
>     - call global-to-local scatter
>
>     Done!
>
>
> Won't the vectors contain floating point values? Are you storing your 
> indices as real numbers?
YES, exactly. And then I cast them to PetscInt when I compose stencils.

Something like this:
         idx[0] = (PetscInt) ivx[k][j][i];
         idx[1] = (PetscInt) ivx[k][j][i+1];
         idx[2] = (PetscInt) ivy[k][j][i];
     ... and so on, where ivx, ivy, ... are the index arrays in x, y .. 
directions

Then I insert (actually add) stencils using MatSetValues.

By the way, you can ideally preallocate in parallel with 
MatMPIAIJSetPreallocation. To count precisely number entries in the 
diagonal & off-diagonal blocks use the same mechanism to easily access 
global indices, and then compare them with the local row range, which is 
also known:
- within the range   -> d_nnz[i]++;
- outside the range -> o_nnz[i]++;

Anton
>
>
>     The advantage is that you can access global indices (including
>     ghosts) in every block using i-j-k indexing scheme.
>     I personally find this way quite easy to implement with PETSc
>
>     Anton
>
>
> Thank you,
> Anush

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/f384bdd1/attachment.html>

From salazardetroya at gmail.com  Thu May  1 18:14:30 2014
From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya)
Date: Thu, 1 May 2014 18:14:30 -0500
Subject: [petsc-users] Adaptive mesh refinement in Petsc
Message-ID: <CACH89CM4QGtWNVYLiBGVquOtS1zWHh3uAmH1FNpwK3BapJfktw@mail.gmail.com>

Hello everybody

I want to implement an adaptive mesh refinement library in a code written
in petsc. I have checked out some of the available libraries, but I want to
work with the latest petsc-dev version and I am sure there will be many
incompatibilities. So far I think I'll end up working with one of these
libraries: SAMRAI, Chombo, libMesh and deal II. Before I start checking out
each of them and learn how to use them I though I would ask you guys which
one you would recommend. My code would be a finite element analysis in
solid mechanics. I would like to take full advantage of petsc capabilities,
but I would not mind start with some restrictions. I hope my question is
not too broad.

Take care
Miguel

-- 
*Miguel Angel Salazar de Troya*
Graduate Research Assistant
Department of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
(217) 550-2360
salaza11 at illinois.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/67b86852/attachment.html>

From knepley at gmail.com  Thu May  1 19:19:32 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 1 May 2014 19:19:32 -0500
Subject: [petsc-users] Adaptive mesh refinement in Petsc
In-Reply-To: <CACH89CM4QGtWNVYLiBGVquOtS1zWHh3uAmH1FNpwK3BapJfktw@mail.gmail.com>
References: <CACH89CM4QGtWNVYLiBGVquOtS1zWHh3uAmH1FNpwK3BapJfktw@mail.gmail.com>
Message-ID: <CAMYG4G=tTHQ5ofEZKLg70OZLUFaR4-QodVg7bdLKH2PxJ2GTmg@mail.gmail.com>

On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya <
salazardetroya at gmail.com> wrote:

> Hello everybody
>
> I want to implement an adaptive mesh refinement library in a code written
> in petsc. I have checked out some of the available libraries, but I want to
> work with the latest petsc-dev version and I am sure there will be many
> incompatibilities. So far I think I'll end up working with one of these
> libraries: SAMRAI, Chombo, libMesh and deal II. Before I start checking out
> each of them and learn how to use them I though I would ask you guys which
> one you would recommend. My code would be a finite element analysis in
> solid mechanics. I would like to take full advantage of petsc capabilities,
> but I would not mind start with some restrictions. I hope my question is
> not too broad.
>

SAMRAI, Chombo, and Deal II are all structured adaptive refinement codes,
whereas LibMesh is unstructured. If you want unstructured, there is
really no other game in town. If you use deal II, I would suggest trying
out p4est underneath which gives great scalability. My understanding
is that Chombo is mostly used for finite volume and SAMRAI and deal II for
finite element, but this could be out of date.

   Matt


> Take care
> Miguel
>
> --
> *Miguel Angel Salazar de Troya*
> Graduate Research Assistant
> Department of Mechanical Science and Engineering
> University of Illinois at Urbana-Champaign
> (217) 550-2360
> salaza11 at illinois.edu
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/725def0c/attachment.html>

From griffith at cims.nyu.edu  Thu May  1 19:25:33 2014
From: griffith at cims.nyu.edu (Boyce Griffith)
Date: Thu, 1 May 2014 20:25:33 -0400
Subject: [petsc-users] Adaptive mesh refinement in Petsc
In-Reply-To: <CAMYG4G=tTHQ5ofEZKLg70OZLUFaR4-QodVg7bdLKH2PxJ2GTmg@mail.gmail.com>
References: <CACH89CM4QGtWNVYLiBGVquOtS1zWHh3uAmH1FNpwK3BapJfktw@mail.gmail.com>
	<CAMYG4G=tTHQ5ofEZKLg70OZLUFaR4-QodVg7bdLKH2PxJ2GTmg@mail.gmail.com>
Message-ID: <BDD152CE-B448-4AE0-9EB5-4184BD7B3C95@cims.nyu.edu>


> On May 1, 2014, at 8:19 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
>> On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya <salazardetroya at gmail.com> wrote:
>> Hello everybody
>> 
>> I want to implement an adaptive mesh refinement library in a code written in petsc. I have checked out some of the available libraries, but I want to work with the latest petsc-dev version and I am sure there will be many incompatibilities. So far I think I'll end up working with one of these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start checking out each of them and learn how to use them I though I would ask you guys which one you would recommend. My code would be a finite element analysis in solid mechanics. I would like to take full advantage of petsc capabilities, but I would not mind start with some restrictions. I hope my question is not too broad.
> 
> SAMRAI, Chombo, and Deal II are all structured adaptive refinement codes, whereas LibMesh is unstructured. If you want unstructured, there is
> really no other game in town. If you use deal II, I would suggest trying out p4est underneath which gives great scalability. My understanding
> is that Chombo is mostly used for finite volume and SAMRAI and deal II for finite element, but this could be out of date.

SAMRAI is definitely much better suited to finite volume, although it does have basic features needed for structured-grid FE.

-- Boyce

> 
>    Matt
>  
>> Take care
>> Miguel
>> 
>> -- 
>> Miguel Angel Salazar de Troya
>> Graduate Research Assistant
>> Department of Mechanical Science and Engineering
>> University of Illinois at Urbana-Champaign
>> (217) 550-2360
>> salaza11 at illinois.edu
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/2bf301a1/attachment-0001.html>

From bsmith at mcs.anl.gov  Thu May  1 19:31:51 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 1 May 2014 19:31:51 -0500
Subject: [petsc-users] Adaptive mesh refinement in Petsc
In-Reply-To: <CAMYG4G=tTHQ5ofEZKLg70OZLUFaR4-QodVg7bdLKH2PxJ2GTmg@mail.gmail.com>
References: <CACH89CM4QGtWNVYLiBGVquOtS1zWHh3uAmH1FNpwK3BapJfktw@mail.gmail.com>
	<CAMYG4G=tTHQ5ofEZKLg70OZLUFaR4-QodVg7bdLKH2PxJ2GTmg@mail.gmail.com>
Message-ID: <48DFB171-4663-4E30-96A6-80103400ACD2@mcs.anl.gov>


  You also could likely benefit from Moose http://www.mooseframework.org it sits on top of libMesh which sits on top of PETSc and manages almost all of what you need for finite element analysis.

   Barry

On May 1, 2014, at 7:19 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya <salazardetroya at gmail.com> wrote:
> Hello everybody
> 
> I want to implement an adaptive mesh refinement library in a code written in petsc. I have checked out some of the available libraries, but I want to work with the latest petsc-dev version and I am sure there will be many incompatibilities. So far I think I'll end up working with one of these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start checking out each of them and learn how to use them I though I would ask you guys which one you would recommend. My code would be a finite element analysis in solid mechanics. I would like to take full advantage of petsc capabilities, but I would not mind start with some restrictions. I hope my question is not too broad.
> 
> SAMRAI, Chombo, and Deal II are all structured adaptive refinement codes, whereas LibMesh is unstructured. If you want unstructured, there is
> really no other game in town. If you use deal II, I would suggest trying out p4est underneath which gives great scalability. My understanding
> is that Chombo is mostly used for finite volume and SAMRAI and deal II for finite element, but this could be out of date.
> 
>    Matt
>  
> Take care
> Miguel
> 
> -- 
> Miguel Angel Salazar de Troya
> Graduate Research Assistant
> Department of Mechanical Science and Engineering
> University of Illinois at Urbana-Champaign
> (217) 550-2360
> salaza11 at illinois.edu
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From friedmud at gmail.com  Thu May  1 20:04:32 2014
From: friedmud at gmail.com (Derek Gaston)
Date: Thu, 1 May 2014 19:04:32 -0600
Subject: [petsc-users] Adaptive mesh refinement in Petsc
In-Reply-To: <48DFB171-4663-4E30-96A6-80103400ACD2@mcs.anl.gov>
References: <CACH89CM4QGtWNVYLiBGVquOtS1zWHh3uAmH1FNpwK3BapJfktw@mail.gmail.com>
	<CAMYG4G=tTHQ5ofEZKLg70OZLUFaR4-QodVg7bdLKH2PxJ2GTmg@mail.gmail.com>
	<48DFB171-4663-4E30-96A6-80103400ACD2@mcs.anl.gov>
Message-ID: <CAFfxPjopHsPG9CmD1uP6mfP5DdjRkMoLnHH2JSFpH=PLTKNW1g@mail.gmail.com>

Miguel,

I'm the lead for the MOOSE Framework project Barry spoke of... we would
love to help you get up and running with adaptive finite elements for solid
mechanics with MOOSE.  If you are doing fairly normal solid mechanics using
small or large strain formulations with some plasticity... most of what you
need is already there.  You may need to plug in your particular material
model but that's about it.  Mesh adaptivity is built-in and should work out
of the box.  The major benefit of using MOOSE is that you can easily couple
in other physics (like heat conduction, chemistry and more) and of course
you have full access to all the power of PETSc.

I recommend going through the Getting Started material on
http://www.mooseframework.org to get set up... and go ahead and create
yourself a new Application using these instructions:
http://mooseframework.org/create-an-app/  .  That Application will already
have full access to our solid mechanics capabilities (as well as tons of
other stuff like heat conduction, chemistry, etc.).

After that - join up on the moose-users mailing list and you can get in
touch with everyone else doing solid mechanics with MOOSE who can point you
in the right direction depending on your particular application.

Let me know if you have any questions...

Derek


On Thu, May 1, 2014 at 6:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   You also could likely benefit from Moose http://www.mooseframework.orgit sits on top of libMesh which sits on top of PETSc and manages almost all
> of what you need for finite element analysis.
>
>    Barry
>
> On May 1, 2014, at 7:19 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> > On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya <
> salazardetroya at gmail.com> wrote:
> > Hello everybody
> >
> > I want to implement an adaptive mesh refinement library in a code
> written in petsc. I have checked out some of the available libraries, but I
> want to work with the latest petsc-dev version and I am sure there will be
> many incompatibilities. So far I think I'll end up working with one of
> these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start
> checking out each of them and learn how to use them I though I would ask
> you guys which one you would recommend. My code would be a finite element
> analysis in solid mechanics. I would like to take full advantage of petsc
> capabilities, but I would not mind start with some restrictions. I hope my
> question is not too broad.
> >
> > SAMRAI, Chombo, and Deal II are all structured adaptive refinement
> codes, whereas LibMesh is unstructured. If you want unstructured, there is
> > really no other game in town. If you use deal II, I would suggest trying
> out p4est underneath which gives great scalability. My understanding
> > is that Chombo is mostly used for finite volume and SAMRAI and deal II
> for finite element, but this could be out of date.
> >
> >    Matt
> >
> > Take care
> > Miguel
> >
> > --
> > Miguel Angel Salazar de Troya
> > Graduate Research Assistant
> > Department of Mechanical Science and Engineering
> > University of Illinois at Urbana-Champaign
> > (217) 550-2360
> > salaza11 at illinois.edu
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/379ae008/attachment.html>

From epscodes at gmail.com  Thu May  1 21:12:38 2014
From: epscodes at gmail.com (Xiangdong)
Date: Thu, 1 May 2014 22:12:38 -0400
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <CAAPpcpk4fagRja6o_ui8iC==jh2vQ44kJpZaJAgOOzgQHx6Fwg@mail.gmail.com>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
	<CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
	<F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>
	<CAAPpcpk4fagRja6o_ui8iC==jh2vQ44kJpZaJAgOOzgQHx6Fwg@mail.gmail.com>
Message-ID: <CAAPpcp=9oTtbo2VjX7=dgt=qnNM8JKLBENnf8g9GDCXAR_EPsQ@mail.gmail.com>

I came up with a simple example to demonstrate this "eliminating row"
behavior. It happens when the solution x to the linearized equation Ax=b is
out of the bound set by SNESVISetVariableBounds();

In the attached example, I use snes to solve a simple function x-b=0. When
you run it, it outputs the matrix as 25 rows, while the real Jacobian
should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be
-inf, it will output 50 rows for the Jacobian. In the first case, the norm
given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different.

In solving the nonlinear equations, it is likely that the solution of the
linearized equation is out of bound, but then we can reset the out-of-bound
solution to be lower or upper bound instead of eliminating the variables
(the rows). Any suggestions on doing this in petsc?

Thank you.

Best,
Xiangdong

P.S. If we change the lower bound of field u (line 124) to be zero, then
the Jacobian matrix is set to be NULL by petsc.


On Thu, May 1, 2014 at 3:43 PM, Xiangdong <epscodes at gmail.com> wrote:

> Here is the order of functions I called:
>
> DMDACreate3d();
>
> SNESCreate();
>
> SNESSetDM();  (DM with dof=2);
>
> DMSetApplicationContext();
>
> DMDASNESSetFunctionLocal();
>
> SNESVISetVariableBounds();
>
> DMDASNESetJacobianLocal();
>
> SNESSetFromOptions();
>
> SNESSolve();
>
> SNESGetKSP();
> KSPGetSolution();
> KSPGetRhs();
> KSPGetOperators();   //get operator kspA, kspx, kspb;
>
> SNESGetFunctionNorm();  ==> get norm fnorma;
> SNESGetFunction(); VecNorm(); ==> get norm fnormb;
> SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the
> solution x and get norm fnormc;
>
> Inside the FormJacobianLocal(), I output the matrix jac and preB;
>
> I found that fnorma matches the default SNES monitor output "SNES Function
> norm", but fnormb=fnormc != fnorma. The solution x, the residue fx obtained
> by  snescomputefunction, mat jac and preB are length 50 or 50-by-50, while
> the kspA, kspx, kspb are 25-by-25 or length 25.
>
> I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb;
> x(2:2:end)=0; It seems that it completely ignores the second degree of
> freedom (setting it to zero). I saw this for (close to) constant initial
> guess, while for heterogeneous initial guess, it works fine and the matrix
> and vector size are correct, and the solution is correct. So this
> eliminating row behavior seems to be initial guess dependent.
>
> I saw this even if I use snes_fd, so we can rule out the possibility of
> wrong Jacobian. For the FormFunctionLocal(), I checked via
> SNESComputeFunction and it output the correct vector of residue.
>
> Are the orders of function calls correct?
>
> Thank you.
>
> Xiangdong
>
>
>
>
>
>
>
> On Thu, May 1, 2014 at 1:58 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> On May 1, 2014, at 10:32 AM, Xiangdong <epscodes at gmail.com> wrote:
>>
>> > Under what condition, SNESGetFunctionNorm() will output different
>> results from SENEGetFunction + VecNorm (with NORM_2)?
>> >
>> > For most of my test cases, it is the same. However, when I have some
>> special (trivial) initial guess to the SNES problem, I see different norms.
>>
>>    Please send more details on your ?trivial? case where the values are
>> different. It could be that we are not setting the function norm properly
>> on early exit from the solvers.
>> >
>> > Another phenomenon I noticed with this is that KSP in SNES squeeze my
>> matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When
>> I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25,
>> and the rhs and solution is with length 25. Do you have any clue on what
>> triggered this? To my surprise, when I output the Jacobian inside the
>> FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct
>> numerical entries. Why does the operator obtained from KSP is different and
>> got rows eliminated? These rows got eliminated have only one entries per
>> row, but the rhs in that row is not zero. Eliminating these rows would give
>> wrong solutions.
>>
>>    Hmm, we never squeeze out rows/columns from the Jacobian. The size of
>> the Jacobian set with SNESSetJacobian() should always match that obtained
>> with KSPGetOperators() on the linear system.   Please send more details on
>> how you get this. Are you calling the KSPGetOperators() inside a
>> preconditioner where the the preconditioner has chopped up the operator?
>>
>>    Barry
>>
>> >
>> > Thank you.
>> >
>> > Xiangdong
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com> wrote:
>> > It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo
>> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize
>> the array f. Zero the array f solved the problem and gave consistent result.
>> >
>> > Just curious, why does not petsc initialize the array f to zero by
>> default inside petsc when passing the f array to FormFunctionLocal?
>> >
>> > If you directly set entires, you might not want us to spend the time
>> writing those zeros.
>> >
>> > I have another quick question about the array x passed to
>> FormFunctionLocal. If I want to know the which x is evaluated, how can I
>> output x in a vector format? Currently, I created a global vector vecx and
>> a local vector vecx_local, get the array of vecx_local_array, copy the x to
>> vecx_local_array,  scatter to global vecx and output vecx. Is there a quick
>> way to restore the array x to a vector and output?
>> >
>> > I cannot think of a better way than that.
>> >
>> >    Matt
>> >
>> > Thank you.
>> >
>> > Best,
>> > Xiangdong
>> >
>> >
>> >
>> > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >
>> > On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
>> >
>> > > Hello everyone,
>> > >
>> > > When I run snes program,
>> >
>> >                ^^^^ what SNES program??
>> >
>> > > it outputs "SNES Function norm 1.23456789e+10". It seems that this
>> norm is different from residue norm (even if solving F(x)=0)
>> >
>> >    Please send the full output where you see this.
>> >
>> > > and also differ from norm of the Jacobian. What is the definition of
>> this "SNES Function Norm??
>> >
>> >    The SNES Function Norm as printed by PETSc is suppose to the 2-norm
>> of F(x) - b (where b is usually zero) and this is also the same thing as
>> the ?residue norm?
>> >
>> >    Barry
>> >
>> > >
>> > > Thank you.
>> > >
>> > > Best,
>> > > Xiangdong
>> >
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/a392dbbd/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exdemo.c
Type: text/x-csrc
Size: 4875 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140501/a392dbbd/attachment-0001.c>

From bsmith at mcs.anl.gov  Thu May  1 21:21:45 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 1 May 2014 21:21:45 -0500
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <CAAPpcp=9oTtbo2VjX7=dgt=qnNM8JKLBENnf8g9GDCXAR_EPsQ@mail.gmail.com>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
	<CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
	<F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>
	<CAAPpcpk4fagRja6o_ui8iC==jh2vQ44kJpZaJAgOOzgQHx6Fwg@mail.gmail.com>
	<CAAPpcp=9oTtbo2VjX7=dgt=qnNM8JKLBENnf8g9GDCXAR_EPsQ@mail.gmail.com>
Message-ID: <8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov>


On May 1, 2014, at 9:12 PM, Xiangdong <epscodes at gmail.com> wrote:

> I came up with a simple example to demonstrate this "eliminating row" behavior. It happens when the solution x to the linearized equation Ax=b is out of the bound set by SNESVISetVariableBounds();
> 
> In the attached example, I use snes to solve a simple function x-b=0. When you run it, it outputs the matrix as 25 rows, while the real Jacobian should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be -inf, it will output 50 rows for the Jacobian. In the first case, the norm given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different.
> 
> In solving the nonlinear equations, it is likely that the solution of the linearized equation is out of bound, but then we can reset the out-of-bound solution to be lower or upper bound instead of eliminating the variables (the rows). Any suggestions on doing this in petsc?

   This is what PETSc is doing. It is using the "active set method". Variables that are at their bounds are ?frozen? and then a smaller system is solved (involving just the variables not a that bounds) to get the next search direction.  Based on the next search direction some of the variables on the bounds may be unfrozen and other variables may be frozen. There is a huge literature on this topic. See for example our buddies 	? Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, New York: Springer-Verlag. ISBN 978-0-387-30303-1..  

   The SNESGetFunctionNorm and SNESGetFunction+VecNorm  may return different values with the SNES VI solver. If you care about the function value just use SNESGetFunction() and compute the norm that way. We are eliminating SNESGetFunctionNorm() from PETSc because it is problematic.

   If you think the SNES VI solver is actually not solving the problem, or giving the wrong answer than please send us the entire simple code and we?ll see if we have introduced any bugs into our solver. But note that the linear system being of different sizes is completely normal for the solver.


  Barry


> 
> Thank you.
> 
> Best,
> Xiangdong
> 
> P.S. If we change the lower bound of field u (line 124) to be zero, then the Jacobian matrix is set to be NULL by petsc.
> 
> 
> On Thu, May 1, 2014 at 3:43 PM, Xiangdong <epscodes at gmail.com> wrote:
> Here is the order of functions I called:
> 
> DMDACreate3d();
> 
> SNESCreate();
> 
> SNESSetDM();  (DM with dof=2);
> 
> DMSetApplicationContext();
> 
> DMDASNESSetFunctionLocal();
> 
> SNESVISetVariableBounds();
> 
> DMDASNESetJacobianLocal();
> 
> SNESSetFromOptions();
> 
> SNESSolve();
> 
> SNESGetKSP();
> KSPGetSolution();
> KSPGetRhs();
> KSPGetOperators();   //get operator kspA, kspx, kspb;
> 
> SNESGetFunctionNorm();  ==> get norm fnorma;
> SNESGetFunction(); VecNorm(); ==> get norm fnormb;
> SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the solution x and get norm fnormc;
> 
> Inside the FormJacobianLocal(), I output the matrix jac and preB;
> 
> I found that fnorma matches the default SNES monitor output "SNES Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx obtained by  snescomputefunction, mat jac and preB are length 50 or 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25.
> 
> I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; x(2:2:end)=0; It seems that it completely ignores the second degree of freedom (setting it to zero). I saw this for (close to) constant initial guess, while for heterogeneous initial guess, it works fine and the matrix and vector size are correct, and the solution is correct. So this eliminating row behavior seems to be initial guess dependent.
> 
> I saw this even if I use snes_fd, so we can rule out the possibility of wrong Jacobian. For the FormFunctionLocal(), I checked via SNESComputeFunction and it output the correct vector of residue. 
> 
> Are the orders of function calls correct?
> 
> Thank you.
> 
> Xiangdong 
> 
> 
> 
> 
> 
> 
> 
> On Thu, May 1, 2014 at 1:58 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> On May 1, 2014, at 10:32 AM, Xiangdong <epscodes at gmail.com> wrote:
> 
> > Under what condition, SNESGetFunctionNorm() will output different results from SENEGetFunction + VecNorm (with NORM_2)?
> >
> > For most of my test cases, it is the same. However, when I have some special (trivial) initial guess to the SNES problem, I see different norms.
> 
>    Please send more details on your ?trivial? case where the values are different. It could be that we are not setting the function norm properly on early exit from the solvers.
> >
> > Another phenomenon I noticed with this is that KSP in SNES squeeze my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, and the rhs and solution is with length 25. Do you have any clue on what triggered this? To my surprise, when I output the Jacobian inside the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct numerical entries. Why does the operator obtained from KSP is different and got rows eliminated? These rows got eliminated have only one entries per row, but the rhs in that row is not zero. Eliminating these rows would give wrong solutions.
> 
>    Hmm, we never squeeze out rows/columns from the Jacobian. The size of the Jacobian set with SNESSetJacobian() should always match that obtained with KSPGetOperators() on the linear system.   Please send more details on how you get this. Are you calling the KSPGetOperators() inside a preconditioner where the the preconditioner has chopped up the operator?
> 
>    Barry
> 
> >
> > Thank you.
> >
> > Xiangdong
> >
> >
> >
> >
> >
> >
> > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com> wrote:
> > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com> wrote:
> > It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize the array f. Zero the array f solved the problem and gave consistent result.
> >
> > Just curious, why does not petsc initialize the array f to zero by default inside petsc when passing the f array to FormFunctionLocal?
> >
> > If you directly set entires, you might not want us to spend the time writing those zeros.
> >
> > I have another quick question about the array x passed to FormFunctionLocal. If I want to know the which x is evaluated, how can I output x in a vector format? Currently, I created a global vector vecx and a local vector vecx_local, get the array of vecx_local_array, copy the x to vecx_local_array,  scatter to global vecx and output vecx. Is there a quick way to restore the array x to a vector and output?
> >
> > I cannot think of a better way than that.
> >
> >    Matt
> >
> > Thank you.
> >
> > Best,
> > Xiangdong
> >
> >
> >
> > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
> >
> > > Hello everyone,
> > >
> > > When I run snes program,
> >
> >                ^^^^ what SNES program??
> >
> > > it outputs "SNES Function norm 1.23456789e+10". It seems that this norm is different from residue norm (even if solving F(x)=0)
> >
> >    Please send the full output where you see this.
> >
> > > and also differ from norm of the Jacobian. What is the definition of this "SNES Function Norm??
> >
> >    The SNES Function Norm as printed by PETSc is suppose to the 2-norm of F(x) - b (where b is usually zero) and this is also the same thing as the ?residue norm?
> >
> >    Barry
> >
> > >
> > > Thank you.
> > >
> > > Best,
> > > Xiangdong
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> >
> 
> 
> 
> <exdemo.c>


From jingyue at gmail.com  Thu May  1 23:44:24 2014
From: jingyue at gmail.com (Jingyue Wang)
Date: Thu, 01 May 2014 23:44:24 -0500
Subject: [petsc-users] SLEPc configuration problem
In-Reply-To: <126D95D7-DDFF-4ED4-995C-D20D07DB5E4F@dsic.upv.es>
References: <5362AF70.9050909@gmail.com>
	<126D95D7-DDFF-4ED4-995C-D20D07DB5E4F@dsic.upv.es>
Message-ID: <536322A8.4050504@gmail.com>

Dear Jose,

Thank you for the suggestion and it works after I removed a few external 
packages. Now the compilation is successful.

Best regards,

Jingyue
On 05/01/2014 03:48 PM, Jose E. Roman wrote:
> El 01/05/2014, a las 22:32, Jingyue Wang escribi?:
>
>> Hi,
>>
>> Can anyone please help me on how to configure SLEPc?  I have installed PETSc 3.4.4 (compiled with MKL) and downloaded and extracted the source of SLEPc 3.4.4. I set up
>>
>> export SLEPC_DIR="/home/jwang/opt/slepc-3.4.4"
>> export PETSC_DIR="/home/jwang/opt/petsc-3.4.4"
>> export PETSC_ARCH=linux-amd64-opt
>>
>> However, after I enter the source directory of SLEPc and type ./configure, I got the error messages that I append at the end of the email. I tried to read the python configuration code and it seems that the reason is self.framework is None and the reason for self.framework is None is in the script.py in my petsc-3.4.4/config/BuildSystem directory, the following code in function loadConfigure(self, argDB = None):
>>       .....
>>       if not 'configureCache' in argDB:
>>            self.logPrint('No cached configure in RDict at '+str(argDB.saveFilename))
>>            return None
>>       .....
>> returns a None value.
>>
>> It seems the reason is SLEPc can not find cached configuration in PETSc, but I don't know how to enable such cached configuration in PETSc...
>>
>>
>> ***********************Error messages*****************************************
>>
>> Checking environment...
>> Checking PETSc installation...
>> Checking LAPACK library...
>>
>> Traceback (most recent call last):
>>    File "./configure", line 10, in <module>
>>      execfile(os.path.join(os.path.dirname(__file__), 'config', 'configure.py'))
>>    File "./config/configure.py", line 401, in <module>
>>      cmakeok = cmakeboot.main(slepcdir,petscdir,petscarch=petscconf.ARCH,log=log)
>>    File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 172, in main
>>      return PETScMaker(slepcdir,petscdir,petscarch,argDB,framework).cmakeboot(args,log)
>>    File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 87, in cmakeboot
>>      self.setup()
>>    File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 83, in setup
>>      self.setupModules()
>>    File "/home/jwang/opt/slepc-3.4.4/config/cmakeboot.py", line 51, in setupModules
>>      self.mpi           = self.framework.require('config.packages.MPI',         None)
>> AttributeError: 'NoneType' object has no attribute 'require'
>>
> This problem has been reported before and it may happen occasionally.
> Check the file $PETSC_DIR/$PETSC_ARCH/conf/RDict.db - see if it has a smaller size than usual. If this is the case, then the problem is that PETSc's configuration did not write this file completely, I don't know the reason.
>
> Suggest to reconfigure PETSc.
>
> Jose
>


From lfreret at arrow.utias.utoronto.ca  Fri May  2 09:38:42 2014
From: lfreret at arrow.utias.utoronto.ca (Lucie Freret)
Date: Fri, 02 May 2014 10:38:42 -0400
Subject: [petsc-users] Petsc with ML and ILU
Message-ID: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca>

Hello,

I would like to solve linear systems using Gmres preconditioned by ML  
and use ILU(0) on all levels (mg_coarse and mg_levels_x).
As I have MATMPIAIJ matrix, I'm using
-ksp_type gmres
-pc_type ml
-mg_levels_ksp_type preonly (-mg_coarse_ksp_type preonly)
-mg_levels_pc_type asm (-mg_coarse_pc_type asm)
but I get:
"Running KSP of preonly doesn't make sense with nonzero initial guess"
I tried different keyword to have a zero initial guess but  
unfortunately, I can't solve this problem.
Should I use an other mg_levels_ksp solver of each level or is there a  
way to initialize guess on each level?

Thanks,
Lucie

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


From knepley at gmail.com  Fri May  2 09:43:34 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 2 May 2014 09:43:34 -0500
Subject: [petsc-users] Petsc with ML and ILU
In-Reply-To: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca>
References: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca>
Message-ID: <CAMYG4Gns7fuJKT8M2=Zq+wXfN69CAyB6JrbHC0mHYwLvkJLKeA@mail.gmail.com>

On Fri, May 2, 2014 at 9:38 AM, Lucie Freret <
lfreret at arrow.utias.utoronto.ca> wrote:

> Hello,
>
> I would like to solve linear systems using Gmres preconditioned by ML and
> use ILU(0) on all levels (mg_coarse and mg_levels_x).
> As I have MATMPIAIJ matrix, I'm using
> -ksp_type gmres
> -pc_type ml
> -mg_levels_ksp_type preonly (-mg_coarse_ksp_type preonly)
> -mg_levels_pc_type asm (-mg_coarse_pc_type asm)
> but I get:
> "Running KSP of preonly doesn't make sense with nonzero initial guess"
> I tried different keyword to have a zero initial guess but unfortunately,
> I can't solve this problem.
> Should I use an other mg_levels_ksp solver of each level or is there a way
> to initialize guess on each level?
>

You want "richardson" instead of preonly, since you are doing defect
correction in MG.

   Matt


> Thanks,
> Lucie
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/13872f34/attachment.html>

From jed at jedbrown.org  Fri May  2 09:46:35 2014
From: jed at jedbrown.org (Jed Brown)
Date: Fri, 02 May 2014 08:46:35 -0600
Subject: [petsc-users] Petsc with ML and ILU
In-Reply-To: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca>
References: <20140502103842.17142yiys75z69rm@arrow.utias.utoronto.ca>
Message-ID: <8761low850.fsf@jedbrown.org>

Lucie Freret <lfreret at arrow.utias.utoronto.ca> writes:

> Hello,
>
> I would like to solve linear systems using Gmres preconditioned by ML  
> and use ILU(0) on all levels (mg_coarse and mg_levels_x).
> As I have MATMPIAIJ matrix, I'm using
> -ksp_type gmres
> -pc_type ml
> -mg_levels_ksp_type preonly (-mg_coarse_ksp_type preonly)

This should be -mg_levels_ksp_type richardson (the default when using
ML), which will compute a residual as necessary before applying the
preconditioner.  Note that this may need damping, or you could use
-mg_levels_ksp_type chebyshev to compute a spectral estimate to combine
damping and targeting a range of the spectrum.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/252e7bbf/attachment.pgp>

From salazardetroya at gmail.com  Fri May  2 10:03:35 2014
From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya)
Date: Fri, 2 May 2014 10:03:35 -0500
Subject: [petsc-users] Adaptive mesh refinement in Petsc
In-Reply-To: <CAFfxPjopHsPG9CmD1uP6mfP5DdjRkMoLnHH2JSFpH=PLTKNW1g@mail.gmail.com>
References: <CACH89CM4QGtWNVYLiBGVquOtS1zWHh3uAmH1FNpwK3BapJfktw@mail.gmail.com>
	<CAMYG4G=tTHQ5ofEZKLg70OZLUFaR4-QodVg7bdLKH2PxJ2GTmg@mail.gmail.com>
	<48DFB171-4663-4E30-96A6-80103400ACD2@mcs.anl.gov>
	<CAFfxPjopHsPG9CmD1uP6mfP5DdjRkMoLnHH2JSFpH=PLTKNW1g@mail.gmail.com>
Message-ID: <CACH89CNHXievBAKrYo1STxYw-n0ZDGFMbTbc+1LkBQLeir5_Uw@mail.gmail.com>

Thanks a lot for your responses. I will get started with MOOSE.


On Thu, May 1, 2014 at 8:04 PM, Derek Gaston <friedmud at gmail.com> wrote:

> Miguel,
>
> I'm the lead for the MOOSE Framework project Barry spoke of... we would
> love to help you get up and running with adaptive finite elements for solid
> mechanics with MOOSE.  If you are doing fairly normal solid mechanics using
> small or large strain formulations with some plasticity... most of what you
> need is already there.  You may need to plug in your particular material
> model but that's about it.  Mesh adaptivity is built-in and should work out
> of the box.  The major benefit of using MOOSE is that you can easily couple
> in other physics (like heat conduction, chemistry and more) and of course
> you have full access to all the power of PETSc.
>
> I recommend going through the Getting Started material on
> http://www.mooseframework.org to get set up... and go ahead and create
> yourself a new Application using these instructions:
> http://mooseframework.org/create-an-app/  .  That Application will
> already have full access to our solid mechanics capabilities (as well as
> tons of other stuff like heat conduction, chemistry, etc.).
>
> After that - join up on the moose-users mailing list and you can get in
> touch with everyone else doing solid mechanics with MOOSE who can point you
> in the right direction depending on your particular application.
>
> Let me know if you have any questions...
>
> Derek
>
>
>
>
> On Thu, May 1, 2014 at 6:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>   You also could likely benefit from Moose http://www.mooseframework.orgit sits on top of libMesh which sits on top of PETSc and manages almost all
>> of what you need for finite element analysis.
>>
>>    Barry
>>
>> On May 1, 2014, at 7:19 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> > On Thu, May 1, 2014 at 6:14 PM, Miguel Angel Salazar de Troya <
>> salazardetroya at gmail.com> wrote:
>> > Hello everybody
>> >
>> > I want to implement an adaptive mesh refinement library in a code
>> written in petsc. I have checked out some of the available libraries, but I
>> want to work with the latest petsc-dev version and I am sure there will be
>> many incompatibilities. So far I think I'll end up working with one of
>> these libraries: SAMRAI, Chombo, libMesh and deal II. Before I start
>> checking out each of them and learn how to use them I though I would ask
>> you guys which one you would recommend. My code would be a finite element
>> analysis in solid mechanics. I would like to take full advantage of petsc
>> capabilities, but I would not mind start with some restrictions. I hope my
>> question is not too broad.
>> >
>> > SAMRAI, Chombo, and Deal II are all structured adaptive refinement
>> codes, whereas LibMesh is unstructured. If you want unstructured, there is
>> > really no other game in town. If you use deal II, I would suggest
>> trying out p4est underneath which gives great scalability. My understanding
>> > is that Chombo is mostly used for finite volume and SAMRAI and deal II
>> for finite element, but this could be out of date.
>> >
>> >    Matt
>> >
>> > Take care
>> > Miguel
>> >
>> > --
>> > Miguel Angel Salazar de Troya
>> > Graduate Research Assistant
>> > Department of Mechanical Science and Engineering
>> > University of Illinois at Urbana-Champaign
>> > (217) 550-2360
>> > salaza11 at illinois.edu
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>>
>>
>


-- 
*Miguel Angel Salazar de Troya*
Graduate Research Assistant
Department of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
(217) 550-2360
salaza11 at illinois.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/064cc50b/attachment-0001.html>

From lu_qin_2000 at yahoo.com  Fri May  2 10:27:13 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Fri, 2 May 2014 08:27:13 -0700 (PDT)
Subject: [petsc-users] ILUTP in PETSc
Message-ID: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>

Hello,

I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf?that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?

Many thanks,
Qin??

From k.anush at gmail.com  Fri May  2 11:34:33 2014
From: k.anush at gmail.com (Anush Krishnan)
Date: Fri, 2 May 2014 12:34:33 -0400
Subject: [petsc-users] Assembling a matrix for a DMComposite vector
In-Reply-To: <5362C9C2.40602@uni-mainz.de>
References: <CAGqzBJzA3_PZ9UtXAsmG7ARdGtwnxFpr-oqc11nZXDzrSLcraA@mail.gmail.com>
	<535F6D64.5040700@uni-mainz.de>
	<CAGqzBJzLBR=+80CCk5gLvx4J6fMJkH0qSTuwfKywAESvzpaUqA@mail.gmail.com>
	<5362C9C2.40602@uni-mainz.de>
Message-ID: <CAGqzBJw=40sXkSz1DSpEBdi4=LssZ05uRVYS3VkLzvTwNQTzPw@mail.gmail.com>

On 1 May 2014 18:25, Anton Popov <popov at uni-mainz.de> wrote:

>  On 5/1/14 10:39 PM, Anush Krishnan wrote:
>
> Hi Anton,
>
> On 29 April 2014 05:14, Anton Popov <popov at uni-mainz.de> wrote:
>
>>
>>  You can do the whole thing much easier (to my opinion).
>> Since you created two DMDA anyway, just do:
>>
>> - find first index on every processor using MPI_Scan
>> - create two global vectors (no ghosts)
>> - put proper global indicies to global vectors
>> - create two local vectors (with ghosts) and set ALL entries to -1 (to
>> have what you need in boundary ghosts)
>> - call global-to-local scatter
>>
>> Done!
>>
>
>  Won't the vectors contain floating point values? Are you storing your
> indices as real numbers?
>
> YES, exactly. And then I cast them to PetscInt when I compose stencils.
>
> Something like this:
>         idx[0] = (PetscInt) ivx[k][j][i];
>         idx[1] = (PetscInt) ivx[k][j][i+1];
>         idx[2] = (PetscInt) ivy[k][j][i];
>     ... and so on, where ivx, ivy, ... are the index arrays in x, y ..
> directions
>
> Then I insert (actually add) stencils using MatSetValues.
>
> By the way, you can ideally preallocate in parallel with
> MatMPIAIJSetPreallocation. To count precisely number entries in the
> diagonal & off-diagonal blocks use the same mechanism to easily access
> global indices, and then compare them with the local row range, which is
> also known:
> - within the range   -> d_nnz[i]++;
> - outside the range -> o_nnz[i]++;
>

Thanks a lot for the help. I did exactly that and it worked perfectly.

But just to clarify: if I was using 32-bit floats, would I start having
trouble when my matrix size reaches ~10 million due to the floating point
precision?


>
> Anton
>
>
>
>>
>> The advantage is that you can access global indices (including ghosts) in
>> every block using i-j-k indexing scheme.
>> I personally find this way quite easy to implement with PETSc
>>
>> Anton
>>
>
>  Thank you,
> Anush
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/f082a247/attachment.html>

From anush at bu.edu  Fri May  2 11:36:25 2014
From: anush at bu.edu (Anush Krishnan)
Date: Fri, 2 May 2014 12:36:25 -0400
Subject: [petsc-users] Assembling a matrix for a DMComposite vector
In-Reply-To: <5362C9C2.40602@uni-mainz.de>
References: <CAGqzBJzA3_PZ9UtXAsmG7ARdGtwnxFpr-oqc11nZXDzrSLcraA@mail.gmail.com>
	<535F6D64.5040700@uni-mainz.de>
	<CAGqzBJzLBR=+80CCk5gLvx4J6fMJkH0qSTuwfKywAESvzpaUqA@mail.gmail.com>
	<5362C9C2.40602@uni-mainz.de>
Message-ID: <CAGqzBJwKPJmCnrJ87FHiriy2P-GKbJQzFe4VUfyREoRX4y5B6g@mail.gmail.com>

On 1 May 2014 18:25, Anton Popov <popov at uni-mainz.de> wrote:

>  On 5/1/14 10:39 PM, Anush Krishnan wrote:
>
> Hi Anton,
>
> On 29 April 2014 05:14, Anton Popov <popov at uni-mainz.de> wrote:
>
>>
>>  You can do the whole thing much easier (to my opinion).
>> Since you created two DMDA anyway, just do:
>>
>> - find first index on every processor using MPI_Scan
>> - create two global vectors (no ghosts)
>> - put proper global indicies to global vectors
>> - create two local vectors (with ghosts) and set ALL entries to -1 (to
>> have what you need in boundary ghosts)
>> - call global-to-local scatter
>>
>> Done!
>>
>
>  Won't the vectors contain floating point values? Are you storing your
> indices as real numbers?
>
> YES, exactly. And then I cast them to PetscInt when I compose stencils.
>
> Something like this:
>         idx[0] = (PetscInt) ivx[k][j][i];
>         idx[1] = (PetscInt) ivx[k][j][i+1];
>         idx[2] = (PetscInt) ivy[k][j][i];
>     ... and so on, where ivx, ivy, ... are the index arrays in x, y ..
> directions
>
> Then I insert (actually add) stencils using MatSetValues.
>
> By the way, you can ideally preallocate in parallel with
> MatMPIAIJSetPreallocation. To count precisely number entries in the
> diagonal & off-diagonal blocks use the same mechanism to easily access
> global indices, and then compare them with the local row range, which is
> also known:
> - within the range   -> d_nnz[i]++;
> - outside the range -> o_nnz[i]++;
>

Thanks a lot for the help. I did exactly that and it worked perfectly.

But just to clarify: if I was using 32-bit floats, would I start having
trouble when my matrix size reaches ~10 million due to the floating point
precision?


>
> Anton
>
>
>
>>
>> The advantage is that you can access global indices (including ghosts) in
>> every block using i-j-k indexing scheme.
>> I personally find this way quite easy to implement with PETSc
>>
>> Anton
>>
>
>  Thank you,
> Anush
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/97175eb9/attachment.html>

From epscodes at gmail.com  Fri May  2 12:53:52 2014
From: epscodes at gmail.com (Xiangdong)
Date: Fri, 2 May 2014 13:53:52 -0400
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
	<CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
	<F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>
	<CAAPpcpk4fagRja6o_ui8iC==jh2vQ44kJpZaJAgOOzgQHx6Fwg@mail.gmail.com>
	<CAAPpcp=9oTtbo2VjX7=dgt=qnNM8JKLBENnf8g9GDCXAR_EPsQ@mail.gmail.com>
	<8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov>
Message-ID: <CAAPpcpnPjHno2fehpge=bLF4bekegZ1S3h4RjHQWHmL85iW+pQ@mail.gmail.com>

On Thu, May 1, 2014 at 10:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On May 1, 2014, at 9:12 PM, Xiangdong <epscodes at gmail.com> wrote:
>
> > I came up with a simple example to demonstrate this "eliminating row"
> behavior. It happens when the solution x to the linearized equation Ax=b is
> out of the bound set by SNESVISetVariableBounds();
> >
> > In the attached example, I use snes to solve a simple function x-b=0.
> When you run it, it outputs the matrix as 25 rows, while the real Jacobian
> should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be
> -inf, it will output 50 rows for the Jacobian. In the first case, the norm
> given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different.
> >
> > In solving the nonlinear equations, it is likely that the solution of
> the linearized equation is out of bound, but then we can reset the
> out-of-bound solution to be lower or upper bound instead of eliminating the
> variables (the rows). Any suggestions on doing this in petsc?
>
>    This is what PETSc is doing. It is using the "active set method".
> Variables that are at their bounds are ?frozen? and then a smaller system
> is solved (involving just the variables not a that bounds) to get the next
> search direction.  Based on the next search direction some of the variables
> on the bounds may be unfrozen and other variables may be frozen. There is a
> huge literature on this topic. See for example our buddies    ? Nocedal,
> Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin,
> New York: Springer-Verlag. ISBN 978-0-387-30303-1..
>
>    The SNESGetFunctionNorm and SNESGetFunction+VecNorm  may return
> different values with the SNES VI solver. If you care about the function
> value just use SNESGetFunction() and compute the norm that way. We are
> eliminating SNESGetFunctionNorm() from PETSc because it is problematic.
>
>    If you think the SNES VI solver is actually not solving the problem, or
> giving the wrong answer than please send us the entire simple code and
> we?ll see if we have introduced any bugs into our solver. But note that the
> linear system being of different sizes is completely normal for the solver.
>

Here is an example I do not quite understand. I have a simple function F(X)
= [x1+x2+100 ; x1-x2; x3+x4; x3-x4+50]. If I solve this F(X)=0 with no
constraint, the exact solution is [x1=-50; x2=-50; x3=-25; x4=25].

If I specify the constraint as x2>=0 and x4>=0, I expect the solution from
one iteration of SNES is [-50, 0,-25,25], since the constraint on x2 should
be active now. However, the petsc outputs the solution [-50, 0, 0, 0].
Since x3 and x4 does not violate the constraint, why does the solution of
x3 and x4 change (note that x3 and x3 are decoupled from x1 and x2)? In
this case, the matrix obtained from KSPGetOperators is only 2-by-2, so two
variables  or constraints are eliminated.

Another thing I noticed is that  constraints x2>-1e-7 and x4>-1e-7 gives
solution [-50,-1e-7,-25,25]; however, constraints x2>-1e-8 and x4>-1e-8
gives the solution [-50,0,0,0].

Attached please find the simple 130-line code showing this behavior. Simply
commenting the line 37 to remove the constraints and modifying line 92 to
change the lower bounds of x2 and x4.

Thanks a lot for your time and help.

Best,
Xiangdong


>
>
>   Barry
>
>
> >
> > Thank you.
> >
> > Best,
> > Xiangdong
> >
> > P.S. If we change the lower bound of field u (line 124) to be zero, then
> the Jacobian matrix is set to be NULL by petsc.
> >
> >
> > On Thu, May 1, 2014 at 3:43 PM, Xiangdong <epscodes at gmail.com> wrote:
> > Here is the order of functions I called:
> >
> > DMDACreate3d();
> >
> > SNESCreate();
> >
> > SNESSetDM();  (DM with dof=2);
> >
> > DMSetApplicationContext();
> >
> > DMDASNESSetFunctionLocal();
> >
> > SNESVISetVariableBounds();
> >
> > DMDASNESetJacobianLocal();
> >
> > SNESSetFromOptions();
> >
> > SNESSolve();
> >
> > SNESGetKSP();
> > KSPGetSolution();
> > KSPGetRhs();
> > KSPGetOperators();   //get operator kspA, kspx, kspb;
> >
> > SNESGetFunctionNorm();  ==> get norm fnorma;
> > SNESGetFunction(); VecNorm(); ==> get norm fnormb;
> > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the
> solution x and get norm fnormc;
> >
> > Inside the FormJacobianLocal(), I output the matrix jac and preB;
> >
> > I found that fnorma matches the default SNES monitor output "SNES
> Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx
> obtained by  snescomputefunction, mat jac and preB are length 50 or
> 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25.
> >
> > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb;
> x(2:2:end)=0; It seems that it completely ignores the second degree of
> freedom (setting it to zero). I saw this for (close to) constant initial
> guess, while for heterogeneous initial guess, it works fine and the matrix
> and vector size are correct, and the solution is correct. So this
> eliminating row behavior seems to be initial guess dependent.
> >
> > I saw this even if I use snes_fd, so we can rule out the possibility of
> wrong Jacobian. For the FormFunctionLocal(), I checked via
> SNESComputeFunction and it output the correct vector of residue.
> >
> > Are the orders of function calls correct?
> >
> > Thank you.
> >
> > Xiangdong
> >
> >
> >
> >
> >
> >
> >
> > On Thu, May 1, 2014 at 1:58 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > On May 1, 2014, at 10:32 AM, Xiangdong <epscodes at gmail.com> wrote:
> >
> > > Under what condition, SNESGetFunctionNorm() will output different
> results from SENEGetFunction + VecNorm (with NORM_2)?
> > >
> > > For most of my test cases, it is the same. However, when I have some
> special (trivial) initial guess to the SNES problem, I see different norms.
> >
> >    Please send more details on your ?trivial? case where the values are
> different. It could be that we are not setting the function norm properly
> on early exit from the solvers.
> > >
> > > Another phenomenon I noticed with this is that KSP in SNES squeeze my
> matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When
> I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25,
> and the rhs and solution is with length 25. Do you have any clue on what
> triggered this? To my surprise, when I output the Jacobian inside the
> FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct
> numerical entries. Why does the operator obtained from KSP is different and
> got rows eliminated? These rows got eliminated have only one entries per
> row, but the rhs in that row is not zero. Eliminating these rows would give
> wrong solutions.
> >
> >    Hmm, we never squeeze out rows/columns from the Jacobian. The size of
> the Jacobian set with SNESSetJacobian() should always match that obtained
> with KSPGetOperators() on the linear system.   Please send more details on
> how you get this. Are you calling the KSPGetOperators() inside a
> preconditioner where the the preconditioner has chopped up the operator?
> >
> >    Barry
> >
> > >
> > > Thank you.
> > >
> > > Xiangdong
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com> wrote:
> > > It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo
> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize
> the array f. Zero the array f solved the problem and gave consistent result.
> > >
> > > Just curious, why does not petsc initialize the array f to zero by
> default inside petsc when passing the f array to FormFunctionLocal?
> > >
> > > If you directly set entires, you might not want us to spend the time
> writing those zeros.
> > >
> > > I have another quick question about the array x passed to
> FormFunctionLocal. If I want to know the which x is evaluated, how can I
> output x in a vector format? Currently, I created a global vector vecx and
> a local vector vecx_local, get the array of vecx_local_array, copy the x to
> vecx_local_array,  scatter to global vecx and output vecx. Is there a quick
> way to restore the array x to a vector and output?
> > >
> > > I cannot think of a better way than that.
> > >
> > >    Matt
> > >
> > > Thank you.
> > >
> > > Best,
> > > Xiangdong
> > >
> > >
> > >
> > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > > On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > When I run snes program,
> > >
> > >                ^^^^ what SNES program??
> > >
> > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this
> norm is different from residue norm (even if solving F(x)=0)
> > >
> > >    Please send the full output where you see this.
> > >
> > > > and also differ from norm of the Jacobian. What is the definition of
> this "SNES Function Norm??
> > >
> > >    The SNES Function Norm as printed by PETSc is suppose to the 2-norm
> of F(x) - b (where b is usually zero) and this is also the same thing as
> the ?residue norm?
> > >
> > >    Barry
> > >
> > > >
> > > > Thank you.
> > > >
> > > > Best,
> > > > Xiangdong
> > >
> > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > > -- Norbert Wiener
> > >
> >
> >
> >
> > <exdemo.c>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/44626c3d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exdemosingle.c
Type: text/x-csrc
Size: 3398 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/44626c3d/attachment.c>

From yuqing.xia at colorado.edu  Fri May  2 14:02:07 2014
From: yuqing.xia at colorado.edu (yuqing xia)
Date: Fri, 2 May 2014 13:02:07 -0600
Subject: [petsc-users] About left eigenvector for general eigenvalue problem
Message-ID: <CABCCuduaoBDdz1owa453rvozt8hE97P+oBccDezBBSQrBR5jxA@mail.gmail.com>

Hello everyone

I am trying to solve a general eigenvalue problem.
A x=\lambda B x
I also need to get the left eigenvector
y A=\lambda y B

I tested the result  for a special case where A and B are real and
symmetric. The left and right eigenvector should be the same. However,
there are not.
Then I tried to solve the problem
A x =\lambda x
The left and right eigenvectors are the same in such case. So I am
wondering what is the reason. Thanks.


Best
Yuqing Xia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/7a53c469/attachment.html>

From bsmith at mcs.anl.gov  Fri May  2 14:25:53 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 2 May 2014 14:25:53 -0500
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>
Message-ID: <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>


At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html  there are two listed. ./configure ?download-hypre

mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid   

you can also add -help to see what options are available.

  Both pretty much suck and I can?t image much reason for using them.

   Barry


On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Hello,
> 
> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?
> 
> Many thanks,
> Qin  


From bsmith at mcs.anl.gov  Fri May  2 14:36:09 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 2 May 2014 14:36:09 -0500
Subject: [petsc-users] About left eigenvector for general eigenvalue
	problem
In-Reply-To: <CABCCuduaoBDdz1owa453rvozt8hE97P+oBccDezBBSQrBR5jxA@mail.gmail.com>
References: <CABCCuduaoBDdz1owa453rvozt8hE97P+oBccDezBBSQrBR5jxA@mail.gmail.com>
Message-ID: <8F3F5B5B-2565-4409-B3C7-0DD216FC9925@mcs.anl.gov>


   Please send more information about how you tried to compute this and the matrix you used (if small just send the binary matrix or code that generates it). Vague questions like ?why doesn?t it work as I expect?? are really hard to answer. With specifics about what was done and how the answer was different make the question easier and easier to answer.

   Barry

On May 2, 2014, at 2:02 PM, yuqing xia <yuqing.xia at colorado.edu> wrote:

> Hello everyone
> 
> I am trying to solve a general eigenvalue problem. 
> A x=\lambda B x
> I also need to get the left eigenvector
> y A=\lambda y B
> 
> I tested the result  for a special case where A and B are real and symmetric. The left and right eigenvector should be the same. However, there are not. 
> Then I tried to solve the problem 
> A x =\lambda x
> The left and right eigenvectors are the same in such case. So I am wondering what is the reason. Thanks. 
> 
> 
> Best
> Yuqing Xia
> 


From bsmith at mcs.anl.gov  Fri May  2 14:49:56 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 2 May 2014 14:49:56 -0500
Subject: [petsc-users] Assembling a matrix for a DMComposite vector
In-Reply-To: <CAGqzBJwKPJmCnrJ87FHiriy2P-GKbJQzFe4VUfyREoRX4y5B6g@mail.gmail.com>
References: <CAGqzBJzA3_PZ9UtXAsmG7ARdGtwnxFpr-oqc11nZXDzrSLcraA@mail.gmail.com>
	<535F6D64.5040700@uni-mainz.de>
	<CAGqzBJzLBR=+80CCk5gLvx4J6fMJkH0qSTuwfKywAESvzpaUqA@mail.gmail.com>
	<5362C9C2.40602@uni-mainz.de>
	<CAGqzBJwKPJmCnrJ87FHiriy2P-GKbJQzFe4VUfyREoRX4y5B6g@mail.gmail.com>
Message-ID: <EA9FDD80-CC27-4CBA-8C5C-59AB4C67A266@mcs.anl.gov>


On May 2, 2014, at 11:36 AM, Anush Krishnan <anush at bu.edu> wrote:

> 
> 
> 
> On 1 May 2014 18:25, Anton Popov <popov at uni-mainz.de> wrote:
> On 5/1/14 10:39 PM, Anush Krishnan wrote:
>> Hi Anton,
>> 
>> On 29 April 2014 05:14, Anton Popov <popov at uni-mainz.de> wrote:
>> 
>> You can do the whole thing much easier (to my opinion).
>> Since you created two DMDA anyway, just do:
>> 
>> - find first index on every processor using MPI_Scan
>> - create two global vectors (no ghosts)
>> - put proper global indicies to global vectors
>> - create two local vectors (with ghosts) and set ALL entries to -1 (to have what you need in boundary ghosts)
>> - call global-to-local scatter
>> 
>> Done!
>> 
>> Won't the vectors contain floating point values? Are you storing your indices as real numbers?
> YES, exactly. And then I cast them to PetscInt when I compose stencils.
> 
> Something like this:
>         idx[0] = (PetscInt) ivx[k][j][i];
>         idx[1] = (PetscInt) ivx[k][j][i+1];
>         idx[2] = (PetscInt) ivy[k][j][i];
>     ... and so on, where ivx, ivy, ... are the index arrays in x, y .. directions
> 
> Then I insert (actually add) stencils using MatSetValues.
> 
> By the way, you can ideally preallocate in parallel with MatMPIAIJSetPreallocation. To count precisely number entries in the diagonal & off-diagonal blocks use the same mechanism to easily access global indices, and then compare them with the local row range, which is also known:
> - within the range   -> d_nnz[i]++;
> - outside the range -> o_nnz[i]++;
> 
> Thanks a lot for the help. I did exactly that and it worked perfectly.
> 
> But just to clarify: if I was using 32-bit floats, would I start having trouble when my matrix size reaches ~10 million due to the floating point precision?

   If you ./configure PETSc with ?with-64-bit-indices=1  then PetscInt will not fit in a float and the code will not work. As soon as you switch you 64 bit indices you would need to use doubles if you hope to store PetscInt in them.

  Barry

>  
> 
> Anton
> 
>>  
>> 
>> The advantage is that you can access global indices (including ghosts) in every block using i-j-k indexing scheme.
>> I personally find this way quite easy to implement with PETSc
>> 
>> Anton
>> 
>> Thank you,
>> Anush
> 
> 


From knepley at gmail.com  Fri May  2 15:10:03 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 2 May 2014 15:10:03 -0500
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <CAAPpcpnPjHno2fehpge=bLF4bekegZ1S3h4RjHQWHmL85iW+pQ@mail.gmail.com>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
	<CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
	<F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>
	<CAAPpcpk4fagRja6o_ui8iC==jh2vQ44kJpZaJAgOOzgQHx6Fwg@mail.gmail.com>
	<CAAPpcp=9oTtbo2VjX7=dgt=qnNM8JKLBENnf8g9GDCXAR_EPsQ@mail.gmail.com>
	<8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov>
	<CAAPpcpnPjHno2fehpge=bLF4bekegZ1S3h4RjHQWHmL85iW+pQ@mail.gmail.com>
Message-ID: <CAMYG4Gkp22BYnYgbqsUShnfV-yD5yu0oZDeRZ2uqbtPkeNXvEA@mail.gmail.com>

On Fri, May 2, 2014 at 12:53 PM, Xiangdong <epscodes at gmail.com> wrote:
>
> On Thu, May 1, 2014 at 10:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> On May 1, 2014, at 9:12 PM, Xiangdong <epscodes at gmail.com> wrote:
>>
>> > I came up with a simple example to demonstrate this "eliminating row"
>> behavior. It happens when the solution x to the linearized equation Ax=b is
>> out of the bound set by SNESVISetVariableBounds();
>> >
>> > In the attached example, I use snes to solve a simple function x-b=0.
>> When you run it, it outputs the matrix as 25 rows, while the real Jacobian
>> should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be
>> -inf, it will output 50 rows for the Jacobian. In the first case, the norm
>> given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different.
>> >
>> > In solving the nonlinear equations, it is likely that the solution of
>> the linearized equation is out of bound, but then we can reset the
>> out-of-bound solution to be lower or upper bound instead of eliminating the
>> variables (the rows). Any suggestions on doing this in petsc?
>>
>>    This is what PETSc is doing. It is using the "active set method".
>> Variables that are at their bounds are ?frozen? and then a smaller system
>> is solved (involving just the variables not a that bounds) to get the next
>> search direction.  Based on the next search direction some of the variables
>> on the bounds may be unfrozen and other variables may be frozen. There is a
>> huge literature on this topic. See for example our buddies    ? Nocedal,
>> Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin,
>> New York: Springer-Verlag. ISBN 978-0-387-30303-1..
>>
>>    The SNESGetFunctionNorm and SNESGetFunction+VecNorm  may return
>> different values with the SNES VI solver. If you care about the function
>> value just use SNESGetFunction() and compute the norm that way. We are
>> eliminating SNESGetFunctionNorm() from PETSc because it is problematic.
>>
>>    If you think the SNES VI solver is actually not solving the problem,
>> or giving the wrong answer than please send us the entire simple code and
>> we?ll see if we have introduced any bugs into our solver. But note that the
>> linear system being of different sizes is completely normal for the solver.
>>
>
> Here is an example I do not quite understand. I have a simple function
> F(X) = [x1+x2+100 ; x1-x2; x3+x4; x3-x4+50]. If I solve this F(X)=0 with no
> constraint, the exact solution is [x1=-50; x2=-50; x3=-25; x4=25].
>
> If I specify the constraint as x2>=0 and x4>=0, I expect the solution from
> one iteration of SNES is [-50, 0,-25,25], since the constraint on x2 should
> be active now. However, the petsc outputs the solution [-50, 0, 0, 0].
> Since x3 and x4 does not violate the constraint, why does the solution of
> x3 and x4 change (note that x3 and x3 are decoupled from x1 and x2)? In
> this case, the matrix obtained from KSPGetOperators is only 2-by-2, so two
> variables  or constraints are eliminated.
>

This just finds a local solution to the constrained problem, and these need
not be unique.

    Matt


> Another thing I noticed is that  constraints x2>-1e-7 and x4>-1e-7 gives
> solution [-50,-1e-7,-25,25]; however, constraints x2>-1e-8 and x4>-1e-8
> gives the solution [-50,0,0,0].
>
> Attached please find the simple 130-line code showing this behavior.
> Simply commenting the line 37 to remove the constraints and modifying line
> 92 to change the lower bounds of x2 and x4.
>
> Thanks a lot for your time and help.
>
> Best,
> Xiangdong
>
>
>
>
>>
>>
>>   Barry
>>
>>
>> >
>> > Thank you.
>> >
>> > Best,
>> > Xiangdong
>> >
>> > P.S. If we change the lower bound of field u (line 124) to be zero,
>> then the Jacobian matrix is set to be NULL by petsc.
>> >
>> >
>> > On Thu, May 1, 2014 at 3:43 PM, Xiangdong <epscodes at gmail.com> wrote:
>> > Here is the order of functions I called:
>> >
>> > DMDACreate3d();
>> >
>> > SNESCreate();
>> >
>> > SNESSetDM();  (DM with dof=2);
>> >
>> > DMSetApplicationContext();
>> >
>> > DMDASNESSetFunctionLocal();
>> >
>> > SNESVISetVariableBounds();
>> >
>> > DMDASNESetJacobianLocal();
>> >
>> > SNESSetFromOptions();
>> >
>> > SNESSolve();
>> >
>> > SNESGetKSP();
>> > KSPGetSolution();
>> > KSPGetRhs();
>> > KSPGetOperators();   //get operator kspA, kspx, kspb;
>> >
>> > SNESGetFunctionNorm();  ==> get norm fnorma;
>> > SNESGetFunction(); VecNorm(); ==> get norm fnormb;
>> > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the
>> solution x and get norm fnormc;
>> >
>> > Inside the FormJacobianLocal(), I output the matrix jac and preB;
>> >
>> > I found that fnorma matches the default SNES monitor output "SNES
>> Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx
>> obtained by  snescomputefunction, mat jac and preB are length 50 or
>> 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25.
>> >
>> > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb;
>> x(2:2:end)=0; It seems that it completely ignores the second degree of
>> freedom (setting it to zero). I saw this for (close to) constant initial
>> guess, while for heterogeneous initial guess, it works fine and the matrix
>> and vector size are correct, and the solution is correct. So this
>> eliminating row behavior seems to be initial guess dependent.
>> >
>> > I saw this even if I use snes_fd, so we can rule out the possibility of
>> wrong Jacobian. For the FormFunctionLocal(), I checked via
>> SNESComputeFunction and it output the correct vector of residue.
>> >
>> > Are the orders of function calls correct?
>> >
>> > Thank you.
>> >
>> > Xiangdong
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Thu, May 1, 2014 at 1:58 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >
>> > On May 1, 2014, at 10:32 AM, Xiangdong <epscodes at gmail.com> wrote:
>> >
>> > > Under what condition, SNESGetFunctionNorm() will output different
>> results from SENEGetFunction + VecNorm (with NORM_2)?
>> > >
>> > > For most of my test cases, it is the same. However, when I have some
>> special (trivial) initial guess to the SNES problem, I see different norms.
>> >
>> >    Please send more details on your ?trivial? case where the values are
>> different. It could be that we are not setting the function norm properly
>> on early exit from the solvers.
>> > >
>> > > Another phenomenon I noticed with this is that KSP in SNES squeeze my
>> matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When
>> I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25,
>> and the rhs and solution is with length 25. Do you have any clue on what
>> triggered this? To my surprise, when I output the Jacobian inside the
>> FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct
>> numerical entries. Why does the operator obtained from KSP is different and
>> got rows eliminated? These rows got eliminated have only one entries per
>> row, but the rhs in that row is not zero. Eliminating these rows would give
>> wrong solutions.
>> >
>> >    Hmm, we never squeeze out rows/columns from the Jacobian. The size
>> of the Jacobian set with SNESSetJacobian() should always match that
>> obtained with KSPGetOperators() on the linear system.   Please send more
>> details on how you get this. Are you calling the KSPGetOperators() inside a
>> preconditioner where the the preconditioner has chopped up the operator?
>> >
>> >    Barry
>> >
>> > >
>> > > Thank you.
>> > >
>> > > Xiangdong
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com>
>> wrote:
>> > > It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo
>> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize
>> the array f. Zero the array f solved the problem and gave consistent result.
>> > >
>> > > Just curious, why does not petsc initialize the array f to zero by
>> default inside petsc when passing the f array to FormFunctionLocal?
>> > >
>> > > If you directly set entires, you might not want us to spend the time
>> writing those zeros.
>> > >
>> > > I have another quick question about the array x passed to
>> FormFunctionLocal. If I want to know the which x is evaluated, how can I
>> output x in a vector format? Currently, I created a global vector vecx and
>> a local vector vecx_local, get the array of vecx_local_array, copy the x to
>> vecx_local_array,  scatter to global vecx and output vecx. Is there a quick
>> way to restore the array x to a vector and output?
>> > >
>> > > I cannot think of a better way than that.
>> > >
>> > >    Matt
>> > >
>> > > Thank you.
>> > >
>> > > Best,
>> > > Xiangdong
>> > >
>> > >
>> > >
>> > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > >
>> > > On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
>> > >
>> > > > Hello everyone,
>> > > >
>> > > > When I run snes program,
>> > >
>> > >                ^^^^ what SNES program??
>> > >
>> > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this
>> norm is different from residue norm (even if solving F(x)=0)
>> > >
>> > >    Please send the full output where you see this.
>> > >
>> > > > and also differ from norm of the Jacobian. What is the definition
>> of this "SNES Function Norm??
>> > >
>> > >    The SNES Function Norm as printed by PETSc is suppose to the
>> 2-norm of F(x) - b (where b is usually zero) and this is also the same
>> thing as the ?residue norm?
>> > >
>> > >    Barry
>> > >
>> > > >
>> > > > Thank you.
>> > > >
>> > > > Best,
>> > > > Xiangdong
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > > -- Norbert Wiener
>> > >
>> >
>> >
>> >
>> > <exdemo.c>
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/60623a26/attachment.html>

From xsli at lbl.gov  Fri May  2 15:40:19 2014
From: xsli at lbl.gov (Xiaoye S. Li)
Date: Fri, 2 May 2014 13:40:19 -0700
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>
	<2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
Message-ID: <CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>

The sequential SuperLU has ILUTP implementation, not in parallel versions.
PETSc already supports the option of using SuperLU, so you should be able
to try easily.

In SuperLU distribution:
  EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner
(returned from driver SRC/zgsisx.c)
  SRC/zgsitrf.c : the actual ILUTP factorization routine

Sherry Li


On Fri, May 2, 2014 at 12:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html there are two listed. ./configure ?download-hypre
>
> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid
>
> you can also add -help to see what options are available.
>
>   Both pretty much suck and I can?t image much reason for using them.
>
>    Barry
>
>
> On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>
> > Hello,
> >
> > I am interested in using ILUTP preconditioner with PETSc linear solver.
> There is an online doc
> https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthat mentioned it is available in PETSc with other packages (page 62-63).
> Is there any instructions or examples on how to use it?
> >
> > Many thanks,
> > Qin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/9e753a5c/attachment-0001.html>

From song.gao2 at mail.mcgill.ca  Fri May  2 16:41:26 2014
From: song.gao2 at mail.mcgill.ca (Song Gao)
Date: Fri, 2 May 2014 17:41:26 -0400
Subject: [petsc-users] Question with setting up KSP solver parameters.
Message-ID: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>

Dear PETSc users,

I'm solving a linear system in KSP and trying to setup the solver in codes.
But I feel strange because my codes don't converge unless I call
KSPGMRESSetRestart twice.

My codes looks like

call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp,
DIFFERENT_NONZERO_PATTERN, ierpetsc )
call KSPSetType ( pet_solv, 'gmres', ierpetsc )
call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc )
call KSPGetPC ( pet_solv, pet_precon, ierpetsc )
call PCSetType ( pet_precon, 'asm', ierpetsc )
call PCASMSetOverlap ( pet_precon, 1, ierpetsc )
call KSPSetUp ( pet_solv, ierpetsc )
call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub,
ierpetsc )  ! n_local is one
call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc )
call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
call KSPSetFromOptions ( pet_solv, ierpetsc )
call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )           ! adding this
line, the codes converge
call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc )

runing with 1 CPU  WITHOUT the line with red color and the codes don't
converge

  runtime options:   -ksp_monitor_true_residual -ksp_view
  0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm
9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm
9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
  2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm
1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
  3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm
1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
.......
 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm
1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm
1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm
1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01

KSP Object: 1 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: asm
    Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
    Additive Schwarz: restriction/interpolation type - RESTRICT
    [0] number of local blocks = 1
    Local solve info for each block is in the following KSP and PC objects:
    - - - - - - - - - - - - - - - - - -
    [0] local block number 0, size = 22905
    KSP Object:    (sub_)     1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (sub_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqbaij
        rows=22905, cols=22905, bs=5
        total: nonzeros=785525, allocated nonzeros=785525
        total number of mallocs used during MatSetValues calls =0
            block size is 5
    - - - - - - - - - - - - - - - - - -
  linear system matrix followed by preconditioner matrix:
  Matrix Object:   1 MPI processes
    type: shell
    rows=22905, cols=22905
  Matrix Object:   1 MPI processes
    type: seqbaij
    rows=22905, cols=22905, bs=5
    total: nonzeros=785525, allocated nonzeros=785525
    total number of mallocs used during MatSetValues calls =0
        block size is 5
     WARNING: zero iteration in iterative solver

runing with 1 CPU  WITH  the line with red color and the codes converge

runtime options:   -ksp_monitor_true_residual -ksp_view
  0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm
9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm
4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
  2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm
3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
  3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm
2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
  4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm
1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
  5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm
7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
  ............
 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm
6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm
4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm
3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
KSP Object: 1 MPI processes
  type: gmres
    GMRES: restart=29, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: asm
    Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
    Additive Schwarz: restriction/interpolation type - RESTRICT
    [0] number of local blocks = 1
    Local solve info for each block is in the following KSP and PC objects:
    - - - - - - - - - - - - - - - - - -
    [0] local block number 0, size = 22905
    KSP Object:    (sub_)     1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (sub_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqbaij
        rows=22905, cols=22905, bs=5
        total: nonzeros=785525, allocated nonzeros=785525
        total number of mallocs used during MatSetValues calls =0
            block size is 5
    - - - - - - - - - - - - - - - - - -
  linear system matrix followed by preconditioner matrix:
  Matrix Object:   1 MPI processes
    type: shell
    rows=22905, cols=22905
  Matrix Object:   1 MPI processes
    type: seqbaij
    rows=22905, cols=22905, bs=5
    total: nonzeros=785525, allocated nonzeros=785525
    total number of mallocs used during MatSetValues calls =0
        block size is 5
     WARNING: zero iteration in iterative solver


What would be my error here? Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140502/84ec0c1d/attachment.html>

From bsmith at mcs.anl.gov  Fri May  2 17:03:30 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 2 May 2014 17:03:30 -0500
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>
Message-ID: <E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>


  Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator.

   Barry

On May 2, 2014, at 4:41 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:

> Dear PETSc users,
> 
> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice.
> 
> My codes looks like
> 
> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc )
> call KSPSetType ( pet_solv, 'gmres', ierpetsc )
> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc )
> call KSPGetPC ( pet_solv, pet_precon, ierpetsc )
> call PCSetType ( pet_precon, 'asm', ierpetsc )
> call PCASMSetOverlap ( pet_precon, 1, ierpetsc )
> call KSPSetUp ( pet_solv, ierpetsc )
> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc )  ! n_local is one
> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc )
> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
> call KSPSetFromOptions ( pet_solv, ierpetsc )
> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )           ! adding this line, the codes converge
> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc )
> 
> runing with 1 CPU  WITHOUT the line with red color and the codes don't converge  
> 
>   runtime options:   -ksp_monitor_true_residual -ksp_view
>   0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>   2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
>   3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
> .......
>  28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
>  29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
>  30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
> 
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: asm
>     Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
>     Additive Schwarz: restriction/interpolation type - RESTRICT
>     [0] number of local blocks = 1
>     Local solve info for each block is in the following KSP and PC objects:
>     - - - - - - - - - - - - - - - - - -
>     [0] local block number 0, size = 22905
>     KSP Object:    (sub_)     1 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object:    (sub_)     1 MPI processes
>       type: jacobi
>       linear system matrix = precond matrix:
>       Matrix Object:       1 MPI processes
>         type: seqbaij
>         rows=22905, cols=22905, bs=5
>         total: nonzeros=785525, allocated nonzeros=785525
>         total number of mallocs used during MatSetValues calls =0
>             block size is 5
>     - - - - - - - - - - - - - - - - - -
>   linear system matrix followed by preconditioner matrix:
>   Matrix Object:   1 MPI processes
>     type: shell
>     rows=22905, cols=22905
>   Matrix Object:   1 MPI processes
>     type: seqbaij
>     rows=22905, cols=22905, bs=5
>     total: nonzeros=785525, allocated nonzeros=785525
>     total number of mallocs used during MatSetValues calls =0
>         block size is 5
>      WARNING: zero iteration in iterative solver
> 
> runing with 1 CPU  WITH  the line with red color and the codes converge  
> 
> runtime options:   -ksp_monitor_true_residual -ksp_view
>   0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
>   2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
>   3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
>   4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
>   5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
>   ............
>  24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
>  25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
>  26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: asm
>     Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
>     Additive Schwarz: restriction/interpolation type - RESTRICT
>     [0] number of local blocks = 1
>     Local solve info for each block is in the following KSP and PC objects:
>     - - - - - - - - - - - - - - - - - -
>     [0] local block number 0, size = 22905
>     KSP Object:    (sub_)     1 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object:    (sub_)     1 MPI processes
>       type: jacobi
>       linear system matrix = precond matrix:
>       Matrix Object:       1 MPI processes
>         type: seqbaij
>         rows=22905, cols=22905, bs=5
>         total: nonzeros=785525, allocated nonzeros=785525
>         total number of mallocs used during MatSetValues calls =0
>             block size is 5
>     - - - - - - - - - - - - - - - - - -
>   linear system matrix followed by preconditioner matrix:
>   Matrix Object:   1 MPI processes
>     type: shell
>     rows=22905, cols=22905
>   Matrix Object:   1 MPI processes
>     type: seqbaij
>     rows=22905, cols=22905, bs=5
>     total: nonzeros=785525, allocated nonzeros=785525
>     total number of mallocs used during MatSetValues calls =0
>         block size is 5
>      WARNING: zero iteration in iterative solver
> 
> 
> What would be my error here? Thank you.


From song.gao2 at mail.mcgill.ca  Fri May  2 17:29:24 2014
From: song.gao2 at mail.mcgill.ca (Song Gao)
Date: Fri, 2 May 2014 22:29:24 +0000
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>,
	<E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>
Message-ID: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>

Thanks for your quick reply.  What confused me is that why would the code works fine if I reset the gmres restart number by recalling kspgmressetrestart just before kspsolve? 

Sent from my iPhone

> On May 2, 2014, at 6:03 PM, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
> 
> 
>  Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator.
> 
>   Barry
> 
>> On May 2, 2014, at 4:41 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
>> 
>> Dear PETSc users,
>> 
>> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice.
>> 
>> My codes looks like
>> 
>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc )
>> call KSPSetType ( pet_solv, 'gmres', ierpetsc )
>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc )
>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc )
>> call PCSetType ( pet_precon, 'asm', ierpetsc )
>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc )
>> call KSPSetUp ( pet_solv, ierpetsc )
>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc )  ! n_local is one
>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc )
>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
>> call KSPSetFromOptions ( pet_solv, ierpetsc )
>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )           ! adding this line, the codes converge
>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc )
>> 
>> runing with 1 CPU  WITHOUT the line with red color and the codes don't converge  
>> 
>>  runtime options:   -ksp_monitor_true_residual -ksp_view
>>  0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>  1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>  2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
>>  3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
>> .......
>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
>> 
>> KSP Object: 1 MPI processes
>>  type: gmres
>>    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>    GMRES: happy breakdown tolerance 1e-30
>>  maximum iterations=10000, initial guess is zero
>>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>  left preconditioning
>>  using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI processes
>>  type: asm
>>    Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
>>    Additive Schwarz: restriction/interpolation type - RESTRICT
>>    [0] number of local blocks = 1
>>    Local solve info for each block is in the following KSP and PC objects:
>>    - - - - - - - - - - - - - - - - - -
>>    [0] local block number 0, size = 22905
>>    KSP Object:    (sub_)     1 MPI processes
>>      type: preonly
>>      maximum iterations=10000, initial guess is zero
>>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>      left preconditioning
>>      using NONE norm type for convergence test
>>    PC Object:    (sub_)     1 MPI processes
>>      type: jacobi
>>      linear system matrix = precond matrix:
>>      Matrix Object:       1 MPI processes
>>        type: seqbaij
>>        rows=22905, cols=22905, bs=5
>>        total: nonzeros=785525, allocated nonzeros=785525
>>        total number of mallocs used during MatSetValues calls =0
>>            block size is 5
>>    - - - - - - - - - - - - - - - - - -
>>  linear system matrix followed by preconditioner matrix:
>>  Matrix Object:   1 MPI processes
>>    type: shell
>>    rows=22905, cols=22905
>>  Matrix Object:   1 MPI processes
>>    type: seqbaij
>>    rows=22905, cols=22905, bs=5
>>    total: nonzeros=785525, allocated nonzeros=785525
>>    total number of mallocs used during MatSetValues calls =0
>>        block size is 5
>>     WARNING: zero iteration in iterative solver
>> 
>> runing with 1 CPU  WITH  the line with red color and the codes converge  
>> 
>> runtime options:   -ksp_monitor_true_residual -ksp_view
>>  0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>  1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
>>  2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
>>  3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
>>  4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
>>  5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
>>  ............
>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
>> KSP Object: 1 MPI processes
>>  type: gmres
>>    GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>    GMRES: happy breakdown tolerance 1e-30
>>  maximum iterations=10000, initial guess is zero
>>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>  left preconditioning
>>  using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI processes
>>  type: asm
>>    Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
>>    Additive Schwarz: restriction/interpolation type - RESTRICT
>>    [0] number of local blocks = 1
>>    Local solve info for each block is in the following KSP and PC objects:
>>    - - - - - - - - - - - - - - - - - -
>>    [0] local block number 0, size = 22905
>>    KSP Object:    (sub_)     1 MPI processes
>>      type: preonly
>>      maximum iterations=10000, initial guess is zero
>>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>      left preconditioning
>>      using NONE norm type for convergence test
>>    PC Object:    (sub_)     1 MPI processes
>>      type: jacobi
>>      linear system matrix = precond matrix:
>>      Matrix Object:       1 MPI processes
>>        type: seqbaij
>>        rows=22905, cols=22905, bs=5
>>        total: nonzeros=785525, allocated nonzeros=785525
>>        total number of mallocs used during MatSetValues calls =0
>>            block size is 5
>>    - - - - - - - - - - - - - - - - - -
>>  linear system matrix followed by preconditioner matrix:
>>  Matrix Object:   1 MPI processes
>>    type: shell
>>    rows=22905, cols=22905
>>  Matrix Object:   1 MPI processes
>>    type: seqbaij
>>    rows=22905, cols=22905, bs=5
>>    total: nonzeros=785525, allocated nonzeros=785525
>>    total number of mallocs used during MatSetValues calls =0
>>        block size is 5
>>     WARNING: zero iteration in iterative solver
>> 
>> 
>> What would be my error here? Thank you.
> 

From bsmith at mcs.anl.gov  Fri May  2 18:25:50 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 2 May 2014 18:25:50 -0500
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>,
	<E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>
	<4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>
Message-ID: <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov>


On May 2, 2014, at 5:29 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:

> Thanks for your quick reply.  What confused me is that why would the code works fine if I reset the gmres restart number by recalling kspgmressetrestart just before kspsolve? 

   It isn?t really working. Something is going wrong (run with valgrind) and setting that restart number and starting the solver just puts it in a ?happier? state so it seems to make more progress.

   Barry

> 
> Sent from my iPhone
> 
>> On May 2, 2014, at 6:03 PM, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>> Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator.
>> 
>>  Barry
>> 
>>> On May 2, 2014, at 4:41 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
>>> 
>>> Dear PETSc users,
>>> 
>>> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice.
>>> 
>>> My codes looks like
>>> 
>>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc )
>>> call KSPSetType ( pet_solv, 'gmres', ierpetsc )
>>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc )
>>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc )
>>> call PCSetType ( pet_precon, 'asm', ierpetsc )
>>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc )
>>> call KSPSetUp ( pet_solv, ierpetsc )
>>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc )  ! n_local is one
>>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc )
>>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
>>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
>>> call KSPSetFromOptions ( pet_solv, ierpetsc )
>>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )           ! adding this line, the codes converge
>>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc )
>>> 
>>> runing with 1 CPU  WITHOUT the line with red color and the codes don't converge  
>>> 
>>> runtime options:   -ksp_monitor_true_residual -ksp_view
>>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
>>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
>>> .......
>>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
>>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
>>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
>>> 
>>> KSP Object: 1 MPI processes
>>> type: gmres
>>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>   GMRES: happy breakdown tolerance 1e-30
>>> maximum iterations=10000, initial guess is zero
>>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI processes
>>> type: asm
>>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
>>>   Additive Schwarz: restriction/interpolation type - RESTRICT
>>>   [0] number of local blocks = 1
>>>   Local solve info for each block is in the following KSP and PC objects:
>>>   - - - - - - - - - - - - - - - - - -
>>>   [0] local block number 0, size = 22905
>>>   KSP Object:    (sub_)     1 MPI processes
>>>     type: preonly
>>>     maximum iterations=10000, initial guess is zero
>>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>     left preconditioning
>>>     using NONE norm type for convergence test
>>>   PC Object:    (sub_)     1 MPI processes
>>>     type: jacobi
>>>     linear system matrix = precond matrix:
>>>     Matrix Object:       1 MPI processes
>>>       type: seqbaij
>>>       rows=22905, cols=22905, bs=5
>>>       total: nonzeros=785525, allocated nonzeros=785525
>>>       total number of mallocs used during MatSetValues calls =0
>>>           block size is 5
>>>   - - - - - - - - - - - - - - - - - -
>>> linear system matrix followed by preconditioner matrix:
>>> Matrix Object:   1 MPI processes
>>>   type: shell
>>>   rows=22905, cols=22905
>>> Matrix Object:   1 MPI processes
>>>   type: seqbaij
>>>   rows=22905, cols=22905, bs=5
>>>   total: nonzeros=785525, allocated nonzeros=785525
>>>   total number of mallocs used during MatSetValues calls =0
>>>       block size is 5
>>>    WARNING: zero iteration in iterative solver
>>> 
>>> runing with 1 CPU  WITH  the line with red color and the codes converge  
>>> 
>>> runtime options:   -ksp_monitor_true_residual -ksp_view
>>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
>>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
>>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
>>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
>>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
>>> ............
>>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
>>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
>>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
>>> KSP Object: 1 MPI processes
>>> type: gmres
>>>   GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>   GMRES: happy breakdown tolerance 1e-30
>>> maximum iterations=10000, initial guess is zero
>>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>> left preconditioning
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI processes
>>> type: asm
>>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
>>>   Additive Schwarz: restriction/interpolation type - RESTRICT
>>>   [0] number of local blocks = 1
>>>   Local solve info for each block is in the following KSP and PC objects:
>>>   - - - - - - - - - - - - - - - - - -
>>>   [0] local block number 0, size = 22905
>>>   KSP Object:    (sub_)     1 MPI processes
>>>     type: preonly
>>>     maximum iterations=10000, initial guess is zero
>>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>     left preconditioning
>>>     using NONE norm type for convergence test
>>>   PC Object:    (sub_)     1 MPI processes
>>>     type: jacobi
>>>     linear system matrix = precond matrix:
>>>     Matrix Object:       1 MPI processes
>>>       type: seqbaij
>>>       rows=22905, cols=22905, bs=5
>>>       total: nonzeros=785525, allocated nonzeros=785525
>>>       total number of mallocs used during MatSetValues calls =0
>>>           block size is 5
>>>   - - - - - - - - - - - - - - - - - -
>>> linear system matrix followed by preconditioner matrix:
>>> Matrix Object:   1 MPI processes
>>>   type: shell
>>>   rows=22905, cols=22905
>>> Matrix Object:   1 MPI processes
>>>   type: seqbaij
>>>   rows=22905, cols=22905, bs=5
>>>   total: nonzeros=785525, allocated nonzeros=785525
>>>   total number of mallocs used during MatSetValues calls =0
>>>       block size is 5
>>>    WARNING: zero iteration in iterative solver
>>> 
>>> 
>>> What would be my error here? Thank you.
>> 


From danyang.su at gmail.com  Sat May  3 14:33:55 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Sat, 03 May 2014 12:33:55 -0700
Subject: [petsc-users] Question on ksp examples ex14f.F
Message-ID: <536544A3.5080703@gmail.com>

Hi All,

The codes can run successfully in release mode, but in debug mode, it 
causes the following error.

forrtl: severe (408): fort: (11): Subscript #1 of the array XX has value 
-665625807 which is less than the lower bound of 1

I can get rid of this error by replacing VecGetArray to VecGetArrayF90 
and do not use idx in XX. The same problem exists in ltog from 
DMDAGetGlobalIndices().

  Is there any other way to avoid this kind of error in fortran since 
the release mode can run without error? Is this caused by the 
configuration in Fortran?

Thanks and regards,

Danyang

From bsmith at mcs.anl.gov  Sat May  3 18:48:49 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 3 May 2014 18:48:49 -0500
Subject: [petsc-users] Question on ksp examples ex14f.F
In-Reply-To: <536544A3.5080703@gmail.com>
References: <536544A3.5080703@gmail.com>
Message-ID: <7EB1B6E2-6574-47A3-894D-D1BA8E34BEAA@mcs.anl.gov>


On May 3, 2014, at 2:33 PM, Danyang Su <danyang.su at gmail.com> wrote:

> Hi All,
> 
> The codes can run successfully in release mode, but in debug mode, it causes the following error.
> forrtl: severe (408): fort: (11): Subscript #1 of the array XX has value -665625807 which is less than the lower bound of 1
> 
> I can get rid of this error by replacing VecGetArray to VecGetArrayF90 and do not use idx in XX. The same problem exists in ltog from DMDAGetGlobalIndices().
> 
> Is there any other way to avoid this kind of error in fortran since the release mode can run without error?
> Is this caused by the configuration in Fortran?

    Certain Fortran compilers add extra code which check for out of array bounds access. Found out how to turn it off your your compiler. For example https://software.intel.com/en-us/forums/topic/271337  and do some googling.

   Barry

> 
> Thanks and regards,
> 
> Danyang


From lu_qin_2000 at yahoo.com  Sat May  3 19:24:30 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Sat, 3 May 2014 17:24:30 -0700 (PDT)
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>	<2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
	<CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
Message-ID: <1399163070.39276.YahooMailNeo@web160204.mail.bf1.yahoo.com>

Thanks a lot for both of you!

Qin


________________________________
 From: Xiaoye S. Li <xsli at lbl.gov>
To: Barry Smith <bsmith at mcs.anl.gov> 
Cc: Qin Lu <lu_qin_2000 at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
Sent: Friday, May 2, 2014 3:40 PM
Subject: Re: [petsc-users] ILUTP in PETSc
 

The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily. ?

In SuperLU distribution:

? EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c)

? SRC/zgsitrf.c : the actual ILUTP factorization routine


Sherry Li


On Fri, May 2, 2014 at 12:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:


>At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html ?there are two listed. ./configure ?download-hypre
>
>mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid
>
>you can also add -help to see what options are available.
>
>? Both pretty much suck and I can?t image much reason for using them.
>
>? ?Barry
>
>
>
>On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>
>> Hello,
>>
>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?
>>
>> Many thanks,
>> Qin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140503/54f9f746/attachment.html>

From danyang.su at gmail.com  Sat May  3 20:01:35 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Sat, 03 May 2014 18:01:35 -0700
Subject: [petsc-users] Question on ksp examples ex14f.F
In-Reply-To: <7EB1B6E2-6574-47A3-894D-D1BA8E34BEAA@mcs.anl.gov>
References: <536544A3.5080703@gmail.com>
	<7EB1B6E2-6574-47A3-894D-D1BA8E34BEAA@mcs.anl.gov>
Message-ID: <5365916F.9010500@gmail.com>

Thank, Barry. After turning off "check array bound" option, it can work 
without any problem.

Danyang
On 03/05/2014 4:48 PM, Barry Smith wrote:
> On May 3, 2014, at 2:33 PM, Danyang Su <danyang.su at gmail.com> wrote:
>
>> Hi All,
>>
>> The codes can run successfully in release mode, but in debug mode, it causes the following error.
>> forrtl: severe (408): fort: (11): Subscript #1 of the array XX has value -665625807 which is less than the lower bound of 1
>>
>> I can get rid of this error by replacing VecGetArray to VecGetArrayF90 and do not use idx in XX. The same problem exists in ltog from DMDAGetGlobalIndices().
>>
>> Is there any other way to avoid this kind of error in fortran since the release mode can run without error?
>> Is this caused by the configuration in Fortran?
>      Certain Fortran compilers add extra code which check for out of array bounds access. Found out how to turn it off your your compiler. For example https://software.intel.com/en-us/forums/topic/271337  and do some googling.
>
>     Barry
>
>> Thanks and regards,
>>
>> Danyang


From jed at jedbrown.org  Sun May  4 08:56:44 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 04 May 2014 07:56:44 -0600
Subject: [petsc-users] How to do the point-block ILU in PETSc
In-Reply-To: <CAE9uwRJJqpcuRs6ee792pskm557NK_hGQ0xjYLhaYyR_d7XP3w@mail.gmail.com>
References: <CAE9uwRJJqpcuRs6ee792pskm557NK_hGQ0xjYLhaYyR_d7XP3w@mail.gmail.com>
Message-ID: <87ha55sl43.fsf@jedbrown.org>

Please use the mailing list for questions like this.

Lulu Liu <lulu.liu at kaust.edu.sa> writes:

> Dear Jed,
>
> I saw in man-page of PCILU:
> For BAIJ matrices this implements a point block ILU
>
> Take /src/snes/examples/tutorials/ex19.c for examples, I add the following
> lines
>   ierr = MatCreate(PETSC_COMM_WORLD,&J);
>   ierr = MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,mx*my,mx*my);

This example uses DMDA so creating your own layout won't generally be
the partition you want.  You should use DMCreateMatrix().  The matrix
type is set via DMSetMatType() and -dm_mat_type.

>   ierr = MatSetType(J,MATBAIJ);
>   ierr = MatSetFromOptions(J);
>   ierr = MatSetUp(J);
>   ierr = MatAssemblyBegin(J,MAT_FINAL_ASSEMBLY);
>   ierr = MatAssemblyEnd(J,MAT_FINAL_ASSEMBLY);

This assembly should not exist.

>   ierr = SNESSetJacobian(snes,J,J,NULL,NULL);
>
> but I got errors, could you tell me how to do the point-block ILU in ex19.c
> ( the small block should be 4x4). Thanks!
>
> ./ex19 -da_grid_x 64 -da_grid_y 64 -contours -draw_pause 1  -snes_monitor
> -snes_rtol 1.e-6 -pc_type ilu

Don't modify the source at all.  Instead, run this:

$ mpiexec -n 4 ./ex19 -da_grid_x 64 -da_grid_y 64 -snes_monitor -snes_view -dm_mat_type baij                                                                                        
lid velocity = 0.000244141, prandtl # = 1, grashof # = 1
  0 SNES Function norm 1.573890417811e-02 
  1 SNES Function norm 1.602010905072e-06 
  2 SNES Function norm 1.580493963868e-11 
SNES Object: 4 MPI processes
  type: newtonls
  maximum iterations=50, maximum function evaluations=10000
  tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
  total number of linear solver iterations=368
  total number of function evaluations=3
  SNESLineSearch Object:   4 MPI processes
    type: bt
      interpolation: cubic
      alpha=1.000000e-04
    maxstep=1.000000e+08, minlambda=1.000000e-12
    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
    maximum iterations=40
  KSP Object:   4 MPI processes
    type: gmres
      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-30
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
    left preconditioning
    using PRECONDITIONED norm type for convergence test
  PC Object:   4 MPI processes
    type: bjacobi
      block Jacobi: number of blocks = 4
      Local solve is same for all blocks, in the following KSP and PC objects:
    KSP Object:    (sub_)     1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (sub_)     1 MPI processes
      type: ilu
        ILU: out-of-place factorization
        0 levels of fill
        tolerance for zero pivot 2.22045e-14
        using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
        matrix ordering: natural
        factor fill ratio given 1, needed 1
          Factored matrix follows:
            Mat Object:             1 MPI processes
              type: seqbaij
              rows=4096, cols=4096, bs=4
              package used to perform factorization: petsc
              total: nonzeros=79872, allocated nonzeros=79872
              total number of mallocs used during MatSetValues calls =0
                  block size is 4
      linear system matrix = precond matrix:
      Mat Object:       1 MPI processes
        type: seqbaij
        rows=4096, cols=4096, bs=4
        total: nonzeros=79872, allocated nonzeros=79872
        total number of mallocs used during MatSetValues calls =0
            block size is 4
    linear system matrix = precond matrix:
    Mat Object:     4 MPI processes
      type: mpibaij
      rows=16384, cols=16384, bs=4
      total: nonzeros=323584, allocated nonzeros=323584
      total number of mallocs used during MatSetValues calls =0
Number of SNES iterations = 2


> lid velocity = 0.000244141, prandtl # = 1, grashof # = 1
>   0 SNES Function norm 1.573890417811e-02
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR:
> or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
> corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> [0]PETSC ERROR:       is given.
> [0]PETSC ERROR: [0] MatFDColoringCreate_SeqAIJ line 20
> src/mat/impls/aij/seq/fdaij.c
> [0]PETSC ERROR: [0] MatFDColoringCreate line 367 src/mat/matfd/fdmatrix.c
> [0]PETSC ERROR: [0] SNESComputeJacobian_DMDA line 165
> src/snes/utils/dmdasnes.c
> [0]PETSC ERROR: [0] SNES user Jacobian function line 2151
> src/snes/interface/snes.c
> [0]PETSC ERROR: [0] SNESComputeJacobian line 2106 src/snes/interface/snes.c
> [0]PETSC ERROR: [0] SNESSolve_NEWTONLS line 144 src/snes/impls/ls/ls.c
> [0]PETSC ERROR: [0] SNESSolve line 3589 src/snes/interface/snes.c
> [0]PETSC ERROR: [0] main line 106 src/snes/examples/tutorials/ex19.c
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.3, unknown
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./ex19 on a arch-darwin-c-debug named kl-12681.local by
> liul Sun May  4 15:43:58 2014
> [0]PETSC ERROR: Libraries linked from
> /Users/liul/soft/petsc-3.4.3/petsc/arch-darwin-c-debug/lib
> [0]PETSC ERROR: Configure run at Sun Mar  9 17:02:57 2014
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
> --download-f-blas-lapack --download-mpich
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> -- 
>
> ------------------------------
> This message and its contents, including attachments are intended solely 
> for the original recipient. If you are not the intended recipient or have 
> received this message in error, please notify me immediately and delete 
> this message from your computer system. Any unauthorized use or 
> distribution is prohibited. Please consider the environment before printing 
> this email.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140504/8b65089d/attachment.pgp>

From epscodes at gmail.com  Sun May  4 15:32:55 2014
From: epscodes at gmail.com (Xiangdong)
Date: Sun, 4 May 2014 16:32:55 -0400
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <CAMYG4Gkp22BYnYgbqsUShnfV-yD5yu0oZDeRZ2uqbtPkeNXvEA@mail.gmail.com>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
	<CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
	<F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>
	<CAAPpcpk4fagRja6o_ui8iC==jh2vQ44kJpZaJAgOOzgQHx6Fwg@mail.gmail.com>
	<CAAPpcp=9oTtbo2VjX7=dgt=qnNM8JKLBENnf8g9GDCXAR_EPsQ@mail.gmail.com>
	<8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov>
	<CAAPpcpnPjHno2fehpge=bLF4bekegZ1S3h4RjHQWHmL85iW+pQ@mail.gmail.com>
	<CAMYG4Gkp22BYnYgbqsUShnfV-yD5yu0oZDeRZ2uqbtPkeNXvEA@mail.gmail.com>
Message-ID: <CAAPpcpkMOnKCijiWVo01QhmaOB=uoXE+3haPaDL+cPvJd=dRSg@mail.gmail.com>

On Fri, May 2, 2014 at 4:10 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Fri, May 2, 2014 at 12:53 PM, Xiangdong <epscodes at gmail.com> wrote:
>>
>> On Thu, May 1, 2014 at 10:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>> On May 1, 2014, at 9:12 PM, Xiangdong <epscodes at gmail.com> wrote:
>>>
>>> > I came up with a simple example to demonstrate this "eliminating row"
>>> behavior. It happens when the solution x to the linearized equation Ax=b is
>>> out of the bound set by SNESVISetVariableBounds();
>>> >
>>> > In the attached example, I use snes to solve a simple function x-b=0.
>>> When you run it, it outputs the matrix as 25 rows, while the real Jacobian
>>> should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be
>>> -inf, it will output 50 rows for the Jacobian. In the first case, the norm
>>> given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different.
>>> >
>>> > In solving the nonlinear equations, it is likely that the solution of
>>> the linearized equation is out of bound, but then we can reset the
>>> out-of-bound solution to be lower or upper bound instead of eliminating the
>>> variables (the rows). Any suggestions on doing this in petsc?
>>>
>>>    This is what PETSc is doing. It is using the "active set method".
>>> Variables that are at their bounds are ?frozen? and then a smaller system
>>> is solved (involving just the variables not a that bounds) to get the next
>>> search direction.  Based on the next search direction some of the variables
>>> on the bounds may be unfrozen and other variables may be frozen. There is a
>>> huge literature on this topic. See for example our buddies    ? Nocedal,
>>> Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin,
>>> New York: Springer-Verlag. ISBN 978-0-387-30303-1..
>>>
>>>    The SNESGetFunctionNorm and SNESGetFunction+VecNorm  may return
>>> different values with the SNES VI solver. If you care about the function
>>> value just use SNESGetFunction() and compute the norm that way. We are
>>> eliminating SNESGetFunctionNorm() from PETSc because it is problematic.
>>>
>>>    If you think the SNES VI solver is actually not solving the problem,
>>> or giving the wrong answer than please send us the entire simple code and
>>> we?ll see if we have introduced any bugs into our solver. But note that the
>>> linear system being of different sizes is completely normal for the solver.
>>>
>>
>> Here is an example I do not quite understand. I have a simple function
>> F(X) = [x1+x2+100 ; x1-x2; x3+x4; x3-x4+50]. If I solve this F(X)=0 with no
>> constraint, the exact solution is [x1=-50; x2=-50; x3=-25; x4=25].
>>
>> If I specify the constraint as x2>=0 and x4>=0, I expect the solution
>> from one iteration of SNES is [-50, 0,-25,25], since the constraint on x2
>> should be active now. However, the petsc outputs the solution [-50, 0, 0,
>> 0]. Since x3 and x4 does not violate the constraint, why does the solution
>> of x3 and x4 change (note that x3 and x3 are decoupled from x1 and x2)? In
>> this case, the matrix obtained from KSPGetOperators is only 2-by-2, so two
>> variables  or constraints are eliminated.
>>
>
> This just finds a local solution to the constrained problem, and these
> need not be unique.
>

This might be trivial, but could you please briefly explain how I can
obtain the same answer petsc outputs by hand calculation for this simple
four-variable example. What I do not understand is when the constraints get
activated and the variables get eliminated (matrix reduced from 4-by-4 to
2-by-2).

For example, as I mentioned before, when I added  x2>=0 and x4>=0 to the
unconstrained problem, why did two of these constraints get eliminated
(matrix from KSPGetOperators is 2-by-2)? In particular, the exact solution
x4=25 does not violate the newly added x4>=0, but still got changed (x4 is
actually decoupled from x1 and x2; changes/constraints on x1 and x2 should
not affect x4).


>
>     Matt
>
>
>> Another thing I noticed is that  constraints x2>-1e-7 and x4>-1e-7 gives
>> solution [-50,-1e-7,-25,25]; however, constraints x2>-1e-8 and x4>-1e-8
>> gives the solution [-50,0,0,0].
>>
>
Is there a small constant number in petsc that caused the jump of the
solution when I simply change the lower bound from -1e-7 to -1e-8?

Thanks for your time and help.

Best,
Xiangdong


> Attached please find the simple 130-line code showing this behavior.
>> Simply commenting the line 37 to remove the constraints and modifying line
>> 92 to change the lower bounds of x2 and x4.
>>
>> Thanks a lot for your time and help.
>>
>> Best,
>> Xiangdong
>>
>>
>>
>>
>>>
>>>
>>>   Barry
>>>
>>>
>>> >
>>> > Thank you.
>>> >
>>> > Best,
>>> > Xiangdong
>>> >
>>> > P.S. If we change the lower bound of field u (line 124) to be zero,
>>> then the Jacobian matrix is set to be NULL by petsc.
>>> >
>>> >
>>> > On Thu, May 1, 2014 at 3:43 PM, Xiangdong <epscodes at gmail.com> wrote:
>>> > Here is the order of functions I called:
>>> >
>>> > DMDACreate3d();
>>> >
>>> > SNESCreate();
>>> >
>>> > SNESSetDM();  (DM with dof=2);
>>> >
>>> > DMSetApplicationContext();
>>> >
>>> > DMDASNESSetFunctionLocal();
>>> >
>>> > SNESVISetVariableBounds();
>>> >
>>> > DMDASNESetJacobianLocal();
>>> >
>>> > SNESSetFromOptions();
>>> >
>>> > SNESSolve();
>>> >
>>> > SNESGetKSP();
>>> > KSPGetSolution();
>>> > KSPGetRhs();
>>> > KSPGetOperators();   //get operator kspA, kspx, kspb;
>>> >
>>> > SNESGetFunctionNorm();  ==> get norm fnorma;
>>> > SNESGetFunction(); VecNorm(); ==> get norm fnormb;
>>> > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the
>>> solution x and get norm fnormc;
>>> >
>>> > Inside the FormJacobianLocal(), I output the matrix jac and preB;
>>> >
>>> > I found that fnorma matches the default SNES monitor output "SNES
>>> Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx
>>> obtained by  snescomputefunction, mat jac and preB are length 50 or
>>> 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25.
>>> >
>>> > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb;
>>> x(2:2:end)=0; It seems that it completely ignores the second degree of
>>> freedom (setting it to zero). I saw this for (close to) constant initial
>>> guess, while for heterogeneous initial guess, it works fine and the matrix
>>> and vector size are correct, and the solution is correct. So this
>>> eliminating row behavior seems to be initial guess dependent.
>>> >
>>> > I saw this even if I use snes_fd, so we can rule out the possibility
>>> of wrong Jacobian. For the FormFunctionLocal(), I checked via
>>> SNESComputeFunction and it output the correct vector of residue.
>>> >
>>> > Are the orders of function calls correct?
>>> >
>>> > Thank you.
>>> >
>>> > Xiangdong
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, May 1, 2014 at 1:58 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> >
>>> > On May 1, 2014, at 10:32 AM, Xiangdong <epscodes at gmail.com> wrote:
>>> >
>>> > > Under what condition, SNESGetFunctionNorm() will output different
>>> results from SENEGetFunction + VecNorm (with NORM_2)?
>>> > >
>>> > > For most of my test cases, it is the same. However, when I have some
>>> special (trivial) initial guess to the SNES problem, I see different norms.
>>> >
>>> >    Please send more details on your ?trivial? case where the values
>>> are different. It could be that we are not setting the function norm
>>> properly on early exit from the solvers.
>>> > >
>>> > > Another phenomenon I noticed with this is that KSP in SNES squeeze
>>> my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50.
>>> When I use KSPGetOperators/rhs/solutions, I found that the operator is
>>> 25-by-25, and the rhs and solution is with length 25. Do you have any clue
>>> on what triggered this? To my surprise, when I output the Jacobian inside
>>> the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct
>>> numerical entries. Why does the operator obtained from KSP is different and
>>> got rows eliminated? These rows got eliminated have only one entries per
>>> row, but the rhs in that row is not zero. Eliminating these rows would give
>>> wrong solutions.
>>> >
>>> >    Hmm, we never squeeze out rows/columns from the Jacobian. The size
>>> of the Jacobian set with SNESSetJacobian() should always match that
>>> obtained with KSPGetOperators() on the linear system.   Please send more
>>> details on how you get this. Are you calling the KSPGetOperators() inside a
>>> preconditioner where the the preconditioner has chopped up the operator?
>>> >
>>> >    Barry
>>> >
>>> > >
>>> > > Thank you.
>>> > >
>>> > > Xiangdong
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>> > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com>
>>> wrote:
>>> > > It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo
>>> *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize
>>> the array f. Zero the array f solved the problem and gave consistent result.
>>> > >
>>> > > Just curious, why does not petsc initialize the array f to zero by
>>> default inside petsc when passing the f array to FormFunctionLocal?
>>> > >
>>> > > If you directly set entires, you might not want us to spend the time
>>> writing those zeros.
>>> > >
>>> > > I have another quick question about the array x passed to
>>> FormFunctionLocal. If I want to know the which x is evaluated, how can I
>>> output x in a vector format? Currently, I created a global vector vecx and
>>> a local vector vecx_local, get the array of vecx_local_array, copy the x to
>>> vecx_local_array,  scatter to global vecx and output vecx. Is there a quick
>>> way to restore the array x to a vector and output?
>>> > >
>>> > > I cannot think of a better way than that.
>>> > >
>>> > >    Matt
>>> > >
>>> > > Thank you.
>>> > >
>>> > > Best,
>>> > > Xiangdong
>>> > >
>>> > >
>>> > >
>>> > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> > >
>>> > > On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
>>> > >
>>> > > > Hello everyone,
>>> > > >
>>> > > > When I run snes program,
>>> > >
>>> > >                ^^^^ what SNES program??
>>> > >
>>> > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this
>>> norm is different from residue norm (even if solving F(x)=0)
>>> > >
>>> > >    Please send the full output where you see this.
>>> > >
>>> > > > and also differ from norm of the Jacobian. What is the definition
>>> of this "SNES Function Norm??
>>> > >
>>> > >    The SNES Function Norm as printed by PETSc is suppose to the
>>> 2-norm of F(x) - b (where b is usually zero) and this is also the same
>>> thing as the ?residue norm?
>>> > >
>>> > >    Barry
>>> > >
>>> > > >
>>> > > > Thank you.
>>> > > >
>>> > > > Best,
>>> > > > Xiangdong
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > > -- Norbert Wiener
>>> > >
>>> >
>>> >
>>> >
>>> > <exdemo.c>
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140504/7182b412/attachment.html>

From bsmith at mcs.anl.gov  Sun May  4 15:45:16 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 4 May 2014 15:45:16 -0500
Subject: [petsc-users] questions about the SNES Function Norm
In-Reply-To: <CAAPpcpkMOnKCijiWVo01QhmaOB=uoXE+3haPaDL+cPvJd=dRSg@mail.gmail.com>
References: <CAAPpcpmRXvqONdsQ_BTZPeLp8A62iMDOJgRN=4=hQ3ubevGMLA@mail.gmail.com>
	<039D1727-3A9C-4785-914C-0AFE27EA68FF@mcs.anl.gov>
	<CAAPpcpkWZKWdHpRp1JKdY1bifOex2n08svzmigsnLC5fCG1EHw@mail.gmail.com>
	<CAMYG4Gk9O2vzTM_sD9PEcwrF5=mLe=9CXFjkpHf50RgWAwcCug@mail.gmail.com>
	<CAAPpcp=Wfnb0vG7OtdckeAY+EXJmYD=bt0NxdCjLKpwQOJxLtQ@mail.gmail.com>
	<F1F79B8D-98BB-4ADB-851D-739FBFB5CA35@mcs.anl.gov>
	<CAAPpcpk4fagRja6o_ui8iC==jh2vQ44kJpZaJAgOOzgQHx6Fwg@mail.gmail.com>
	<CAAPpcp=9oTtbo2VjX7=dgt=qnNM8JKLBENnf8g9GDCXAR_EPsQ@mail.gmail.com>
	<8E9F8FD2-8640-4CAD-9F4B-C8C5FC576399@mcs.anl.gov>
	<CAAPpcpnPjHno2fehpge=bLF4bekegZ1S3h4RjHQWHmL85iW+pQ@mail.gmail.com>
	<CAMYG4Gkp22BYnYgbqsUShnfV-yD5yu0oZDeRZ2uqbtPkeNXvEA@mail.gmail.com>
	<CAAPpcpkMOnKCijiWVo01QhmaOB=uoXE+3haPaDL+cPvJd=dRSg@mail.gmail.com>
Message-ID: <DE6D983D-F349-4A9C-827D-6C2F2DEFBA47@mcs.anl.gov>


  You will need to work your way through the code in SNESSolve_VINEWTONRSLS() which is in src/snes/impls/vi/rs  http://www.mcs.anl.gov/petsc/petsc-dev/src/snes/impls/vi/rs/virs.c.html   It is not a trivial algorithm but it is reasonably straightforward.

   Barry


On May 4, 2014, at 3:32 PM, Xiangdong <epscodes at gmail.com> wrote:

> 
> 
> 
> On Fri, May 2, 2014 at 4:10 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Fri, May 2, 2014 at 12:53 PM, Xiangdong <epscodes at gmail.com> wrote:
> On Thu, May 1, 2014 at 10:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> On May 1, 2014, at 9:12 PM, Xiangdong <epscodes at gmail.com> wrote:
> 
> > I came up with a simple example to demonstrate this "eliminating row" behavior. It happens when the solution x to the linearized equation Ax=b is out of the bound set by SNESVISetVariableBounds();
> >
> > In the attached example, I use snes to solve a simple function x-b=0. When you run it, it outputs the matrix as 25 rows, while the real Jacobian should be 5*5*2=50 rows. If one changes the lower bound in line 125 to be -inf, it will output 50 rows for the Jacobian. In the first case, the norm given by SNESGetFunctionNorm and SNESGetFunction+VecNorm are also different.
> >
> > In solving the nonlinear equations, it is likely that the solution of the linearized equation is out of bound, but then we can reset the out-of-bound solution to be lower or upper bound instead of eliminating the variables (the rows). Any suggestions on doing this in petsc?
> 
>    This is what PETSc is doing. It is using the "active set method". Variables that are at their bounds are ?frozen? and then a smaller system is solved (involving just the variables not a that bounds) to get the next search direction.  Based on the next search direction some of the variables on the bounds may be unfrozen and other variables may be frozen. There is a huge literature on this topic. See for example our buddies    ? Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization (2nd ed.). Berlin, New York: Springer-Verlag. ISBN 978-0-387-30303-1..
> 
>    The SNESGetFunctionNorm and SNESGetFunction+VecNorm  may return different values with the SNES VI solver. If you care about the function value just use SNESGetFunction() and compute the norm that way. We are eliminating SNESGetFunctionNorm() from PETSc because it is problematic.
> 
>    If you think the SNES VI solver is actually not solving the problem, or giving the wrong answer than please send us the entire simple code and we?ll see if we have introduced any bugs into our solver. But note that the linear system being of different sizes is completely normal for the solver.
> 
> Here is an example I do not quite understand. I have a simple function F(X) = [x1+x2+100 ; x1-x2; x3+x4; x3-x4+50]. If I solve this F(X)=0 with no constraint, the exact solution is [x1=-50; x2=-50; x3=-25; x4=25].
> 
> If I specify the constraint as x2>=0 and x4>=0, I expect the solution from one iteration of SNES is [-50, 0,-25,25], since the constraint on x2 should be active now. However, the petsc outputs the solution [-50, 0, 0, 0]. Since x3 and x4 does not violate the constraint, why does the solution of x3 and x4 change (note that x3 and x3 are decoupled from x1 and x2)? In this case, the matrix obtained from KSPGetOperators is only 2-by-2, so two variables  or constraints are eliminated.
> 
> This just finds a local solution to the constrained problem, and these need not be unique.
> 
> This might be trivial, but could you please briefly explain how I can obtain the same answer petsc outputs by hand calculation for this simple four-variable example. What I do not understand is when the constraints get activated and the variables get eliminated (matrix reduced from 4-by-4 to 2-by-2). 
> 
> For example, as I mentioned before, when I added  x2>=0 and x4>=0 to the unconstrained problem, why did two of these constraints get eliminated (matrix from KSPGetOperators is 2-by-2)? In particular, the exact solution x4=25 does not violate the newly added x4>=0, but still got changed (x4 is actually decoupled from x1 and x2; changes/constraints on x1 and x2 should not affect x4).
> 
>  
> 
>     Matt
>  
> Another thing I noticed is that  constraints x2>-1e-7 and x4>-1e-7 gives solution [-50,-1e-7,-25,25]; however, constraints x2>-1e-8 and x4>-1e-8 gives the solution [-50,0,0,0].
> 
> Is there a small constant number in petsc that caused the jump of the solution when I simply change the lower bound from -1e-7 to -1e-8?
>  
> Thanks for your time and help.
> 
> Best,
> Xiangdong
> 
>  
> Attached please find the simple 130-line code showing this behavior. Simply commenting the line 37 to remove the constraints and modifying line 92 to change the lower bounds of x2 and x4.
> 
> Thanks a lot for your time and help.
> 
> Best,
> Xiangdong
> 
> 
>  
> 
> 
>   Barry
> 
> 
> >
> > Thank you.
> >
> > Best,
> > Xiangdong
> >
> > P.S. If we change the lower bound of field u (line 124) to be zero, then the Jacobian matrix is set to be NULL by petsc.
> >
> >
> > On Thu, May 1, 2014 at 3:43 PM, Xiangdong <epscodes at gmail.com> wrote:
> > Here is the order of functions I called:
> >
> > DMDACreate3d();
> >
> > SNESCreate();
> >
> > SNESSetDM();  (DM with dof=2);
> >
> > DMSetApplicationContext();
> >
> > DMDASNESSetFunctionLocal();
> >
> > SNESVISetVariableBounds();
> >
> > DMDASNESetJacobianLocal();
> >
> > SNESSetFromOptions();
> >
> > SNESSolve();
> >
> > SNESGetKSP();
> > KSPGetSolution();
> > KSPGetRhs();
> > KSPGetOperators();   //get operator kspA, kspx, kspb;
> >
> > SNESGetFunctionNorm();  ==> get norm fnorma;
> > SNESGetFunction(); VecNorm(); ==> get norm fnormb;
> > SNESComputeFunction(); VecNorm(); ==> function evaluation fx at the solution x and get norm fnormc;
> >
> > Inside the FormJacobianLocal(), I output the matrix jac and preB;
> >
> > I found that fnorma matches the default SNES monitor output "SNES Function norm", but fnormb=fnormc != fnorma. The solution x, the residue fx obtained by  snescomputefunction, mat jac and preB are length 50 or 50-by-50, while the kspA, kspx, kspb are 25-by-25 or length 25.
> >
> > I checked that kspA=jac(1:2:end,1:2:end) and x(1:2:end)= kspA\kspb; x(2:2:end)=0; It seems that it completely ignores the second degree of freedom (setting it to zero). I saw this for (close to) constant initial guess, while for heterogeneous initial guess, it works fine and the matrix and vector size are correct, and the solution is correct. So this eliminating row behavior seems to be initial guess dependent.
> >
> > I saw this even if I use snes_fd, so we can rule out the possibility of wrong Jacobian. For the FormFunctionLocal(), I checked via SNESComputeFunction and it output the correct vector of residue.
> >
> > Are the orders of function calls correct?
> >
> > Thank you.
> >
> > Xiangdong
> >
> >
> >
> >
> >
> >
> >
> > On Thu, May 1, 2014 at 1:58 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > On May 1, 2014, at 10:32 AM, Xiangdong <epscodes at gmail.com> wrote:
> >
> > > Under what condition, SNESGetFunctionNorm() will output different results from SENEGetFunction + VecNorm (with NORM_2)?
> > >
> > > For most of my test cases, it is the same. However, when I have some special (trivial) initial guess to the SNES problem, I see different norms.
> >
> >    Please send more details on your ?trivial? case where the values are different. It could be that we are not setting the function norm properly on early exit from the solvers.
> > >
> > > Another phenomenon I noticed with this is that KSP in SNES squeeze my matrix by eliminating rows. I have a Jacobian supposed to be 50-by-50. When I use KSPGetOperators/rhs/solutions, I found that the operator is 25-by-25, and the rhs and solution is with length 25. Do you have any clue on what triggered this? To my surprise, when I output the Jacobian inside the FormJacobianLocal, it outputs the correct matrix 50-by-50 with correct numerical entries. Why does the operator obtained from KSP is different and got rows eliminated? These rows got eliminated have only one entries per row, but the rhs in that row is not zero. Eliminating these rows would give wrong solutions.
> >
> >    Hmm, we never squeeze out rows/columns from the Jacobian. The size of the Jacobian set with SNESSetJacobian() should always match that obtained with KSPGetOperators() on the linear system.   Please send more details on how you get this. Are you calling the KSPGetOperators() inside a preconditioner where the the preconditioner has chopped up the operator?
> >
> >    Barry
> >
> > >
> > > Thank you.
> > >
> > > Xiangdong
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Apr 29, 2014 at 3:12 PM, Matthew Knepley <knepley at gmail.com> wrote:
> > > On Tue, Apr 29, 2014 at 2:09 PM, Xiangdong <epscodes at gmail.com> wrote:
> > > It turns out to a be a bug  in my FormFunctionLocal(DMDALocalInfo *info,PetscScalar **x,PetscScalar **f,AppCtx *user). I forgot to initialize the array f. Zero the array f solved the problem and gave consistent result.
> > >
> > > Just curious, why does not petsc initialize the array f to zero by default inside petsc when passing the f array to FormFunctionLocal?
> > >
> > > If you directly set entires, you might not want us to spend the time writing those zeros.
> > >
> > > I have another quick question about the array x passed to FormFunctionLocal. If I want to know the which x is evaluated, how can I output x in a vector format? Currently, I created a global vector vecx and a local vector vecx_local, get the array of vecx_local_array, copy the x to vecx_local_array,  scatter to global vecx and output vecx. Is there a quick way to restore the array x to a vector and output?
> > >
> > > I cannot think of a better way than that.
> > >
> > >    Matt
> > >
> > > Thank you.
> > >
> > > Best,
> > > Xiangdong
> > >
> > >
> > >
> > > On Mon, Apr 28, 2014 at 10:28 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > > On Apr 28, 2014, at 3:23 PM, Xiangdong <epscodes at gmail.com> wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > When I run snes program,
> > >
> > >                ^^^^ what SNES program??
> > >
> > > > it outputs "SNES Function norm 1.23456789e+10". It seems that this norm is different from residue norm (even if solving F(x)=0)
> > >
> > >    Please send the full output where you see this.
> > >
> > > > and also differ from norm of the Jacobian. What is the definition of this "SNES Function Norm??
> > >
> > >    The SNES Function Norm as printed by PETSc is suppose to the 2-norm of F(x) - b (where b is usually zero) and this is also the same thing as the ?residue norm?
> > >
> > >    Barry
> > >
> > > >
> > > > Thank you.
> > > >
> > > > Best,
> > > > Xiangdong
> > >
> > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > > -- Norbert Wiener
> > >
> >
> >
> >
> > <exdemo.c>
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 


From francium87 at hotmail.com  Mon May  5 07:25:22 2014
From: francium87 at hotmail.com (linjing bo)
Date: Mon, 5 May 2014 12:25:22 +0000
Subject: [petsc-users] VecValidValues() reports NaN found
Message-ID: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>


Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below 


[3]PETSC ERROR: --------------------- Error Message ------------------------------------           
[3]PETSC ERROR: Floating point exception!                                                          
[3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2!                                                                               
[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                                         
[3]PETSC ERROR: See docs/changes/index.html for recent updates.                                    
[3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                                
[3]PETSC ERROR: See docs/index.html for manual pages.                                              
[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:03:20 2014                                                                                                
[3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        
[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c       
[3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h         
[3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c     
[3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c      
[3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below

---------------------------Solver section--------------------------

  call VecSet( pet_bp_b, one, ierr)

  vecidx=[0,1]
  call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
  write(*,*) ' first two values ', first(1), first(2)

  call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
  call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
  call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
  CHKERRQ(ierr)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/b83c8056/attachment.html>

From knepley at gmail.com  Mon May  5 07:27:52 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 5 May 2014 07:27:52 -0500
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>
Message-ID: <CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>

On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:

>  Hi, I'm trying to use PETSc's ksp method to solve a linear system. When
> running, Error is reported by VecValidValues() that NaN or Inf is found
> with error message listed below
>
>
> [3]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [3]PETSC ERROR: Floating point
> exception!
> [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite
> at beginning of function: Parameter number
> 2!
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [3]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [3]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [3]PETSC ERROR: See docs/index.html for manual
> pages.
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:03:20
> 2014
>
> [3]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [3]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: VecValidValues() line 28 in
> /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c
>

It looks like the vector after preconditioner application is bad. What is
the preconditioner?

  Matt


>
> [3]PETSC ERROR: PCApply() line 436 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [3]PETSC ERROR: KSP_PCApply() line 227 in
> /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h
> [3]PETSC ERROR: KSPInitialResidual() line 64 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c
> [3]PETSC ERROR: KSPSolve_GMRES() line 239 in
> /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c
> [3]PETSC ERROR: KSPSolve() line 441 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
> After read the source code shown by backtrack informations, I realize the
> problem is in the right hand side vector. So I make a trial of set right
> hand side vector to ONE by VecSet, But the program still shows error
> message above, and using VecView or VecGetValue to investigate the first
> value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly
> describe the problem. The code related is listed below
>
> ---------------------------Solver section--------------------------
>
>   call VecSet( pet_bp_b, one, ierr)
>
>   vecidx=[0,1]
>   call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
>   write(*,*) ' first two values ', first(1), first(2)
>
>   call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
>   call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
>   call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
>   CHKERRQ(ierr)
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/813e01b0/attachment.html>

From francium87 at hotmail.com  Mon May  5 07:56:39 2014
From: francium87 at hotmail.com (linjing bo)
Date: Mon, 5 May 2014 12:56:39 +0000
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>,
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>
Message-ID: <BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>

I use JACOBI. The message showed is with JACOBI. 


Wired situation is that the backtrack information shows the location
 is before actually apply PC, so I guess the rhs vec is not changed at 
this point.

Another wired thing is : Because the original code is
 to complex. I write out the A matrix in Ax=b, and write a small test 
code to read in this matrix and solve it, no error showed. The KSP, PC 
are all set to be the same.

When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function:

[2]PETSC ERROR: --------------------- Error Message ------------------------------------
[2]PETSC ERROR: Argument out of range!                                                  
[2]PETSC ERROR: Cannot log negative flops!                                              
[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                              
[2]PETSC ERROR: See docs/changes/index.html for recent updates.                         
[2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                     
[2]PETSC ERROR: See docs/index.html for manual pages.                                   
[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:51:27 2014                                                                                                
[2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        
[2]PETSC ERROR: ------------------------------------------------------------------------           
[2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h                    
[2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c                                                                                              
[2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c      
[2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c       
[2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             
[2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


Date: Mon, 5 May 2014 07:27:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:


Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below 


[3]PETSC ERROR: --------------------- Error Message ------------------------------------           

[3]PETSC ERROR: Floating point exception!                                                          
[3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2!                                                                               

[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                                         
[3]PETSC ERROR: See docs/changes/index.html for recent updates.                                    

[3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                                
[3]PETSC ERROR: See docs/index.html for manual pages.                                              
[3]PETSC ERROR: ------------------------------------------------------------------------           

[3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:03:20 2014                                                                                                

[3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        

[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c  

It looks like the vector after preconditioner application is bad. What is the preconditioner?
  Matt 
     
[3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h         

[3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c     
[3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c      
[3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below


---------------------------Solver section--------------------------

  call VecSet( pet_bp_b, one, ierr)

  vecidx=[0,1]
  call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
  write(*,*) ' first two values ', first(1), first(2)


  call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
  call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
  call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
  CHKERRQ(ierr)


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.

-- Norbert Wiener
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/87f2204e/attachment-0001.html>

From knepley at gmail.com  Mon May  5 08:12:05 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 5 May 2014 08:12:05 -0500
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>
Message-ID: <CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>

On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:

> I use JACOBI. The message showed is with JACOBI.
>
>
> Wired situation is that the backtrack information shows the location is
> before actually apply PC, so I guess the rhs vec is not changed at this
> point.
>
> Another wired thing is : Because the original code is to complex. I write
> out the A matrix in Ax=b, and write a small test code to read in this
> matrix and solve it, no error showed. The KSP, PC are all set to be the
> same.
>
> When I try to using ILU, more wired error happens, the backtrack info
> shows it died in a Flops logging function:
>

1) Run in serial until it works

2) It looks like you have memory overwriting problems. Run with valgrind

   Matt


> [2]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [2]PETSC ERROR: Argument out of
> range!
> [2]PETSC ERROR: Cannot log negative
> flops!
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [2]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [2]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [2]PETSC ERROR: See docs/index.html for manual
> pages.
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:51:27
> 2014
>
> [2]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [2]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [2]PETSC ERROR: PetscLogFlops() line 204 in
> /tmp/petsc-3.4.4/include/petsclog.h
> [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in
> /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c
>
> [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in
> /tmp/petsc-3.4.4/src/mat/interface/matrix.c
> [2]PETSC ERROR: PCSetUp_ILU() line 232 in
> /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
> [2]PETSC ERROR: PCSetUp() line 890 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [2]PETSC ERROR: KSPSetUp() line 278 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> [2]PETSC ERROR: KSPSolve() line 399 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
>
>
> ------------------------------
> Date: Mon, 5 May 2014 07:27:52 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:
>
>  Hi, I'm trying to use PETSc's ksp method to solve a linear system. When
> running, Error is reported by VecValidValues() that NaN or Inf is found
> with error message listed below
>
>
> [3]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [3]PETSC ERROR: Floating point
> exception!
> [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite
> at beginning of function: Parameter number
> 2!
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [3]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [3]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [3]PETSC ERROR: See docs/index.html for manual
> pages.
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:03:20
> 2014
>
> [3]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [3]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: VecValidValues() line 28 in
> /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c
>
>
> It looks like the vector after preconditioner application is bad. What is
> the preconditioner?
>
>   Matt
>
>
>
> [3]PETSC ERROR: PCApply() line 436 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [3]PETSC ERROR: KSP_PCApply() line 227 in
> /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h
> [3]PETSC ERROR: KSPInitialResidual() line 64 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c
> [3]PETSC ERROR: KSPSolve_GMRES() line 239 in
> /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c
> [3]PETSC ERROR: KSPSolve() line 441 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
> After read the source code shown by backtrack informations, I realize the
> problem is in the right hand side vector. So I make a trial of set right
> hand side vector to ONE by VecSet, But the program still shows error
> message above, and using VecView or VecGetValue to investigate the first
> value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly
> describe the problem. The code related is listed below
>
> ---------------------------Solver section--------------------------
>
>   call VecSet( pet_bp_b, one, ierr)
>
>   vecidx=[0,1]
>   call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
>   write(*,*) ' first two values ', first(1), first(2)
>
>   call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
>   call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
>   call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
>   CHKERRQ(ierr)
>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/0afe5b2d/attachment.html>

From francium87 at hotmail.com  Mon May  5 08:15:38 2014
From: francium87 at hotmail.com (linjing bo)
Date: Mon, 5 May 2014 13:15:38 +0000
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>,
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>,
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>,
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>
Message-ID: <BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>

Ok, I will try it . Thanks for your advise. 

Date: Mon, 5 May 2014 08:12:05 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:


I use JACOBI. The message showed is with JACOBI. 


Wired situation is that the backtrack information shows the location
 is before actually apply PC, so I guess the rhs vec is not changed at 
this point.

Another wired thing is : Because the original code is
 to complex. I write out the A matrix in Ax=b, and write a small test 
code to read in this matrix and solve it, no error showed. The KSP, PC 
are all set to be the same.

When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function:

1) Run in serial until it works

2) It looks like you have memory overwriting problems. Run with valgrind
   Matt 
[2]PETSC ERROR: --------------------- Error Message ------------------------------------
[2]PETSC ERROR: Argument out of range!                                                  
[2]PETSC ERROR: Cannot log negative flops!                                              

[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                              
[2]PETSC ERROR: See docs/changes/index.html for recent updates.                         

[2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                     
[2]PETSC ERROR: See docs/index.html for manual pages.                                   
[2]PETSC ERROR: ------------------------------------------------------------------------

[2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:51:27 2014                                                                                                

[2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        

[2]PETSC ERROR: ------------------------------------------------------------------------           
[2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h                    
[2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c                                                                                              

[2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c      
[2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c       
[2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               

[2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             
[2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


Date: Mon, 5 May 2014 07:27:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com

CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:


Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below 


[3]PETSC ERROR: --------------------- Error Message ------------------------------------           


[3]PETSC ERROR: Floating point exception!                                                          
[3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2!                                                                               


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                                         
[3]PETSC ERROR: See docs/changes/index.html for recent updates.                                    


[3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                                
[3]PETSC ERROR: See docs/index.html for manual pages.                                              
[3]PETSC ERROR: ------------------------------------------------------------------------           


[3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:03:20 2014                                                                                                


[3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c  


It looks like the vector after preconditioner application is bad. What is the preconditioner?
  Matt 

     
[3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h         


[3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c     
[3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c      
[3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below


---------------------------Solver section--------------------------

  call VecSet( pet_bp_b, one, ierr)

  vecidx=[0,1]
  call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
  write(*,*) ' first two values ', first(1), first(2)


  call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
  call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
  call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
  CHKERRQ(ierr)


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.

-- Norbert Wiener
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/5cf4b47c/attachment-0001.html>

From song.gao2 at mail.mcgill.ca  Mon May  5 08:28:25 2014
From: song.gao2 at mail.mcgill.ca (Song Gao)
Date: Mon, 5 May 2014 09:28:25 -0400
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>
	<E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>
	<4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>
	<3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov>
Message-ID: <CAJitgPUG+pyGF6CiwAfEv8wRrf41xZB9n77ukc8QnZ5gFqC=0A@mail.gmail.com>

Thanks for reply. What do you mean by a ?happier? state? I check the
converged solution (the one which call kspgmressetrestart twice), the
solution should be correct.

I run with valgrind both codes (one call kspgmressetrestart once and
another call kspgmressetrestart twice)
Both of them have the errors:                          what does this mean?
Thank you in advance.
==7858== Conditional jump or move depends on uninitialised value(s)
==7858==    at 0xE71DFB: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0xE71640: mkl_cfg_file (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0xE6E068: DDOT (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x73281A: VecNorm_Seq (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x730BF4: VecNormalize (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x7BC5A8: KSPSolve_GMRES (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0xB8A06E: KSPSolve (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x7B659F: kspsolve_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x5EAE84: petsolv_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x4ECD46: flowsol_ng_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x507E4E: iterprc_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x51D1B4: solnalg_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==
==7858== Conditional jump or move depends on uninitialised value(s)
==7858==    at 0xE71E25: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0xE71640: mkl_cfg_file (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0xE6E068: DDOT (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x73281A: VecNorm_Seq (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x730BF4: VecNormalize (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x7BC5A8: KSPSolve_GMRES (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0xB8A06E: KSPSolve (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x7B659F: kspsolve_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x5EAE84: petsolv_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x4ECD46: flowsol_ng_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x507E4E: iterprc_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==    by 0x51D1B4: solnalg_ (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
==7858==


On Fri, May 2, 2014 at 7:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On May 2, 2014, at 5:29 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
>
> > Thanks for your quick reply.  What confused me is that why would the
> code works fine if I reset the gmres restart number by recalling
> kspgmressetrestart just before kspsolve?
>
>    It isn?t really working. Something is going wrong (run with valgrind)
> and setting that restart number and starting the solver just puts it in a
> ?happier? state so it seems to make more progress.
>
>    Barry
>
> >
> > Sent from my iPhone
> >
> >> On May 2, 2014, at 6:03 PM, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
> >>
> >>
> >> Your shell matrix is buggy in some way. Whenever the residual norm
> jumps like crazy at a restart it means that something is wrong with the
> operator.
> >>
> >>  Barry
> >>
> >>> On May 2, 2014, at 4:41 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> >>>
> >>> Dear PETSc users,
> >>>
> >>> I'm solving a linear system in KSP and trying to setup the solver in
> codes. But I feel strange because my codes don't converge unless I call
> KSPGMRESSetRestart twice.
> >>>
> >>> My codes looks like
> >>>
> >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp,
> DIFFERENT_NONZERO_PATTERN, ierpetsc )
> >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc )
> >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc )
> >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc )
> >>> call PCSetType ( pet_precon, 'asm', ierpetsc )
> >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc )
> >>> call KSPSetUp ( pet_solv, ierpetsc )
> >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub,
> ierpetsc )  ! n_local is one
> >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc )
> >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
> >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
> >>> call KSPSetFromOptions ( pet_solv, ierpetsc )
> >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )           ! adding
> this line, the codes converge
> >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc )
> >>>
> >>> runing with 1 CPU  WITHOUT the line with red color and the codes don't
> converge
> >>>
> >>> runtime options:   -ksp_monitor_true_residual -ksp_view
> >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm
> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm
> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm
> 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
> >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm
> 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
> >>> .......
> >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm
> 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
> >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm
> 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
> >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm
> 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
> >>>
> >>> KSP Object: 1 MPI processes
> >>> type: gmres
> >>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >>>   GMRES: happy breakdown tolerance 1e-30
> >>> maximum iterations=10000, initial guess is zero
> >>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >>> left preconditioning
> >>> using PRECONDITIONED norm type for convergence test
> >>> PC Object: 1 MPI processes
> >>> type: asm
> >>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
> >>>   Additive Schwarz: restriction/interpolation type - RESTRICT
> >>>   [0] number of local blocks = 1
> >>>   Local solve info for each block is in the following KSP and PC
> objects:
> >>>   - - - - - - - - - - - - - - - - - -
> >>>   [0] local block number 0, size = 22905
> >>>   KSP Object:    (sub_)     1 MPI processes
> >>>     type: preonly
> >>>     maximum iterations=10000, initial guess is zero
> >>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >>>     left preconditioning
> >>>     using NONE norm type for convergence test
> >>>   PC Object:    (sub_)     1 MPI processes
> >>>     type: jacobi
> >>>     linear system matrix = precond matrix:
> >>>     Matrix Object:       1 MPI processes
> >>>       type: seqbaij
> >>>       rows=22905, cols=22905, bs=5
> >>>       total: nonzeros=785525, allocated nonzeros=785525
> >>>       total number of mallocs used during MatSetValues calls =0
> >>>           block size is 5
> >>>   - - - - - - - - - - - - - - - - - -
> >>> linear system matrix followed by preconditioner matrix:
> >>> Matrix Object:   1 MPI processes
> >>>   type: shell
> >>>   rows=22905, cols=22905
> >>> Matrix Object:   1 MPI processes
> >>>   type: seqbaij
> >>>   rows=22905, cols=22905, bs=5
> >>>   total: nonzeros=785525, allocated nonzeros=785525
> >>>   total number of mallocs used during MatSetValues calls =0
> >>>       block size is 5
> >>>    WARNING: zero iteration in iterative solver
> >>>
> >>> runing with 1 CPU  WITH  the line with red color and the codes converge
> >>>
> >>> runtime options:   -ksp_monitor_true_residual -ksp_view
> >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm
> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm
> 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
> >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm
> 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
> >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm
> 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
> >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm
> 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
> >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm
> 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
> >>> ............
> >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm
> 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
> >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm
> 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
> >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm
> 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
> >>> KSP Object: 1 MPI processes
> >>> type: gmres
> >>>   GMRES: restart=29, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >>>   GMRES: happy breakdown tolerance 1e-30
> >>> maximum iterations=10000, initial guess is zero
> >>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >>> left preconditioning
> >>> using PRECONDITIONED norm type for convergence test
> >>> PC Object: 1 MPI processes
> >>> type: asm
> >>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
> >>>   Additive Schwarz: restriction/interpolation type - RESTRICT
> >>>   [0] number of local blocks = 1
> >>>   Local solve info for each block is in the following KSP and PC
> objects:
> >>>   - - - - - - - - - - - - - - - - - -
> >>>   [0] local block number 0, size = 22905
> >>>   KSP Object:    (sub_)     1 MPI processes
> >>>     type: preonly
> >>>     maximum iterations=10000, initial guess is zero
> >>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >>>     left preconditioning
> >>>     using NONE norm type for convergence test
> >>>   PC Object:    (sub_)     1 MPI processes
> >>>     type: jacobi
> >>>     linear system matrix = precond matrix:
> >>>     Matrix Object:       1 MPI processes
> >>>       type: seqbaij
> >>>       rows=22905, cols=22905, bs=5
> >>>       total: nonzeros=785525, allocated nonzeros=785525
> >>>       total number of mallocs used during MatSetValues calls =0
> >>>           block size is 5
> >>>   - - - - - - - - - - - - - - - - - -
> >>> linear system matrix followed by preconditioner matrix:
> >>> Matrix Object:   1 MPI processes
> >>>   type: shell
> >>>   rows=22905, cols=22905
> >>> Matrix Object:   1 MPI processes
> >>>   type: seqbaij
> >>>   rows=22905, cols=22905, bs=5
> >>>   total: nonzeros=785525, allocated nonzeros=785525
> >>>   total number of mallocs used during MatSetValues calls =0
> >>>       block size is 5
> >>>    WARNING: zero iteration in iterative solver
> >>>
> >>>
> >>> What would be my error here? Thank you.
> >>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/2d94024b/attachment.html>

From bsmith at mcs.anl.gov  Mon May  5 09:03:01 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 5 May 2014 09:03:01 -0500
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <CAJitgPUG+pyGF6CiwAfEv8wRrf41xZB9n77ukc8QnZ5gFqC=0A@mail.gmail.com>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>
	<E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>
	<4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>
	<3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov>
	<CAJitgPUG+pyGF6CiwAfEv8wRrf41xZB9n77ukc8QnZ5gFqC=0A@mail.gmail.com>
Message-ID: <07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov>


  If you run valgrind with the debug version of the libraries it will provide more information about the line numbers where the problem occurred, etc. recommend doing that.

   Either your initial solution or right hand side has garbage in it or the wrong blas may be being linked in. But there is definitely a problem

   Barry

On May 5, 2014, at 8:28 AM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:

> Thanks for reply. What do you mean by a ?happier? state? I check the converged solution (the one which call kspgmressetrestart twice), the solution should be correct.
> 
> I run with valgrind both codes (one call kspgmressetrestart once and another call kspgmressetrestart twice) 
> Both of them have the errors:                          what does this mean? Thank you in advance.
> ==7858== Conditional jump or move depends on uninitialised value(s)
> ==7858==    at 0xE71DFB: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858== 
> ==7858== Conditional jump or move depends on uninitialised value(s)
> ==7858==    at 0xE71E25: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==    by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> ==7858==
> 
> 
> On Fri, May 2, 2014 at 7:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> On May 2, 2014, at 5:29 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> 
> > Thanks for your quick reply.  What confused me is that why would the code works fine if I reset the gmres restart number by recalling kspgmressetrestart just before kspsolve?
> 
>    It isn?t really working. Something is going wrong (run with valgrind) and setting that restart number and starting the solver just puts it in a ?happier? state so it seems to make more progress.
> 
>    Barry
> 
> >
> > Sent from my iPhone
> >
> >> On May 2, 2014, at 6:03 PM, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
> >>
> >>
> >> Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator.
> >>
> >>  Barry
> >>
> >>> On May 2, 2014, at 4:41 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> >>>
> >>> Dear PETSc users,
> >>>
> >>> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice.
> >>>
> >>> My codes looks like
> >>>
> >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc )
> >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc )
> >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc )
> >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc )
> >>> call PCSetType ( pet_precon, 'asm', ierpetsc )
> >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc )
> >>> call KSPSetUp ( pet_solv, ierpetsc )
> >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc )  ! n_local is one
> >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc )
> >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
> >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
> >>> call KSPSetFromOptions ( pet_solv, ierpetsc )
> >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )           ! adding this line, the codes converge
> >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc )
> >>>
> >>> runing with 1 CPU  WITHOUT the line with red color and the codes don't converge
> >>>
> >>> runtime options:   -ksp_monitor_true_residual -ksp_view
> >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
> >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
> >>> .......
> >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
> >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
> >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
> >>>
> >>> KSP Object: 1 MPI processes
> >>> type: gmres
> >>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
> >>>   GMRES: happy breakdown tolerance 1e-30
> >>> maximum iterations=10000, initial guess is zero
> >>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >>> left preconditioning
> >>> using PRECONDITIONED norm type for convergence test
> >>> PC Object: 1 MPI processes
> >>> type: asm
> >>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
> >>>   Additive Schwarz: restriction/interpolation type - RESTRICT
> >>>   [0] number of local blocks = 1
> >>>   Local solve info for each block is in the following KSP and PC objects:
> >>>   - - - - - - - - - - - - - - - - - -
> >>>   [0] local block number 0, size = 22905
> >>>   KSP Object:    (sub_)     1 MPI processes
> >>>     type: preonly
> >>>     maximum iterations=10000, initial guess is zero
> >>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >>>     left preconditioning
> >>>     using NONE norm type for convergence test
> >>>   PC Object:    (sub_)     1 MPI processes
> >>>     type: jacobi
> >>>     linear system matrix = precond matrix:
> >>>     Matrix Object:       1 MPI processes
> >>>       type: seqbaij
> >>>       rows=22905, cols=22905, bs=5
> >>>       total: nonzeros=785525, allocated nonzeros=785525
> >>>       total number of mallocs used during MatSetValues calls =0
> >>>           block size is 5
> >>>   - - - - - - - - - - - - - - - - - -
> >>> linear system matrix followed by preconditioner matrix:
> >>> Matrix Object:   1 MPI processes
> >>>   type: shell
> >>>   rows=22905, cols=22905
> >>> Matrix Object:   1 MPI processes
> >>>   type: seqbaij
> >>>   rows=22905, cols=22905, bs=5
> >>>   total: nonzeros=785525, allocated nonzeros=785525
> >>>   total number of mallocs used during MatSetValues calls =0
> >>>       block size is 5
> >>>    WARNING: zero iteration in iterative solver
> >>>
> >>> runing with 1 CPU  WITH  the line with red color and the codes converge
> >>>
> >>> runtime options:   -ksp_monitor_true_residual -ksp_view
> >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
> >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
> >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
> >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
> >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
> >>> ............
> >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
> >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
> >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
> >>> KSP Object: 1 MPI processes
> >>> type: gmres
> >>>   GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
> >>>   GMRES: happy breakdown tolerance 1e-30
> >>> maximum iterations=10000, initial guess is zero
> >>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >>> left preconditioning
> >>> using PRECONDITIONED norm type for convergence test
> >>> PC Object: 1 MPI processes
> >>> type: asm
> >>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
> >>>   Additive Schwarz: restriction/interpolation type - RESTRICT
> >>>   [0] number of local blocks = 1
> >>>   Local solve info for each block is in the following KSP and PC objects:
> >>>   - - - - - - - - - - - - - - - - - -
> >>>   [0] local block number 0, size = 22905
> >>>   KSP Object:    (sub_)     1 MPI processes
> >>>     type: preonly
> >>>     maximum iterations=10000, initial guess is zero
> >>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >>>     left preconditioning
> >>>     using NONE norm type for convergence test
> >>>   PC Object:    (sub_)     1 MPI processes
> >>>     type: jacobi
> >>>     linear system matrix = precond matrix:
> >>>     Matrix Object:       1 MPI processes
> >>>       type: seqbaij
> >>>       rows=22905, cols=22905, bs=5
> >>>       total: nonzeros=785525, allocated nonzeros=785525
> >>>       total number of mallocs used during MatSetValues calls =0
> >>>           block size is 5
> >>>   - - - - - - - - - - - - - - - - - -
> >>> linear system matrix followed by preconditioner matrix:
> >>> Matrix Object:   1 MPI processes
> >>>   type: shell
> >>>   rows=22905, cols=22905
> >>> Matrix Object:   1 MPI processes
> >>>   type: seqbaij
> >>>   rows=22905, cols=22905, bs=5
> >>>   total: nonzeros=785525, allocated nonzeros=785525
> >>>   total number of mallocs used during MatSetValues calls =0
> >>>       block size is 5
> >>>    WARNING: zero iteration in iterative solver
> >>>
> >>>
> >>> What would be my error here? Thank you.
> >>
> 
> 


From asmund.ervik at ntnu.no  Mon May  5 09:24:57 2014
From: asmund.ervik at ntnu.no (=?ISO-8859-1?Q?=C5smund_Ervik?=)
Date: Mon, 05 May 2014 16:24:57 +0200
Subject: [petsc-users] Question with setting up KSP solver parameters
In-Reply-To: <mailman.7915.1399298596.3880.petsc-users@mcs.anl.gov>
References: <mailman.7915.1399298596.3880.petsc-users@mcs.anl.gov>
Message-ID: <53679F39.8080905@ntnu.no>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I would suggest also running valgrind with the additional option
"--track-origins=yes" which will show you where the uninitialized
values are coming from.

Regards,
?smund


On 05. mai 2014 16:03, petsc-users-request at mcs.anl.gov wrote:

> From: Song Gao <song.gao2 at mail.mcgill.ca> To: Barry Smith
> <bsmith at mcs.anl.gov> Cc: petsc-users <petsc-users at mcs.anl.gov>,
> Dario Isola <dario.isola at newmerical.com> Subject: Re: [petsc-users]
> Question with setting up KSP solver parameters. Message-ID: 
> <CAJitgPUG+pyGF6CiwAfEv8wRrf41xZB9n77ukc8QnZ5gFqC=0A at mail.gmail.com>
>
> 
Content-Type: text/plain; charset="utf-8"
> 
> Thanks for reply. What do you mean by a ?happier? state? I check
> the converged solution (the one which call kspgmressetrestart
> twice), the solution should be correct.
> 
> I run with valgrind both codes (one call kspgmressetrestart once
> and another call kspgmressetrestart twice) Both of them have the
> errors:                          what does this mean? Thank you in
> advance. ==7858== Conditional jump or move depends on uninitialised
> value(s) ==7858==    at 0xE71DFB: SearchPath (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0xE71640: mkl_cfg_file (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0xE6E068: DDOT (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0x73281A: VecNorm_Seq (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0x730BF4: VecNormalize (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0x7BC5A8: KSPSolve_GMRES (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0xB8A06E: KSPSolve (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0x7B659F: kspsolve_ (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0x5EAE84: petsolv_ (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0x4ECD46: flowsol_ng_ (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0x507E4E: iterprc_ (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64) ==7858==
> by 0x51D1B4: solnalg_ (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTZ585AAoJED+FDAHgGz19680H+wZNRjtbaBMCIAkWTaCjql3N
dwMMvBoPezDJFuVBOgBhns+no3FMBFP4lHqcZGEMJasxZSvS4pHXAgXpDZtL+amw
WLwK3mEPUMXYq/yT1AW/9HyT9fQx1738jOoKlRIaEL1SR+PfSzL8fnsi/ERpz2Tb
hs4wwPczEazRWMzyA3w8jDcWdGamcfO3fXPg6vAXMEG2TTjNUuwivV9tLEBeOy6v
GQypVm6hIvgE8fLsmTwYs3fnh8sZrw5QDV67fDnGSe3RrSc3jXbznu/j0JRtj0Rr
fRAj4S2kT/NYF07W2I7BeE1kvscgAbupmhAIpkSS8g/vBZRlKir/F7OOanYlP4k=
=XUTa
-----END PGP SIGNATURE-----

From song.gao2 at mail.mcgill.ca  Mon May  5 10:00:34 2014
From: song.gao2 at mail.mcgill.ca (Song Gao)
Date: Mon, 5 May 2014 11:00:34 -0400
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>
	<E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>
	<4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>
	<3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov>
	<CAJitgPUG+pyGF6CiwAfEv8wRrf41xZB9n77ukc8QnZ5gFqC=0A@mail.gmail.com>
	<07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov>
Message-ID: <CAJitgPV_pPgbJsOnwAm6+G75L2oj0bfx10t+2AOdwk3KL-q+pg@mail.gmail.com>

Thank you.
Runing with
mpirun -np 1 valgrind --track-origins=yes
~/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG
-ksp_monitor_true_residual -ksp_view

gives the following information.

==8222== Conditional jump or move depends on uninitialised value(s)
==8222==    at 0x216E9A7: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==8222==    by 0x216E1EC: mkl_cfg_file (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==8222==    by 0x216AC14: DDOT (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==8222==    by 0x1291100: VecNorm_Seq (bvec2.c:239)
==8222==    by 0x126F431: VecNorm (rvector.c:166)
==8222==    by 0x127039D: VecNormalize (rvector.c:261)
==8222==    by 0x1405507: KSPGMRESCycle (gmres.c:127)
==8222==    by 0x140695F: KSPSolve_GMRES (gmres.c:231)
==8222==    by 0x1BEEF66: KSPSolve (itfunc.c:446)
==8222==    by 0x13F95E8: kspsolve_ (itfuncf.c:219)
==8222==    by 0xC5E51F: petsolv_ (PETSOLV.F:375)
==8222==    by 0x612C35: flowsol_ng_ (flowsol_ng.F:275)
==8222==  Uninitialised value was created by a stack allocation
==8222==    at 0x216E97F: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==8222==
==8222== Conditional jump or move depends on uninitialised value(s)
==8222==    at 0x216E9D1: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==8222==    by 0x216E1EC: mkl_cfg_file (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==8222==    by 0x216AC14: DDOT (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==8222==    by 0x1291100: VecNorm_Seq (bvec2.c:239)
==8222==    by 0x126F431: VecNorm (rvector.c:166)
==8222==    by 0x127039D: VecNormalize (rvector.c:261)
==8222==    by 0x1405507: KSPGMRESCycle (gmres.c:127)
==8222==    by 0x140695F: KSPSolve_GMRES (gmres.c:231)
==8222==    by 0x1BEEF66: KSPSolve (itfunc.c:446)
==8222==    by 0x13F95E8: kspsolve_ (itfuncf.c:219)
==8222==    by 0xC5E51F: petsolv_ (PETSOLV.F:375)
==8222==    by 0x612C35: flowsol_ng_ (flowsol_ng.F:275)
==8222==  Uninitialised value was created by a stack allocation
==8222==    at 0x216E97F: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==8222==


On Mon, May 5, 2014 at 10:03 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   If you run valgrind with the debug version of the libraries it will
> provide more information about the line numbers where the problem occurred,
> etc. recommend doing that.
>
>    Either your initial solution or right hand side has garbage in it or
> the wrong blas may be being linked in. But there is definitely a problem
>
>    Barry
>
> On May 5, 2014, at 8:28 AM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
>
> > Thanks for reply. What do you mean by a ?happier? state? I check the
> converged solution (the one which call kspgmressetrestart twice), the
> solution should be correct.
> >
> > I run with valgrind both codes (one call kspgmressetrestart once and
> another call kspgmressetrestart twice)
> > Both of them have the errors:                          what does this
> mean? Thank you in advance.
> > ==7858== Conditional jump or move depends on uninitialised value(s)
> > ==7858==    at 0xE71DFB: SearchPath (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xE71640: mkl_cfg_file (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xE6E068: DDOT (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x73281A: VecNorm_Seq (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x730BF4: VecNormalize (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xB8A06E: KSPSolve (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x7B659F: kspsolve_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x5EAE84: petsolv_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x4ECD46: flowsol_ng_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x507E4E: iterprc_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x51D1B4: solnalg_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==
> > ==7858== Conditional jump or move depends on uninitialised value(s)
> > ==7858==    at 0xE71E25: SearchPath (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xE71640: mkl_cfg_file (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xE6E068: DDOT (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x73281A: VecNorm_Seq (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x730BF4: VecNormalize (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xB8A06E: KSPSolve (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x7B659F: kspsolve_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x5EAE84: petsolv_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x4ECD46: flowsol_ng_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x507E4E: iterprc_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x51D1B4: solnalg_ (in
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==
> >
> >
> > On Fri, May 2, 2014 at 7:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > On May 2, 2014, at 5:29 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> >
> > > Thanks for your quick reply.  What confused me is that why would the
> code works fine if I reset the gmres restart number by recalling
> kspgmressetrestart just before kspsolve?
> >
> >    It isn?t really working. Something is going wrong (run with valgrind)
> and setting that restart number and starting the solver just puts it in a
> ?happier? state so it seems to make more progress.
> >
> >    Barry
> >
> > >
> > > Sent from my iPhone
> > >
> > >> On May 2, 2014, at 6:03 PM, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
> > >>
> > >>
> > >> Your shell matrix is buggy in some way. Whenever the residual norm
> jumps like crazy at a restart it means that something is wrong with the
> operator.
> > >>
> > >>  Barry
> > >>
> > >>> On May 2, 2014, at 4:41 PM, Song Gao <song.gao2 at mail.mcgill.ca>
> wrote:
> > >>>
> > >>> Dear PETSc users,
> > >>>
> > >>> I'm solving a linear system in KSP and trying to setup the solver in
> codes. But I feel strange because my codes don't converge unless I call
> KSPGMRESSetRestart twice.
> > >>>
> > >>> My codes looks like
> > >>>
> > >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp,
> DIFFERENT_NONZERO_PATTERN, ierpetsc )
> > >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc )
> > >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc )
> > >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc )
> > >>> call PCSetType ( pet_precon, 'asm', ierpetsc )
> > >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc )
> > >>> call KSPSetUp ( pet_solv, ierpetsc )
> > >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local,
> pet_solv_sub, ierpetsc )  ! n_local is one
> > >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc )
> > >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
> > >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
> > >>> call KSPSetFromOptions ( pet_solv, ierpetsc )
> > >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )           !
> adding this line, the codes converge
> > >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc )
> > >>>
> > >>> runing with 1 CPU  WITHOUT the line with red color and the codes
> don't converge
> > >>>
> > >>> runtime options:   -ksp_monitor_true_residual -ksp_view
> > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm
> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> > >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm
> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> > >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm
> 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
> > >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm
> 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
> > >>> .......
> > >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm
> 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
> > >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm
> 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
> > >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm
> 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
> > >>>
> > >>> KSP Object: 1 MPI processes
> > >>> type: gmres
> > >>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> > >>>   GMRES: happy breakdown tolerance 1e-30
> > >>> maximum iterations=10000, initial guess is zero
> > >>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> > >>> left preconditioning
> > >>> using PRECONDITIONED norm type for convergence test
> > >>> PC Object: 1 MPI processes
> > >>> type: asm
> > >>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
> > >>>   Additive Schwarz: restriction/interpolation type - RESTRICT
> > >>>   [0] number of local blocks = 1
> > >>>   Local solve info for each block is in the following KSP and PC
> objects:
> > >>>   - - - - - - - - - - - - - - - - - -
> > >>>   [0] local block number 0, size = 22905
> > >>>   KSP Object:    (sub_)     1 MPI processes
> > >>>     type: preonly
> > >>>     maximum iterations=10000, initial guess is zero
> > >>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> > >>>     left preconditioning
> > >>>     using NONE norm type for convergence test
> > >>>   PC Object:    (sub_)     1 MPI processes
> > >>>     type: jacobi
> > >>>     linear system matrix = precond matrix:
> > >>>     Matrix Object:       1 MPI processes
> > >>>       type: seqbaij
> > >>>       rows=22905, cols=22905, bs=5
> > >>>       total: nonzeros=785525, allocated nonzeros=785525
> > >>>       total number of mallocs used during MatSetValues calls =0
> > >>>           block size is 5
> > >>>   - - - - - - - - - - - - - - - - - -
> > >>> linear system matrix followed by preconditioner matrix:
> > >>> Matrix Object:   1 MPI processes
> > >>>   type: shell
> > >>>   rows=22905, cols=22905
> > >>> Matrix Object:   1 MPI processes
> > >>>   type: seqbaij
> > >>>   rows=22905, cols=22905, bs=5
> > >>>   total: nonzeros=785525, allocated nonzeros=785525
> > >>>   total number of mallocs used during MatSetValues calls =0
> > >>>       block size is 5
> > >>>    WARNING: zero iteration in iterative solver
> > >>>
> > >>> runing with 1 CPU  WITH  the line with red color and the codes
> converge
> > >>>
> > >>> runtime options:   -ksp_monitor_true_residual -ksp_view
> > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm
> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> > >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm
> 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
> > >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm
> 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
> > >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm
> 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
> > >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm
> 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
> > >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm
> 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
> > >>> ............
> > >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm
> 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
> > >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm
> 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
> > >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm
> 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
> > >>> KSP Object: 1 MPI processes
> > >>> type: gmres
> > >>>   GMRES: restart=29, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> > >>>   GMRES: happy breakdown tolerance 1e-30
> > >>> maximum iterations=10000, initial guess is zero
> > >>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> > >>> left preconditioning
> > >>> using PRECONDITIONED norm type for convergence test
> > >>> PC Object: 1 MPI processes
> > >>> type: asm
> > >>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
> > >>>   Additive Schwarz: restriction/interpolation type - RESTRICT
> > >>>   [0] number of local blocks = 1
> > >>>   Local solve info for each block is in the following KSP and PC
> objects:
> > >>>   - - - - - - - - - - - - - - - - - -
> > >>>   [0] local block number 0, size = 22905
> > >>>   KSP Object:    (sub_)     1 MPI processes
> > >>>     type: preonly
> > >>>     maximum iterations=10000, initial guess is zero
> > >>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> > >>>     left preconditioning
> > >>>     using NONE norm type for convergence test
> > >>>   PC Object:    (sub_)     1 MPI processes
> > >>>     type: jacobi
> > >>>     linear system matrix = precond matrix:
> > >>>     Matrix Object:       1 MPI processes
> > >>>       type: seqbaij
> > >>>       rows=22905, cols=22905, bs=5
> > >>>       total: nonzeros=785525, allocated nonzeros=785525
> > >>>       total number of mallocs used during MatSetValues calls =0
> > >>>           block size is 5
> > >>>   - - - - - - - - - - - - - - - - - -
> > >>> linear system matrix followed by preconditioner matrix:
> > >>> Matrix Object:   1 MPI processes
> > >>>   type: shell
> > >>>   rows=22905, cols=22905
> > >>> Matrix Object:   1 MPI processes
> > >>>   type: seqbaij
> > >>>   rows=22905, cols=22905, bs=5
> > >>>   total: nonzeros=785525, allocated nonzeros=785525
> > >>>   total number of mallocs used during MatSetValues calls =0
> > >>>       block size is 5
> > >>>    WARNING: zero iteration in iterative solver
> > >>>
> > >>>
> > >>> What would be my error here? Thank you.
> > >>
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/8e8cf834/attachment-0001.html>

From bsmith at mcs.anl.gov  Mon May  5 11:27:47 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 5 May 2014 11:27:47 -0500
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <CAJitgPV_pPgbJsOnwAm6+G75L2oj0bfx10t+2AOdwk3KL-q+pg@mail.gmail.com>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>
	<E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>
	<4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>
	<3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov>
	<CAJitgPUG+pyGF6CiwAfEv8wRrf41xZB9n77ukc8QnZ5gFqC=0A@mail.gmail.com>
	<07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov>
	<CAJitgPV_pPgbJsOnwAm6+G75L2oj0bfx10t+2AOdwk3KL-q+pg@mail.gmail.com>
Message-ID: <44FAE78D-1C12-45FD-A4CF-0AFF50F7352F@mcs.anl.gov>


  Please email configure.log and make.log for this build.

  Barry

On May 5, 2014, at 10:00 AM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:

> Thank you.
> Runing with 
> mpirun -np 1 valgrind --track-origins=yes ~/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG  -ksp_monitor_true_residual -ksp_view
> 
> gives the following information. 
> 
> ==8222== Conditional jump or move depends on uninitialised value(s)
> ==8222==    at 0x216E9A7: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> ==8222==    by 0x216E1EC: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> ==8222==    by 0x216AC14: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> ==8222==    by 0x1291100: VecNorm_Seq (bvec2.c:239)
> ==8222==    by 0x126F431: VecNorm (rvector.c:166)
> ==8222==    by 0x127039D: VecNormalize (rvector.c:261)
> ==8222==    by 0x1405507: KSPGMRESCycle (gmres.c:127)
> ==8222==    by 0x140695F: KSPSolve_GMRES (gmres.c:231)
> ==8222==    by 0x1BEEF66: KSPSolve (itfunc.c:446)
> ==8222==    by 0x13F95E8: kspsolve_ (itfuncf.c:219)
> ==8222==    by 0xC5E51F: petsolv_ (PETSOLV.F:375)
> ==8222==    by 0x612C35: flowsol_ng_ (flowsol_ng.F:275)
> ==8222==  Uninitialised value was created by a stack allocation
> ==8222==    at 0x216E97F: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> ==8222== 
> ==8222== Conditional jump or move depends on uninitialised value(s)
> ==8222==    at 0x216E9D1: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> ==8222==    by 0x216E1EC: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> ==8222==    by 0x216AC14: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> ==8222==    by 0x1291100: VecNorm_Seq (bvec2.c:239)
> ==8222==    by 0x126F431: VecNorm (rvector.c:166)
> ==8222==    by 0x127039D: VecNormalize (rvector.c:261)
> ==8222==    by 0x1405507: KSPGMRESCycle (gmres.c:127)
> ==8222==    by 0x140695F: KSPSolve_GMRES (gmres.c:231)
> ==8222==    by 0x1BEEF66: KSPSolve (itfunc.c:446)
> ==8222==    by 0x13F95E8: kspsolve_ (itfuncf.c:219)
> ==8222==    by 0xC5E51F: petsolv_ (PETSOLV.F:375)
> ==8222==    by 0x612C35: flowsol_ng_ (flowsol_ng.F:275)
> ==8222==  Uninitialised value was created by a stack allocation
> ==8222==    at 0x216E97F: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> ==8222== 
> 
> 
> 
> On Mon, May 5, 2014 at 10:03 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   If you run valgrind with the debug version of the libraries it will provide more information about the line numbers where the problem occurred, etc. recommend doing that.
> 
>    Either your initial solution or right hand side has garbage in it or the wrong blas may be being linked in. But there is definitely a problem
> 
>    Barry
> 
> On May 5, 2014, at 8:28 AM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> 
> > Thanks for reply. What do you mean by a ?happier? state? I check the converged solution (the one which call kspgmressetrestart twice), the solution should be correct.
> >
> > I run with valgrind both codes (one call kspgmressetrestart once and another call kspgmressetrestart twice)
> > Both of them have the errors:                          what does this mean? Thank you in advance.
> > ==7858== Conditional jump or move depends on uninitialised value(s)
> > ==7858==    at 0xE71DFB: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==
> > ==7858== Conditional jump or move depends on uninitialised value(s)
> > ==7858==    at 0xE71E25: SearchPath (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xE71640: mkl_cfg_file (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xE6E068: DDOT (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x73281A: VecNorm_Seq (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x730BF4: VecNormalize (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0xB8A06E: KSPSolve (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x7B659F: kspsolve_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x5EAE84: petsolv_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x4ECD46: flowsol_ng_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x507E4E: iterprc_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==    by 0x51D1B4: solnalg_ (in /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> > ==7858==
> >
> >
> > On Fri, May 2, 2014 at 7:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > On May 2, 2014, at 5:29 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> >
> > > Thanks for your quick reply.  What confused me is that why would the code works fine if I reset the gmres restart number by recalling kspgmressetrestart just before kspsolve?
> >
> >    It isn?t really working. Something is going wrong (run with valgrind) and setting that restart number and starting the solver just puts it in a ?happier? state so it seems to make more progress.
> >
> >    Barry
> >
> > >
> > > Sent from my iPhone
> > >
> > >> On May 2, 2014, at 6:03 PM, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
> > >>
> > >>
> > >> Your shell matrix is buggy in some way. Whenever the residual norm jumps like crazy at a restart it means that something is wrong with the operator.
> > >>
> > >>  Barry
> > >>
> > >>> On May 2, 2014, at 4:41 PM, Song Gao <song.gao2 at mail.mcgill.ca> wrote:
> > >>>
> > >>> Dear PETSc users,
> > >>>
> > >>> I'm solving a linear system in KSP and trying to setup the solver in codes. But I feel strange because my codes don't converge unless I call KSPGMRESSetRestart twice.
> > >>>
> > >>> My codes looks like
> > >>>
> > >>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell, pet_matp, DIFFERENT_NONZERO_PATTERN, ierpetsc )
> > >>> call KSPSetType ( pet_solv, 'gmres', ierpetsc )
> > >>> call KSPGMRESSetRestart ( pet_solv, 30, ierpetsc )
> > >>> call KSPGetPC ( pet_solv, pet_precon, ierpetsc )
> > >>> call PCSetType ( pet_precon, 'asm', ierpetsc )
> > >>> call PCASMSetOverlap ( pet_precon, 1, ierpetsc )
> > >>> call KSPSetUp ( pet_solv, ierpetsc )
> > >>> call PCASMGetSubKSP ( pet_precon, n_local, first_local, pet_solv_sub, ierpetsc )  ! n_local is one
> > >>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc )
> > >>> call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
> > >>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
> > >>> call KSPSetFromOptions ( pet_solv, ierpetsc )
> > >>> call KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )           ! adding this line, the codes converge
> > >>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc )
> > >>>
> > >>> runing with 1 CPU  WITHOUT the line with red color and the codes don't converge
> > >>>
> > >>> runtime options:   -ksp_monitor_true_residual -ksp_view
> > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> > >>> 1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> > >>> 2 KSP preconditioned resid norm 2.198638170622e+00 true resid norm 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
> > >>> 3 KSP preconditioned resid norm 1.599896387215e+00 true resid norm 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
> > >>> .......
> > >>> 28 KSP preconditioned resid norm 4.478466011191e-01 true resid norm 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
> > >>> 29 KSP preconditioned resid norm 4.398129572260e-01 true resid norm 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
> > >>> 30 KSP preconditioned resid norm 2.783227613716e+12 true resid norm 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
> > >>>
> > >>> KSP Object: 1 MPI processes
> > >>> type: gmres
> > >>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
> > >>>   GMRES: happy breakdown tolerance 1e-30
> > >>> maximum iterations=10000, initial guess is zero
> > >>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> > >>> left preconditioning
> > >>> using PRECONDITIONED norm type for convergence test
> > >>> PC Object: 1 MPI processes
> > >>> type: asm
> > >>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
> > >>>   Additive Schwarz: restriction/interpolation type - RESTRICT
> > >>>   [0] number of local blocks = 1
> > >>>   Local solve info for each block is in the following KSP and PC objects:
> > >>>   - - - - - - - - - - - - - - - - - -
> > >>>   [0] local block number 0, size = 22905
> > >>>   KSP Object:    (sub_)     1 MPI processes
> > >>>     type: preonly
> > >>>     maximum iterations=10000, initial guess is zero
> > >>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> > >>>     left preconditioning
> > >>>     using NONE norm type for convergence test
> > >>>   PC Object:    (sub_)     1 MPI processes
> > >>>     type: jacobi
> > >>>     linear system matrix = precond matrix:
> > >>>     Matrix Object:       1 MPI processes
> > >>>       type: seqbaij
> > >>>       rows=22905, cols=22905, bs=5
> > >>>       total: nonzeros=785525, allocated nonzeros=785525
> > >>>       total number of mallocs used during MatSetValues calls =0
> > >>>           block size is 5
> > >>>   - - - - - - - - - - - - - - - - - -
> > >>> linear system matrix followed by preconditioner matrix:
> > >>> Matrix Object:   1 MPI processes
> > >>>   type: shell
> > >>>   rows=22905, cols=22905
> > >>> Matrix Object:   1 MPI processes
> > >>>   type: seqbaij
> > >>>   rows=22905, cols=22905, bs=5
> > >>>   total: nonzeros=785525, allocated nonzeros=785525
> > >>>   total number of mallocs used during MatSetValues calls =0
> > >>>       block size is 5
> > >>>    WARNING: zero iteration in iterative solver
> > >>>
> > >>> runing with 1 CPU  WITH  the line with red color and the codes converge
> > >>>
> > >>> runtime options:   -ksp_monitor_true_residual -ksp_view
> > >>> 0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> > >>> 1 KSP preconditioned resid norm 2.566248171026e+00 true resid norm 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
> > >>> 2 KSP preconditioned resid norm 1.410418402651e+00 true resid norm 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
> > >>> 3 KSP preconditioned resid norm 9.665409287757e-01 true resid norm 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
> > >>> 4 KSP preconditioned resid norm 4.469486152454e-01 true resid norm 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
> > >>> 5 KSP preconditioned resid norm 2.474889829653e-01 true resid norm 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
> > >>> ............
> > >>> 24 KSP preconditioned resid norm 9.518780877620e-05 true resid norm 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
> > >>> 25 KSP preconditioned resid norm 6.837876679998e-05 true resid norm 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
> > >>> 26 KSP preconditioned resid norm 4.864361942316e-05 true resid norm 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
> > >>> KSP Object: 1 MPI processes
> > >>> type: gmres
> > >>>   GMRES: restart=29, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
> > >>>   GMRES: happy breakdown tolerance 1e-30
> > >>> maximum iterations=10000, initial guess is zero
> > >>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> > >>> left preconditioning
> > >>> using PRECONDITIONED norm type for convergence test
> > >>> PC Object: 1 MPI processes
> > >>> type: asm
> > >>>   Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
> > >>>   Additive Schwarz: restriction/interpolation type - RESTRICT
> > >>>   [0] number of local blocks = 1
> > >>>   Local solve info for each block is in the following KSP and PC objects:
> > >>>   - - - - - - - - - - - - - - - - - -
> > >>>   [0] local block number 0, size = 22905
> > >>>   KSP Object:    (sub_)     1 MPI processes
> > >>>     type: preonly
> > >>>     maximum iterations=10000, initial guess is zero
> > >>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> > >>>     left preconditioning
> > >>>     using NONE norm type for convergence test
> > >>>   PC Object:    (sub_)     1 MPI processes
> > >>>     type: jacobi
> > >>>     linear system matrix = precond matrix:
> > >>>     Matrix Object:       1 MPI processes
> > >>>       type: seqbaij
> > >>>       rows=22905, cols=22905, bs=5
> > >>>       total: nonzeros=785525, allocated nonzeros=785525
> > >>>       total number of mallocs used during MatSetValues calls =0
> > >>>           block size is 5
> > >>>   - - - - - - - - - - - - - - - - - -
> > >>> linear system matrix followed by preconditioner matrix:
> > >>> Matrix Object:   1 MPI processes
> > >>>   type: shell
> > >>>   rows=22905, cols=22905
> > >>> Matrix Object:   1 MPI processes
> > >>>   type: seqbaij
> > >>>   rows=22905, cols=22905, bs=5
> > >>>   total: nonzeros=785525, allocated nonzeros=785525
> > >>>   total number of mallocs used during MatSetValues calls =0
> > >>>       block size is 5
> > >>>    WARNING: zero iteration in iterative solver
> > >>>
> > >>>
> > >>> What would be my error here? Thank you.
> > >>
> >
> >
> 
> 


From asmund.ervik at ntnu.no  Mon May  5 12:33:52 2014
From: asmund.ervik at ntnu.no (=?windows-1252?Q?=C5smund_Ervik?=)
Date: Mon, 05 May 2014 19:33:52 +0200
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <CAJitgPV_pPgbJsOnwAm6+G75L2oj0bfx10t+2AOdwk3KL-q+pg@mail.gmail.com>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>	<E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>	<4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>	<3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov>	<CAJitgPUG+pyGF6CiwAfEv8wRrf41xZB9n77ukc8QnZ5gFqC=0A@mail.gmail.com>	<07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov>
	<CAJitgPV_pPgbJsOnwAm6+G75L2oj0bfx10t+2AOdwk3KL-q+pg@mail.gmail.com>
Message-ID: <5367CB80.1040203@ntnu.no>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

You should also compile your own code "fensapngnew" with debug flags,
specifically "-g"  for gcc/gfortran or icc/ifort. This tells the
compiler to generate the information necessary for gdb or valgrind to
do their job. Then you would get more detailed information than just

'''
==8222==  Uninitialised value was created by a stack allocation
==8222==    at 0x216E97F: SearchPath (in
 /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
'''

E.g. when I have an error, I get a full backtrace with line numbers in
my source code, like:
'''
==5277==  Uninitialised value was created by a heap allocation
==5277==    at 0x4C277AB: malloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==5277==    by 0x6BE3FE: __navier_stokes_MOD_rhs_ns (navier_stokes.f90:59)
==5277==    by 0x6C712A: __rhs_MOD_dfdt_1phase (rhs.f90:109)
==5277==    by 0x4EF52C: __rk_MOD_forward_euler (rk.f90:2168)
==5277==    by 0x642764: __rk_wrapper_MOD_rk_step (rk_wrapper.f90:313)
==5277==    by 0x7FA8B8: MAIN__ (meph.F90:179)
==5277==    by 0x7FC5B9: main (meph.F90:2)
==5277==
'''


On 05. mai 2014 17:00, Song Gao wrote:
> Thank you. Runing with mpirun -np 1 valgrind --track-origins=yes 
> ~/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG 
> -ksp_monitor_true_residual -ksp_view
> 
> gives the following information.
> 
> ==8222== Conditional jump or move depends on uninitialised
> value(s) ==8222==    at 0x216E9A7: SearchPath (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) 
> ==8222==    by 0x216E1EC: mkl_cfg_file (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) 
> ==8222==    by 0x216AC14: DDOT (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) 
> ==8222==    by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222==    by
> 0x126F431: VecNorm (rvector.c:166) ==8222==    by 0x127039D:
> VecNormalize (rvector.c:261) ==8222==    by 0x1405507:
> KSPGMRESCycle (gmres.c:127) ==8222==    by 0x140695F:
> KSPSolve_GMRES (gmres.c:231) ==8222==    by 0x1BEEF66: KSPSolve
> (itfunc.c:446) ==8222==    by 0x13F95E8: kspsolve_ (itfuncf.c:219) 
> ==8222==    by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222==    by
> 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222==  Uninitialised
> value was created by a stack allocation ==8222==    at 0x216E97F:
> SearchPath (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) 
> ==8222== ==8222== Conditional jump or move depends on uninitialised
> value(s) ==8222==    at 0x216E9D1: SearchPath (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) 
> ==8222==    by 0x216E1EC: mkl_cfg_file (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) 
> ==8222==    by 0x216AC14: DDOT (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) 
> ==8222==    by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222==    by
> 0x126F431: VecNorm (rvector.c:166) ==8222==    by 0x127039D:
> VecNormalize (rvector.c:261) ==8222==    by 0x1405507:
> KSPGMRESCycle (gmres.c:127) ==8222==    by 0x140695F:
> KSPSolve_GMRES (gmres.c:231) ==8222==    by 0x1BEEF66: KSPSolve
> (itfunc.c:446) ==8222==    by 0x13F95E8: kspsolve_ (itfuncf.c:219) 
> ==8222==    by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222==    by
> 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222==  Uninitialised
> value was created by a stack allocation ==8222==    at 0x216E97F:
> SearchPath (in 
> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG) 
> ==8222==
> 
> 
> 
> On Mon, May 5, 2014 at 10:03 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> 
>> 
>> If you run valgrind with the debug version of the libraries it
>> will provide more information about the line numbers where the
>> problem occurred, etc. recommend doing that.
>> 
>> Either your initial solution or right hand side has garbage in it
>> or the wrong blas may be being linked in. But there is definitely
>> a problem
>> 
>> Barry
>> 
>> On May 5, 2014, at 8:28 AM, Song Gao <song.gao2 at mail.mcgill.ca>
>> wrote:
>> 
>>> Thanks for reply. What do you mean by a ?happier? state? I
>>> check the
>> converged solution (the one which call kspgmressetrestart twice),
>> the solution should be correct.
>>> 
>>> I run with valgrind both codes (one call kspgmressetrestart
>>> once and
>> another call kspgmressetrestart twice)
>>> Both of them have the errors:                          what
>>> does this
>> mean? Thank you in advance.
>>> ==7858== Conditional jump or move depends on uninitialised
>>> value(s) ==7858==    at 0xE71DFB: SearchPath (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0xE71640: mkl_cfg_file (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0xE6E068: DDOT (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x73281A: VecNorm_Seq (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x730BF4: VecNormalize (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0xB8A06E: KSPSolve (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x7B659F: kspsolve_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x5EAE84: petsolv_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x4ECD46: flowsol_ng_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x507E4E: iterprc_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x51D1B4: solnalg_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858== ==7858== Conditional jump or move depends on
>>> uninitialised value(s) ==7858==    at 0xE71E25: SearchPath (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0xE71640: mkl_cfg_file (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0xE6E068: DDOT (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x73281A: VecNorm_Seq (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x730BF4: VecNormalize (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0xB8A06E: KSPSolve (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x7B659F: kspsolve_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x5EAE84: petsolv_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x4ECD46: flowsol_ng_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x507E4E: iterprc_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==    by 0x51D1B4: solnalg_ (in
>> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
>>> ==7858==
>>> 
>>> 
>>> On Fri, May 2, 2014 at 7:25 PM, Barry Smith
>>> <bsmith at mcs.anl.gov> wrote:
>>> 
>>> On May 2, 2014, at 5:29 PM, Song Gao <song.gao2 at mail.mcgill.ca>
>>> wrote:
>>> 
>>>> Thanks for your quick reply.  What confused me is that why
>>>> would the
>> code works fine if I reset the gmres restart number by recalling 
>> kspgmressetrestart just before kspsolve?
>>> 
>>> It isn?t really working. Something is going wrong (run with
>>> valgrind)
>> and setting that restart number and starting the solver just puts
>> it in a ?happier? state so it seems to make more progress.
>>> 
>>> Barry
>>> 
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>>> On May 2, 2014, at 6:03 PM, "Barry Smith"
>>>>> <bsmith at mcs.anl.gov> wrote:
>>>>> 
>>>>> 
>>>>> Your shell matrix is buggy in some way. Whenever the
>>>>> residual norm
>> jumps like crazy at a restart it means that something is wrong
>> with the operator.
>>>>> 
>>>>> Barry
>>>>> 
>>>>>> On May 2, 2014, at 4:41 PM, Song Gao
>>>>>> <song.gao2 at mail.mcgill.ca>
>> wrote:
>>>>>> 
>>>>>> Dear PETSc users,
>>>>>> 
>>>>>> I'm solving a linear system in KSP and trying to setup
>>>>>> the solver in
>> codes. But I feel strange because my codes don't converge unless
>> I call KSPGMRESSetRestart twice.
>>>>>> 
>>>>>> My codes looks like
>>>>>> 
>>>>>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell,
>>>>>> pet_matp,
>> DIFFERENT_NONZERO_PATTERN, ierpetsc )
>>>>>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) call
>>>>>> KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) call
>>>>>> KSPGetPC ( pet_solv, pet_precon, ierpetsc ) call
>>>>>> PCSetType ( pet_precon, 'asm', ierpetsc ) call
>>>>>> PCASMSetOverlap ( pet_precon, 1, ierpetsc ) call KSPSetUp
>>>>>> ( pet_solv, ierpetsc ) call PCASMGetSubKSP ( pet_precon,
>>>>>> n_local, first_local,
>> pet_solv_sub, ierpetsc )  ! n_local is one
>>>>>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc
>>>>>> ) call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc ) 
>>>>>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc ) 
>>>>>> call KSPSetFromOptions ( pet_solv, ierpetsc ) call
>>>>>> KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )
>>>>>> !
>> adding this line, the codes converge
>>>>>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc
>>>>>> )
>>>>>> 
>>>>>> runing with 1 CPU  WITHOUT the line with red color and
>>>>>> the codes
>> don't converge
>>>>>> 
>>>>>> runtime options:   -ksp_monitor_true_residual -ksp_view 0
>>>>>> KSP preconditioned resid norm 6.585278940829e+00 true
>>>>>> resid norm
>> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>>>>> 1 KSP preconditioned resid norm 6.585278219510e+00 true
>>>>>> resid norm
>> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>>>>> 2 KSP preconditioned resid norm 2.198638170622e+00 true
>>>>>> resid norm
>> 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
>>>>>> 3 KSP preconditioned resid norm 1.599896387215e+00 true
>>>>>> resid norm
>> 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
>>>>>> ....... 28 KSP preconditioned resid norm
>>>>>> 4.478466011191e-01 true resid norm
>> 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
>>>>>> 29 KSP preconditioned resid norm 4.398129572260e-01 true
>>>>>> resid norm
>> 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
>>>>>> 30 KSP preconditioned resid norm 2.783227613716e+12 true
>>>>>> resid norm
>> 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
>>>>>> 
>>>>>> KSP Object: 1 MPI processes type: gmres GMRES:
>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>>>>> GMRES: happy breakdown tolerance 1e-30 maximum
>>>>>> iterations=10000, initial guess is zero tolerances:
>>>>>> relative=1e-05, absolute=1e-50, divergence=10000 left
>>>>>> preconditioning using PRECONDITIONED norm type for
>>>>>> convergence test PC Object: 1 MPI processes type: asm 
>>>>>> Additive Schwarz: total subdomain blocks = 1, amount of
>>>>>> overlap = 1 Additive Schwarz: restriction/interpolation
>>>>>> type - RESTRICT [0] number of local blocks = 1 Local
>>>>>> solve info for each block is in the following KSP and PC
>> objects:
>>>>>> - - - - - - - - - - - - - - - - - - [0] local block
>>>>>> number 0, size = 22905 KSP Object:    (sub_)     1 MPI
>>>>>> processes type: preonly maximum iterations=10000, initial
>>>>>> guess is zero tolerances:  relative=1e-05,
>>>>>> absolute=1e-50, divergence=10000 left preconditioning 
>>>>>> using NONE norm type for convergence test PC Object:
>>>>>> (sub_)     1 MPI processes type: jacobi linear system
>>>>>> matrix = precond matrix: Matrix Object:       1 MPI
>>>>>> processes type: seqbaij rows=22905, cols=22905, bs=5 
>>>>>> total: nonzeros=785525, allocated nonzeros=785525 total
>>>>>> number of mallocs used during MatSetValues calls =0 block
>>>>>> size is 5 - - - - - - - - - - - - - - - - - - linear
>>>>>> system matrix followed by preconditioner matrix: Matrix
>>>>>> Object:   1 MPI processes type: shell rows=22905,
>>>>>> cols=22905 Matrix Object:   1 MPI processes type:
>>>>>> seqbaij rows=22905, cols=22905, bs=5 total:
>>>>>> nonzeros=785525, allocated nonzeros=785525 total number
>>>>>> of mallocs used during MatSetValues calls =0 block size
>>>>>> is 5 WARNING: zero iteration in iterative solver
>>>>>> 
>>>>>> runing with 1 CPU  WITH  the line with red color and the
>>>>>> codes
>> converge
>>>>>> 
>>>>>> runtime options:   -ksp_monitor_true_residual -ksp_view 0
>>>>>> KSP preconditioned resid norm 6.585278940829e+00 true
>>>>>> resid norm
>> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
>>>>>> 1 KSP preconditioned resid norm 2.566248171026e+00 true
>>>>>> resid norm
>> 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
>>>>>> 2 KSP preconditioned resid norm 1.410418402651e+00 true
>>>>>> resid norm
>> 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
>>>>>> 3 KSP preconditioned resid norm 9.665409287757e-01 true
>>>>>> resid norm
>> 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
>>>>>> 4 KSP preconditioned resid norm 4.469486152454e-01 true
>>>>>> resid norm
>> 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
>>>>>> 5 KSP preconditioned resid norm 2.474889829653e-01 true
>>>>>> resid norm
>> 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
>>>>>> ............ 24 KSP preconditioned resid norm
>>>>>> 9.518780877620e-05 true resid norm
>> 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
>>>>>> 25 KSP preconditioned resid norm 6.837876679998e-05 true
>>>>>> resid norm
>> 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
>>>>>> 26 KSP preconditioned resid norm 4.864361942316e-05 true
>>>>>> resid norm
>> 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
>>>>>> KSP Object: 1 MPI processes type: gmres GMRES:
>>>>>> restart=29, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>>>>> GMRES: happy breakdown tolerance 1e-30 maximum
>>>>>> iterations=10000, initial guess is zero tolerances:
>>>>>> relative=1e-05, absolute=1e-50, divergence=10000 left
>>>>>> preconditioning using PRECONDITIONED norm type for
>>>>>> convergence test PC Object: 1 MPI processes type: asm 
>>>>>> Additive Schwarz: total subdomain blocks = 1, amount of
>>>>>> overlap = 1 Additive Schwarz: restriction/interpolation
>>>>>> type - RESTRICT [0] number of local blocks = 1 Local
>>>>>> solve info for each block is in the following KSP and PC
>> objects:
>>>>>> - - - - - - - - - - - - - - - - - - [0] local block
>>>>>> number 0, size = 22905 KSP Object:    (sub_)     1 MPI
>>>>>> processes type: preonly maximum iterations=10000, initial
>>>>>> guess is zero tolerances:  relative=1e-05,
>>>>>> absolute=1e-50, divergence=10000 left preconditioning 
>>>>>> using NONE norm type for convergence test PC Object:
>>>>>> (sub_)     1 MPI processes type: jacobi linear system
>>>>>> matrix = precond matrix: Matrix Object:       1 MPI
>>>>>> processes type: seqbaij rows=22905, cols=22905, bs=5 
>>>>>> total: nonzeros=785525, allocated nonzeros=785525 total
>>>>>> number of mallocs used during MatSetValues calls =0 block
>>>>>> size is 5 - - - - - - - - - - - - - - - - - - linear
>>>>>> system matrix followed by preconditioner matrix: Matrix
>>>>>> Object:   1 MPI processes type: shell rows=22905,
>>>>>> cols=22905 Matrix Object:   1 MPI processes type:
>>>>>> seqbaij rows=22905, cols=22905, bs=5 total:
>>>>>> nonzeros=785525, allocated nonzeros=785525 total number
>>>>>> of mallocs used during MatSetValues calls =0 block size
>>>>>> is 5 WARNING: zero iteration in iterative solver
>>>>>> 
>>>>>> 
>>>>>> What would be my error here? Thank you.
>>>>> 
>>> 
>>> 
>> 
>> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTZ8t/AAoJED+FDAHgGz19xocH/i2A2Ccw3BTypkyicy6dqAQE
wgVqukXnBI//adXHSe60uQBtL4OmjMiGOSt/Egye6N2QF/29yMzNdwTmHw6DZSRC
C8yyPpVMEOPwB2WED0ui+IGSYq6JglOVplT5lCf2T99Y/gZNiqugCNz0ydnA5KnP
9W0O1yO2/2xgE4bMEibVhFIPsaXKGyTLv1ZjZLgdnbnTYFbCZqJk+9lVOOpQlqBZ
mrzE+9GjO+0+BucEwI4Ekw4b9PI/Yctl0JW7zx+ZmviRsXRF4L3aO2SeFm1fBSnh
XPIreXBNB6vyAmPFBx9TJZHQFucJIsFLHrlrea6onePKBx4Eg3JcpOlX8GdJr5w=
=MkKg
-----END PGP SIGNATURE-----

From song.gao2 at mail.mcgill.ca  Mon May  5 14:06:02 2014
From: song.gao2 at mail.mcgill.ca (Song Gao)
Date: Mon, 5 May 2014 15:06:02 -0400
Subject: [petsc-users] Question with setting up KSP solver parameters.
In-Reply-To: <5367CB80.1040203@ntnu.no>
References: <CAJitgPX9gw7koG+KMUnHVe8=ORXpEXuXwPnMoCEMjQE_7df5Pw@mail.gmail.com>
	<E59BAEF9-08D5-45E5-88BD-3B6E73144283@mcs.anl.gov>
	<4C502F52-D0DD-420C-B76B-828A95935FDE@mail.mcgill.ca>
	<3097A8AB-A609-44AA-8D08-51F9EE4D4899@mcs.anl.gov>
	<CAJitgPUG+pyGF6CiwAfEv8wRrf41xZB9n77ukc8QnZ5gFqC=0A@mail.gmail.com>
	<07ACAD01-CCAC-49BA-9A7D-C83478EAA6BF@mcs.anl.gov>
	<CAJitgPV_pPgbJsOnwAm6+G75L2oj0bfx10t+2AOdwk3KL-q+pg@mail.gmail.com>
	<5367CB80.1040203@ntnu.no>
Message-ID: <CAJitgPWqqCUvdmFfkJf5NKQOwaOHe9_jJxphBW93vPL7vb16hg@mail.gmail.com>

Thank you.

Barry, Please see the attached log files.

Asmund, my apologize, I forget to make clean and recompile the code. But I
still don't see the full backtrace. I checked the compilation log and all
source files are compiled with -g flag.

 ==9475== Conditional jump or move depends on uninitialised value(s)
==9475==    at 0x216F00F: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==9475==    by 0x216E854: mkl_cfg_file (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==9475==    by 0x216B27C: DDOT (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==9475==    by 0x1291768: VecNorm_Seq (bvec2.c:239)
==9475==    by 0x126FA99: VecNorm (rvector.c:166)
==9475==    by 0x1270A05: VecNormalize (rvector.c:261)
==9475==    by 0x1405B6F: KSPGMRESCycle (gmres.c:127)
==9475==    by 0x1406FC7: KSPSolve_GMRES (gmres.c:231)
==9475==    by 0x1BEF5CE: KSPSolve (itfunc.c:446)
==9475==    by 0x13F9C50: kspsolve_ (itfuncf.c:219)
==9475==    by 0xC5EB87: petsolv_ (PETSOLV.F:375)
==9475==    by 0x612C35: flowsol_ng_ (flowsol_ng.F:275)
==9475==  Uninitialised value was created by a stack allocation
==9475==    at 0x216EFE7: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==9475==
==9475== Conditional jump or move depends on uninitialised value(s)
==9475==    at 0x216F039: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==9475==    by 0x216E854: mkl_cfg_file (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==9475==    by 0x216B27C: DDOT (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==9475==    by 0x1291768: VecNorm_Seq (bvec2.c:239)
==9475==    by 0x126FA99: VecNorm (rvector.c:166)
==9475==    by 0x1270A05: VecNormalize (rvector.c:261)
==9475==    by 0x1405B6F: KSPGMRESCycle (gmres.c:127)
==9475==    by 0x1406FC7: KSPSolve_GMRES (gmres.c:231)
==9475==    by 0x1BEF5CE: KSPSolve (itfunc.c:446)
==9475==    by 0x13F9C50: kspsolve_ (itfuncf.c:219)
==9475==    by 0xC5EB87: petsolv_ (PETSOLV.F:375)
==9475==    by 0x612C35: flowsol_ng_ (flowsol_ng.F:275)
==9475==  Uninitialised value was created by a stack allocation
==9475==    at 0x216EFE7: SearchPath (in
/home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
==9475==
  0 KSP preconditioned resid norm 6.585278940829e+00 true resid norm
9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 6.585278219510e+00 true resid norm
9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
  2 KSP preconditioned resid norm 2.198671238042e+00 true resid norm
1.365127786174e-01 ||r(i)||/||b|| 1.419158195200e+01
  3 KSP preconditioned resid norm 1.599921867950e+00 true resid norm
1.445986203309e-01 ||r(i)||/||b|| 1.503216908596e+01
...................


On Mon, May 5, 2014 at 1:33 PM, ?smund Ervik <asmund.ervik at ntnu.no> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> You should also compile your own code "fensapngnew" with debug flags,
> specifically "-g"  for gcc/gfortran or icc/ifort. This tells the
> compiler to generate the information necessary for gdb or valgrind to
> do their job. Then you would get more detailed information than just
>
> '''
> ==8222==  Uninitialised value was created by a stack allocation
> ==8222==    at 0x216E97F: SearchPath (in
>  /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> '''
>
> E.g. when I have an error, I get a full backtrace with line numbers in
> my source code, like:
> '''
> ==5277==  Uninitialised value was created by a heap allocation
> ==5277==    at 0x4C277AB: malloc (in
> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==5277==    by 0x6BE3FE: __navier_stokes_MOD_rhs_ns (navier_stokes.f90:59)
> ==5277==    by 0x6C712A: __rhs_MOD_dfdt_1phase (rhs.f90:109)
> ==5277==    by 0x4EF52C: __rk_MOD_forward_euler (rk.f90:2168)
> ==5277==    by 0x642764: __rk_wrapper_MOD_rk_step (rk_wrapper.f90:313)
> ==5277==    by 0x7FA8B8: MAIN__ (meph.F90:179)
> ==5277==    by 0x7FC5B9: main (meph.F90:2)
> ==5277==
> '''
>
>
> On 05. mai 2014 17:00, Song Gao wrote:
> > Thank you. Runing with mpirun -np 1 valgrind --track-origins=yes
> > ~/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG
> > -ksp_monitor_true_residual -ksp_view
> >
> > gives the following information.
> >
> > ==8222== Conditional jump or move depends on uninitialised
> > value(s) ==8222==    at 0x216E9A7: SearchPath (in
> > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> > ==8222==    by 0x216E1EC: mkl_cfg_file (in
> > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> > ==8222==    by 0x216AC14: DDOT (in
> > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> > ==8222==    by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222==    by
> > 0x126F431: VecNorm (rvector.c:166) ==8222==    by 0x127039D:
> > VecNormalize (rvector.c:261) ==8222==    by 0x1405507:
> > KSPGMRESCycle (gmres.c:127) ==8222==    by 0x140695F:
> > KSPSolve_GMRES (gmres.c:231) ==8222==    by 0x1BEEF66: KSPSolve
> > (itfunc.c:446) ==8222==    by 0x13F95E8: kspsolve_ (itfuncf.c:219)
> > ==8222==    by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222==    by
> > 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222==  Uninitialised
> > value was created by a stack allocation ==8222==    at 0x216E97F:
> > SearchPath (in
> > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> > ==8222== ==8222== Conditional jump or move depends on uninitialised
> > value(s) ==8222==    at 0x216E9D1: SearchPath (in
> > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> > ==8222==    by 0x216E1EC: mkl_cfg_file (in
> > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> > ==8222==    by 0x216AC14: DDOT (in
> > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> > ==8222==    by 0x1291100: VecNorm_Seq (bvec2.c:239) ==8222==    by
> > 0x126F431: VecNorm (rvector.c:166) ==8222==    by 0x127039D:
> > VecNormalize (rvector.c:261) ==8222==    by 0x1405507:
> > KSPGMRESCycle (gmres.c:127) ==8222==    by 0x140695F:
> > KSPSolve_GMRES (gmres.c:231) ==8222==    by 0x1BEEF66: KSPSolve
> > (itfunc.c:446) ==8222==    by 0x13F95E8: kspsolve_ (itfuncf.c:219)
> > ==8222==    by 0xC5E51F: petsolv_ (PETSOLV.F:375) ==8222==    by
> > 0x612C35: flowsol_ng_ (flowsol_ng.F:275) ==8222==  Uninitialised
> > value was created by a stack allocation ==8222==    at 0x216E97F:
> > SearchPath (in
> > /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64_DEBUG)
> > ==8222==
> >
> >
> >
> > On Mon, May 5, 2014 at 10:03 AM, Barry Smith <bsmith at mcs.anl.gov>
> > wrote:
> >
> >>
> >> If you run valgrind with the debug version of the libraries it
> >> will provide more information about the line numbers where the
> >> problem occurred, etc. recommend doing that.
> >>
> >> Either your initial solution or right hand side has garbage in it
> >> or the wrong blas may be being linked in. But there is definitely
> >> a problem
> >>
> >> Barry
> >>
> >> On May 5, 2014, at 8:28 AM, Song Gao <song.gao2 at mail.mcgill.ca>
> >> wrote:
> >>
> >>> Thanks for reply. What do you mean by a ?happier? state? I
> >>> check the
> >> converged solution (the one which call kspgmressetrestart twice),
> >> the solution should be correct.
> >>>
> >>> I run with valgrind both codes (one call kspgmressetrestart
> >>> once and
> >> another call kspgmressetrestart twice)
> >>> Both of them have the errors:                          what
> >>> does this
> >> mean? Thank you in advance.
> >>> ==7858== Conditional jump or move depends on uninitialised
> >>> value(s) ==7858==    at 0xE71DFB: SearchPath (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0xE71640: mkl_cfg_file (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0xE6E068: DDOT (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x73281A: VecNorm_Seq (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x730BF4: VecNormalize (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0xB8A06E: KSPSolve (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x7B659F: kspsolve_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x5EAE84: petsolv_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x4ECD46: flowsol_ng_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x507E4E: iterprc_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x51D1B4: solnalg_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858== ==7858== Conditional jump or move depends on
> >>> uninitialised value(s) ==7858==    at 0xE71E25: SearchPath (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0xE71640: mkl_cfg_file (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0xE6E068: DDOT (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x73281A: VecNorm_Seq (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x730BF4: VecNormalize (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x7BC5A8: KSPSolve_GMRES (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0xB8A06E: KSPSolve (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x7B659F: kspsolve_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x5EAE84: petsolv_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x4ECD46: flowsol_ng_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x507E4E: iterprc_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==    by 0x51D1B4: solnalg_ (in
> >> /home/cfd/sgao/mycodes/fensapngnew/bin/fensapMPI_LINUX64)
> >>> ==7858==
> >>>
> >>>
> >>> On Fri, May 2, 2014 at 7:25 PM, Barry Smith
> >>> <bsmith at mcs.anl.gov> wrote:
> >>>
> >>> On May 2, 2014, at 5:29 PM, Song Gao <song.gao2 at mail.mcgill.ca>
> >>> wrote:
> >>>
> >>>> Thanks for your quick reply.  What confused me is that why
> >>>> would the
> >> code works fine if I reset the gmres restart number by recalling
> >> kspgmressetrestart just before kspsolve?
> >>>
> >>> It isn?t really working. Something is going wrong (run with
> >>> valgrind)
> >> and setting that restart number and starting the solver just puts
> >> it in a ?happier? state so it seems to make more progress.
> >>>
> >>> Barry
> >>>
> >>>>
> >>>> Sent from my iPhone
> >>>>
> >>>>> On May 2, 2014, at 6:03 PM, "Barry Smith"
> >>>>> <bsmith at mcs.anl.gov> wrote:
> >>>>>
> >>>>>
> >>>>> Your shell matrix is buggy in some way. Whenever the
> >>>>> residual norm
> >> jumps like crazy at a restart it means that something is wrong
> >> with the operator.
> >>>>>
> >>>>> Barry
> >>>>>
> >>>>>> On May 2, 2014, at 4:41 PM, Song Gao
> >>>>>> <song.gao2 at mail.mcgill.ca>
> >> wrote:
> >>>>>>
> >>>>>> Dear PETSc users,
> >>>>>>
> >>>>>> I'm solving a linear system in KSP and trying to setup
> >>>>>> the solver in
> >> codes. But I feel strange because my codes don't converge unless
> >> I call KSPGMRESSetRestart twice.
> >>>>>>
> >>>>>> My codes looks like
> >>>>>>
> >>>>>> call KSPSetOperators ( pet_solv, pet_mat_mf_shell,
> >>>>>> pet_matp,
> >> DIFFERENT_NONZERO_PATTERN, ierpetsc )
> >>>>>> call KSPSetType ( pet_solv, 'gmres', ierpetsc ) call
> >>>>>> KSPGMRESSetRestart ( pet_solv, 30, ierpetsc ) call
> >>>>>> KSPGetPC ( pet_solv, pet_precon, ierpetsc ) call
> >>>>>> PCSetType ( pet_precon, 'asm', ierpetsc ) call
> >>>>>> PCASMSetOverlap ( pet_precon, 1, ierpetsc ) call KSPSetUp
> >>>>>> ( pet_solv, ierpetsc ) call PCASMGetSubKSP ( pet_precon,
> >>>>>> n_local, first_local,
> >> pet_solv_sub, ierpetsc )  ! n_local is one
> >>>>>> call KSPGetPC ( pet_solv_sub(1), pet_precon_sub, ierpetsc
> >>>>>> ) call PCSetType ( pet_precon_sub, 'jacobi', ierpetsc )
> >>>>>> call PCJacobiSetUseRowMax ( pet_precon_sub, ierpetsc )
> >>>>>> call KSPSetFromOptions ( pet_solv, ierpetsc ) call
> >>>>>> KSPGMRESSetRestart ( pet_solv, 29, ierpetsc )
> >>>>>> !
> >> adding this line, the codes converge
> >>>>>> call KSPSolve ( pet_solv, pet_rhsp, pet_solup, ierpetsc
> >>>>>> )
> >>>>>>
> >>>>>> runing with 1 CPU  WITHOUT the line with red color and
> >>>>>> the codes
> >> don't converge
> >>>>>>
> >>>>>> runtime options:   -ksp_monitor_true_residual -ksp_view 0
> >>>>>> KSP preconditioned resid norm 6.585278940829e+00 true
> >>>>>> resid norm
> >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>> 1 KSP preconditioned resid norm 6.585278219510e+00 true
> >>>>>> resid norm
> >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>> 2 KSP preconditioned resid norm 2.198638170622e+00 true
> >>>>>> resid norm
> >> 1.365132713014e-01 ||r(i)||/||b|| 1.419163317039e+01
> >>>>>> 3 KSP preconditioned resid norm 1.599896387215e+00 true
> >>>>>> resid norm
> >> 1.445988845022e-01 ||r(i)||/||b|| 1.503219654865e+01
> >>>>>> ....... 28 KSP preconditioned resid norm
> >>>>>> 4.478466011191e-01 true resid norm
> >> 1.529879309381e-01 ||r(i)||/||b|| 1.590430420920e+01
> >>>>>> 29 KSP preconditioned resid norm 4.398129572260e-01 true
> >>>>>> resid norm
> >> 1.530132924055e-01 ||r(i)||/||b|| 1.590694073413e+01
> >>>>>> 30 KSP preconditioned resid norm 2.783227613716e+12 true
> >>>>>> resid norm
> >> 1.530369123550e-01 ||r(i)||/||b|| 1.590939621450e+01
> >>>>>>
> >>>>>> KSP Object: 1 MPI processes type: gmres GMRES:
> >>>>>> restart=30, using Classical (unmodified) Gram-Schmidt
> >> Orthogonalization with no iterative refinement
> >>>>>> GMRES: happy breakdown tolerance 1e-30 maximum
> >>>>>> iterations=10000, initial guess is zero tolerances:
> >>>>>> relative=1e-05, absolute=1e-50, divergence=10000 left
> >>>>>> preconditioning using PRECONDITIONED norm type for
> >>>>>> convergence test PC Object: 1 MPI processes type: asm
> >>>>>> Additive Schwarz: total subdomain blocks = 1, amount of
> >>>>>> overlap = 1 Additive Schwarz: restriction/interpolation
> >>>>>> type - RESTRICT [0] number of local blocks = 1 Local
> >>>>>> solve info for each block is in the following KSP and PC
> >> objects:
> >>>>>> - - - - - - - - - - - - - - - - - - [0] local block
> >>>>>> number 0, size = 22905 KSP Object:    (sub_)     1 MPI
> >>>>>> processes type: preonly maximum iterations=10000, initial
> >>>>>> guess is zero tolerances:  relative=1e-05,
> >>>>>> absolute=1e-50, divergence=10000 left preconditioning
> >>>>>> using NONE norm type for convergence test PC Object:
> >>>>>> (sub_)     1 MPI processes type: jacobi linear system
> >>>>>> matrix = precond matrix: Matrix Object:       1 MPI
> >>>>>> processes type: seqbaij rows=22905, cols=22905, bs=5
> >>>>>> total: nonzeros=785525, allocated nonzeros=785525 total
> >>>>>> number of mallocs used during MatSetValues calls =0 block
> >>>>>> size is 5 - - - - - - - - - - - - - - - - - - linear
> >>>>>> system matrix followed by preconditioner matrix: Matrix
> >>>>>> Object:   1 MPI processes type: shell rows=22905,
> >>>>>> cols=22905 Matrix Object:   1 MPI processes type:
> >>>>>> seqbaij rows=22905, cols=22905, bs=5 total:
> >>>>>> nonzeros=785525, allocated nonzeros=785525 total number
> >>>>>> of mallocs used during MatSetValues calls =0 block size
> >>>>>> is 5 WARNING: zero iteration in iterative solver
> >>>>>>
> >>>>>> runing with 1 CPU  WITH  the line with red color and the
> >>>>>> codes
> >> converge
> >>>>>>
> >>>>>> runtime options:   -ksp_monitor_true_residual -ksp_view 0
> >>>>>> KSP preconditioned resid norm 6.585278940829e+00 true
> >>>>>> resid norm
> >> 9.619278462343e-03 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>> 1 KSP preconditioned resid norm 2.566248171026e+00 true
> >>>>>> resid norm
> >> 4.841043870812e-03 ||r(i)||/||b|| 5.032647604250e-01
> >>>>>> 2 KSP preconditioned resid norm 1.410418402651e+00 true
> >>>>>> resid norm
> >> 3.347509391208e-03 ||r(i)||/||b|| 3.480000505561e-01
> >>>>>> 3 KSP preconditioned resid norm 9.665409287757e-01 true
> >>>>>> resid norm
> >> 2.289877121679e-03 ||r(i)||/||b|| 2.380508195748e-01
> >>>>>> 4 KSP preconditioned resid norm 4.469486152454e-01 true
> >>>>>> resid norm
> >> 1.283813398084e-03 ||r(i)||/||b|| 1.334625463968e-01
> >>>>>> 5 KSP preconditioned resid norm 2.474889829653e-01 true
> >>>>>> resid norm
> >> 7.956009139680e-04 ||r(i)||/||b|| 8.270900120862e-02
> >>>>>> ............ 24 KSP preconditioned resid norm
> >>>>>> 9.518780877620e-05 true resid norm
> >> 6.273993696172e-07 ||r(i)||/||b|| 6.522312167937e-05
> >>>>>> 25 KSP preconditioned resid norm 6.837876679998e-05 true
> >>>>>> resid norm
> >> 4.612861071815e-07 ||r(i)||/||b|| 4.795433555514e-05
> >>>>>> 26 KSP preconditioned resid norm 4.864361942316e-05 true
> >>>>>> resid norm
> >> 3.394754589076e-07 ||r(i)||/||b|| 3.529115621682e-05
> >>>>>> KSP Object: 1 MPI processes type: gmres GMRES:
> >>>>>> restart=29, using Classical (unmodified) Gram-Schmidt
> >> Orthogonalization with no iterative refinement
> >>>>>> GMRES: happy breakdown tolerance 1e-30 maximum
> >>>>>> iterations=10000, initial guess is zero tolerances:
> >>>>>> relative=1e-05, absolute=1e-50, divergence=10000 left
> >>>>>> preconditioning using PRECONDITIONED norm type for
> >>>>>> convergence test PC Object: 1 MPI processes type: asm
> >>>>>> Additive Schwarz: total subdomain blocks = 1, amount of
> >>>>>> overlap = 1 Additive Schwarz: restriction/interpolation
> >>>>>> type - RESTRICT [0] number of local blocks = 1 Local
> >>>>>> solve info for each block is in the following KSP and PC
> >> objects:
> >>>>>> - - - - - - - - - - - - - - - - - - [0] local block
> >>>>>> number 0, size = 22905 KSP Object:    (sub_)     1 MPI
> >>>>>> processes type: preonly maximum iterations=10000, initial
> >>>>>> guess is zero tolerances:  relative=1e-05,
> >>>>>> absolute=1e-50, divergence=10000 left preconditioning
> >>>>>> using NONE norm type for convergence test PC Object:
> >>>>>> (sub_)     1 MPI processes type: jacobi linear system
> >>>>>> matrix = precond matrix: Matrix Object:       1 MPI
> >>>>>> processes type: seqbaij rows=22905, cols=22905, bs=5
> >>>>>> total: nonzeros=785525, allocated nonzeros=785525 total
> >>>>>> number of mallocs used during MatSetValues calls =0 block
> >>>>>> size is 5 - - - - - - - - - - - - - - - - - - linear
> >>>>>> system matrix followed by preconditioner matrix: Matrix
> >>>>>> Object:   1 MPI processes type: shell rows=22905,
> >>>>>> cols=22905 Matrix Object:   1 MPI processes type:
> >>>>>> seqbaij rows=22905, cols=22905, bs=5 total:
> >>>>>> nonzeros=785525, allocated nonzeros=785525 total number
> >>>>>> of mallocs used during MatSetValues calls =0 block size
> >>>>>> is 5 WARNING: zero iteration in iterative solver
> >>>>>>
> >>>>>>
> >>>>>> What would be my error here? Thank you.
> >>>>>
> >>>
> >>>
> >>
> >>
> >
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.22 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJTZ8t/AAoJED+FDAHgGz19xocH/i2A2Ccw3BTypkyicy6dqAQE
> wgVqukXnBI//adXHSe60uQBtL4OmjMiGOSt/Egye6N2QF/29yMzNdwTmHw6DZSRC
> C8yyPpVMEOPwB2WED0ui+IGSYq6JglOVplT5lCf2T99Y/gZNiqugCNz0ydnA5KnP
> 9W0O1yO2/2xgE4bMEibVhFIPsaXKGyTLv1ZjZLgdnbnTYFbCZqJk+9lVOOpQlqBZ
> mrzE+9GjO+0+BucEwI4Ekw4b9PI/Yctl0JW7zx+ZmviRsXRF4L3aO2SeFm1fBSnh
> XPIreXBNB6vyAmPFBx9TJZHQFucJIsFLHrlrea6onePKBx4Eg3JcpOlX8GdJr5w=
> =MkKg
> -----END PGP SIGNATURE-----
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/343e8cb8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: text/x-log
Size: 2520564 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/343e8cb8/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log
Type: text/x-log
Size: 46399 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/343e8cb8/attachment-0003.bin>

From paulhuaizhang at gmail.com  Mon May  5 20:14:44 2014
From: paulhuaizhang at gmail.com (huaibao zhang)
Date: Mon, 5 May 2014 21:14:44 -0400
Subject: [petsc-users] a naive question about assembly
Message-ID: <35D8669B-AF79-48EA-AF1D-CC8A21EEC4A0@gmail.com>

	
Hello,

I looked up the manual, but still felt  quite confused about why have to do assembly. Does it have to do with parallelization? Since all of the processors are loading the data at the same time, they need to a pause before one can use the whole vector? 

See a piece of code: 

                        for (int c=0;c<grid[gid].cellCount;++c) {
                                row=grid[gid].myOffset+c;
                                value=p;
                                VecSetValues(soln_n,1,&row,&value,INSERT_VALUES);
                        }
                        VecAssemblyBegin(soln_n); VecAssemblyEnd(soln_n);


Thanks,
Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/ef107b5b/attachment.html>

From knepley at gmail.com  Mon May  5 20:18:27 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 5 May 2014 20:18:27 -0500
Subject: [petsc-users] a naive question about assembly
In-Reply-To: <35D8669B-AF79-48EA-AF1D-CC8A21EEC4A0@gmail.com>
References: <35D8669B-AF79-48EA-AF1D-CC8A21EEC4A0@gmail.com>
Message-ID: <CAMYG4G=fOjb_QaCpuGnh0PPbLy9JYD8iP7dksV0m-SYUBcJnTA@mail.gmail.com>

On Mon, May 5, 2014 at 8:14 PM, huaibao zhang <paulhuaizhang at gmail.com>wrote:

>
> Hello,
>
> I looked up the manual, but still felt  quite confused about why have to
> do assembly. Does it have to do with parallelization? Since all of the
> processors are loading the data at the same time, they need to a pause
> before one can use the whole vector?
>

If one process sets a value owned by another process, it has to tell it.

   Matt


> See a piece of code:
>
>                         for (int c=0;c<grid[gid].cellCount;++c) {
>                                 row=grid[gid].myOffset+c;
>                                 value=p;
>
> VecSetValues(soln_n,1,&row,&value,INSERT_VALUES);
>                         }
>                         VecAssemblyBegin(soln_n); VecAssemblyEnd(soln_n);
>
>
> Thanks,
> Paul
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/674e1d34/attachment.html>

From knepley at gmail.com  Mon May  5 20:24:50 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 5 May 2014 20:24:50 -0500
Subject: [petsc-users] a naive question about assembly
In-Reply-To: <E5A555A4-FCFC-4E66-9F2D-05CAC9CC90C5@gmail.com>
References: <35D8669B-AF79-48EA-AF1D-CC8A21EEC4A0@gmail.com>
	<CAMYG4G=fOjb_QaCpuGnh0PPbLy9JYD8iP7dksV0m-SYUBcJnTA@mail.gmail.com>
	<E5A555A4-FCFC-4E66-9F2D-05CAC9CC90C5@gmail.com>
Message-ID: <CAMYG4Gnw-Dgww3Ova-x6LLBQu7fLrMQXWfVtuA7xmcCyQ=k+AA@mail.gmail.com>

On Mon, May 5, 2014 at 8:22 PM, huaibao zhang <paulhuaizhang at gmail.com>wrote:

>
> Matt,
>
> THanks for the answer.
> I think my question is why have to do assembly?
> In the piece of my code, 2 processors are inserting the dada to a public
> vector soon_n.
>

This is explained very well in the book Using MPI:
http://www.mcs.anl.gov/research/projects/mpi/usingmpi/

If process 0 inserts a value for process 1, then somehow process 1 must be
told. That happens in VecAssembly().

   Matt


> Paul
>
>
>
> On May 5, 2014, at 9:18 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Mon, May 5, 2014 at 8:14 PM, huaibao zhang <paulhuaizhang at gmail.com>wrote:
>
>>
>>  Hello,
>>
>> I looked up the manual, but still felt  quite confused about why have to
>> do assembly. Does it have to do with parallelization? Since all of the
>> processors are loading the data at the same time, they need to a pause
>> before one can use the whole vector?
>>
>
> If one process sets a value owned by another process, it has to tell it.
>
>    Matt
>
>
>>  See a piece of code:
>>
>>                         for (int c=0;c<grid[gid].cellCount;++c) {
>>                                 row=grid[gid].myOffset+c;
>>                                 value=p;
>>
>> VecSetValues(soln_n,1,&row,&value,INSERT_VALUES);
>>                         }
>>                         VecAssemblyBegin(soln_n); VecAssemblyEnd(soln_n);
>>
>>
>> Thanks,
>> Paul
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140505/d377439e/attachment.html>

From puesoek at uni-mainz.de  Tue May  6 07:23:40 2014
From: puesoek at uni-mainz.de (=?iso-8859-1?Q?P=FCs=F6k=2C_Adina-Erika?=)
Date: Tue, 6 May 2014 12:23:40 +0000
Subject: [petsc-users] Problem with MatZeroRowsColumnsIS()
Message-ID: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de>

Hello!

I was trying to implement some internal Dirichlet boundary conditions into an aij matrix of the form:  A=(  VV  VP; PV PP ). The idea was to create an internal block (let's say Dirichlet block) that moves with constant velocity within the domain (i.e. check all the dofs within the block and set the values accordingly to the desired motion).

Ideally, this means to zero the rows and columns in VV, VP, PV corresponding to the dirichlet dofs and modify the corresponding rhs values. However, since we have submatrices and not a monolithic matrix A,  we can choose to modify only VV and PV matrices.
The global indices of the velocity points within the Dirichlet block are contained in the arrays rowid_array.

What I want to point out is that the function MatZeroRowsColumnsIS() seems to create parallel artefacts, compared to MatZeroRowsIS() when run on more than 1 processor. Moreover, the results on 1 cpu are identical.
See below the results of the test (the Dirichlet block is outlined in white) and the piece of the code involved where the 1) - 2) parts are the only difference.

Thanks,
Adina Pusok

// Create an IS required by MatZeroRows()
ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx);  CHKERRQ(ierr);
ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy);  CHKERRQ(ierr);
ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz);  CHKERRQ(ierr);

1) /* ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);*/

2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);

ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);

ierr = ISDestroy(&isx); CHKERRQ(ierr);
ierr = ISDestroy(&isy); CHKERRQ(ierr);
ierr = ISDestroy(&isz); CHKERRQ(ierr);


Results (velocity) with MatZeroRowsColumnsIS().
1cpu[cid:779A0024-2BBB-4F8D-AB25-114C5B3D111C at Geo.Uni-Mainz.DE] 4cpu[cid:9FAA7278-A3FE-4A5D-B7A1-05AAFCA43181 at Geo.Uni-Mainz.DE]

Results (velocity) with MatZeroRowsIS():
1cpu[cid:C0C73566-0D52-484C-A858-01A184C23597 at Geo.Uni-Mainz.DE] 4cpu[cid:7A9FD8A2-C2FC-41B3-88BF-11F77628E874 at Geo.Uni-Mainz.DE]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/eb2c9945/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r01_1cpu_rows_columns.png
Type: image/png
Size: 28089 bytes
Desc: r01_1cpu_rows_columns.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/eb2c9945/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r01_rows_columns.png
Type: image/png
Size: 28325 bytes
Desc: r01_rows_columns.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/eb2c9945/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r01_1cpu_rows.png
Type: image/png
Size: 28089 bytes
Desc: r01_1cpu_rows.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/eb2c9945/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r01_rows.png
Type: image/png
Size: 28045 bytes
Desc: r01_rows.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/eb2c9945/attachment-0007.png>

From knepley at gmail.com  Tue May  6 09:22:52 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 6 May 2014 09:22:52 -0500
Subject: [petsc-users] Problem with MatZeroRowsColumnsIS()
In-Reply-To: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de>
References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de>
Message-ID: <CAMYG4GnwSkUC11GhmLj7H-EYmK2X2tMBUQUFdCnfc5PPGfNHag@mail.gmail.com>

On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika <puesoek at uni-mainz.de>wrote:

>  Hello!
>
>  I was trying to implement some internal Dirichlet boundary conditions
> into an aij matrix of the form:  A=(  VV  VP; PV PP ). The idea was to
> create an internal block (let's say Dirichlet block) that moves with
> constant velocity within the domain (i.e. check all the dofs within the
> block and set the values accordingly to the desired motion).
>
>  Ideally, this means to zero the rows and columns in VV, VP, PV
> corresponding to the dirichlet dofs and modify the corresponding rhs
> values. However, since we have submatrices and not a monolithic matrix A,
>  we can choose to modify only VV and PV matrices.
> The global indices of the velocity points within the Dirichlet block are
> contained in the arrays rowid_array.
>
>  What I want to point out is that the function MatZeroRowsColumnsIS()
> seems to create parallel artefacts, compared to MatZeroRowsIS() when run on
> more than 1 processor. Moreover, the results on 1 cpu are identical.
> See below the results of the test (the Dirichlet block is outlined in
> white) and the piece of the code involved where the 1) - 2) parts are the
> only difference.
>

I am assuming that you are showing the result of solving the equations. It
would be more useful, and presumably just as easy
to say:

  a) Are the correct rows zeroed out?

  b) Is the diagonal element correct?

  c) Is the rhs value correct?

  d) Are the columns zeroed correctly?

If we know where the problem is, its easier to fix. For example, if the rhs
values are
correct and the rows are zeroed, then something is wrong with the solution
procedure.
Since ZeroRows() works and ZeroRowsColumns() does not, this is a distinct
possibility.

  Thanks,

     Matt


> Thanks,
> Adina Pusok
>
>   // Create an IS required by MatZeroRows()
>  ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,
> PETSC_COPY_VALUES,&isx);  CHKERRQ(ierr);
>  ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,
> PETSC_COPY_VALUES,&isy);  CHKERRQ(ierr);
>  ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,
> PETSC_COPY_VALUES,&isz);  CHKERRQ(ierr);
>
>  1) /* ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(
> ierr);
>  ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
>  ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr
> );*/
>
>  2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
>  ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
>  ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);
>
>  ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL);
> CHKERRQ(ierr);
>  ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL);
> CHKERRQ(ierr);
>  ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL);
> CHKERRQ(ierr);
>
>  ierr = ISDestroy(&isx); CHKERRQ(ierr);
>  ierr = ISDestroy(&isy); CHKERRQ(ierr);
>  ierr = ISDestroy(&isz); CHKERRQ(ierr);
>
>
>  Results (velocity) with MatZeroRowsColumnsIS().
>  1cpu 4cpu
>
>  Results (velocity) with MatZeroRowsIS():
>  1cpu 4cpu
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/9bf403e0/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r01_1cpu_rows_columns.png
Type: image/png
Size: 28089 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/9bf403e0/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r01_1cpu_rows.png
Type: image/png
Size: 28089 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/9bf403e0/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r01_rows_columns.png
Type: image/png
Size: 28325 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/9bf403e0/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: r01_rows.png
Type: image/png
Size: 28045 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140506/9bf403e0/attachment-0007.png>

From francium87 at hotmail.com  Tue May  6 22:53:46 2014
From: francium87 at hotmail.com (linjing bo)
Date: Wed, 7 May 2014 03:53:46 +0000
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>,
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>,
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>,
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>,
	<BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>
Message-ID: <BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>

The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal?
 
---------------------------------------------------------------------------------------------
==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327
==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75)
==31551==    by 0x5E1C41F: KSPSetUp (itfunc.c:239)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)
==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
 
...
 
==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327
==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011)
==31551==    by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655)
==31551==    by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756)
==31551==    by 0x573D152: MatILUFactorSymbolic (matrix.c:6240)
==31551==    by 0x5CFB843: PCSetUp_ILU (ilu.c:204)
==31551==    by 0x5D8BAA7: PCSetUp (precon.c:890)
==31551==    by 0x5E1C639: KSPSetUp (itfunc.c:278)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)
==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)

-------------------------------------------------------------------------------------------------

 
From: francium87 at hotmail.com
To: knepley at gmail.com
CC: petsc-users at mcs.anl.gov
Subject: RE: [petsc-users] VecValidValues() reports NaN found
Date: Mon, 5 May 2014 13:15:38 +0000


Ok, I will try it . Thanks for your advise. 

Date: Mon, 5 May 2014 08:12:05 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:


I use JACOBI. The message showed is with JACOBI. 


Wired situation is that the backtrack information shows the location
 is before actually apply PC, so I guess the rhs vec is not changed at 
this point.

Another wired thing is : Because the original code is
 to complex. I write out the A matrix in Ax=b, and write a small test 
code to read in this matrix and solve it, no error showed. The KSP, PC 
are all set to be the same.

When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function:

1) Run in serial until it works

2) It looks like you have memory overwriting problems. Run with valgrind
   Matt 
[2]PETSC ERROR: --------------------- Error Message ------------------------------------
[2]PETSC ERROR: Argument out of range!                                                  
[2]PETSC ERROR: Cannot log negative flops!                                              

[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                              
[2]PETSC ERROR: See docs/changes/index.html for recent updates.                         

[2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                     
[2]PETSC ERROR: See docs/index.html for manual pages.                                   
[2]PETSC ERROR: ------------------------------------------------------------------------

[2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:51:27 2014                                                                                                

[2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        

[2]PETSC ERROR: ------------------------------------------------------------------------           
[2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h                    
[2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c                                                                                              

[2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c      
[2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c       
[2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               

[2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             
[2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


Date: Mon, 5 May 2014 07:27:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com

CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:


Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below 


[3]PETSC ERROR: --------------------- Error Message ------------------------------------           


[3]PETSC ERROR: Floating point exception!                                                          
[3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2!                                                                               


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                                         
[3]PETSC ERROR: See docs/changes/index.html for recent updates.                                    


[3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                                
[3]PETSC ERROR: See docs/index.html for manual pages.                                              
[3]PETSC ERROR: ------------------------------------------------------------------------           


[3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:03:20 2014                                                                                                


[3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c  


It looks like the vector after preconditioner application is bad. What is the preconditioner?
  Matt 

     
[3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h         


[3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c     
[3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c      
[3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below


---------------------------Solver section--------------------------

  call VecSet( pet_bp_b, one, ierr)

  vecidx=[0,1]
  call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
  write(*,*) ' first two values ', first(1), first(2)


  call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
  call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
  call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
  CHKERRQ(ierr)


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.

-- Norbert Wiener
 		 	   		   		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/c25e36f1/attachment-0001.html>

From Zafer.Leylek at student.adfa.edu.au  Tue May  6 23:39:38 2014
From: Zafer.Leylek at student.adfa.edu.au (Zafer Leylek)
Date: Wed, 7 May 2014 04:39:38 +0000
Subject: [petsc-users] MatGetMumpsRINFOG()
Message-ID: <996B4E35EA834745A31426395DB0BBA01DFD960C@ADFAPWEXMBX02.ad.adfa.edu.au>

Hi,

I am trying to get mumps to return the matrix determinant. I have set the ICNTL option using:

MatMumpsSetIcntl(A,33,1);

and can view the determinant using

PCView(pc, PETSC_VIEWER_STDOUT_WORLD);

I  need to use the determinant in my code. Is there a way I can get petsc to return this parameter. If not, is it possible to implement the MatGetMumpsRINFOG() as suggested in:

http://lists.mcs.anl.gov/pipermail/petsc-users/2011-September/010225.html

King Regards

ZL
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/f2af006e/attachment.html>

From altriaex86 at gmail.com  Wed May  7 04:01:53 2014
From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=)
Date: Wed, 7 May 2014 19:01:53 +1000
Subject: [petsc-users] Inserting -nan+iG at matrix entry problem
Message-ID: <CACDKL=xE_5TvDVwYOu1Rwd14OpZZuFmuYxOSDawWtQ4=_n=q3Q@mail.gmail.com>

Hi,all

I got a "Inserting -nan+iG error" at function MatSetValues.

My code like this:

I first use code below to change a double into PETScScalar (I am using
Complex version).
*for(i=0;i<nz;i++)temp[i] = Ax[i] + Az[i] * PETSC_i;*
Then I use code below to insert values into matrix.
*ierr = MatSetValues(A,n,Conlumn_ptr,n,Ai,temp,INSERT_VALUES);*

Here is how problem happens:

I compile my PETSc code into a .so lib and test it with a simple matrix and*
it passed*. So I link it with the other part of my program.

However, it keeps telling me

*Inserting -nan+iG at matrix entry (2,3)!*

The (2,3) is zero actually, and I could print it with std::cerr which tells
me it is zero. The other part of my program,with which generates actual
matrix I will deal, is correct.(I could use ARPACK with it.)

I was confused about why PETSc recognize a zero into -nan. In my simple
test, there is also zero entry, at (0,0) however. For the other part is
compiled itself, I guess there might be some problem with compiling
options. But I have no idea about it. Could anybody help me?


Guoxi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/fce715c3/attachment.html>

From rupp at iue.tuwien.ac.at  Wed May  7 05:08:00 2014
From: rupp at iue.tuwien.ac.at (Karl Rupp)
Date: Wed, 7 May 2014 12:08:00 +0200
Subject: [petsc-users] Inserting -nan+iG at matrix entry problem
In-Reply-To: <CACDKL=xE_5TvDVwYOu1Rwd14OpZZuFmuYxOSDawWtQ4=_n=q3Q@mail.gmail.com>
References: <CACDKL=xE_5TvDVwYOu1Rwd14OpZZuFmuYxOSDawWtQ4=_n=q3Q@mail.gmail.com>
Message-ID: <536A0600.6020005@iue.tuwien.ac.at>

Hi,

from your description it sounds like this might be a memory corruption 
issue. Please run your code through valgrind first. If this doesn't show 
any errors, please send more context (sources, Makefile, etc.). We can 
only guess what e.g. the type of 'temp' is or whether the correct header 
files get picked up.

Best regards,
Karli


On 05/07/2014 11:01 AM, ??? wrote:
> Hi,all
>
> I got a "Inserting -nan+iG error" at function MatSetValues.
>
> My code like this:
>
> I first use code below to change a double into PETScScalar (I am using
> Complex version).
> *for(i=0;i<nz;i++)temp[i] = Ax[i] + Az[i] * PETSC_i;*
> Then I use code below to insert values into matrix.
> *ierr = MatSetValues(A,n,Conlumn_ptr,n,Ai,temp,INSERT_VALUES);*
>
> Here is how problem happens:
>
> I compile my PETSc code into a .so lib and test it with a simple matrix
> and*it passed*. So I link it with the other part of my program.
>
> However, it keeps telling me
> *
> Inserting -nan+iG at matrix entry (2,3)!*
>
> The (2,3) is zero actually, and I could print it with std::cerr which
> tells me it is zero. The other part of my program,with which generates
> actual matrix I will deal, is correct.(I could use ARPACK with it.)
>
> I was confused about why PETSc recognize a zero into -nan. In my simple
> test, there is also zero entry, at (0,0) however. For the other part is
> compiled itself, I guess there might be some problem with compiling
> options. But I have no idea about it. Could anybody help me?
>
>
> Guoxi


From knepley at gmail.com  Wed May  7 05:52:13 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 7 May 2014 05:52:13 -0500
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>
	<BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>
	<BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>
Message-ID: <CAMYG4G=6p4+D4DznoNVPeR9KgcQ405-vuY=0o2X9nFgMV-SqBw@mail.gmail.com>

On Tue, May 6, 2014 at 10:53 PM, linjing bo <francium87 at hotmail.com> wrote:

> The Valgrind shows memory leak in memalign() called by KSPSetup and
> PCSetup. Is that normal?
>


Did you call KSPDestroy()?

  Matt

>
>
> ---------------------------------------------------------------------------------------------
> ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636
> of 3,327
> ==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
> ==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
> ==31551==    by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75)
> ==31551==    by 0x5E1C41F: KSPSetUp (itfunc.c:239)
> ==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
> ==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
> ==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)
> ==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
> ==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
> ==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
>
> ...
>
> ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113
> of 3,327
> ==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
> ==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
> ==31551==    by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011)
> ==31551==    by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0
> (aijfact.c:1655)
> ==31551==    by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756)
> ==31551==    by 0x573D152: MatILUFactorSymbolic (matrix.c:6240)
> ==31551==    by 0x5CFB843: PCSetUp_ILU (ilu.c:204)
> ==31551==    by 0x5D8BAA7: PCSetUp (precon.c:890)
> ==31551==    by 0x5E1C639: KSPSetUp (itfunc.c:278)
> ==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
> ==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
> ==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)
> ==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
> ==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
> ==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
>
>
> -------------------------------------------------------------------------------------------------
>
>
> ------------------------------
> From: francium87 at hotmail.com
> To: knepley at gmail.com
> CC: petsc-users at mcs.anl.gov
> Subject: RE: [petsc-users] VecValidValues() reports NaN found
> Date: Mon, 5 May 2014 13:15:38 +0000
>
> Ok, I will try it . Thanks for your advise.
>
> ------------------------------
> Date: Mon, 5 May 2014 08:12:05 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:
>
> I use JACOBI. The message showed is with JACOBI.
>
>
> Wired situation is that the backtrack information shows the location is
> before actually apply PC, so I guess the rhs vec is not changed at this
> point.
>
> Another wired thing is : Because the original code is to complex. I write
> out the A matrix in Ax=b, and write a small test code to read in this
> matrix and solve it, no error showed. The KSP, PC are all set to be the
> same.
>
> When I try to using ILU, more wired error happens, the backtrack info
> shows it died in a Flops logging function:
>
>
> 1) Run in serial until it works
>
> 2) It looks like you have memory overwriting problems. Run with valgrind
>
>    Matt
>
>
> [2]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [2]PETSC ERROR: Argument out of
> range!
> [2]PETSC ERROR: Cannot log negative
> flops!
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [2]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [2]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [2]PETSC ERROR: See docs/index.html for manual
> pages.
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:51:27
> 2014
>
> [2]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [2]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [2]PETSC ERROR: PetscLogFlops() line 204 in
> /tmp/petsc-3.4.4/include/petsclog.h
> [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in
> /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c
>
> [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in
> /tmp/petsc-3.4.4/src/mat/interface/matrix.c
> [2]PETSC ERROR: PCSetUp_ILU() line 232 in
> /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
> [2]PETSC ERROR: PCSetUp() line 890 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [2]PETSC ERROR: KSPSetUp() line 278 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> [2]PETSC ERROR: KSPSolve() line 399 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
>
>
> ------------------------------
> Date: Mon, 5 May 2014 07:27:52 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:
>
>  Hi, I'm trying to use PETSc's ksp method to solve a linear system. When
> running, Error is reported by VecValidValues() that NaN or Inf is found
> with error message listed below
>
>
> [3]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [3]PETSC ERROR: Floating point
> exception!
> [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite
> at beginning of function: Parameter number
> 2!
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [3]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [3]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [3]PETSC ERROR: See docs/index.html for manual
> pages.
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:03:20
> 2014
>
> [3]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [3]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: VecValidValues() line 28 in
> /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c
>
>
> It looks like the vector after preconditioner application is bad. What is
> the preconditioner?
>
>   Matt
>
>
>
> [3]PETSC ERROR: PCApply() line 436 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [3]PETSC ERROR: KSP_PCApply() line 227 in
> /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h
> [3]PETSC ERROR: KSPInitialResidual() line 64 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c
> [3]PETSC ERROR: KSPSolve_GMRES() line 239 in
> /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c
> [3]PETSC ERROR: KSPSolve() line 441 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
> After read the source code shown by backtrack informations, I realize the
> problem is in the right hand side vector. So I make a trial of set right
> hand side vector to ONE by VecSet, But the program still shows error
> message above, and using VecView or VecGetValue to investigate the first
> value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly
> describe the problem. The code related is listed below
>
> ---------------------------Solver section--------------------------
>
>   call VecSet( pet_bp_b, one, ierr)
>
>   vecidx=[0,1]
>   call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
>   write(*,*) ' first two values ', first(1), first(2)
>
>   call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
>   call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
>   call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
>   CHKERRQ(ierr)
>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/1b4c2b85/attachment-0001.html>

From altriaex86 at gmail.com  Wed May  7 08:03:07 2014
From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=)
Date: Wed, 7 May 2014 23:03:07 +1000
Subject: [petsc-users] Is there any method to call multiple mpi run inside
	program
Message-ID: <CACDKL=yYi1OJWk_2ag3M48iNnDe=AR8R48jz-qp9Dn-6f9Arzw@mail.gmail.com>

Hi, all

I use SLEPc as part of my program, which means I compile it as .so library.

I want my program work like this: execute serially, reach Point A, parallel
solving, then serially again.

I know I could use mpirun -np 4 in terminal to call the whole program, but
this will let the serial part be executed 4 times. What I want is only call
mpi at the eigensolving part.
Is there any function that could achieve something like that?

Thanks a lot.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/51604ffc/attachment.html>

From jed at jedbrown.org  Wed May  7 08:16:52 2014
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 07 May 2014 07:16:52 -0600
Subject: [petsc-users] Is there any method to call multiple mpi run
	inside program
In-Reply-To: <CACDKL=yYi1OJWk_2ag3M48iNnDe=AR8R48jz-qp9Dn-6f9Arzw@mail.gmail.com>
References: <CACDKL=yYi1OJWk_2ag3M48iNnDe=AR8R48jz-qp9Dn-6f9Arzw@mail.gmail.com>
Message-ID: <87eh05pw3f.fsf@jedbrown.org>

??? <altriaex86 at gmail.com> writes:

> Hi, all
>
> I use SLEPc as part of my program, which means I compile it as .so library.
>
> I want my program work like this: execute serially, reach Point A, parallel
> solving, then serially again.
>
> I know I could use mpirun -np 4 in terminal to call the whole program,
> but this will let the serial part be executed 4 times. 

This may not be bad, but you can MPI_Comm_rank(MPI_COMM_WORLD,&rank), if
(rank == 0) { ... do the serial stuff ...}.

> What I want is only call mpi at the eigensolving part.  Is there any
> function that could achieve something like that?

There is MPI_Comm_spawn, but it's sort of a mess for resource
management/portability problem so I would recommend just using mpiexec.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/4659c197/attachment.pgp>

From hzhang at mcs.anl.gov  Wed May  7 09:53:24 2014
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Wed, 7 May 2014 09:53:24 -0500
Subject: [petsc-users] MatGetMumpsRINFOG()
In-Reply-To: <16ccafbcb0c941b09a063946350a6688@GEORGE.anl.gov>
References: <16ccafbcb0c941b09a063946350a6688@GEORGE.anl.gov>
Message-ID: <CAGCphBsR_4yj8ekFpF9MF_whSkKQEAa6VJ1Ev2aH=SWQB+4yag@mail.gmail.com>

Zafer :
Sure, I can add
MatGetMumpsxxx().
I'll let you know after I'm done (1-2 days).

Hong

> Hi,
>
> I am trying to get mumps to return the matrix determinant. I have set the
> ICNTL option using:
>
> MatMumpsSetIcntl(A,33,1);
>
> and can view the determinant using
>
> PCView(pc, PETSC_VIEWER_STDOUT_WORLD);
>
> I  need to use the determinant in my code. Is there a way I can get petsc to
> return this parameter. If not, is it possible to implement the
> MatGetMumpsRINFOG() as suggested in:
>
> http://lists.mcs.anl.gov/pipermail/petsc-users/2011-September/010225.html
>
> King Regards
>
> ZL

From info at jubileedvds.com  Wed May  7 17:33:04 2014
From: info at jubileedvds.com (Jubilee DVDs)
Date: Wed,  7 May 2014 17:33:04 -0500 (CDT)
Subject: [petsc-users] Jubilee DVDs Newsletter
Message-ID: <1195896-1399501804761-133838-250313049-1-0@b.ss55.mailboxesmore.com>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/b882a47c/attachment-0001.html>

From zonexo at gmail.com  Wed May  7 20:35:39 2014
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 08 May 2014 09:35:39 +0800
Subject: [petsc-users] Override PETSc compile options
Message-ID: <536ADF6B.9020602@gmail.com>

Hi,

I want to override PETSc compile options. During compile, PETSc 
automatically uses -Wall etc

How can I change that?

-- 
Thank you

Yours sincerely,

TAY wee-beng


From knepley at gmail.com  Wed May  7 20:46:33 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 7 May 2014 20:46:33 -0500
Subject: [petsc-users] Override PETSc compile options
In-Reply-To: <536ADF6B.9020602@gmail.com>
References: <536ADF6B.9020602@gmail.com>
Message-ID: <CAMYG4G==e0sWpROngwbPJofVmiopwbRsBZ-pTqzd_fvZz1eAYg@mail.gmail.com>

On Wed, May 7, 2014 at 8:35 PM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
>
> I want to override PETSc compile options. During compile, PETSc
> automatically uses -Wall etc
>
> How can I change that?


All the options are given in -help for configure. That compiler options can
be overridden using --COPTFLAGS

   Matt


>
> --
> Thank you
>
> Yours sincerely,
>
> TAY wee-beng
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/1c2a9639/attachment.html>

From zonexo at gmail.com  Wed May  7 20:53:26 2014
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 08 May 2014 09:53:26 +0800
Subject: [petsc-users] Override PETSc compile options
In-Reply-To: <CAMYG4G==e0sWpROngwbPJofVmiopwbRsBZ-pTqzd_fvZz1eAYg@mail.gmail.com>
References: <536ADF6B.9020602@gmail.com>
	<CAMYG4G==e0sWpROngwbPJofVmiopwbRsBZ-pTqzd_fvZz1eAYg@mail.gmail.com>
Message-ID: <536AE396.7020102@gmail.com>

Hi Matt,


Sorry, I mean during the compilation of my own codes using my own makefile.

Thank you

Yours sincerely,

TAY wee-beng

On 8/5/2014 9:46 AM, Matthew Knepley wrote:
> On Wed, May 7, 2014 at 8:35 PM, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>     Hi,
>
>     I want to override PETSc compile options. During compile, PETSc
>     automatically uses -Wall etc
>
>     How can I change that?
>
>
> All the options are given in -help for configure. That compiler 
> options can be overridden using --COPTFLAGS
>
>    Matt
>
>
>     -- 
>     Thank you
>
>     Yours sincerely,
>
>     TAY wee-beng
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/63a6d2a1/attachment.html>

From knepley at gmail.com  Wed May  7 21:09:33 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 7 May 2014 21:09:33 -0500
Subject: [petsc-users] Override PETSc compile options
In-Reply-To: <536AE396.7020102@gmail.com>
References: <536ADF6B.9020602@gmail.com>
	<CAMYG4G==e0sWpROngwbPJofVmiopwbRsBZ-pTqzd_fvZz1eAYg@mail.gmail.com>
	<536AE396.7020102@gmail.com>
Message-ID: <CAMYG4GmjCQLXZnEZLSyXRmSSq0b0uHoqzZ_K4ACKHfx8nVXncw@mail.gmail.com>

On Wed, May 7, 2014 at 8:53 PM, TAY wee-beng <zonexo at gmail.com> wrote:

>  Hi Matt,
>
> Sorry, I mean during the compilation of my own codes using my own makefile.
>

I know, this is how you change the default PETSc compile flags.

   Matt


> Thank you
>
> Yours sincerely,
>
> TAY wee-beng
>
> On 8/5/2014 9:46 AM, Matthew Knepley wrote:
>
>  On Wed, May 7, 2014 at 8:35 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>
>> Hi,
>>
>> I want to override PETSc compile options. During compile, PETSc
>> automatically uses -Wall etc
>>
>> How can I change that?
>
>
>  All the options are given in -help for configure. That compiler options
> can be overridden using --COPTFLAGS
>
>     Matt
>
>
>>
>> --
>> Thank you
>>
>> Yours sincerely,
>>
>> TAY wee-beng
>>
>>
>
>
>  --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140507/e5ca4a58/attachment.html>

From balay at mcs.anl.gov  Wed May  7 21:13:54 2014
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 7 May 2014 21:13:54 -0500
Subject: [petsc-users] Override PETSc compile options
In-Reply-To: <CAMYG4G==e0sWpROngwbPJofVmiopwbRsBZ-pTqzd_fvZz1eAYg@mail.gmail.com>
References: <536ADF6B.9020602@gmail.com>
	<CAMYG4G==e0sWpROngwbPJofVmiopwbRsBZ-pTqzd_fvZz1eAYg@mail.gmail.com>
Message-ID: <alpine.LFD.2.11.1405072112410.10407@asterix>

On Wed, 7 May 2014, Matthew Knepley wrote:

> On Wed, May 7, 2014 at 8:35 PM, TAY wee-beng <zonexo at gmail.com> wrote:
> 
> > Hi,
> >
> > I want to override PETSc compile options. During compile, PETSc
> > automatically uses -Wall etc
> >
> > How can I change that?
> 
> 
> All the options are given in -help for configure. That compiler options can
> be overridden using --COPTFLAGS

Actually CFLAGS should be used to ovewride -Wall type options

Satish

From balay at mcs.anl.gov  Wed May  7 21:15:39 2014
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 7 May 2014 21:15:39 -0500
Subject: [petsc-users] Override PETSc compile options
In-Reply-To: <536AE396.7020102@gmail.com>
References: <536ADF6B.9020602@gmail.com>
	<CAMYG4G==e0sWpROngwbPJofVmiopwbRsBZ-pTqzd_fvZz1eAYg@mail.gmail.com>
	<536AE396.7020102@gmail.com>
Message-ID: <alpine.LFD.2.11.1405072114370.10407@asterix>

On Wed, 7 May 2014, TAY wee-beng wrote:

> Hi Matt,
> 
> 
> Sorry, I mean during the compilation of my own codes using my own makefile.
> 
$ grep Wall arch-linux2-c-debug/conf/petscvariables 
FC_LINKER_FLAGS =    -fPIC -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g -O0 
CC_FLAGS =  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 
FC_FLAGS =  -fPIC -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g -O0  
CXX_FLAGS =  -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0   -fPIC  
PCC_LINKER_FLAGS =    -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0
PCC_FLAGS =  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 
balay at asterix /home/balay/petsc (master)


You can redefine the corresponding variables in your makefile as needed. [after the include directive]

Satish

From zonexo at gmail.com  Wed May  7 21:16:59 2014
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 08 May 2014 10:16:59 +0800
Subject: [petsc-users] Override PETSc compile options
In-Reply-To: <alpine.LFD.2.11.1405072114370.10407@asterix>
References: <536ADF6B.9020602@gmail.com>
	<CAMYG4G==e0sWpROngwbPJofVmiopwbRsBZ-pTqzd_fvZz1eAYg@mail.gmail.com>
	<536AE396.7020102@gmail.com>
	<alpine.LFD.2.11.1405072114370.10407@asterix>
Message-ID: <536AE91B.3030201@gmail.com>

On 8/5/2014 10:15 AM, Satish Balay wrote:
> On Wed, 7 May 2014, TAY wee-beng wrote:
>
>> Hi Matt,
>>
>>
>> Sorry, I mean during the compilation of my own codes using my own makefile.
>>
> $ grep Wall arch-linux2-c-debug/conf/petscvariables
> FC_LINKER_FLAGS =    -fPIC -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g -O0
> CC_FLAGS =  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0
> FC_FLAGS =  -fPIC -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g -O0
> CXX_FLAGS =  -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0   -fPIC
> PCC_LINKER_FLAGS =    -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0
> PCC_FLAGS =  -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0
> balay at asterix /home/balay/petsc (master)
>
>
> You can redefine the corresponding variables in your makefile as needed. [after the include directive]
Ya, that was what I was looking for!

Thanks!
>
> Satish


From likunt at caltech.edu  Wed May  7 22:11:33 2014
From: likunt at caltech.edu (likunt at caltech.edu)
Date: Wed, 7 May 2014 20:11:33 -0700 (PDT)
Subject: [petsc-users] question on ksp
Message-ID: <59019.131.215.220.164.1399518693.squirrel@webmail.caltech.edu>

Dear Petsc developers,

I am solving a linear system Ax=b. The rhs vector b and the matrix A are
defined as follows,

DMDACreate1d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,M,3,1,NULL,&da);
DMCreateGlobalVector(da, &b);

MatCreate(PETSC_COMM_WORLD, &A);
MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, M*3, M*3);
MatMPIAIJSetPreallocation(A, 7, NULL, 7, NULL);
MatSetUp(A);

There is a Memory corruption problem when calling
KSPSetOperators(ksp, A, A, DIFFERENT_NONZERO_PATTERN);
KSPSolve(ksp, x, b);

since the partition of A and b are not consistent. Should I use

KSPSetDM and KSPSetComputeOperators

for sovling this problem?

Thanks,


From bsmith at mcs.anl.gov  Wed May  7 22:27:05 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 7 May 2014 22:27:05 -0500
Subject: [petsc-users] question on ksp
In-Reply-To: <59019.131.215.220.164.1399518693.squirrel@webmail.caltech.edu>
References: <59019.131.215.220.164.1399518693.squirrel@webmail.caltech.edu>
Message-ID: <D7FC120C-1D4E-4332-99C6-52AFA381D036@mcs.anl.gov>


  Use DMCreateMatrix() and it will return the correctly sized matrix, with the correct parallel layout and the the correct nonzero preallocation for  the given DM.

   After these changes let us know if you have any problems.

   Barry

On May 7, 2014, at 10:11 PM, <likunt at caltech.edu> <likunt at caltech.edu> wrote:

> Dear Petsc developers,
> 
> I am solving a linear system Ax=b. The rhs vector b and the matrix A are
> defined as follows,
> 
> DMDACreate1d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,M,3,1,NULL,&da);
> DMCreateGlobalVector(da, &b);
> 
> MatCreate(PETSC_COMM_WORLD, &A);
> MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, M*3, M*3);
> MatMPIAIJSetPreallocation(A, 7, NULL, 7, NULL);
> MatSetUp(A);
> 
> There is a Memory corruption problem when calling
> KSPSetOperators(ksp, A, A, DIFFERENT_NONZERO_PATTERN);
> KSPSolve(ksp, x, b);
> 
> since the partition of A and b are not consistent. Should I use
> 
> KSPSetDM and KSPSetComputeOperators
> 
> for sovling this problem?
> 
> Thanks,
> 
> 
> 
> 
> 
> 


From francium87 at hotmail.com  Thu May  8 02:49:03 2014
From: francium87 at hotmail.com (linjing bo)
Date: Thu, 8 May 2014 07:49:03 +0000
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <CAMYG4G=6p4+D4DznoNVPeR9KgcQ405-vuY=0o2X9nFgMV-SqBw@mail.gmail.com>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>,
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>,
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>,
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>,
	<BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>,
	<BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>,
	<CAMYG4G=6p4+D4DznoNVPeR9KgcQ405-vuY=0o2X9nFgMV-SqBw@mail.gmail.com>
Message-ID: <BLU179-W60055771804997B4A7A3E9C9490@phx.gbl>


Yes, I called KSPDestroy(). I have reproduce the problem using a small C code, this code with default ilu preconditioner will show an error  [0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Argument out of range!
[0]PETSC ERROR: Cannot log negative flops!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.info by jlin Thu May  8 15:43:13 2014
[0]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib
[0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014
[0]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h
[0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c
[0]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
[0]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c
===================================================Below is the code-------------------------------------------------static char help[] = "Solve";
#include <petsc.h>int main(int argc, char **args){
        Vec x,b,u;
        Mat A;
        KSP ksp;
        PC  pc;
        PetscViewer fd;
        PetscErrorCode ierr;
        PetscReal tol=1.e-4;
        PetscScalar one = 1.0;
        PetscInt n=1023;        PetscInitialize(&argc,&args,(char*)0,help);        ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
        ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr);
        ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr);
        ierr = VecSetFromOptions(x);CHKERRQ(ierr);
        ierr = VecDuplicate(x,&b);CHKERRQ(ierr);        PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", FILE_MODE_READ, &fd);
        ierr = MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \
                        11,PETSC_NULL,11,PETSC_NULL,&A);
        ierr = MatLoad(A, fd);
        PetscViewerDestroy(&fd);        VecSet( b, one);
        VecSet( x, one);
        VecAssemblyBegin(b);
        VecAssemblyEnd(b);
        VecAssemblyBegin(x);
        VecAssemblyEnd(x);        ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);        ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);        ierr = KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);        ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);        ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr);        ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);        ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

        KSPDestroy(&ksp);
        VecDestroy(&x);
        VecDestroy(&b);
        PetscFinalize();
        return 0;
}-----------------------------------------------------------

Date: Wed, 7 May 2014 05:52:13 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Tue, May 6, 2014 at 10:53 PM, linjing bo <francium87 at hotmail.com> wrote:


The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal?

Did you call KSPDestroy()?

  Matt  
---------------------------------------------------------------------------------------------

==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327
==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75)

==31551==    by 0x5E1C41F: KSPSetUp (itfunc.c:239)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)

==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
 
...
 
==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327

==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011)
==31551==    by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655)

==31551==    by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756)
==31551==    by 0x573D152: MatILUFactorSymbolic (matrix.c:6240)
==31551==    by 0x5CFB843: PCSetUp_ILU (ilu.c:204)
==31551==    by 0x5D8BAA7: PCSetUp (precon.c:890)

==31551==    by 0x5E1C639: KSPSetUp (itfunc.c:278)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)

==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)

-------------------------------------------------------------------------------------------------


From: francium87 at hotmail.com
To: knepley at gmail.com
CC: petsc-users at mcs.anl.gov

Subject: RE: [petsc-users] VecValidValues() reports NaN found
Date: Mon, 5 May 2014 13:15:38 +0000


Ok, I will try it . Thanks for your advise. 

Date: Mon, 5 May 2014 08:12:05 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com

To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:


I use JACOBI. The message showed is with JACOBI. 


Wired situation is that the backtrack information shows the location
 is before actually apply PC, so I guess the rhs vec is not changed at 
this point.

Another wired thing is : Because the original code is
 to complex. I write out the A matrix in Ax=b, and write a small test 
code to read in this matrix and solve it, no error showed. The KSP, PC 
are all set to be the same.

When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function:

1) Run in serial until it works


2) It looks like you have memory overwriting problems. Run with valgrind
   Matt 

[2]PETSC ERROR: --------------------- Error Message ------------------------------------
[2]PETSC ERROR: Argument out of range!                                                  
[2]PETSC ERROR: Cannot log negative flops!                                              


[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                              
[2]PETSC ERROR: See docs/changes/index.html for recent updates.                         


[2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                     
[2]PETSC ERROR: See docs/index.html for manual pages.                                   
[2]PETSC ERROR: ------------------------------------------------------------------------


[2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:51:27 2014                                                                                                


[2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[2]PETSC ERROR: ------------------------------------------------------------------------           
[2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h                    
[2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c                                                                                              


[2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c      
[2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c       
[2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               


[2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             
[2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


Date: Mon, 5 May 2014 07:27:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com


CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:


Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below 


[3]PETSC ERROR: --------------------- Error Message ------------------------------------           


[3]PETSC ERROR: Floating point exception!                                                          
[3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2!                                                                               


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                                         
[3]PETSC ERROR: See docs/changes/index.html for recent updates.                                    


[3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                                
[3]PETSC ERROR: See docs/index.html for manual pages.                                              
[3]PETSC ERROR: ------------------------------------------------------------------------           


[3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:03:20 2014                                                                                                


[3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c  


It looks like the vector after preconditioner application is bad. What is the preconditioner?
  Matt 


[3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h         


[3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c     
[3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c      
[3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below


---------------------------Solver section--------------------------

  call VecSet( pet_bp_b, one, ierr)

  vecidx=[0,1]
  call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
  write(*,*) ' first two values ', first(1), first(2)


  call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
  call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
  call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
  CHKERRQ(ierr)


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		   		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.

-- Norbert Wiener

 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/6d5b6dc7/attachment-0001.html>

From knepley at gmail.com  Thu May  8 05:20:52 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 8 May 2014 05:20:52 -0500
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <BLU179-W60055771804997B4A7A3E9C9490@phx.gbl>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>
	<BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>
	<BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>
	<CAMYG4G=6p4+D4DznoNVPeR9KgcQ405-vuY=0o2X9nFgMV-SqBw@mail.gmail.com>
	<BLU179-W60055771804997B4A7A3E9C9490@phx.gbl>
Message-ID: <CAMYG4GkjdBqw_hJxzbQfgZpa_XGWRX7hyOOPHPorhuL4V4jZ2A@mail.gmail.com>

On Thu, May 8, 2014 at 2:49 AM, linjing bo <francium87 at hotmail.com> wrote:

>  Yes, I called KSPDestroy(). I have reproduce the problem using a small C
> code, this code with default ilu preconditioner will show an error
>

Can you also send your matrix so I can run it?

  Thanks,

     Matt


>
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Argument out of range!
> [0]PETSC ERROR: Cannot log negative flops!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.infoby jlin Thu May  8 15:43:13 2014
> [0]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014
> [0]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscLogFlops() line 204 in
> /tmp/petsc-3.4.4/include/petsclog.h
> [0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in
> /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c
> [0]PETSC ERROR: MatLUFactorNumeric() line 2889 in
> /tmp/petsc-3.4.4/src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetUp_ILU() line 232 in
> /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
> [0]PETSC ERROR: PCSetUp() line 890 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 278 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 399 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c
> ===================================================
> Below is the code
> -------------------------------------------------
> static char help[] = "Solve";
> #include <petsc.h>
> int main(int argc, char **args){
>         Vec x,b,u;
>         Mat A;
>         KSP ksp;
>         PC  pc;
>         PetscViewer fd;
>         PetscErrorCode ierr;
>         PetscReal tol=1.e-4;
>         PetscScalar one = 1.0;
>         PetscInt n=1023;
>         PetscInitialize(&argc,&args,(char*)0,help);
>         ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
>         ierr = PetscObjectSetName((PetscObject) x,
> "Solution");CHKERRQ(ierr);
>         ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr);
>         ierr = VecSetFromOptions(x);CHKERRQ(ierr);
>         ierr = VecDuplicate(x,&b);CHKERRQ(ierr);
>         PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin",
> FILE_MODE_READ, &fd);
>         ierr =
> MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \
>                         11,PETSC_NULL,11,PETSC_NULL,&A);
>         ierr = MatLoad(A, fd);
>         PetscViewerDestroy(&fd);
>         VecSet( b, one);
>         VecSet( x, one);
>         VecAssemblyBegin(b);
>         VecAssemblyEnd(b);
>         VecAssemblyBegin(x);
>         VecAssemblyEnd(x);
>         ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
>         ierr =
> KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);
>         ierr =
> KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);
>         ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
>         ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr);
>         ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);
>         ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
>
>         KSPDestroy(&ksp);
>         VecDestroy(&x);
>         VecDestroy(&b);
>         PetscFinalize();
>         return 0;
> }
> -----------------------------------------------------------
>
> ------------------------------
> Date: Wed, 7 May 2014 05:52:13 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Tue, May 6, 2014 at 10:53 PM, linjing bo <francium87 at hotmail.com>wrote:
>
> The Valgrind shows memory leak in memalign() called by KSPSetup and
> PCSetup. Is that normal?
>
>
>
> Did you call KSPDestroy()?
>
>   Matt
>
>
>
> ---------------------------------------------------------------------------------------------
> ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636
> of 3,327
> ==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
> ==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
> ==31551==    by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75)
> ==31551==    by 0x5E1C41F: KSPSetUp (itfunc.c:239)
> ==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
> ==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
> ==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)
> ==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
> ==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
> ==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
>
> ...
>
> ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113
> of 3,327
> ==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
> ==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
> ==31551==    by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011)
> ==31551==    by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0
> (aijfact.c:1655)
> ==31551==    by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756)
> ==31551==    by 0x573D152: MatILUFactorSymbolic (matrix.c:6240)
> ==31551==    by 0x5CFB843: PCSetUp_ILU (ilu.c:204)
> ==31551==    by 0x5D8BAA7: PCSetUp (precon.c:890)
> ==31551==    by 0x5E1C639: KSPSetUp (itfunc.c:278)
> ==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
> ==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
> ==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)
> ==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
> ==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
> ==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
>
>
> -------------------------------------------------------------------------------------------------
>
>
> ------------------------------
> From: francium87 at hotmail.com
> To: knepley at gmail.com
> CC: petsc-users at mcs.anl.gov
> Subject: RE: [petsc-users] VecValidValues() reports NaN found
> Date: Mon, 5 May 2014 13:15:38 +0000
>
> Ok, I will try it . Thanks for your advise.
>
> ------------------------------
> Date: Mon, 5 May 2014 08:12:05 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:
>
> I use JACOBI. The message showed is with JACOBI.
>
>
> Wired situation is that the backtrack information shows the location is
> before actually apply PC, so I guess the rhs vec is not changed at this
> point.
>
> Another wired thing is : Because the original code is to complex. I write
> out the A matrix in Ax=b, and write a small test code to read in this
> matrix and solve it, no error showed. The KSP, PC are all set to be the
> same.
>
> When I try to using ILU, more wired error happens, the backtrack info
> shows it died in a Flops logging function:
>
>
> 1) Run in serial until it works
>
> 2) It looks like you have memory overwriting problems. Run with valgrind
>
>    Matt
>
>
> [2]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [2]PETSC ERROR: Argument out of
> range!
> [2]PETSC ERROR: Cannot log negative
> flops!
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [2]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [2]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [2]PETSC ERROR: See docs/index.html for manual
> pages.
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:51:27
> 2014
>
> [2]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [2]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [2]PETSC ERROR: PetscLogFlops() line 204 in
> /tmp/petsc-3.4.4/include/petsclog.h
> [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in
> /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c
>
> [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in
> /tmp/petsc-3.4.4/src/mat/interface/matrix.c
> [2]PETSC ERROR: PCSetUp_ILU() line 232 in
> /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
> [2]PETSC ERROR: PCSetUp() line 890 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [2]PETSC ERROR: KSPSetUp() line 278 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> [2]PETSC ERROR: KSPSolve() line 399 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
>
>
> ------------------------------
> Date: Mon, 5 May 2014 07:27:52 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:
>
>  Hi, I'm trying to use PETSc's ksp method to solve a linear system. When
> running, Error is reported by VecValidValues() that NaN or Inf is found
> with error message listed below
>
>
> [3]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [3]PETSC ERROR: Floating point
> exception!
> [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite
> at beginning of function: Parameter number
> 2!
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [3]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [3]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [3]PETSC ERROR: See docs/index.html for manual
> pages.
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:03:20
> 2014
>
> [3]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [3]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: VecValidValues() line 28 in
> /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c
>
>
> It looks like the vector after preconditioner application is bad. What is
> the preconditioner?
>
>   Matt
>
>
>
> [3]PETSC ERROR: PCApply() line 436 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [3]PETSC ERROR: KSP_PCApply() line 227 in
> /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h
> [3]PETSC ERROR: KSPInitialResidual() line 64 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c
> [3]PETSC ERROR: KSPSolve_GMRES() line 239 in
> /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c
> [3]PETSC ERROR: KSPSolve() line 441 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
> After read the source code shown by backtrack informations, I realize the
> problem is in the right hand side vector. So I make a trial of set right
> hand side vector to ONE by VecSet, But the program still shows error
> message above, and using VecView or VecGetValue to investigate the first
> value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly
> describe the problem. The code related is listed below
>
> ---------------------------Solver section--------------------------
>
>   call VecSet( pet_bp_b, one, ierr)
>
>   vecidx=[0,1]
>   call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
>   write(*,*) ' first two values ', first(1), first(2)
>
>   call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
>   call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
>   call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
>   CHKERRQ(ierr)
>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/964a484d/attachment-0001.html>

From francium87 at hotmail.com  Thu May  8 05:44:00 2014
From: francium87 at hotmail.com (linjing bo)
Date: Thu, 8 May 2014 10:44:00 +0000
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <CAMYG4GkjdBqw_hJxzbQfgZpa_XGWRX7hyOOPHPorhuL4V4jZ2A@mail.gmail.com>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>,
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>,
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>,
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>,
	<BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>,
	<BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>,
	<CAMYG4G=6p4+D4DznoNVPeR9KgcQ405-vuY=0o2X9nFgMV-SqBw@mail.gmail.com>,
	<BLU179-W60055771804997B4A7A3E9C9490@phx.gbl>,
	<CAMYG4GkjdBqw_hJxzbQfgZpa_XGWRX7hyOOPHPorhuL4V4jZ2A@mail.gmail.com>
Message-ID: <BLU179-W95E23A7225AE91F22D87BFC9490@phx.gbl>

Sorry , forgot to attach file. Thanks in advance.
But does the elements in matrix really matters a lot ?
 
Date: Thu, 8 May 2014 05:20:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Thu, May 8, 2014 at 2:49 AM, linjing bo <francium87 at hotmail.com> wrote:


Yes, I called KSPDestroy(). I have reproduce the problem using a small C code, this code with default ilu preconditioner will show an error
Can you also send your matrix so I can run it?

  Thanks,
     Matt  
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Argument out of range!
[0]PETSC ERROR: Cannot log negative flops!
[0]PETSC ERROR: ------------------------------------------------------------------------

[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.info by jlin Thu May  8 15:43:13 2014

[0]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib
[0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014
[0]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h
[0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c

[0]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
[0]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c

[0]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c

===================================================Below is the code-------------------------------------------------static char help[] = "Solve";

#include <petsc.h>int main(int argc, char **args){
        Vec x,b,u;
        Mat A;
        KSP ksp;
        PC  pc;
        PetscViewer fd;
        PetscErrorCode ierr;
        PetscReal tol=1.e-4;

        PetscScalar one = 1.0;
        PetscInt n=1023;        PetscInitialize(&argc,&args,(char*)0,help);        ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);

        ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr);
        ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr);
        ierr = VecSetFromOptions(x);CHKERRQ(ierr);
        ierr = VecDuplicate(x,&b);CHKERRQ(ierr);
        PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", FILE_MODE_READ, &fd);
        ierr = MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \
                        11,PETSC_NULL,11,PETSC_NULL,&A);

        ierr = MatLoad(A, fd);
        PetscViewerDestroy(&fd);        VecSet( b, one);
        VecSet( x, one);
        VecAssemblyBegin(b);
        VecAssemblyEnd(b);
        VecAssemblyBegin(x);

        VecAssemblyEnd(x);        ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);        ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);
        ierr = KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);        ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);        ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr);
        ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);        ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

        KSPDestroy(&ksp);
        VecDestroy(&x);

        VecDestroy(&b);
        PetscFinalize();
        return 0;
}-----------------------------------------------------------

Date: Wed, 7 May 2014 05:52:13 -0500

Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com

CC: petsc-users at mcs.anl.gov

On Tue, May 6, 2014 at 10:53 PM, linjing bo <francium87 at hotmail.com> wrote:


The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal?

Did you call KSPDestroy()?


  Matt  
---------------------------------------------------------------------------------------------


==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327
==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75)


==31551==    by 0x5E1C41F: KSPSetUp (itfunc.c:239)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)


==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
 
...
 
==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327


==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011)
==31551==    by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655)


==31551==    by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756)
==31551==    by 0x573D152: MatILUFactorSymbolic (matrix.c:6240)
==31551==    by 0x5CFB843: PCSetUp_ILU (ilu.c:204)
==31551==    by 0x5D8BAA7: PCSetUp (precon.c:890)


==31551==    by 0x5E1C639: KSPSetUp (itfunc.c:278)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)


==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)

-------------------------------------------------------------------------------------------------


From: francium87 at hotmail.com
To: knepley at gmail.com
CC: petsc-users at mcs.anl.gov


Subject: RE: [petsc-users] VecValidValues() reports NaN found
Date: Mon, 5 May 2014 13:15:38 +0000


Ok, I will try it . Thanks for your advise. 

Date: Mon, 5 May 2014 08:12:05 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com


To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:


I use JACOBI. The message showed is with JACOBI. 


Wired situation is that the backtrack information shows the location
 is before actually apply PC, so I guess the rhs vec is not changed at 
this point.

Another wired thing is : Because the original code is
 to complex. I write out the A matrix in Ax=b, and write a small test 
code to read in this matrix and solve it, no error showed. The KSP, PC 
are all set to be the same.

When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function:

1) Run in serial until it works


2) It looks like you have memory overwriting problems. Run with valgrind
   Matt 


[2]PETSC ERROR: --------------------- Error Message ------------------------------------
[2]PETSC ERROR: Argument out of range!                                                  
[2]PETSC ERROR: Cannot log negative flops!                                              


[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                              
[2]PETSC ERROR: See docs/changes/index.html for recent updates.                         


[2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                     
[2]PETSC ERROR: See docs/index.html for manual pages.                                   
[2]PETSC ERROR: ------------------------------------------------------------------------


[2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:51:27 2014                                                                                                


[2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[2]PETSC ERROR: ------------------------------------------------------------------------           
[2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h                    
[2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c                                                                                              


[2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c      
[2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c       
[2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               


[2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             
[2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


Date: Mon, 5 May 2014 07:27:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com


CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:


Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below 


[3]PETSC ERROR: --------------------- Error Message ------------------------------------           


[3]PETSC ERROR: Floating point exception!                                                          
[3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2!                                                                               


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                                         
[3]PETSC ERROR: See docs/changes/index.html for recent updates.                                    


[3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                                
[3]PETSC ERROR: See docs/index.html for manual pages.                                              
[3]PETSC ERROR: ------------------------------------------------------------------------           


[3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:03:20 2014                                                                                                


[3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c  


It looks like the vector after preconditioner application is bad. What is the preconditioner?
  Matt 


[3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h         


[3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c     
[3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c      
[3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below


---------------------------Solver section--------------------------

  call VecSet( pet_bp_b, one, ierr)

  vecidx=[0,1]
  call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
  write(*,*) ' first two values ', first(1), first(2)


  call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
  call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
  call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
  CHKERRQ(ierr)


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		   		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener

 		 	   		  
-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.

-- Norbert Wiener
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/2a251e14/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tor0bp.bin
Type: application/octet-stream
Size: 135976 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/2a251e14/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tor0bp.bin.info
Type: application/octet-stream
Size: 22 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/2a251e14/attachment-0001.obj>

From scanmail at anl.gov  Thu May  8 05:44:31 2014
From: scanmail at anl.gov (Administrator)
Date: Thu, 8 May 2014 05:44:31 -0500
Subject: [petsc-users] [MailServer Notification]Argonne Antivirus Quarantine
	Notification - DO NOT REPLY
Message-ID: <6FAB217316DE4B008E79C04251562526@anl.gov>

Do not reply to this message. The reply address is not monitored.

The message below has been quarantined by the Argonne National Laboratory 
Antivirus filtering system.  The message was filtered for having been 
detected of having malicious content or an attachment that matches the 
laboratory?s filtering criteria.

From: francium87 at hotmail.com;
To:  knepley at gmail.com;petsc-users at mcs.anl.gov;
Subject:  Re: [petsc-users] VecValidValues() reports NaN found
Attachment:  tor0bp.bin
Date:  5/8/2014 5:44:08 AM

If you have any questions regarding the Argonne's antivirus filtering 
product, or feel that the attachment was incorrectly identified, please 
contact the CIS Service Desk at help at anl.gov or x-9999 option 2.

From francium87 at hotmail.com  Thu May  8 05:48:05 2014
From: francium87 at hotmail.com (linjing bo)
Date: Thu, 8 May 2014 10:48:05 +0000
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <CAMYG4GkjdBqw_hJxzbQfgZpa_XGWRX7hyOOPHPorhuL4V4jZ2A@mail.gmail.com>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>,
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>,
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>,
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>,
	<BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>,
	<BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>,
	<CAMYG4G=6p4+D4DznoNVPeR9KgcQ405-vuY=0o2X9nFgMV-SqBw@mail.gmail.com>,
	<BLU179-W60055771804997B4A7A3E9C9490@phx.gbl>,
	<CAMYG4GkjdBqw_hJxzbQfgZpa_XGWRX7hyOOPHPorhuL4V4jZ2A@mail.gmail.com>
Message-ID: <BLU179-W32BD5A4887998D72B183AAC9490@phx.gbl>

Sorry, forgot to attatch the matrix file
 
Date: Thu, 8 May 2014 05:20:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Thu, May 8, 2014 at 2:49 AM, linjing bo <francium87 at hotmail.com> wrote:


Yes, I called KSPDestroy(). I have reproduce the problem using a small C code, this code with default ilu preconditioner will show an error
Can you also send your matrix so I can run it?

  Thanks,
     Matt  
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Argument out of range!
[0]PETSC ERROR: Cannot log negative flops!
[0]PETSC ERROR: ------------------------------------------------------------------------

[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.info by jlin Thu May  8 15:43:13 2014

[0]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib
[0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014
[0]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h
[0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c

[0]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
[0]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c

[0]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c

===================================================Below is the code-------------------------------------------------static char help[] = "Solve";

#include <petsc.h>int main(int argc, char **args){
        Vec x,b,u;
        Mat A;
        KSP ksp;
        PC  pc;
        PetscViewer fd;
        PetscErrorCode ierr;
        PetscReal tol=1.e-4;

        PetscScalar one = 1.0;
        PetscInt n=1023;        PetscInitialize(&argc,&args,(char*)0,help);        ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);

        ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr);
        ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr);
        ierr = VecSetFromOptions(x);CHKERRQ(ierr);
        ierr = VecDuplicate(x,&b);CHKERRQ(ierr);
        PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", FILE_MODE_READ, &fd);
        ierr = MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \
                        11,PETSC_NULL,11,PETSC_NULL,&A);

        ierr = MatLoad(A, fd);
        PetscViewerDestroy(&fd);        VecSet( b, one);
        VecSet( x, one);
        VecAssemblyBegin(b);
        VecAssemblyEnd(b);
        VecAssemblyBegin(x);

        VecAssemblyEnd(x);        ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);        ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);
        ierr = KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);        ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);        ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr);
        ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);        ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

        KSPDestroy(&ksp);
        VecDestroy(&x);

        VecDestroy(&b);
        PetscFinalize();
        return 0;
}-----------------------------------------------------------

Date: Wed, 7 May 2014 05:52:13 -0500

Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com

CC: petsc-users at mcs.anl.gov

On Tue, May 6, 2014 at 10:53 PM, linjing bo <francium87 at hotmail.com> wrote:


The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal?

Did you call KSPDestroy()?


  Matt  
---------------------------------------------------------------------------------------------


==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327
==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75)


==31551==    by 0x5E1C41F: KSPSetUp (itfunc.c:239)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)


==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
 
...
 
==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327


==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011)
==31551==    by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655)


==31551==    by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756)
==31551==    by 0x573D152: MatILUFactorSymbolic (matrix.c:6240)
==31551==    by 0x5CFB843: PCSetUp_ILU (ilu.c:204)
==31551==    by 0x5D8BAA7: PCSetUp (precon.c:890)


==31551==    by 0x5E1C639: KSPSetUp (itfunc.c:278)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)


==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)

-------------------------------------------------------------------------------------------------


From: francium87 at hotmail.com
To: knepley at gmail.com
CC: petsc-users at mcs.anl.gov


Subject: RE: [petsc-users] VecValidValues() reports NaN found
Date: Mon, 5 May 2014 13:15:38 +0000


Ok, I will try it . Thanks for your advise. 

Date: Mon, 5 May 2014 08:12:05 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com


To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:


I use JACOBI. The message showed is with JACOBI. 


Wired situation is that the backtrack information shows the location
 is before actually apply PC, so I guess the rhs vec is not changed at 
this point.

Another wired thing is : Because the original code is
 to complex. I write out the A matrix in Ax=b, and write a small test 
code to read in this matrix and solve it, no error showed. The KSP, PC 
are all set to be the same.

When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function:

1) Run in serial until it works


2) It looks like you have memory overwriting problems. Run with valgrind
   Matt 


[2]PETSC ERROR: --------------------- Error Message ------------------------------------
[2]PETSC ERROR: Argument out of range!                                                  
[2]PETSC ERROR: Cannot log negative flops!                                              


[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                              
[2]PETSC ERROR: See docs/changes/index.html for recent updates.                         


[2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                     
[2]PETSC ERROR: See docs/index.html for manual pages.                                   
[2]PETSC ERROR: ------------------------------------------------------------------------


[2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:51:27 2014                                                                                                


[2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[2]PETSC ERROR: ------------------------------------------------------------------------           
[2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h                    
[2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c                                                                                              


[2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c      
[2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c       
[2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               


[2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             
[2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


Date: Mon, 5 May 2014 07:27:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com


CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:


Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below 


[3]PETSC ERROR: --------------------- Error Message ------------------------------------           


[3]PETSC ERROR: Floating point exception!                                                          
[3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2!                                                                               


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                                         
[3]PETSC ERROR: See docs/changes/index.html for recent updates.                                    


[3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                                
[3]PETSC ERROR: See docs/index.html for manual pages.                                              
[3]PETSC ERROR: ------------------------------------------------------------------------           


[3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:03:20 2014                                                                                                


[3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c  


It looks like the vector after preconditioner application is bad. What is the preconditioner?
  Matt 


[3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h         


[3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c     
[3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c      
[3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below


---------------------------Solver section--------------------------

  call VecSet( pet_bp_b, one, ierr)

  vecidx=[0,1]
  call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
  write(*,*) ' first two values ', first(1), first(2)


  call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
  call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
  call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
  CHKERRQ(ierr)


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		   		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener

 		 	   		  
-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.

-- Norbert Wiener
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/bd01922a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matrix.zip
Type: application/zip
Size: 50130 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/bd01922a/attachment-0001.zip>

From scanmail at anl.gov  Thu May  8 05:48:37 2014
From: scanmail at anl.gov (Administrator)
Date: Thu, 8 May 2014 05:48:37 -0500
Subject: [petsc-users] [MailServer Notification]Argonne Antivirus Quarantine
	Notification - DO NOT REPLY
Message-ID: <31D15166009D4985AD204832B9E3F228@anl.gov>

Do not reply to this message. The reply address is not monitored.

The message below has been quarantined by the Argonne National Laboratory 
Antivirus filtering system.  The message was filtered for having been 
detected of having malicious content or an attachment that matches the 
laboratory?s filtering criteria.

From: francium87 at hotmail.com;
To:  knepley at gmail.com;petsc-users at mcs.anl.gov;
Subject:  Re: [petsc-users] VecValidValues() reports NaN found
Attachment:  matrix.zip
Date:  5/8/2014 5:48:12 AM

If you have any questions regarding the Argonne's antivirus filtering 
product, or feel that the attachment was incorrectly identified, please 
contact the CIS Service Desk at help at anl.gov or x-9999 option 2.

From knepley at gmail.com  Thu May  8 06:27:26 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 8 May 2014 06:27:26 -0500
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <BLU179-W95E23A7225AE91F22D87BFC9490@phx.gbl>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>
	<BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>
	<BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>
	<CAMYG4G=6p4+D4DznoNVPeR9KgcQ405-vuY=0o2X9nFgMV-SqBw@mail.gmail.com>
	<BLU179-W60055771804997B4A7A3E9C9490@phx.gbl>
	<CAMYG4GkjdBqw_hJxzbQfgZpa_XGWRX7hyOOPHPorhuL4V4jZ2A@mail.gmail.com>
	<BLU179-W95E23A7225AE91F22D87BFC9490@phx.gbl>
Message-ID: <CAMYG4GnaxD1o+zdeiVBwLAuWFj+bwC2A7sE-N9jF+AVzhH4MqQ@mail.gmail.com>

On Thu, May 8, 2014 at 5:44 AM, linjing bo <francium87 at hotmail.com> wrote:

> Sorry , forgot to attach file. Thanks in advance.
> But does the elements in matrix really matters a lot ?
>

Yes, unfortunately. The problem is that you have no diagonal element in row
0. I
do not think our factorization routine can handle this, but I will check
with Hong. If
you put a 0 there, it should work fine.

  Thanks,

      Matt


>
> ------------------------------
> Date: Thu, 8 May 2014 05:20:52 -0500
>
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Thu, May 8, 2014 at 2:49 AM, linjing bo <francium87 at hotmail.com> wrote:
>
>  Yes, I called KSPDestroy(). I have reproduce the problem using a small C
> code, this code with default ilu preconditioner will show an error
>
>
> Can you also send your matrix so I can run it?
>
>   Thanks,
>
>      Matt
>
>
>
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Argument out of range!
> [0]PETSC ERROR: Cannot log negative flops!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.infoby jlin Thu May  8 15:43:13 2014
> [0]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014
> [0]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscLogFlops() line 204 in
> /tmp/petsc-3.4.4/include/petsclog.h
> [0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in
> /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c
> [0]PETSC ERROR: MatLUFactorNumeric() line 2889 in
> /tmp/petsc-3.4.4/src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetUp_ILU() line 232 in
> /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
> [0]PETSC ERROR: PCSetUp() line 890 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 278 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 399 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c
> ===================================================
> Below is the code
> -------------------------------------------------
> static char help[] = "Solve";
> #include <petsc.h>
> int main(int argc, char **args){
>         Vec x,b,u;
>         Mat A;
>         KSP ksp;
>         PC  pc;
>         PetscViewer fd;
>         PetscErrorCode ierr;
>         PetscReal tol=1.e-4;
>         PetscScalar one = 1.0;
>         PetscInt n=1023;
>         PetscInitialize(&argc,&args,(char*)0,help);
>         ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
>         ierr = PetscObjectSetName((PetscObject) x,
> "Solution");CHKERRQ(ierr);
>         ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr);
>         ierr = VecSetFromOptions(x);CHKERRQ(ierr);
>         ierr = VecDuplicate(x,&b);CHKERRQ(ierr);
>         PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin",
> FILE_MODE_READ, &fd);
>         ierr =
> MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \
>                         11,PETSC_NULL,11,PETSC_NULL,&A);
>         ierr = MatLoad(A, fd);
>         PetscViewerDestroy(&fd);
>         VecSet( b, one);
>         VecSet( x, one);
>         VecAssemblyBegin(b);
>         VecAssemblyEnd(b);
>         VecAssemblyBegin(x);
>         VecAssemblyEnd(x);
>         ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
>         ierr =
> KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);
>         ierr =
> KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);
>         ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
>         ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr);
>         ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);
>         ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
>
>         KSPDestroy(&ksp);
>         VecDestroy(&x);
>         VecDestroy(&b);
>         PetscFinalize();
>         return 0;
> }
> -----------------------------------------------------------
>
> ------------------------------
> Date: Wed, 7 May 2014 05:52:13 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Tue, May 6, 2014 at 10:53 PM, linjing bo <francium87 at hotmail.com>wrote:
>
> The Valgrind shows memory leak in memalign() called by KSPSetup and
> PCSetup. Is that normal?
>
>
>
> Did you call KSPDestroy()?
>
>   Matt
>
>
>
> ---------------------------------------------------------------------------------------------
> ==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636
> of 3,327
> ==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
> ==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
> ==31551==    by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75)
> ==31551==    by 0x5E1C41F: KSPSetUp (itfunc.c:239)
> ==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
> ==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
> ==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)
> ==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
> ==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
> ==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
>
> ...
>
> ==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113
> of 3,327
> ==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
> ==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
> ==31551==    by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011)
> ==31551==    by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0
> (aijfact.c:1655)
> ==31551==    by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756)
> ==31551==    by 0x573D152: MatILUFactorSymbolic (matrix.c:6240)
> ==31551==    by 0x5CFB843: PCSetUp_ILU (ilu.c:204)
> ==31551==    by 0x5D8BAA7: PCSetUp (precon.c:890)
> ==31551==    by 0x5E1C639: KSPSetUp (itfunc.c:278)
> ==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
> ==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
> ==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)
> ==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
> ==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
> ==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
>
>
> -------------------------------------------------------------------------------------------------
>
>
> ------------------------------
> From: francium87 at hotmail.com
> To: knepley at gmail.com
> CC: petsc-users at mcs.anl.gov
> Subject: RE: [petsc-users] VecValidValues() reports NaN found
> Date: Mon, 5 May 2014 13:15:38 +0000
>
> Ok, I will try it . Thanks for your advise.
>
> ------------------------------
> Date: Mon, 5 May 2014 08:12:05 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:
>
> I use JACOBI. The message showed is with JACOBI.
>
>
> Wired situation is that the backtrack information shows the location is
> before actually apply PC, so I guess the rhs vec is not changed at this
> point.
>
> Another wired thing is : Because the original code is to complex. I write
> out the A matrix in Ax=b, and write a small test code to read in this
> matrix and solve it, no error showed. The KSP, PC are all set to be the
> same.
>
> When I try to using ILU, more wired error happens, the backtrack info
> shows it died in a Flops logging function:
>
>
> 1) Run in serial until it works
>
> 2) It looks like you have memory overwriting problems. Run with valgrind
>
>    Matt
>
>
> [2]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [2]PETSC ERROR: Argument out of
> range!
> [2]PETSC ERROR: Cannot log negative
> flops!
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [2]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [2]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [2]PETSC ERROR: See docs/index.html for manual
> pages.
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:51:27
> 2014
>
> [2]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [2]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [2]PETSC ERROR: PetscLogFlops() line 204 in
> /tmp/petsc-3.4.4/include/petsclog.h
> [2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in
> /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c
>
> [2]PETSC ERROR: MatLUFactorNumeric() line 2889 in
> /tmp/petsc-3.4.4/src/mat/interface/matrix.c
> [2]PETSC ERROR: PCSetUp_ILU() line 232 in
> /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
> [2]PETSC ERROR: PCSetUp() line 890 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [2]PETSC ERROR: KSPSetUp() line 278 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
> [2]PETSC ERROR: KSPSolve() line 399 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
>
>
> ------------------------------
> Date: Mon, 5 May 2014 07:27:52 -0500
> Subject: Re: [petsc-users] VecValidValues() reports NaN found
> From: knepley at gmail.com
> To: francium87 at hotmail.com
> CC: petsc-users at mcs.anl.gov
>
> On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:
>
>  Hi, I'm trying to use PETSc's ksp method to solve a linear system. When
> running, Error is reported by VecValidValues() that NaN or Inf is found
> with error message listed below
>
>
> [3]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [3]PETSC ERROR: Floating point
> exception!
> [3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite
> at beginning of function: Parameter number
> 2!
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13,
> 2014
> [3]PETSC ERROR: See docs/changes/index.html for recent
> updates.
> [3]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [3]PETSC ERROR: See docs/index.html for manual
> pages.
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by
> jlin Mon May  5 20:03:20
> 2014
>
> [3]PETSC ERROR: Libraries linked from
> /opt/sfw/petsc/3.4.4/intel/openmpi/lib
> [3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41
> 2014
> [3]PETSC ERROR: Configure options
> --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi
> --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel
> --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64
> --with-mpiexec=mpiexec
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: VecValidValues() line 28 in
> /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c
>
>
> It looks like the vector after preconditioner application is bad. What is
> the preconditioner?
>
>   Matt
>
>
>
> [3]PETSC ERROR: PCApply() line 436 in
> /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c
> [3]PETSC ERROR: KSP_PCApply() line 227 in
> /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h
> [3]PETSC ERROR: KSPInitialResidual() line 64 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c
> [3]PETSC ERROR: KSPSolve_GMRES() line 239 in
> /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c
> [3]PETSC ERROR: KSPSolve() line 441 in
> /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
>
>
> After read the source code shown by backtrack informations, I realize the
> problem is in the right hand side vector. So I make a trial of set right
> hand side vector to ONE by VecSet, But the program still shows error
> message above, and using VecView or VecGetValue to investigate the first
> value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly
> describe the problem. The code related is listed below
>
> ---------------------------Solver section--------------------------
>
>   call VecSet( pet_bp_b, one, ierr)
>
>   vecidx=[0,1]
>   call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
>   write(*,*) ' first two values ', first(1), first(2)
>
>   call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
>   call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
>   call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
>   CHKERRQ(ierr)
>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/90690e77/attachment-0001.html>

From francium87 at hotmail.com  Thu May  8 06:46:44 2014
From: francium87 at hotmail.com (linjing bo)
Date: Thu, 8 May 2014 11:46:44 +0000
Subject: [petsc-users] VecValidValues() reports NaN found
In-Reply-To: <CAMYG4GnaxD1o+zdeiVBwLAuWFj+bwC2A7sE-N9jF+AVzhH4MqQ@mail.gmail.com>
References: <BLU179-W117999BBC5C3506941B5CEC94C0@phx.gbl>,
	<CAMYG4G=i5wT=cwuVRfXw46D1h3KRz6DsiSJ3Pjh04Wcdjks7kQ@mail.gmail.com>,
	<BLU179-W30D6F42AF635AC79BAFF79C94C0@phx.gbl>,
	<CAMYG4Gnv1hjLO9SeZcUd8JFDGkpjLi+HkncmCrSy9+xXWP29jg@mail.gmail.com>,
	<BLU179-W9424EC6BA59B807739397CC94C0@phx.gbl>,
	<BLU179-W8382C882440B6E1CC5BD67C94E0@phx.gbl>,
	<CAMYG4G=6p4+D4DznoNVPeR9KgcQ405-vuY=0o2X9nFgMV-SqBw@mail.gmail.com>,
	<BLU179-W60055771804997B4A7A3E9C9490@phx.gbl>,
	<CAMYG4GkjdBqw_hJxzbQfgZpa_XGWRX7hyOOPHPorhuL4V4jZ2A@mail.gmail.com>,
	<BLU179-W95E23A7225AE91F22D87BFC9490@phx.gbl>,
	<CAMYG4GnaxD1o+zdeiVBwLAuWFj+bwC2A7sE-N9jF+AVzhH4MqQ@mail.gmail.com>
Message-ID: <BLU179-W38B770C41C701417BCE81DC9490@phx.gbl>

Ok. Thanks for your attention. I will check the matrix generation part, there should be diagonal element.
 
Date: Thu, 8 May 2014 06:27:26 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Thu, May 8, 2014 at 5:44 AM, linjing bo <francium87 at hotmail.com> wrote:


Sorry , forgot to attach file. Thanks in advance.
But does the elements in matrix really matters a lot ?
Yes, unfortunately. The problem is that you have no diagonal element in row 0. I
do not think our factorization routine can handle this, but I will check with Hong. Ifyou put a 0 there, it should work fine.
  Thanks,
      Matt
  
Date: Thu, 8 May 2014 05:20:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found

From: knepley at gmail.com
To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov


On Thu, May 8, 2014 at 2:49 AM, linjing bo <francium87 at hotmail.com> wrote:


Yes, I called KSPDestroy(). I have reproduce the problem using a small C code, this code with default ilu preconditioner will show an error
Can you also send your matrix so I can run it?


  Thanks,
     Matt 
 
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Argument out of range!
[0]PETSC ERROR: Cannot log negative flops!
[0]PETSC ERROR: ------------------------------------------------------------------------


[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.


[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: ./bptest_c on a arch-linux2-c-debug named node2.indac.info by jlin Thu May  8 15:43:13 2014


[0]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib
[0]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014
[0]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec


[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h
[0]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c


[0]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c
[0]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c


[0]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: main() line 46 in "unknowndirectory/"test.c


===================================================Below is the code-------------------------------------------------static char help[] = "Solve";


#include <petsc.h>int main(int argc, char **args){
        Vec x,b,u;
        Mat A;
        KSP ksp;
        PC  pc;
        PetscViewer fd;
        PetscErrorCode ierr;
        PetscReal tol=1.e-4;


        PetscScalar one = 1.0;
        PetscInt n=1023;        PetscInitialize(&argc,&args,(char*)0,help);        ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);


        ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr);
        ierr = VecSetSizes(x,PETSC_DECIDE,n);CHKERRQ(ierr);
        ierr = VecSetFromOptions(x);CHKERRQ(ierr);
        ierr = VecDuplicate(x,&b);CHKERRQ(ierr);

        PetscViewerBinaryOpen(PETSC_COMM_WORLD, "tor0bp.bin", FILE_MODE_READ, &fd);
        ierr = MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,n,n, \
                        11,PETSC_NULL,11,PETSC_NULL,&A);


        ierr = MatLoad(A, fd);
        PetscViewerDestroy(&fd);        VecSet( b, one);
        VecSet( x, one);
        VecAssemblyBegin(b);
        VecAssemblyEnd(b);
        VecAssemblyBegin(x);


        VecAssemblyEnd(x);        ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);        ierr = KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);

        ierr = KSPSetTolerances(ksp,1.e-5,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);        ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);        ierr = KSPSetInitialGuessNonzero(ksp,PETSC_TRUE);CHKERRQ(ierr);

        ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);        ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

        KSPDestroy(&ksp);
        VecDestroy(&x);


        VecDestroy(&b);
        PetscFinalize();
        return 0;
}-----------------------------------------------------------

Date: Wed, 7 May 2014 05:52:13 -0500


Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com


CC: petsc-users at mcs.anl.gov

On Tue, May 6, 2014 at 10:53 PM, linjing bo <francium87 at hotmail.com> wrote:


The Valgrind shows memory leak in memalign() called by KSPSetup and PCSetup. Is that normal?

Did you call KSPDestroy()?


  Matt  
---------------------------------------------------------------------------------------------


==31551== 136 bytes in 1 blocks are definitely lost in loss record 2,636 of 3,327
==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5DE9D5E: KSPSetUp_GMRES (gmres.c:75)


==31551==    by 0x5E1C41F: KSPSetUp (itfunc.c:239)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)


==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)
 
...
 
==31551== 4,092 bytes in 1 blocks are definitely lost in loss record 3,113 of 3,327


==31551==    at 0x4A05458: memalign (vg_replace_malloc.c:727)
==31551==    by 0x5498E79: PetscMallocAlign (mal.c:27)
==31551==    by 0x5A085AE: MatDuplicateNoCreate_SeqAIJ (aij.c:4011)
==31551==    by 0x5A2D24F: MatILUFactorSymbolic_SeqAIJ_ilu0 (aijfact.c:1655)


==31551==    by 0x5A2B7CB: MatILUFactorSymbolic_SeqAIJ (aijfact.c:1756)
==31551==    by 0x573D152: MatILUFactorSymbolic (matrix.c:6240)
==31551==    by 0x5CFB843: PCSetUp_ILU (ilu.c:204)
==31551==    by 0x5D8BAA7: PCSetUp (precon.c:890)


==31551==    by 0x5E1C639: KSPSetUp (itfunc.c:278)
==31551==    by 0x5E1821D: KSPSolve (itfunc.c:399)
==31551==    by 0x5CAD7F1: kspsolve_ (itfuncf.c:219)
==31551==    by 0x8D9842: petsc_solver_defi_ (petsc_defi.F90:434)


==31551==    by 0x831618: field_solver_defi_ (field_defi.F90:57)
==31551==    by 0x46CB4E: MAIN__ (main.F90:96)
==31551==    by 0x41852B: main (in /home/jlin/defi_field/ftest/gtc)

-------------------------------------------------------------------------------------------------


From: francium87 at hotmail.com
To: knepley at gmail.com
CC: petsc-users at mcs.anl.gov


Subject: RE: [petsc-users] VecValidValues() reports NaN found
Date: Mon, 5 May 2014 13:15:38 +0000


Ok, I will try it . Thanks for your advise. 

Date: Mon, 5 May 2014 08:12:05 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com


To: francium87 at hotmail.com
CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:56 AM, linjing bo <francium87 at hotmail.com> wrote:


I use JACOBI. The message showed is with JACOBI. 


Wired situation is that the backtrack information shows the location
 is before actually apply PC, so I guess the rhs vec is not changed at 
this point.

Another wired thing is : Because the original code is
 to complex. I write out the A matrix in Ax=b, and write a small test 
code to read in this matrix and solve it, no error showed. The KSP, PC 
are all set to be the same.

When I try to using ILU, more wired error happens, the backtrack info shows it died in a Flops logging function:

1) Run in serial until it works


2) It looks like you have memory overwriting problems. Run with valgrind
   Matt 


[2]PETSC ERROR: --------------------- Error Message ------------------------------------
[2]PETSC ERROR: Argument out of range!                                                  
[2]PETSC ERROR: Cannot log negative flops!                                              


[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                              
[2]PETSC ERROR: See docs/changes/index.html for recent updates.                         


[2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                     
[2]PETSC ERROR: See docs/index.html for manual pages.                                   
[2]PETSC ERROR: ------------------------------------------------------------------------


[2]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:51:27 2014                                                                                                


[2]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[2]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[2]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[2]PETSC ERROR: ------------------------------------------------------------------------           
[2]PETSC ERROR: PetscLogFlops() line 204 in /tmp/petsc-3.4.4/include/petsclog.h                    
[2]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 552 in /tmp/petsc-3.4.4/src/mat/impls/aij/seq/aijfact.c                                                                                              


[2]PETSC ERROR: MatLUFactorNumeric() line 2889 in /tmp/petsc-3.4.4/src/mat/interface/matrix.c      
[2]PETSC ERROR: PCSetUp_ILU() line 232 in /tmp/petsc-3.4.4/src/ksp/pc/impls/factor/ilu/ilu.c       
[2]PETSC ERROR: PCSetUp() line 890 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               


[2]PETSC ERROR: KSPSetUp() line 278 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             
[2]PETSC ERROR: KSPSolve() line 399 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


Date: Mon, 5 May 2014 07:27:52 -0500
Subject: Re: [petsc-users] VecValidValues() reports NaN found
From: knepley at gmail.com
To: francium87 at hotmail.com


CC: petsc-users at mcs.anl.gov

On Mon, May 5, 2014 at 7:25 AM, linjing bo <francium87 at hotmail.com> wrote:


Hi, I'm trying to use PETSc's ksp method to solve a linear system. When running, Error is reported by VecValidValues() that NaN or Inf is found with error message listed below 


[3]PETSC ERROR: --------------------- Error Message ------------------------------------           


[3]PETSC ERROR: Floating point exception!                                                          
[3]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite at beginning of function: Parameter number 2!                                                                               


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014                                         
[3]PETSC ERROR: See docs/changes/index.html for recent updates.                                    


[3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.                                
[3]PETSC ERROR: See docs/index.html for manual pages.                                              
[3]PETSC ERROR: ------------------------------------------------------------------------           


[3]PETSC ERROR: ./gtc on a arch-linux2-c-debug named node2.indac.info by jlin Mon May  5 20:03:20 2014                                                                                                


[3]PETSC ERROR: Libraries linked from /opt/sfw/petsc/3.4.4/intel/openmpi/lib                       
[3]PETSC ERROR: Configure run at Sat Apr 26 20:19:41 2014                                          
[3]PETSC ERROR: Configure options --prefix=/opt/sfw/petsc/3.4.4/intel/openmpi --with-mpi-dir=/opt/sfw/openmpi/1.6.3/intel --with-blas-lapack-dir=/opt/sfw/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 --with-mpiexec=mpiexec                                                                        


[3]PETSC ERROR: ------------------------------------------------------------------------           
[3]PETSC ERROR: VecValidValues() line 28 in /tmp/petsc-3.4.4/src/vec/vec/interface/rvector.c  


It looks like the vector after preconditioner application is bad. What is the preconditioner?
  Matt 


[3]PETSC ERROR: PCApply() line 436 in /tmp/petsc-3.4.4/src/ksp/pc/interface/precon.c               
[3]PETSC ERROR: KSP_PCApply() line 227 in /tmp/petsc-3.4.4/include/petsc-private/kspimpl.h         


[3]PETSC ERROR: KSPInitialResidual() line 64 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itres.c     
[3]PETSC ERROR: KSPSolve_GMRES() line 239 in /tmp/petsc-3.4.4/src/ksp/ksp/impls/gmres/gmres.c      
[3]PETSC ERROR: KSPSolve() line 441 in /tmp/petsc-3.4.4/src/ksp/ksp/interface/itfunc.c             


After read the source code shown by backtrack informations, I realize the problem is in the right hand side vector. So I make a trial of set right hand side vector to ONE by VecSet, But the program still shows error message above, and using VecView or VecGetValue to investigate the first value of rhs vec shows the value is 1.0 as I set it to. Hope I clearly describe the problem. The code related is listed below


---------------------------Solver section--------------------------

  call VecSet( pet_bp_b, one, ierr)

  vecidx=[0,1]
  call VecGetValues( pet_bp_b, 2, vecidx, first, ierr)
  write(*,*) ' first two values ', first(1), first(2)


  call KSPSetInitialGuessNonzero(solver_bp,Petsc_True,ierr)
  call KSPSolve(solver_bp,pet_bp_b,pet_bp_x,ierr)
  call KSPView(solver_bp, PETSC_VIEWER_STDOUT_SELF,ierr)
  CHKERRQ(ierr)


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		   		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener

 		 	   		  
-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.


-- Norbert Wiener
 		 	   		  

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.

-- Norbert Wiener
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/609633ab/attachment-0001.html>

From Zafer.Leylek at student.adfa.edu.au  Thu May  8 08:45:05 2014
From: Zafer.Leylek at student.adfa.edu.au (Zafer Leylek)
Date: Thu, 8 May 2014 13:45:05 +0000
Subject: [petsc-users] VecDot usage in parallel
Message-ID: <996B4E35EA834745A31426395DB0BBA01DFD9754@ADFAPWEXMBX02.ad.adfa.edu.au>

I have recently started using petsc and have little experience with parallel programming.

I am having problem with the following section of my code:

        KSPGetPC(ksp,&pc);
        PCSetType(pc,PCCHOLESKY);
        PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);
        PCFactorSetUpMatSolverPackage(pc);
        PCFactorGetMatrix(pc,&L);
        MatMumpsSetIcntl(L,7,2);
        MatMumpsSetCntl(L,1,0.0);
        MatMumpsSetIcntl(L,33,1);
        KSPSetUp(ksp);
        KSPSolve(ksp, y, alpha);

        VecDot(y, alpha, &sigma);

when I run it using a single processor (mpiexec -np 1 ....) I get the correct answer, when I run using 2 processors I get sigma = 4*sigma and so on.

How can I solve this problem??

ZL


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/e389beab/attachment.html>

From jed at jedbrown.org  Thu May  8 09:11:11 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 08 May 2014 08:11:11 -0600
Subject: [petsc-users] VecDot usage in parallel
In-Reply-To: <996B4E35EA834745A31426395DB0BBA01DFD9754@ADFAPWEXMBX02.ad.adfa.edu.au>
References: <996B4E35EA834745A31426395DB0BBA01DFD9754@ADFAPWEXMBX02.ad.adfa.edu.au>
Message-ID: <87iopgl5s0.fsf@jedbrown.org>

Zafer Leylek <Zafer.Leylek at student.adfa.edu.au> writes:

> I have recently started using petsc and have little experience with parallel programming.
>
> I am having problem with the following section of my code:
>
>         KSPGetPC(ksp,&pc);
>         PCSetType(pc,PCCHOLESKY);
>         PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);
>         PCFactorSetUpMatSolverPackage(pc);
>         PCFactorGetMatrix(pc,&L);
>         MatMumpsSetIcntl(L,7,2);
>         MatMumpsSetCntl(L,1,0.0);
>         MatMumpsSetIcntl(L,33,1);
>         KSPSetUp(ksp);
>         KSPSolve(ksp, y, alpha);
>
>         VecDot(y, alpha, &sigma);
>
> when I run it using a single processor (mpiexec -np 1 ....) I get the
> correct answer, when I run using 2 processors I get sigma = 4*sigma
> and so on.

View y and alpha.  I suspect you are solving a different problem in
these cases.

VecDot computes the parallel dot product.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/60121c9a/attachment.pgp>

From likunt at caltech.edu  Thu May  8 11:08:05 2014
From: likunt at caltech.edu (likunt at caltech.edu)
Date: Thu, 8 May 2014 09:08:05 -0700 (PDT)
Subject: [petsc-users] normalization of a vector
Message-ID: <59909.131.215.220.161.1399565285.squirrel@webmail.caltech.edu>

Dear Petsc developers,

I have a vector V={u1, u2, u3, v1, v2, v3}. I need to normalize each 3d
vector  and reset V, i.e.

V={u1/|u|, u2/|u|, u3/|u|, v1/|v|, v2/|v|, v3/|v|},
with |u| and |v| denotes the magnitudes of {u1,u2,u3} and {v1,v2,v3}.

I tried

VecGetValues(V, 3, col, val);
normalization of val;
VecSetValues(V, 3, col, val, INSERT_VALUES);

but I got the error message

PETSC ERROR: Object is in wrong state!
PETSC ERROR: You have already added values; you cannot now insert!

Is there any fast way to do that? Thanks.


From mrestelli at gmail.com  Thu May  8 11:25:11 2014
From: mrestelli at gmail.com (marco restelli)
Date: Thu, 8 May 2014 18:25:11 +0200
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
Message-ID: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>

Hi,
   I have a Cartesian communicator and some matrices distributed along
the "x" direction. I would like to compute an all_reduce operation for
these matrices in the y direction, and I wander whether there is a
PETSc function for this.


More precisely:

a matrix A is distributed among processors  0 , 1 , 2
another A is distributed among processors   3 , 4 , 5
another A is distributed among processors   6 , 7 , 8
...

The x direction is 0,1,2; while the y direction is 0,3,6,...

I would like to compute a matrix  B = "sum of the matrices A"  and a
copy of B should be distributed among processors 0,1,2, another copy
among 3,4,5 and so on.

A way of doing this is getting the matrix coefficients, broadcasting
them along the y direction and summing them in the matrix B; maybe
however there is already a PETSc function doing this.

Thank you, regards
   Marco

From knepley at gmail.com  Thu May  8 11:29:21 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 8 May 2014 11:29:21 -0500
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
Message-ID: <CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>

On Thu, May 8, 2014 at 11:25 AM, marco restelli <mrestelli at gmail.com> wrote:

> Hi,
>    I have a Cartesian communicator and some matrices distributed along
> the "x" direction. I would like to compute an all_reduce operation for
> these matrices in the y direction, and I wander whether there is a
> PETSc function for this.
>
>
> More precisely:
>
> a matrix A is distributed among processors  0 , 1 , 2
> another A is distributed among processors   3 , 4 , 5
> another A is distributed among processors   6 , 7 , 8
> ...
>
> The x direction is 0,1,2; while the y direction is 0,3,6,...
>
> I would like to compute a matrix  B = "sum of the matrices A"  and a
> copy of B should be distributed among processors 0,1,2, another copy
> among 3,4,5 and so on.
>
> A way of doing this is getting the matrix coefficients, broadcasting
> them along the y direction and summing them in the matrix B; maybe
> however there is already a PETSc function doing this.
>

There is nothing like this in PETSc. There are many tools for this using
dense
matrices in Elemental, but I have not seen anything for sparse matrices.

   Matt


> Thank you, regards
>    Marco
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/4c3a2035/attachment.html>

From bsmith at mcs.anl.gov  Thu May  8 12:44:33 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 8 May 2014 12:44:33 -0500
Subject: [petsc-users] normalization of a vector
In-Reply-To: <59909.131.215.220.161.1399565285.squirrel@webmail.caltech.edu>
References: <59909.131.215.220.161.1399565285.squirrel@webmail.caltech.edu>
Message-ID: <AB4B200F-5F7C-45C2-8A92-A245FFD99716@mcs.anl.gov>


   Use VecGetArray(). Loop over each the 3 tuples doing the normalization and then use VecRestoreArray(). VecGetArray/RestoreArray() do not copy values so are much faster than VecGetValues().   

   Barry

  Each process will just loop over its local part of the vector.


On May 8, 2014, at 11:08 AM, likunt at caltech.edu wrote:

> Dear Petsc developers,
> 
> I have a vector V={u1, u2, u3, v1, v2, v3}. I need to normalize each 3d
> vector  and reset V, i.e.
> 
> V={u1/|u|, u2/|u|, u3/|u|, v1/|v|, v2/|v|, v3/|v|},
> with |u| and |v| denotes the magnitudes of {u1,u2,u3} and {v1,v2,v3}.
> 
> I tried
> 
> VecGetValues(V, 3, col, val);
> normalization of val;
> VecSetValues(V, 3, col, val, INSERT_VALUES);
> 
> but I got the error message
> 
> PETSC ERROR: Object is in wrong state!
> PETSC ERROR: You have already added values; you cannot now insert!
> 
> Is there any fast way to do that? Thanks.
> 
> 


From mrestelli at gmail.com  Thu May  8 14:06:32 2014
From: mrestelli at gmail.com (marco restelli)
Date: Thu, 8 May 2014 21:06:32 +0200
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
	<CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
Message-ID: <CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>

2014-05-08 18:29 GMT+0200, Matthew Knepley <knepley at gmail.com>:
> On Thu, May 8, 2014 at 11:25 AM, marco restelli <mrestelli at gmail.com>
> wrote:
>
>> Hi,
>>    I have a Cartesian communicator and some matrices distributed along
>> the "x" direction. I would like to compute an all_reduce operation for
>> these matrices in the y direction, and I wander whether there is a
>> PETSc function for this.
>>
>>
>> More precisely:
>>
>> a matrix A is distributed among processors  0 , 1 , 2
>> another A is distributed among processors   3 , 4 , 5
>> another A is distributed among processors   6 , 7 , 8
>> ...
>>
>> The x direction is 0,1,2; while the y direction is 0,3,6,...
>>
>> I would like to compute a matrix  B = "sum of the matrices A"  and a
>> copy of B should be distributed among processors 0,1,2, another copy
>> among 3,4,5 and so on.
>>
>> A way of doing this is getting the matrix coefficients, broadcasting
>> them along the y direction and summing them in the matrix B; maybe
>> however there is already a PETSc function doing this.
>>
>
> There is nothing like this in PETSc. There are many tools for this using
> dense
> matrices in Elemental, but I have not seen anything for sparse matrices.
>
>    Matt
>

OK, thank you.

Now, to do it myself, is MatGetRow the best way to get all the local
nonzero entries of a matrix?

Marco

From knepley at gmail.com  Thu May  8 14:13:18 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 8 May 2014 14:13:18 -0500
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
	<CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
	<CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>
Message-ID: <CAMYG4Gn4ufX3iVSoh6aRM-21moqCYMhfGJRTkieQCbg+WERKwg@mail.gmail.com>

On Thu, May 8, 2014 at 2:06 PM, marco restelli <mrestelli at gmail.com> wrote:

> 2014-05-08 18:29 GMT+0200, Matthew Knepley <knepley at gmail.com>:
> > On Thu, May 8, 2014 at 11:25 AM, marco restelli <mrestelli at gmail.com>
> > wrote:
> >
> >> Hi,
> >>    I have a Cartesian communicator and some matrices distributed along
> >> the "x" direction. I would like to compute an all_reduce operation for
> >> these matrices in the y direction, and I wander whether there is a
> >> PETSc function for this.
> >>
> >>
> >> More precisely:
> >>
> >> a matrix A is distributed among processors  0 , 1 , 2
> >> another A is distributed among processors   3 , 4 , 5
> >> another A is distributed among processors   6 , 7 , 8
> >> ...
> >>
> >> The x direction is 0,1,2; while the y direction is 0,3,6,...
> >>
> >> I would like to compute a matrix  B = "sum of the matrices A"  and a
> >> copy of B should be distributed among processors 0,1,2, another copy
> >> among 3,4,5 and so on.
> >>
> >> A way of doing this is getting the matrix coefficients, broadcasting
> >> them along the y direction and summing them in the matrix B; maybe
> >> however there is already a PETSc function doing this.
> >>
> >
> > There is nothing like this in PETSc. There are many tools for this using
> > dense
> > matrices in Elemental, but I have not seen anything for sparse matrices.
> >
> >    Matt
> >
>
> OK, thank you.
>
> Now, to do it myself, is MatGetRow the best way to get all the local
> nonzero entries of a matrix?


I think MatGetSubmatrices() is probably better.

   Matt


>
> Marco
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/ace720f7/attachment.html>

From mrestelli at gmail.com  Thu May  8 14:45:03 2014
From: mrestelli at gmail.com (marco restelli)
Date: Thu, 8 May 2014 21:45:03 +0200
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <CAMYG4Gn4ufX3iVSoh6aRM-21moqCYMhfGJRTkieQCbg+WERKwg@mail.gmail.com>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
	<CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
	<CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>
	<CAMYG4Gn4ufX3iVSoh6aRM-21moqCYMhfGJRTkieQCbg+WERKwg@mail.gmail.com>
Message-ID: <CAHV2F1JGqG=UM5BTgJMrEF7=gX=38epkFJv0RPhHqiUj+ub1yA@mail.gmail.com>

2014-05-08 21:13 GMT+0200, Matthew Knepley <knepley at gmail.com>:
> On Thu, May 8, 2014 at 2:06 PM, marco restelli <mrestelli at gmail.com> wrote:
>
>> 2014-05-08 18:29 GMT+0200, Matthew Knepley <knepley at gmail.com>:
>> > On Thu, May 8, 2014 at 11:25 AM, marco restelli <mrestelli at gmail.com>
>> > wrote:
>> >
>> >> Hi,
>> >>    I have a Cartesian communicator and some matrices distributed along
>> >> the "x" direction. I would like to compute an all_reduce operation for
>> >> these matrices in the y direction, and I wander whether there is a
>> >> PETSc function for this.
>> >>
>> >>
>> >> More precisely:
>> >>
>> >> a matrix A is distributed among processors  0 , 1 , 2
>> >> another A is distributed among processors   3 , 4 , 5
>> >> another A is distributed among processors   6 , 7 , 8
>> >> ...
>> >>
>> >> The x direction is 0,1,2; while the y direction is 0,3,6,...
>> >>
>> >> I would like to compute a matrix  B = "sum of the matrices A"  and a
>> >> copy of B should be distributed among processors 0,1,2, another copy
>> >> among 3,4,5 and so on.
>> >>
>> >> A way of doing this is getting the matrix coefficients, broadcasting
>> >> them along the y direction and summing them in the matrix B; maybe
>> >> however there is already a PETSc function doing this.
>> >>
>> >
>> > There is nothing like this in PETSc. There are many tools for this
>> > using
>> > dense
>> > matrices in Elemental, but I have not seen anything for sparse
>> > matrices.
>> >
>> >    Matt
>> >
>>
>> OK, thank you.
>>
>> Now, to do it myself, is MatGetRow the best way to get all the local
>> nonzero entries of a matrix?
>
>
> I think MatGetSubmatrices() is probably better.
>
>    Matt

Matt, thanks but this I don't understand. What I want is getting three
arrays (i,j,coeff) with all the nonzero local coefficients, so that I
can send them around with MPI.

MatGetSubmatrices would give me some PETSc objects, which I can not
pass to MPI, right?

Marco

From tlk0812 at hotmail.com  Thu May  8 17:14:29 2014
From: tlk0812 at hotmail.com (LikunTan)
Date: Fri, 9 May 2014 06:14:29 +0800
Subject: [petsc-users] question on VecView
Message-ID: <COL127-W15E2699754BA7745623BCBB7490@phx.gbl>

Dear Petsc Developers,
Instead of outputting a vector vertically, is there an option to output it horizontally, i.e.
v[1] v[2] v[3] ..........
Thanks, 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/b5033fdc/attachment.html>

From knepley at gmail.com  Thu May  8 17:32:32 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 8 May 2014 17:32:32 -0500
Subject: [petsc-users] question on VecView
In-Reply-To: <COL127-W15E2699754BA7745623BCBB7490@phx.gbl>
References: <COL127-W15E2699754BA7745623BCBB7490@phx.gbl>
Message-ID: <CAMYG4GnpzJyzQZn+14_edCaQVBi56ES4q0QmZtH3f05+gYBHvQ@mail.gmail.com>

On Thu, May 8, 2014 at 5:14 PM, LikunTan <tlk0812 at hotmail.com> wrote:

> Dear Petsc Developers,
>
> Instead of outputting a vector vertically, is there an option to output it
> horizontally, i.e.
>
> v[1] v[2] v[3] ..........
>

No, you would have to write it.

   Matt


> Thanks,
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/f99a0c95/attachment.html>

From friedmud at gmail.com  Thu May  8 18:36:42 2014
From: friedmud at gmail.com (Derek Gaston)
Date: Thu, 8 May 2014 17:36:42 -0600
Subject: [petsc-users] Hiring for the MOOSE Team!
Message-ID: <CAFfxPjqM6ss2ccTjtZyM8FAO9Pyf8gffYCfLFfeJRkvhh0tMQw@mail.gmail.com>

We're hiring on the MOOSE Framework team!  Come join a high-intensity
computational science team that is devoted to open source and innovative
development methods!

The MOOSE Framework ( http://www.mooseframework.org ) is a high-level,
parallel, multiscale, multiphysics, PDE solution framework built on libMesh
and PETSc.

Working on the MOOSE Framework provides ample opportunity for anyone with a
computational science background.  You can work on massively parallel
algorithms, innovative graphical user interfaces, numerical methods,
software development methodologies and much more.  Most importantly: the
work you do every day will have a direct impact on our hundreds of users
and the multiple science programs that depend on MOOSE.

This position includes opportunities to travel to conferences.  In
addition, publishing papers is highly encouraged.

Here is a direct link to the job posting: http://1.usa.gov/1hAX8zX?

Let me know if you have any questions!  And please forward this on to
anyone that you think may be interested!

Derek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/f7cb915f/attachment-0001.html>

From jed at jedbrown.org  Thu May  8 22:47:32 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 08 May 2014 21:47:32 -0600
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <CAHV2F1JGqG=UM5BTgJMrEF7=gX=38epkFJv0RPhHqiUj+ub1yA@mail.gmail.com>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
	<CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
	<CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>
	<CAMYG4Gn4ufX3iVSoh6aRM-21moqCYMhfGJRTkieQCbg+WERKwg@mail.gmail.com>
	<CAHV2F1JGqG=UM5BTgJMrEF7=gX=38epkFJv0RPhHqiUj+ub1yA@mail.gmail.com>
Message-ID: <87lhubk3zf.fsf@jedbrown.org>

marco restelli <mrestelli at gmail.com> writes:
> Matt, thanks but this I don't understand. What I want is getting three
> arrays (i,j,coeff) with all the nonzero local coefficients, so that I
> can send them around with MPI.
>
> MatGetSubmatrices would give me some PETSc objects, which I can not
> pass to MPI, right?

I'm not sure you want this, but you can use MatGetRowIJ and similar to
access the representation you're asking for if you are dead set on
depending on a specific data format rather than using generic
interfaces.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140508/5767448c/attachment.pgp>

From mrestelli at gmail.com  Fri May  9 03:15:28 2014
From: mrestelli at gmail.com (marco restelli)
Date: Fri, 9 May 2014 10:15:28 +0200
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <87lhubk3zf.fsf@jedbrown.org>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
	<CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
	<CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>
	<CAMYG4Gn4ufX3iVSoh6aRM-21moqCYMhfGJRTkieQCbg+WERKwg@mail.gmail.com>
	<CAHV2F1JGqG=UM5BTgJMrEF7=gX=38epkFJv0RPhHqiUj+ub1yA@mail.gmail.com>
	<87lhubk3zf.fsf@jedbrown.org>
Message-ID: <CAHV2F1JFLMMtvtuA7FNj336bgc8Z0ByaJyS0KHJ0enEEibNscw@mail.gmail.com>

2014-05-09 5:47 GMT+0200, Jed Brown <jed at jedbrown.org>:
> marco restelli <mrestelli at gmail.com> writes:
>> Matt, thanks but this I don't understand. What I want is getting three
>> arrays (i,j,coeff) with all the nonzero local coefficients, so that I
>> can send them around with MPI.
>>
>> MatGetSubmatrices would give me some PETSc objects, which I can not
>> pass to MPI, right?
>
> I'm not sure you want this, but you can use MatGetRowIJ and similar to
> access the representation you're asking for if you are dead set on
> depending on a specific data format rather than using generic
> interfaces.
>

Jed, thank you. This is probably not the PETSc solution, but still it
might a solution!

I have found this example for MatGetRowIJ:

http://www.stce.rwth-aachen.de/trac/petsc/browser/src/mat/examples/tests/ex79f.F?rev=a52934f9a5da430fdd891fa538a66c376435ec4c

My understanding is that I need to:

1) get the sequential part of the matrix, i.e. those rows stored on
this processor
      call MatMPIAIJGetSeqAIJ(A,Ad,Ao,icol,iicol,ierr)

2) get the indexes of these rows
      call MatGetOwnershipRange(A,rstart,rend,ierr)

3) get the indexes i,j of the local portion of the matrix (compressed
form)
      call MatGetRowIJ(Ad,one,zero,zero,n,ia,iia,ja,jja,done,ierr)

4) get the corresponding elements
      call MatGetArray(Ad,aa,aaa,ierr)

5) WARNING: the row indexes obtained with MatGetRowIJ are local to
this processor, so they must be corrected with rstart to obtain the
corresponding global indexes

6) clean-up


Does this make sense?


Marco

From knepley at gmail.com  Fri May  9 06:14:53 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 9 May 2014 06:14:53 -0500
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <CAHV2F1JFLMMtvtuA7FNj336bgc8Z0ByaJyS0KHJ0enEEibNscw@mail.gmail.com>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
	<CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
	<CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>
	<CAMYG4Gn4ufX3iVSoh6aRM-21moqCYMhfGJRTkieQCbg+WERKwg@mail.gmail.com>
	<CAHV2F1JGqG=UM5BTgJMrEF7=gX=38epkFJv0RPhHqiUj+ub1yA@mail.gmail.com>
	<87lhubk3zf.fsf@jedbrown.org>
	<CAHV2F1JFLMMtvtuA7FNj336bgc8Z0ByaJyS0KHJ0enEEibNscw@mail.gmail.com>
Message-ID: <CAMYG4G=cfj-3yp0pagTo3pisn8Y4fZApuZY+sk71+XrQOmcEFQ@mail.gmail.com>

On Fri, May 9, 2014 at 3:15 AM, marco restelli <mrestelli at gmail.com> wrote:

> 2014-05-09 5:47 GMT+0200, Jed Brown <jed at jedbrown.org>:
> > marco restelli <mrestelli at gmail.com> writes:
> >> Matt, thanks but this I don't understand. What I want is getting three
> >> arrays (i,j,coeff) with all the nonzero local coefficients, so that I
> >> can send them around with MPI.
> >>
> >> MatGetSubmatrices would give me some PETSc objects, which I can not
> >> pass to MPI, right?
> >
> > I'm not sure you want this, but you can use MatGetRowIJ and similar to
> > access the representation you're asking for if you are dead set on
> > depending on a specific data format rather than using generic
> > interfaces.
> >
>
> Jed, thank you. This is probably not the PETSc solution, but still it
> might a solution!
>
> I have found this example for MatGetRowIJ:
>

I really do not think you want to do this. It is complex, fragile and I
believe the performance
improvement to be non-existent. You can get the effect you want JUST by
using one function.
For example, suppose you want 2 procs to get rows [0,5] and two procs to
get rows [1,3], then

procs A.B

  MatGetSubmatrices(A, 2, [0,5], 2, [0,5], ..., &submat)

procs C, D

  MatGetSubmatrices(A, 2, [1,3], 2, [1,3], ..., &submat)

and its done. No MPI, no extraction which depends on the Mat data structure.

   Matt


>
> http://www.stce.rwth-aachen.de/trac/petsc/browser/src/mat/examples/tests/ex79f.F?rev=a52934f9a5da430fdd891fa538a66c376435ec4c
>
> My understanding is that I need to:
>
> 1) get the sequential part of the matrix, i.e. those rows stored on
> this processor
>       call MatMPIAIJGetSeqAIJ(A,Ad,Ao,icol,iicol,ierr)
>
> 2) get the indexes of these rows
>       call MatGetOwnershipRange(A,rstart,rend,ierr)
>
> 3) get the indexes i,j of the local portion of the matrix (compressed
> form)
>       call MatGetRowIJ(Ad,one,zero,zero,n,ia,iia,ja,jja,done,ierr)
>
> 4) get the corresponding elements
>       call MatGetArray(Ad,aa,aaa,ierr)
>
> 5) WARNING: the row indexes obtained with MatGetRowIJ are local to
> this processor, so they must be corrected with rstart to obtain the
> corresponding global indexes
>
> 6) clean-up
>
>
> Does this make sense?
>
>
> Marco
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/4cae87d5/attachment.html>

From spk at ldeo.columbia.edu  Fri May  9 06:47:58 2014
From: spk at ldeo.columbia.edu (Samar Khatiwala)
Date: Fri, 9 May 2014 07:47:58 -0400
Subject: [petsc-users] possible performance issues with PETSc on Cray
In-Reply-To: <8761mgw0ol.fsf@jedbrown.org>
References: <283192B3-CD15-4118-A688-EB6E500831A6@ldeo.columbia.edu>
	<8761mgw0ol.fsf@jedbrown.org>
Message-ID: <BDF06A64-64ED-40D2-B617-FB023B5DC687@ldeo.columbia.edu>

Hi Jed et al.,

Just wanted to report back on the resolution of this issue. The computing support people at HLRN in Germany 
submitted a test case to CRAY re. performance on their XC30. CRAY has finally gotten back with a solution, 
which is to use the run-time option  -vecscatter_alltoall. Apparently this is a known issue and according to the 
HLRN folks passing this command line option to PETSc seems to work nicely.

Thanks again for your help.

Samar

On Apr 11, 2014, at 7:44 AM, Jed Brown <jed at jedbrown.org> wrote:

> Samar Khatiwala <spk at ldeo.columbia.edu> writes:
> 
>> Hello,
>> 
>> This is a somewhat vague query but I and a colleague have been running PETSc (3.4.3.0) on a Cray 
>> XC30 in Germany (https://www.hlrn.de/home/view/System3/WebHome) and the system administrators 
>> alerted us to some anomalies with our jobs that may or may not be related to PETSc but I thought I'd ask 
>> here in case others have noticed something similar.
>> 
>> First, there was a large variation in run-time for identical jobs, sometimes as much as 50%. We didn't 
>> really pick up on this but other users complained to the IT people that their jobs were taking a performance 
>> hit with a similar variation in run-time. At that point we're told the IT folks started monitoring jobs and 
>> carrying out tests to see what was going on. They discovered that (1) this always happened when we were 
>> running our jobs and (2) the problem got worse with physical proximity to the nodes on which our jobs were 
>> running (what they described as a "strong interaction" between our jobs and others presumably through the 
>> communication network).
> 
> It sounds like you are strong scaling (smallish subdomains) so that your
> application is sensitive to network latency.  I see significant
> performance variability on XC-30 with this Full Multigrid solver that is
> not using PETSc.
> 
> http://59a2.org/files/hopper-vs-edison.3semilogx.png
> 
> See the factor of 2 performance variability for the samples of the ~15M
> element case.  This operation is limited by instruction issue rather
> than bandwidth (indeed, it is several times faster than doing the same
> operations with assembled matrices).  Here the variability is within the
> same application performing repeated solves.  If you get a different
> partition on a different run, you can see larger variation.
> 
> If your matrices are large enough, your performance will be limited by
> memory bandwidth.  (This is the typical case, but sufficiently small
> matrices can fit in cache.)  I once encountered a batch system that did
> not properly reset nodes between runs, leaving a partially-filled
> ramdisk distributed asymmetrically across the memory busses.  This led
> to 3x performance reduction on 4-socket nodes because much of the memory
> demanded by the application would be faulted onto one memory bus.
> Presumably your machine has a resource manager that would not allow such
> things to happen.


From knepley at gmail.com  Fri May  9 06:50:01 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 9 May 2014 06:50:01 -0500
Subject: [petsc-users] possible performance issues with PETSc on Cray
In-Reply-To: <BDF06A64-64ED-40D2-B617-FB023B5DC687@ldeo.columbia.edu>
References: <283192B3-CD15-4118-A688-EB6E500831A6@ldeo.columbia.edu>
	<8761mgw0ol.fsf@jedbrown.org>
	<BDF06A64-64ED-40D2-B617-FB023B5DC687@ldeo.columbia.edu>
Message-ID: <CAMYG4GkE9q7KsvsZjPTox635qQ_PwO3_4pjz1MRit92VmVT=pg@mail.gmail.com>

On Fri, May 9, 2014 at 6:47 AM, Samar Khatiwala <spk at ldeo.columbia.edu>wrote:

> Hi Jed et al.,
>
> Just wanted to report back on the resolution of this issue. The computing
> support people at HLRN in Germany
> submitted a test case to CRAY re. performance on their XC30. CRAY has
> finally gotten back with a solution,
> which is to use the run-time option  -vecscatter_alltoall. Apparently this
> is a known issue and according to the
> HLRN folks passing this command line option to PETSc seems to work nicely.
>

What this does is replace point-to-point communication (MPI_Send/Recv) with
collective communication (MI_Alltoall).

  Thanks,

     Matt


> Thanks again for your help.
>
> Samar
>
> On Apr 11, 2014, at 7:44 AM, Jed Brown <jed at jedbrown.org> wrote:
>
> > Samar Khatiwala <spk at ldeo.columbia.edu> writes:
> >
> >> Hello,
> >>
> >> This is a somewhat vague query but I and a colleague have been running
> PETSc (3.4.3.0) on a Cray
> >> XC30 in Germany (https://www.hlrn.de/home/view/System3/WebHome) and
> the system administrators
> >> alerted us to some anomalies with our jobs that may or may not be
> related to PETSc but I thought I'd ask
> >> here in case others have noticed something similar.
> >>
> >> First, there was a large variation in run-time for identical jobs,
> sometimes as much as 50%. We didn't
> >> really pick up on this but other users complained to the IT people that
> their jobs were taking a performance
> >> hit with a similar variation in run-time. At that point we're told the
> IT folks started monitoring jobs and
> >> carrying out tests to see what was going on. They discovered that (1)
> this always happened when we were
> >> running our jobs and (2) the problem got worse with physical proximity
> to the nodes on which our jobs were
> >> running (what they described as a "strong interaction" between our jobs
> and others presumably through the
> >> communication network).
> >
> > It sounds like you are strong scaling (smallish subdomains) so that your
> > application is sensitive to network latency.  I see significant
> > performance variability on XC-30 with this Full Multigrid solver that is
> > not using PETSc.
> >
> > http://59a2.org/files/hopper-vs-edison.3semilogx.png
> >
> > See the factor of 2 performance variability for the samples of the ~15M
> > element case.  This operation is limited by instruction issue rather
> > than bandwidth (indeed, it is several times faster than doing the same
> > operations with assembled matrices).  Here the variability is within the
> > same application performing repeated solves.  If you get a different
> > partition on a different run, you can see larger variation.
> >
> > If your matrices are large enough, your performance will be limited by
> > memory bandwidth.  (This is the typical case, but sufficiently small
> > matrices can fit in cache.)  I once encountered a batch system that did
> > not properly reset nodes between runs, leaving a partially-filled
> > ramdisk distributed asymmetrically across the memory busses.  This led
> > to 3x performance reduction on 4-socket nodes because much of the memory
> > demanded by the application would be faulted onto one memory bus.
> > Presumably your machine has a resource manager that would not allow such
> > things to happen.
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/f94af059/attachment-0001.html>

From mrestelli at gmail.com  Fri May  9 07:19:24 2014
From: mrestelli at gmail.com (marco restelli)
Date: Fri, 9 May 2014 14:19:24 +0200
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <CAMYG4G=cfj-3yp0pagTo3pisn8Y4fZApuZY+sk71+XrQOmcEFQ@mail.gmail.com>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
	<CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
	<CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>
	<CAMYG4Gn4ufX3iVSoh6aRM-21moqCYMhfGJRTkieQCbg+WERKwg@mail.gmail.com>
	<CAHV2F1JGqG=UM5BTgJMrEF7=gX=38epkFJv0RPhHqiUj+ub1yA@mail.gmail.com>
	<87lhubk3zf.fsf@jedbrown.org>
	<CAHV2F1JFLMMtvtuA7FNj336bgc8Z0ByaJyS0KHJ0enEEibNscw@mail.gmail.com>
	<CAMYG4G=cfj-3yp0pagTo3pisn8Y4fZApuZY+sk71+XrQOmcEFQ@mail.gmail.com>
Message-ID: <CAHV2F1+LPrGmMrio4f-tnU7WpKnFQvJhCjCvmmb1pbyZ0dyVoQ@mail.gmail.com>

2014-05-09 13:14 GMT+0200, Matthew Knepley <knepley at gmail.com>:
> On Fri, May 9, 2014 at 3:15 AM, marco restelli <mrestelli at gmail.com> wrote:
>
>> 2014-05-09 5:47 GMT+0200, Jed Brown <jed at jedbrown.org>:
>> > marco restelli <mrestelli at gmail.com> writes:
>> >> Matt, thanks but this I don't understand. What I want is getting three
>> >> arrays (i,j,coeff) with all the nonzero local coefficients, so that I
>> >> can send them around with MPI.
>> >>
>> >> MatGetSubmatrices would give me some PETSc objects, which I can not
>> >> pass to MPI, right?
>> >
>> > I'm not sure you want this, but you can use MatGetRowIJ and similar to
>> > access the representation you're asking for if you are dead set on
>> > depending on a specific data format rather than using generic
>> > interfaces.
>> >
>>
>> Jed, thank you. This is probably not the PETSc solution, but still it
>> might a solution!
>>
>> I have found this example for MatGetRowIJ:
>>
>
> I really do not think you want to do this. It is complex, fragile and I
> believe the performance
> improvement to be non-existent. You can get the effect you want JUST by
> using one function.
> For example, suppose you want 2 procs to get rows [0,5] and two procs to
> get rows [1,3], then
>
> procs A.B
>
>   MatGetSubmatrices(A, 2, [0,5], 2, [0,5], ..., &submat)
>
> procs C, D
>
>   MatGetSubmatrices(A, 2, [1,3], 2, [1,3], ..., &submat)
>
> and its done. No MPI, no extraction which depends on the Mat data
> structure.

Matt, I understand that the idea is to avoid using MPI, but I don't
see how getting a submatrix is related to my problem.

Probably a simpler version of my problem is the following:

one matrix is distributed on procs. 0,1
another matrix is distributed on procs. 2,3

The two matrices have the same size and I want to add them. For the
resulting matrix, I want two copies, one is distributed among 0,1 and
the second one among 2,3.


A possibility that I see now is creating a third matrix, with the same
size, distributed among all the four processors: 0,1,2,3, setting it
to zero and then letting processors 0,1 add their matrix, and also 2,3
add their own. Then I could convert the result into two matrices,
making the two copies that I need.

This works provided that in MatAXPY I can uses matrices distributed on
different processors: given that the function computes

Y = a*X + Y

in my case it would be
Y -> procs. 0,1,2,3
X -> procs. 0,1

Would this work?

Marco

From ant_mil at hotmail.com  Fri May  9 07:29:39 2014
From: ant_mil at hotmail.com (Antonios Mylonakis)
Date: Fri, 9 May 2014 15:29:39 +0300
Subject: [petsc-users] Errors in running
Message-ID: <DUB123-W46BB55A983981B6190125283480@phx.gbl>

Dear Sir or Madam
I am a new PETSc user. I am using PETSc library with fortran. I have the following problem. I want to use the matrix-free form of krylov solvers. So I am starting by using the example ex14f.F.In this example, within subroutine mymult()  I try call another subroutine which calculates the vector I need as the result of the matrix-vector multiplication.In this second subroutine the vector is defined as a simple array. (Is this the problem?) The problem is that I receive errors when I'm attempting to run the program.   The problem seems to be related with memory, but I am not sure.
The first line of errors can be seen below:"Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range"

Could you help me?
Thanks in advance 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/2fe6af3c/attachment.html>

From knepley at gmail.com  Fri May  9 07:30:10 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 9 May 2014 07:30:10 -0500
Subject: [petsc-users] Equivalent of all_reduce for sparse matrices
In-Reply-To: <CAHV2F1+LPrGmMrio4f-tnU7WpKnFQvJhCjCvmmb1pbyZ0dyVoQ@mail.gmail.com>
References: <CAHV2F1J2aqQ6im9QSRqqKHgqQxOjTcB0=Y4u8Y8M4WE1OEoy5Q@mail.gmail.com>
	<CAMYG4Gkv6S-RRA9Yd4J+zooNrCpf1Kyd8MOEBU9tJBz1LDUOoQ@mail.gmail.com>
	<CAHV2F1Jv5=K7ZEG5Us9MqmrcS6fY0=b9wK77ejXzLHc6CBBkvg@mail.gmail.com>
	<CAMYG4Gn4ufX3iVSoh6aRM-21moqCYMhfGJRTkieQCbg+WERKwg@mail.gmail.com>
	<CAHV2F1JGqG=UM5BTgJMrEF7=gX=38epkFJv0RPhHqiUj+ub1yA@mail.gmail.com>
	<87lhubk3zf.fsf@jedbrown.org>
	<CAHV2F1JFLMMtvtuA7FNj336bgc8Z0ByaJyS0KHJ0enEEibNscw@mail.gmail.com>
	<CAMYG4G=cfj-3yp0pagTo3pisn8Y4fZApuZY+sk71+XrQOmcEFQ@mail.gmail.com>
	<CAHV2F1+LPrGmMrio4f-tnU7WpKnFQvJhCjCvmmb1pbyZ0dyVoQ@mail.gmail.com>
Message-ID: <CAMYG4GkZVN+LryV5hPRWGxoD713O8kSP1b4P_wmJUJGkB28=MA@mail.gmail.com>

On Fri, May 9, 2014 at 7:19 AM, marco restelli <mrestelli at gmail.com> wrote:

> 2014-05-09 13:14 GMT+0200, Matthew Knepley <knepley at gmail.com>:
> > On Fri, May 9, 2014 at 3:15 AM, marco restelli <mrestelli at gmail.com>
> wrote:
> >
> >> 2014-05-09 5:47 GMT+0200, Jed Brown <jed at jedbrown.org>:
> >> > marco restelli <mrestelli at gmail.com> writes:
> >> >> Matt, thanks but this I don't understand. What I want is getting
> three
> >> >> arrays (i,j,coeff) with all the nonzero local coefficients, so that I
> >> >> can send them around with MPI.
> >> >>
> >> >> MatGetSubmatrices would give me some PETSc objects, which I can not
> >> >> pass to MPI, right?
> >> >
> >> > I'm not sure you want this, but you can use MatGetRowIJ and similar to
> >> > access the representation you're asking for if you are dead set on
> >> > depending on a specific data format rather than using generic
> >> > interfaces.
> >> >
> >>
> >> Jed, thank you. This is probably not the PETSc solution, but still it
> >> might a solution!
> >>
> >> I have found this example for MatGetRowIJ:
> >>
> >
> > I really do not think you want to do this. It is complex, fragile and I
> > believe the performance
> > improvement to be non-existent. You can get the effect you want JUST by
> > using one function.
> > For example, suppose you want 2 procs to get rows [0,5] and two procs to
> > get rows [1,3], then
> >
> > procs A.B
> >
> >   MatGetSubmatrices(A, 2, [0,5], 2, [0,5], ..., &submat)
> >
> > procs C, D
> >
> >   MatGetSubmatrices(A, 2, [1,3], 2, [1,3], ..., &submat)
> >
> > and its done. No MPI, no extraction which depends on the Mat data
> > structure.
>
> Matt, I understand that the idea is to avoid using MPI, but I don't
> see how getting a submatrix is related to my problem.
>
> Probably a simpler version of my problem is the following:
>
> one matrix is distributed on procs. 0,1
> another matrix is distributed on procs. 2,3
>
> The two matrices have the same size and I want to add them. For the
> resulting matrix, I want two copies, one is distributed among 0,1 and
> the second one among 2,3.
>

If you want distributed matrices to come out you could make one call to
MatGetSubmatrix() for each group, but that is unattractive for a large
number of groups.

I am not seeing the value you get by distributing these matrices if you are
just going to make copies later.

   Matt

A possibility that I see now is creating a third matrix, with the same
> size, distributed among all the four processors: 0,1,2,3, setting it
> to zero and then letting processors 0,1 add their matrix, and also 2,3
> add their own. Then I could convert the result into two matrices,
> making the two copies that I need.
>
> This works provided that in MatAXPY I can uses matrices distributed on
> different processors: given that the function computes
>
> Y = a*X + Y
>
> in my case it would be
> Y -> procs. 0,1,2,3
> X -> procs. 0,1
>
> Would this work?
>
> Marco
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/fb86fd67/attachment.html>

From jed at jedbrown.org  Fri May  9 07:30:08 2014
From: jed at jedbrown.org (Jed Brown)
Date: Fri, 09 May 2014 07:30:08 -0500
Subject: [petsc-users] possible performance issues with PETSc on Cray
In-Reply-To: <BDF06A64-64ED-40D2-B617-FB023B5DC687@ldeo.columbia.edu>
References: <283192B3-CD15-4118-A688-EB6E500831A6@ldeo.columbia.edu>
	<8761mgw0ol.fsf@jedbrown.org>
	<BDF06A64-64ED-40D2-B617-FB023B5DC687@ldeo.columbia.edu>
Message-ID: <87fvkjjfsf.fsf@jedbrown.org>

Samar Khatiwala <spk at ldeo.columbia.edu> writes:
> CRAY has finally gotten back with a solution,
> which is to use the run-time option -vecscatter_alltoall. Apparently
> this is a known issue and according to the HLRN folks passing this
> command line option to PETSc seems to work nicely.

This option is good when you have nearly-dense rows or columns (in terms
of processors depended on).  For problems with actual dense rows or
columns, it is good to formulate as a sparse matrix plus a low-rank
correction.  The other cases are usually poor dof layout, and reordering
will make the graph sparser.  Sparse problems with good layout usually
run faster with the default VecScatter, though there are exceptions
(mostly non-PDE problems).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/11387ee0/attachment.pgp>

From knepley at gmail.com  Fri May  9 07:31:23 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 9 May 2014 07:31:23 -0500
Subject: [petsc-users] Errors in running
In-Reply-To: <DUB123-W46BB55A983981B6190125283480@phx.gbl>
References: <DUB123-W46BB55A983981B6190125283480@phx.gbl>
Message-ID: <CAMYG4G=Ri4ApWcnJ3sWFQ66=uo2=15S82GFacA+wdVJmD_Hfjw@mail.gmail.com>

On Fri, May 9, 2014 at 7:29 AM, Antonios Mylonakis <ant_mil at hotmail.com>wrote:

> Dear Sir or Madam
>
> I am a new PETSc user. I am using PETSc library with fortran.
> I have the following problem. I want to use the matrix-free form of krylov
> solvers. So I am starting by using the example ex14f.F.
> In this example, within subroutine mymult()  I try call another subroutine
> which calculates the vector I need as the result of the matrix-vector
> multiplication.In this second subroutine the vector is defined as a simple
> array. (Is this the problem?)
> The problem is that I receive errors when I'm attempting to run the
> program.   The problem seems to be related with memory, but I am not sure.
>
> The first line of errors can be seen below:
> "Caught signal number 11 SEGV: Segmentation Violation, probably memory
> access out of range
>

Always send the entire error meesage. The rest of the message tells you to
run valgrind.

   Matt


> Could you help me?
>
> Thanks in advance
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/a21addef/attachment.html>

From spk at ldeo.columbia.edu  Fri May  9 07:40:37 2014
From: spk at ldeo.columbia.edu (Samar Khatiwala)
Date: Fri, 9 May 2014 08:40:37 -0400
Subject: [petsc-users] possible performance issues with PETSc on Cray
In-Reply-To: <87fvkjjfsf.fsf@jedbrown.org>
References: <283192B3-CD15-4118-A688-EB6E500831A6@ldeo.columbia.edu>
	<8761mgw0ol.fsf@jedbrown.org>
	<BDF06A64-64ED-40D2-B617-FB023B5DC687@ldeo.columbia.edu>
	<87fvkjjfsf.fsf@jedbrown.org>
Message-ID: <E836BC29-E2A6-410D-B24F-2CA90B23415D@ldeo.columbia.edu>

Hi Jed,

This is useful to know. My matrices are all very sparse but just may not be ordered 
optimally (there's a problem-specific reason why I order them in a certain way). That 
said, this is the first time in many years of similar computations with similar matrices 
that I've encountered this problem. It may just be peculiar to the XC30's.

Thanks,

Samar

On May 9, 2014, at 8:30 AM, Jed Brown <jed at jedbrown.org> wrote:

> Samar Khatiwala <spk at ldeo.columbia.edu> writes:
>> CRAY has finally gotten back with a solution,
>> which is to use the run-time option -vecscatter_alltoall. Apparently
>> this is a known issue and according to the HLRN folks passing this
>> command line option to PETSc seems to work nicely.
> 
> This option is good when you have nearly-dense rows or columns (in terms
> of processors depended on).  For problems with actual dense rows or
> columns, it is good to formulate as a sparse matrix plus a low-rank
> correction.  The other cases are usually poor dof layout, and reordering
> will make the graph sparser.  Sparse problems with good layout usually
> run faster with the default VecScatter, though there are exceptions
> (mostly non-PDE problems).


From bsmith at mcs.anl.gov  Fri May  9 08:00:29 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 9 May 2014 08:00:29 -0500
Subject: [petsc-users] Errors in running
In-Reply-To: <DUB123-W46BB55A983981B6190125283480@phx.gbl>
References: <DUB123-W46BB55A983981B6190125283480@phx.gbl>
Message-ID: <85533106-1C5E-498F-8BB5-F078CD218E05@mcs.anl.gov>


On May 9, 2014, at 7:29 AM, Antonios Mylonakis <ant_mil at hotmail.com> wrote:

> Dear Sir or Madam
> 
> I am a new PETSc user. I am using PETSc library with fortran. 
> I have the following problem. I want to use the matrix-free form of krylov solvers. So I am starting by using the example ex14f.F.
> In this example, within subroutine mymult()  I try call another subroutine which calculates the vector I need as the result of the matrix-vector multiplication.In this second subroutine the vector is defined as a simple array. (Is this the problem?) 

   A PETSc Vec is NOT a simple array you cannot do something like

      myroutine( x)
      double x(*)
      ?.


     anotherroutine(y)
      Vec y
      call myroutine(y)

      to access local entries in PETSc Vec directly you need to call VecGetArray() or VecGetArrayF90()

  Barry

 
> The problem is that I receive errors when I'm attempting to run the program.   The problem seems to be related with memory, but I am not sure.
> 
> The first line of errors can be seen below:
> "Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range"
> 
> 
> Could you help me?
> 
> Thanks in advance


From puesoek at uni-mainz.de  Fri May  9 09:50:36 2014
From: puesoek at uni-mainz.de (=?iso-8859-1?Q?P=FCs=F6k=2C_Adina-Erika?=)
Date: Fri, 9 May 2014 14:50:36 +0000
Subject: [petsc-users] Problem with MatZeroRowsColumnsIS()
In-Reply-To: <CAMYG4GnwSkUC11GhmLj7H-EYmK2X2tMBUQUFdCnfc5PPGfNHag@mail.gmail.com>
References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de>
	<CAMYG4GnwSkUC11GhmLj7H-EYmK2X2tMBUQUFdCnfc5PPGfNHag@mail.gmail.com>
Message-ID: <57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de>

Yes, I tested the implementation with both MatZeroRowsIS() and MatZeroRowsColumnsIS(). But first, I will be more explicit about the problem I was set to solve:

We have a Dirichlet block of size (L,W,H) and centered (xc,yc,zc), which is much smaller than the model domain, and we set Vx = Vpush, Vy=0 within the block (Vz is let free for easier convergence).
As I said before, since the code does not have a monolithic matrix, but 4 submatrices (VV VP; PV PP), and the rhs has 2 sub vectors rhs=(f; g), my approach is to modify only (VV, VP, f) for the Dirichlet BC.

The way I tested the implementation:
1) Output (VV, VP, f, Dirichlet dofs) - unmodified (no Dirichlet BC)
2) Output (VV, VP, f, Dirichlet dofs) - a) modified with MatZeroRowsIS(),
       - b) modified with MatZeroRowsColumnsIS() -> S_PETSc
Again, the only difference between a) and b) is:
// ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
   // ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);

ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
3) Read them in Matlab and perform the exact same operations on the unmodified matrices and f vector. -> S_Matlab
4) Compare S_PETSc with S_Matlab. If the implementation is correct, they should be equal (VV, VP, f).
5) Check for 1 cpu and 4 cpus.

Now to answer your questions:

a,b,d) Yes, matrix modification is done correctly (check the spy diagrams below) in all cases:  MatZeroRowsIS() and MatZeroRowsColumnsIS() on 1 and 4 cpus.

I should have said that in the piece of code above:
 v_vv = 1.0;
 v_vp = 0.0;
The vector x_push is a duplicate of rhs, with zero elements except the values for the Dirichlet dofs.

c) The rhs is a different matter. With MatZeroRows() there is no problem. The rhs is equivalent with the one in Matlab, sequential and parallel.
    However, with MatZeroRowsColumns(), the residual contains nonzero elements, and in parallel the nonzero pattern is even bigger (1 cpu - 63, 4 cpu - 554). But if you look carefully, the values of the nonzero residuals are very small < +/- 1e-10.
So, I did a tolerance filter:

tol = 1e-10;
res = f_petsc - f_mod_matlab;
for i=1:length(res)
    if abs(res(i))>0 & abs(res(i))<tol
        res(i)=0;
    end
end

and then the f_petsc and f_mod_matlab are equivalent on 1 and 4 cpus (figure 5). So it seems that MatZeroRowsColumnsIS() might give some nonzero residuals.

Calculating the norm2 of the residuals defined above in each case gives:
MatZeroRowsIS() 1cpu:  norm(res,2) =      0
MatZeroRowsIS() 4cpu:  norm(res,2) =      0
MatZeroRowsColumnsIS() 1cpu:  norm(res,2) =    1.6880e-10
MatZeroRowsColumnsIS() 4cpu:  norm(res,2) =    7.3786e+06

Since this is purely a problem of matrix and vector assembly/manipulation, I think the nonzero residuals of the rhs with MatZeroRowsColumnsIS() give the parallel artefacts that I showed last time.
If you need the raw data and the matlab scripts that I used for testing for your consideration, please let me know.

Thanks,
Adina

When performing the manual operations on the unmodified matrices and rhs vector in Matlab, I took into account:
- matlab indexing = petsc indexing +1;
- the vectors written to file for matlab (PETSC_VIEWER_BINARY_MATLAB) have the natural ordering, rather than the petsc ordering. On 1 cpu, they are equivalent, but on 4 cpus, the Dirichlet BC indices had to be converted to natural indices in order to perform the correct operations on the rhs.

[cid:8547B616-9585-4E29-BC0D-D86348480A10 at Geo.Uni-Mainz.DE]
[cid:A982EFF3-D817-404A-BCD3-E74CE9CB9EE4 at Geo.Uni-Mainz.DE]
[cid:ED674059-4A40-445D-8D43-FE23792E4B9F at Geo.Uni-Mainz.DE]
[cid:0C431827-2AA7-47EF-A158-4004D3C0D9CE at Geo.Uni-Mainz.DE]
[cid:1C25FB42-709A-4D7B-BE52-94879AB5FAC7 at Geo.Uni-Mainz.DE]

On May 6, 2014, at 4:22 PM, Matthew Knepley wrote:

On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika <puesoek at uni-mainz.de<mailto:puesoek at uni-mainz.de>> wrote:
Hello!

I was trying to implement some internal Dirichlet boundary conditions into an aij matrix of the form:  A=(  VV  VP; PV PP ). The idea was to create an internal block (let's say Dirichlet block) that moves with constant velocity within the domain (i.e. check all the dofs within the block and set the values accordingly to the desired motion).

Ideally, this means to zero the rows and columns in VV, VP, PV corresponding to the dirichlet dofs and modify the corresponding rhs values. However, since we have submatrices and not a monolithic matrix A,  we can choose to modify only VV and PV matrices.
The global indices of the velocity points within the Dirichlet block are contained in the arrays rowid_array.

What I want to point out is that the function MatZeroRowsColumnsIS() seems to create parallel artefacts, compared to MatZeroRowsIS() when run on more than 1 processor. Moreover, the results on 1 cpu are identical.
See below the results of the test (the Dirichlet block is outlined in white) and the piece of the code involved where the 1) - 2) parts are the only difference.

I am assuming that you are showing the result of solving the equations. It would be more useful, and presumably just as easy
to say:

  a) Are the correct rows zeroed out?

  b) Is the diagonal element correct?

  c) Is the rhs value correct?

  d) Are the columns zeroed correctly?

If we know where the problem is, its easier to fix. For example, if the rhs values are
correct and the rows are zeroed, then something is wrong with the solution procedure.
Since ZeroRows() works and ZeroRowsColumns() does not, this is a distinct possibility.

  Thanks,

     Matt

Thanks,
Adina Pusok

// Create an IS required by MatZeroRows()
ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx);  CHKERRQ(ierr);
ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy);  CHKERRQ(ierr);
ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz);  CHKERRQ(ierr);

1) /* ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);*/

2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);

ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);
ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);

ierr = ISDestroy(&isx); CHKERRQ(ierr);
ierr = ISDestroy(&isy); CHKERRQ(ierr);
ierr = ISDestroy(&isz); CHKERRQ(ierr);


Results (velocity) with MatZeroRowsColumnsIS().
1cpu<r01_1cpu_rows_columns.png> 4cpu<r01_rows_columns.png>

Results (velocity) with MatZeroRowsIS():
1cpu<r01_1cpu_rows.png> 4cpu<r01_rows.png>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/1f1c932e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spy_zerorows_1cpu.png
Type: image/png
Size: 15916 bytes
Desc: spy_zerorows_1cpu.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/1f1c932e/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spy_zerorows_4cpu.png
Type: image/png
Size: 17690 bytes
Desc: spy_zerorows_4cpu.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/1f1c932e/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spy_zerorowscol_1cpu.png
Type: image/png
Size: 16300 bytes
Desc: spy_zerorowscol_1cpu.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/1f1c932e/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spy_zerorowscol_4cpu.png
Type: image/png
Size: 18174 bytes
Desc: spy_zerorowscol_4cpu.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/1f1c932e/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: residual_tol.png
Type: image/png
Size: 3577 bytes
Desc: residual_tol.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140509/1f1c932e/attachment-0009.png>

From bsmith at mcs.anl.gov  Fri May  9 14:31:01 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 9 May 2014 14:31:01 -0500
Subject: [petsc-users] Errors in running
In-Reply-To: <DUB123-W1DF71270A88C953EA28CD83480@phx.gbl>
References: <DUB123-W46BB55A983981B6190125283480@phx.gbl>,
	<85533106-1C5E-498F-8BB5-F078CD218E05@mcs.anl.gov>
	<DUB123-W1DF71270A88C953EA28CD83480@phx.gbl>
Message-ID: <A6D53849-68BA-47DF-A2A3-37BD8401DE13@mcs.anl.gov>


  Always respond to ALL on mailing lists, otherwise you may never get an answer.

  Since xx_v() is a PetscScalar, pointer ::  xx_v(:)   I believe it is best if your myroutine() takes a PetscScalar, pointer ::  xx_v(:) argument, not a real::

  Also by default PetscScalar is a double precision number. If you wish PETSc to use single precision numbers then you must ./configure it with ?with-precision=single

  Barry

On May 9, 2014, at 9:23 AM, Antonios Mylonakis <ant_mil at hotmail.com> wrote:

> Thanks for your help.
> 
> So if I understand well I should do sth like this (?):
> 
> myroutine(x)
> real:: x(2)
> 
> ....
> 
> end
> 
> 
> anotherroutine(y)
> vec y
> PetscScalar, pointer ::  xx_v(:)
> 
> call myroutine(y)
> VecGetArrayF90(y,xx_v,ierr)
> edit xx_v
> VecRestoreArray(y,xx_v,ierr)
> ...
> end
> 
> 
> 
>> Subject: Re: [petsc-users] Errors in running
>> From: bsmith at mcs.anl.gov
>> Date: Fri, 9 May 2014 08:00:29 -0500
>> CC: petsc-users at mcs.anl.gov
>> To: ant_mil at hotmail.com
>> 
>> 
>> On May 9, 2014, at 7:29 AM, Antonios Mylonakis <ant_mil at hotmail.com> wrote:
>> 
>>> Dear Sir or Madam
>>> 
>>> I am a new PETSc user. I am using PETSc library with fortran. 
>>> I have the following problem. I want to use the matrix-free form of krylov solvers. So I am starting by using the example ex14f.F.
>>> In this example, within subroutine mymult() I try call another subroutine which calculates the vector I need as the result of the matrix-vector multiplication.In this second subroutine the vector is defined as a simple array. (Is this the problem?) 
>> 
>> A PETSc Vec is NOT a simple array you cannot do something like
>> 
>> myroutine( x)
>> double x(*)
>> ?.
>> 
>> 
>> anotherroutine(y)
>> Vec y
>> call myroutine(y)
>> 
>> to access local entries in PETSc Vec directly you need to call VecGetArray() or VecGetArrayF90()
>> 
>> Barry
>> 
>> 
>>> The problem is that I receive errors when I'm attempting to run the program. The problem seems to be related with memory, but I am not sure.
>>> 
>>> The first line of errors can be seen below:
>>> "Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range"
>>> 
>>> 
>>> Could you help me?
>>> 
>>> Thanks in advance
>> 


From info at jubileedvds.com  Sat May 10 21:40:26 2014
From: info at jubileedvds.com (Jubilee DVDs)
Date: Sun, 11 May 2014 04:40:26 +0200 (SAST)
Subject: [petsc-users] Jubilee DVDs Newsletter
Message-ID: <1195896-1399775882773-133838-250313049-1-0@b.ss51.mailboxesmore.com>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140511/6e139b92/attachment-0001.html>

From likunt at caltech.edu  Sat May 10 23:11:22 2014
From: likunt at caltech.edu (likunt at caltech.edu)
Date: Sat, 10 May 2014 21:11:22 -0700 (PDT)
Subject: [petsc-users] about VecScatter
Message-ID: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu>

Dear Petsc developers,

I have a vector object M, I need all the elements of it in all the
processors.

Here is a part of my code

//////////////////////////////////////////////////////////////
Vec M;
VecScatterCreateToAll(M,&scatter_ctx,&N);
VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);

VecGetArray(N, &aM);

for(i=xs; i<xs+xm; i++)
{
   //within the loop, requires all the elements of aM
}
////////////////////////////////////////////////////////////

but this seems not working well. Would you please suggest a more efficient
way? Thank you.


From jed at jedbrown.org  Sat May 10 23:22:04 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sat, 10 May 2014 22:22:04 -0600
Subject: [petsc-users] about VecScatter
In-Reply-To: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu>
References: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu>
Message-ID: <87oaz5c5cj.fsf@jedbrown.org>

likunt at caltech.edu writes:

> Dear Petsc developers,
>
> I have a vector object M, I need all the elements of it in all the
> processors.
>
> Here is a part of my code
>
> //////////////////////////////////////////////////////////////
> Vec M;
> VecScatterCreateToAll(M,&scatter_ctx,&N);
> VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
> VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
>
> VecGetArray(N, &aM);
>
> for(i=xs; i<xs+xm; i++)

What are xs and xm in this setting.  What do you intend?

> {
>    //within the loop, requires all the elements of aM
> }
> ////////////////////////////////////////////////////////////
>
> but this seems not working well. 

The phrase "not working" should never appear unqualified in polite
conversation.  Send steps to reproduce, what you expect, and what you
observe.

> Would you please suggest a more efficient way? Thank you.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140510/80a83d1e/attachment.pgp>

From likunt at caltech.edu  Sat May 10 23:40:55 2014
From: likunt at caltech.edu (likunt at caltech.edu)
Date: Sat, 10 May 2014 21:40:55 -0700 (PDT)
Subject: [petsc-users] about VecScatter
In-Reply-To: <87oaz5c5cj.fsf@jedbrown.org>
References: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu>
	<87oaz5c5cj.fsf@jedbrown.org>
Message-ID: <52104.131.215.220.161.1399783255.squirrel@webmail.caltech.edu>

Dear Jed,

Thanks for your reply. Below is a more complete version of the code. I
need to loop over all the elements of aM to compute a new Vector called
result. But this process is very slow, I would appreciate if you can give
advice on speeding it up. Many thanks.

//////////////////////////////////////////////////////////////
Vec M,N,result;
DM  da;
DMDACreate1d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,NODE,1,1,NULL,&da);
DMCreateGlobalVector(da, &M);
//set values of M ..
VecScatterCreateToAll(M,&scatter_ctx,&N);
VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
VecGetArray(N, &aM);

DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0);
for(i=xs; i<xs+xm; i++)
{
   val=0.0;
   for(j=0; j<NODE; j++)
   {
      val=val+aM[j];

   }
   VecSetValues(result, 1, i, val, INSERT_VALUES);
}

VecRestoreArray(N, &aM);
VecAssemblyBegin(result);
VecAssemblyEnd(result);
////////////////////////////////////////////////////////////


> likunt at caltech.edu writes:
>
>> Dear Petsc developers,
>>
>> I have a vector object M, I need all the elements of it in all the
>> processors.
>>
>> Here is a part of my code
>>
>> //////////////////////////////////////////////////////////////
>> Vec M;
>> VecScatterCreateToAll(M,&scatter_ctx,&N);
>> VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
>> VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
>>
>> VecGetArray(N, &aM);
>>
>> for(i=xs; i<xs+xm; i++)
>
> What are xs and xm in this setting.  What do you intend?
>
>> {
>>    //within the loop, requires all the elements of aM
>> }
>> ////////////////////////////////////////////////////////////
>>
>> but this seems not working well.
>
> The phrase "not working" should never appear unqualified in polite
> conversation.  Send steps to reproduce, what you expect, and what you
> observe.
>
>> Would you please suggest a more efficient way? Thank you.
>


From knepley at gmail.com  Sun May 11 06:27:25 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 11 May 2014 06:27:25 -0500
Subject: [petsc-users] about VecScatter
In-Reply-To: <52104.131.215.220.161.1399783255.squirrel@webmail.caltech.edu>
References: <51984.131.215.220.161.1399781482.squirrel@webmail.caltech.edu>
	<87oaz5c5cj.fsf@jedbrown.org>
	<52104.131.215.220.161.1399783255.squirrel@webmail.caltech.edu>
Message-ID: <CAMYG4G=SCD3dkKL9tUFtAv5q3CC+CZLS=es=wV7hVm2nPwyT9g@mail.gmail.com>

On Sat, May 10, 2014 at 11:40 PM, <likunt at caltech.edu> wrote:

> Dear Jed,
>
> Thanks for your reply. Below is a more complete version of the code. I
> need to loop over all the elements of aM to compute a new Vector called
> result. But this process is very slow, I would appreciate if you can give
> advice on speeding it up. Many thanks.
>
> //////////////////////////////////////////////////////////////
> Vec M,N,result;
> DM  da;
> DMDACreate1d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,NODE,1,1,NULL,&da);
> DMCreateGlobalVector(da, &M);
> //set values of M ..
> VecScatterCreateToAll(M,&scatter_ctx,&N);
> VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
> VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
> VecGetArray(N, &aM);
>

1) Everything below is nonsensical. The values of M are already in N.

2) Sending all values to a single process is inherently slow. It should not
be done in parallel computing

  Matt


> DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0);
> for(i=xs; i<xs+xm; i++)
> {
>    val=0.0;
>    for(j=0; j<NODE; j++)
>    {
>       val=val+aM[j];
>
>    }
>    VecSetValues(result, 1, i, val, INSERT_VALUES);
> }
>
> VecRestoreArray(N, &aM);
> VecAssemblyBegin(result);
> VecAssemblyEnd(result);
> ////////////////////////////////////////////////////////////
>
>
> > likunt at caltech.edu writes:
> >
> >> Dear Petsc developers,
> >>
> >> I have a vector object M, I need all the elements of it in all the
> >> processors.
> >>
> >> Here is a part of my code
> >>
> >> //////////////////////////////////////////////////////////////
> >> Vec M;
> >> VecScatterCreateToAll(M,&scatter_ctx,&N);
> >> VecScatterBegin(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
> >> VecScatterEnd(scatter_ctx,M,N,INSERT_VALUES,SCATTER_FORWARD);
> >>
> >> VecGetArray(N, &aM);
> >>
> >> for(i=xs; i<xs+xm; i++)
> >
> > What are xs and xm in this setting.  What do you intend?
> >
> >> {
> >>    //within the loop, requires all the elements of aM
> >> }
> >> ////////////////////////////////////////////////////////////
> >>
> >> but this seems not working well.
> >
> > The phrase "not working" should never appear unqualified in polite
> > conversation.  Send steps to reproduce, what you expect, and what you
> > observe.
> >
> >> Would you please suggest a more efficient way? Thank you.
> >
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140511/2818ea16/attachment.html>

From tlk0812 at hotmail.com  Sun May 11 20:07:17 2014
From: tlk0812 at hotmail.com (LikunTan)
Date: Mon, 12 May 2014 09:07:17 +0800
Subject: [petsc-users] Normalize vectors
Message-ID: <COL127-W415EC21393569E95F3ACFB7350@phx.gbl>

Dear Petsc developers,
I have a vector M which consists of a series of 3d vectors, and I want to reset M by normalizing each 3d vector. Here is my code:

/**********************************************************************VecGetArray(M, &aM);     DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); 	for(node=xs; node<xs+xm; node++){	mag=0.0;	for(l=0; l<3; l++)	{		val[l]=aM[node*3+l];		mag=mag+val[l]*val[l];		}	mag=sqrt(mag);	for(l=0; l<user->NDOF; l++)	{	    aM[node*3+l]=val[l]/mag;	}}VecRestoreArray(M, &aM);       VecAssemblyBegin(M);VecAssemblyEnd(M); VecView(M, PETSC_VIEWER_STDOUT_WORLD);**********************************************************************/
but I got the error at the last step:--------------------------------------------------------------------------mpiexec noticed that process rank 3 with PID 17156 on node compute-21-8.local exited on signal 6 (Aborted).--------------------------------------------------------------------------and if I commented out VecView, and used vector M for other operations, e.g.KSPSolve(ksp, M, b), I got "memory corruption" message. Your comment on this issue is well appreciated.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140512/7d692768/attachment.html>

From knepley at gmail.com  Sun May 11 20:09:40 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 11 May 2014 20:09:40 -0500
Subject: [petsc-users] Normalize vectors
In-Reply-To: <COL127-W415EC21393569E95F3ACFB7350@phx.gbl>
References: <COL127-W415EC21393569E95F3ACFB7350@phx.gbl>
Message-ID: <CAMYG4G=r7Q1rTv_oRBbSBbz4wOFOfJjhG3e693Z-SuKK7XD-VA@mail.gmail.com>

On Sun, May 11, 2014 at 8:07 PM, LikunTan <tlk0812 at hotmail.com> wrote:

> Dear Petsc developers,
>
> I have a vector M which consists of a series of 3d vectors, and I want to
> reset M by normalizing each 3d vector. Here is my code:
>
> /**********************************************************************
> VecGetArray(M, &aM);
>

Its easier not to make a mistake if you use

  DMDAVecGetArrayDof()

     Matt


> DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0);
> for(node=xs; node<xs+xm; node++)
> {
> mag=0.0;
> for(l=0; l<3; l++)
> {
> val[l]=aM[node*3+l];
> mag=mag+val[l]*val[l];
> }
> mag=sqrt(mag);
> for(l=0; l<user->NDOF; l++)
> {
>     aM[node*3+l]=val[l]/mag;
> }
> }
> VecRestoreArray(M, &aM);
> VecAssemblyBegin(M);
> VecAssemblyEnd(M);
> VecView(M, PETSC_VIEWER_STDOUT_WORLD);
> **********************************************************************/
>
> but I got the error at the last step:
> --------------------------------------------------------------------------
> mpiexec noticed that process rank 3 with PID 17156 on node
> compute-21-8.local exited on signal 6 (Aborted).
> --------------------------------------------------------------------------
> and if I commented out VecView, and used vector M for other operations,
> e.g.
> KSPSolve(ksp, M, b), I got "memory corruption" message. Your comment on
> this issue is well appreciated.
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140511/fd811493/attachment.html>

From bsmith at mcs.anl.gov  Sun May 11 21:08:11 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 11 May 2014 21:08:11 -0500
Subject: [petsc-users] Normalize vectors
In-Reply-To: <COL127-W415EC21393569E95F3ACFB7350@phx.gbl>
References: <COL127-W415EC21393569E95F3ACFB7350@phx.gbl>
Message-ID: <8FBEE805-C520-4700-A9D7-97D4DB4EDC79@mcs.anl.gov>


On May 11, 2014, at 8:07 PM, LikunTan <tlk0812 at hotmail.com> wrote:

> Dear Petsc developers,
> 
> I have a vector M which consists of a series of 3d vectors, and I want to reset M by normalizing each 3d vector. Here is my code:

   This code is wrong in several ways

> 
> /**********************************************************************
> VecGetArray(M, &aM);     

    This always returns an array that is indexed starting at 0, so the code below when you access with aM[node*3+l] is like totally reading from the wrong place. 

     Instead use 

typedef struct {
   PetscScalar x,y,z;
} Field; 

   Field *aM;

   DMDAVecGetArray(da,M,&aM);

   This routine returns an array whose index starts at xs

> DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0); 
> 	
> for(node=xs; node<xs+xm; node++)
> {
> 	mag=PetscSqrtScalar(aM[xs].x*aM[xs].x +  aM[xs].y*aM[xs].y  aM[xs].z*aM[xs].z);
>         if (mag != 0.0) {
>            aM[xs].x /= mag;
>            aM[xs].y /= mag;
>            aM[xs].z /= mag;
          }
> }
> DMDAVecRestoreArray(da,M, &aM);       
> 
 When accessing the vector arrays directly you do not need VecAssemblyBegin/End() they are only for use with VecSetValues()

  Also use valgrind to find memory access problems: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

> VecView(M, PETSC_VIEWER_STDOUT_WORLD);
> **********************************************************************/
> 
> but I got the error at the last step:
> --------------------------------------------------------------------------
> mpiexec noticed that process rank 3 with PID 17156 on node compute-21-8.local exited on signal 6 (Aborted).
> --------------------------------------------------------------------------
> and if I commented out VecView, and used vector M for other operations, e.g.
> KSPSolve(ksp, M, b), I got "memory corruption" message. Your comment on this issue is well appreciated.


From hzhang at mcs.anl.gov  Mon May 12 10:28:29 2014
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Mon, 12 May 2014 10:28:29 -0500
Subject: [petsc-users] MatGetMumpsRINFOG()
In-Reply-To: <16ccafbcb0c941b09a063946350a6688@GEORGE.anl.gov>
References: <16ccafbcb0c941b09a063946350a6688@GEORGE.anl.gov>
Message-ID: <CAGCphBuHy9rVmqouiU-y921i4RAECNk-aPXEb3hOUc+TD4-bTw@mail.gmail.com>

Zafer,
MatGetMumpsXXX() are added to petsc development
https://bitbucket.org/petsc/petsc/commits/d28e04f7b1f5a73d3305399116f492322ea7448c

Hong

On Tue, May 6, 2014 at 11:39 PM, Zafer Leylek
<Zafer.Leylek at student.adfa.edu.au> wrote:
> Hi,
>
> I am trying to get mumps to return the matrix determinant. I have set the
> ICNTL option using:
>
> MatMumpsSetIcntl(A,33,1);
>
> and can view the determinant using
>
> PCView(pc, PETSC_VIEWER_STDOUT_WORLD);
>
> I  need to use the determinant in my code. Is there a way I can get petsc to
> return this parameter. If not, is it possible to implement the
> MatGetMumpsRINFOG() as suggested in:
>
> http://lists.mcs.anl.gov/pipermail/petsc-users/2011-September/010225.html
>
> King Regards
>
> ZL

From likunt at caltech.edu  Mon May 12 12:27:51 2014
From: likunt at caltech.edu (likunt at caltech.edu)
Date: Mon, 12 May 2014 10:27:51 -0700 (PDT)
Subject: [petsc-users] solving Ax=b with constant A
Message-ID: <58653.131.215.220.165.1399915671.squirrel@webmail.caltech.edu>

Dear Petsc developers,

I am solving a linear system Ax=b, while A is constant and b is changing
in each time step. Here is the code I wrote:

/****************************************************************
...compute matrix A...
KSPCreate(PETSC_COMM_WORLD, &ksp);  CHKERRQ(ierr);
KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER);  CHKERRQ(ierr);
KSPSetTolerances(ksp, 1.e-5, 1.E-50, PETSC_DEFAULT, PETSC_DEFAULT);
KSPSetFromOptions(ksp);
for(int step=0; step<STEP; step++)
{
    ... compute vector b ...
    KSPSolve(ksp, b, x);
}
*****************************************************************/

I tested a system with size 1725*1725, on 4 processors, it takes 0.06s.
Would you please let me know if there is a way to improve its efficiency?
Thanks.


From knepley at gmail.com  Mon May 12 12:37:10 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 12 May 2014 12:37:10 -0500
Subject: [petsc-users] solving Ax=b with constant A
In-Reply-To: <58653.131.215.220.165.1399915671.squirrel@webmail.caltech.edu>
References: <58653.131.215.220.165.1399915671.squirrel@webmail.caltech.edu>
Message-ID: <CAMYG4Gkp05nLS7pYK3F2GWKtpfPL2E7QSE9oxPhiVMfsVzB21A@mail.gmail.com>

On Mon, May 12, 2014 at 12:27 PM, <likunt at caltech.edu> wrote:

> Dear Petsc developers,
>
> I am solving a linear system Ax=b, while A is constant and b is changing
> in each time step. Here is the code I wrote:
>
> /****************************************************************
> ...compute matrix A...
> KSPCreate(PETSC_COMM_WORLD, &ksp);  CHKERRQ(ierr);
> KSPSetOperators(ksp, A, A, SAME_PRECONDITIONER);  CHKERRQ(ierr);
> KSPSetTolerances(ksp, 1.e-5, 1.E-50, PETSC_DEFAULT, PETSC_DEFAULT);
> KSPSetFromOptions(ksp);
> for(int step=0; step<STEP; step++)
> {
>     ... compute vector b ...
>     KSPSolve(ksp, b, x);
> }
> *****************************************************************/
>
> I tested a system with size 1725*1725, on 4 processors, it takes 0.06s.
> Would you please let me know if there is a way to improve its efficiency?
>

It would be amazing if we could do that given the description. First, we do
not know
exactly what solver is being used (-ksp_view), but lets assume its
GMRES/ILU(0)
which is the default. Second, we have no idea what the convergence was like
(-ksp_monitor_true_residual -ksp_converged_reason), so we do not know what
the bottleneck is, and have no performance monitoring (-log_summary).
Lastly,
even if we had that we have no idea what the operator is so that we could
make
intelligent suggestions for other preconditioners.

   Matt


> Thanks.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140512/795bc68c/attachment.html>

From lu_qin_2000 at yahoo.com  Mon May 12 16:54:17 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Mon, 12 May 2014 14:54:17 -0700 (PDT)
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>	<2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
	<CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
Message-ID: <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com>

Hello,

I have built?PETSc with SuperLU,?but what are?PETSc's command line?options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?)
?
Do I need to do some programming in order to call SuperLU's preconditioner,?or the command line options would work???
?
Many thanks,
Qin???


?From: Xiaoye S. Li <xsli at lbl.gov>
To: Barry Smith <bsmith at mcs.anl.gov> 
Cc: Qin Lu <lu_qin_2000 at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
Sent: Friday, May 2, 2014 3:40 PM
Subject: Re: [petsc-users] ILUTP in PETSc


The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily. ?

In SuperLU distribution:

? EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c)

? SRC/zgsitrf.c : the actual ILUTP factorization routine


Sherry Li


On Fri, May 2, 2014 at 12:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:


>At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html ?there are two listed. ./configure ?download-hypre
>
>mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid
>
>you can also add -help to see what options are available.
>
>? Both pretty much suck and I can?t image much reason for using them.
>
>? ?Barry
>
>
>
>On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>
>> Hello,
>>
>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?
>>
>> Many thanks,
>> Qin
>
>?? ??

From bsmith at mcs.anl.gov  Mon May 12 17:11:12 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 12 May 2014 17:11:12 -0500
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>	<2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
	<CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
	<1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com>
Message-ID: <8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov>


   See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html


On May 12, 2014, at 4:54 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Hello,
> 
> I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?)
>  
> Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work?  
>  
> Many thanks,
> Qin   
> 
> 
>  From: Xiaoye S. Li <xsli at lbl.gov>
> To: Barry Smith <bsmith at mcs.anl.gov> 
> Cc: Qin Lu <lu_qin_2000 at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
> Sent: Friday, May 2, 2014 3:40 PM
> Subject: Re: [petsc-users] ILUTP in PETSc
> 
> 
> 
> The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily.  
> 
> In SuperLU distribution:
> 
>   EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c)
> 
>   SRC/zgsitrf.c : the actual ILUTP factorization routine
> 
> 
> Sherry Li
> 
> 
> 
> On Fri, May 2, 2014 at 12:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html  there are two listed. ./configure ?download-hypre
>> 
>> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid
>> 
>> you can also add -help to see what options are available.
>> 
>>   Both pretty much suck and I can?t image much reason for using them.
>> 
>>    Barry
>> 
>> 
>> 
>> On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>> 
>>> Hello,
>>> 
>>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdf that mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?
>>> 
>>> Many thanks,
>>> Qin
>> 
>>      


From zonexo at gmail.com  Mon May 12 21:52:08 2014
From: zonexo at gmail.com (TAY wee-beng)
Date: Tue, 13 May 2014 10:52:08 +0800
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>
References: <534C9A2C.5060404@gmail.com>	<534C9DB5.9070407@gmail.com>	<CB93E42E-730E-4EA5-9F84-DE656A7A43A1@mcs.anl.gov>	<53514B8A.90901@gmail.com>	<495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov>	<5351E62B.6060201@gmail.com>	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>	<53520587.6010606@gmail.com>	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>	<535248E8.2070002@gmail.com>	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>	<535284E0.8010901@gmail.com>	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>	<5352934C.1010306@gmail.com>	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>	<53529B09.8040009@gmail.com>
	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>
	<5353173D.60609@gmail.com> <53546B03.1010407@gmail.com>
	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>
Message-ID: <537188D8.2030307@gmail.com>

Hi,

I have sent the entire code a while ago. Is there any answer? I was also 
trying myself but it worked for some intel compiler, and some not. I'm 
still not able to find the answer. gnu compilers for most cluster are 
old versions so they are not able to compile since I have allocatable 
structures.

Thank you.

Yours sincerely,

TAY wee-beng

On 21/4/2014 8:58 AM, Barry Smith wrote:
>     Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email.
>
>     Barry
>
> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>
>> On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>
>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>
>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>>>>>     Hmm,
>>>>>>>
>>>>>>>         Interface DMDAVecGetArrayF90
>>>>>>>           Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>>>>             USE_DM_HIDE
>>>>>>>             DM_HIDE da1
>>>>>>>             VEC_HIDE v
>>>>>>>             PetscScalar,pointer :: d1(:,:,:)
>>>>>>>             PetscErrorCode ierr
>>>>>>>           End Subroutine
>>>>>>>
>>>>>>>      So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array??
>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>> Hi,
>>>>>>>
>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u".
>>>>>>>
>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why...
>>>>>>>
>>>>>>> Also, supposed I call:
>>>>>>>
>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>
>>>>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>
>>>>>>> u_array ....
>>>>>>>
>>>>>>> v_array .... etc
>>>>>>>
>>>>>>> Now to restore the array, does it matter the sequence they are restored?
>>>>>>>      No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code.
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization:
>>>>>>>
>>>>>>>
>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>
>>>>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>
>>>>>>>      call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>
>>>>>>>      u_array = 0.d0
>>>>>>>
>>>>>>>      v_array = 0.d0
>>>>>>>
>>>>>>>      w_array = 0.d0
>>>>>>>
>>>>>>>      p_array = 0.d0
>>>>>>>
>>>>>>>
>>>>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>
>>>>>>>
>>>>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>
>>>>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>
>>>>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other                                                     version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi?
>>>>>>>
>>>>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F?
>>>>>> Hi Matt,
>>>>>>
>>>>>> Do you mean putting the above lines into ex11f90.F and test?
>>>>>>
>>>>>> It already has DMDAVecGetArray(). Just run it.
>>>>> Hi,
>>>>>
>>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region.
>>>>>
>>>>> No the global/local difference should not matter.
>>>>>   
>>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though.
>>>>>
>>>>> DMGetLocalVector()
>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter?
>>>>
>>>> If so, when should I call them?
>>>>
>>>> You just need a local vector from somewhere.
>> Hi,
>>
>> Anyone can help with the questions below? Still trying to find why my code doesn't work.
>>
>> Thanks.
>>> Hi,
>>>
>>> I insert part of my error region code into ex11f90:
>>>
>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>      
>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>      
>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>      
>>>      call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>
>>>      u_array = 0.d0
>>>      
>>>      v_array = 0.d0
>>>      
>>>      w_array = 0.d0
>>>      
>>>      p_array = 0.d0
>>>
>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>
>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>
>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>
>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> It worked w/o error. I'm going to change the way the modules are defined in my code.
>>>
>>> My code contains a main program and a number of modules files, with subroutines inside e.g.
>>>
>>> module solve
>>>                   <- add include file?
>>> subroutine RRK
>>>                   <- add include file?
>>> end subroutine RRK
>>>
>>> end module solve
>>>
>>> So where should the include files (#include <finclude/petscdmda.h90>) be placed?
>>>
>>> After the module or inside the subroutine?
>>>
>>> Thanks.
>>>>    Matt
>>>>   
>>>> Thanks.
>>>>>     Matt
>>>>>   
>>>>> Thanks.
>>>>>>     Matt
>>>>>>   
>>>>>> Thanks
>>>>>>
>>>>>> Regards.
>>>>>>>     Matt
>>>>>>>   
>>>>>>> As in w, then v and u?
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>> thanks
>>>>>>>      Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F  (and they do not start at 1). This is how to get the loop bounds.
>>>>>>> Hi,
>>>>>>>
>>>>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?.
>>>>>>>
>>>>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so?
>>>>>>>      Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors.  So really C or Fortran                                                           doesn?t make any difference.
>>>>>>>
>>>>>>>
>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>>>>>     If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1.
>>>>>>>
>>>>>>>     Barry
>>>>>>>
>>>>>>> Thanks.
>>>>>>>     Barry
>>>>>>>
>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)"
>>>>>>>
>>>>>>> However, by re-writing my code, I found out a few things:
>>>>>>>
>>>>>>> 1. if I write my code this way:
>>>>>>>
>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>
>>>>>>> u_array = ....
>>>>>>>
>>>>>>> v_array = ....
>>>>>>>
>>>>>>> w_array = ....
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>> The code runs fine.
>>>>>>>
>>>>>>> 2. if I write my code this way:
>>>>>>>
>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>
>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above.
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>>>>>
>>>>>>> where the subroutine is:
>>>>>>>
>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>
>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>
>>>>>>> u ...
>>>>>>> v...
>>>>>>> w ...
>>>>>>>
>>>>>>> end subroutine uvw_array_change.
>>>>>>>
>>>>>>> The above will give an error at :
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>> 3. Same as above, except I change the order of the last 3 lines to:
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>
>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>
>>>>>>> So they are now in reversed order. Now it works.
>>>>>>>
>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>>>>>
>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>
>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>
>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>
>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>
>>>>>>> u ...
>>>>>>> v...
>>>>>>> w ...
>>>>>>>
>>>>>>> end subroutine uvw_array_change.
>>>>>>>
>>>>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick".
>>>>>>>
>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) "
>>>>>>>
>>>>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other                                                           than the "trick" to continue using the 1 indices convention.
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Yours sincerely,
>>>>>>>
>>>>>>> TAY wee-beng
>>>>>>>
>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>>>>>     Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>>
>>>>>>>
>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>
>>>>>>> Hi Barry,
>>>>>>>
>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode.
>>>>>>>
>>>>>>> I have attached my code.
>>>>>>>
>>>>>>> Thank you
>>>>>>>
>>>>>>> Yours sincerely,
>>>>>>>
>>>>>>> TAY wee-beng
>>>>>>>
>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>>>>>     Please send the code that creates da_w and the declarations of w_array
>>>>>>>
>>>>>>>     Barry
>>>>>>>
>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>>>>>> <zonexo at gmail.com>
>>>>>>>    wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Barry,
>>>>>>>
>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>>>>>
>>>>>>>    mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>
>>>>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts.
>>>>>>>
>>>>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too.
>>>>>>>
>>>>>>> mpirun -n 4 ./a.out -start_in_debugger
>>>>>>> --------------------------------------------------------------------------
>>>>>>> An MPI process has executed an operation involving a call to the
>>>>>>> "fork()" system call to create a child process.  Open MPI is currently
>>>>>>> operating in a condition that could result in memory corruption or
>>>>>>> other system errors; your MPI job may hang, crash, or produce silent
>>>>>>> data corruption.  The use of fork() (or system() or other calls that
>>>>>>> create child processes) is strongly discouraged.
>>>>>>>
>>>>>>> The process that invoked fork was:
>>>>>>>
>>>>>>>     Local host:          n12-76 (PID 20235)
>>>>>>>     MPI_COMM_WORLD rank: 2
>>>>>>>
>>>>>>> If you are *absolutely sure* that your application will successfully
>>>>>>> and correctly survive a call to fork(), you may disable this warning
>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76
>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76
>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76
>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76
>>>>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>>>>>>
>>>>>>> ....
>>>>>>>
>>>>>>>    1
>>>>>>> [1]PETSC ERROR: ------------------------------------------------------------------------
>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>> [1]PETSC ERROR: or see
>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org
>>>>>>>    on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>>>>> [1]PETSC ERROR: to get more information on the crash.
>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>>>>> [3]PETSC ERROR: ------------------------------------------------------------------------
>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>> [3]PETSC ERROR: or see
>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org
>>>>>>>    on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>>>>> [3]PETSC ERROR: to get more information on the crash.
>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>>>>>
>>>>>>> ...
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Yours sincerely,
>>>>>>>
>>>>>>> TAY wee-beng
>>>>>>>
>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>>>>>
>>>>>>>     Because IO doesn?t always get flushed immediately it may not be hanging at this point.  It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point.
>>>>>>>
>>>>>>>      Barry
>>>>>>>
>>>>>>>     This routines don?t have any parallel communication in them so are unlikely to hang.
>>>>>>>
>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>>>>>
>>>>>>> <zonexo at gmail.com>
>>>>>>>
>>>>>>>    wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>>>>>>>
>>>>>>>           call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3"
>>>>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>>>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>>>>>>           call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>>>>>>           call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be in reverse order
>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>>>>>>           call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>>>>>>           call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>> -- 
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Yours sincerely,
>>>>>>>
>>>>>>> TAY wee-beng
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> <code.txt>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>
>>>>
>>>>
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener


From altriaex86 at gmail.com  Tue May 13 00:02:08 2014
From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=)
Date: Tue, 13 May 2014 15:02:08 +1000
Subject: [petsc-users] Configured with superlu but cannot find a package
Message-ID: <CACDKL=zvVY_K_H5nrE5vrnndBwxQti4F+RE7u8xZ7fX=i=GvZQ@mail.gmail.com>

Hi, there

>From the error message below I am sure I configured PETSc with superLU and
superLU-DIST. However, it told me there's no such package.
Or, is mpiaij not compatible with superlu? According to the manual, I think
it should be compatible with it. Thanks a lot.

[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: No support for this operation for this object type!
[0]PETSC ERROR: Matrix format mpiaij does not have a solver package superlu
for LU. Perhaps you must ./configure with --download-superlu!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named
altria-Aspire-5830TG by root Tue May 13 14:53:33 2014
[0]PETSC ERROR: Libraries linked from
/home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib
[0]PETSC ERROR: Configure run at Tue May 13 14:43:13 2014
[0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
--with-cxx=g++ --download-mpich --download-scalapack --download-metis
--download-parmetis --download-mumps --download-PASTIX --download-superLU
--download-superLU-dist --with-scalar-type=complex --with-clanguage=cxx

My code

    input(Ap,Ai,Ax,Az,size,nz);          //Process input
    MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);
    EPSCreate( PETSC_COMM_WORLD, &eps ); //Setup Solver
    EPSSetOperators(eps,A,NULL);
    EPSSetProblemType(eps,EPS_NHEP);
    EPSSetDimensions(eps,1,6,0);
    EPSSetType(eps,type);
    EPSSetTarget(eps,offset);
    EPSSetWhichEigenpairs(eps,EPS_TARGET_REAL); //Set Target


    //EPSSetExtraction(eps,EPS_HARMONIC);
    EPSGetST(eps,&st);               //shift-and-invert
    STSetType(st,STSINVERT);
    STSetShift(st,offset);
    STGetKSP(st,&ksp);
    KSPSetType(ksp,KSPPREONLY);
    KSPGetPC(ksp,&pc);
    PCSetType(pc,PCLU);
    PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU);
    EPSSolve(eps);
    EPSGetConverged(eps,&nconv);


Function input

MatCreate(PETSC_COMM_WORLD,&A);
    MatSetType(A,MATMPIAIJ);
    MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size);
    MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp);
    MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);
    MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);


Guoxi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/ec8dd202/attachment.html>

From balay at mcs.anl.gov  Tue May 13 00:10:35 2014
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 13 May 2014 00:10:35 -0500
Subject: [petsc-users] Configured with superlu but cannot find a package
In-Reply-To: <CACDKL=zvVY_K_H5nrE5vrnndBwxQti4F+RE7u8xZ7fX=i=GvZQ@mail.gmail.com>
References: <CACDKL=zvVY_K_H5nrE5vrnndBwxQti4F+RE7u8xZ7fX=i=GvZQ@mail.gmail.com>
Message-ID: <alpine.LFD.2.11.1405130007110.2810@asterix>

Hm - none of these options should have any 'capitalized' letters

[and superlu_dist has an '_' - not a '-'

--download-PASTIX --download-superLU--download-superLU-dist

They should be:

--download-pastix --download-superlu --download-superlu_dist

Satish

On Tue, 13 May 2014, ??? wrote:

> Hi, there
> 
> From the error message below I am sure I configured PETSc with superLU and
> superLU-DIST. However, it told me there's no such package.
> Or, is mpiaij not compatible with superlu? According to the manual, I think
> it should be compatible with it. Thanks a lot.
> 
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: No support for this operation for this object type!
> [0]PETSC ERROR: Matrix format mpiaij does not have a solver package superlu
> for LU. Perhaps you must ./configure with --download-superlu!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named
> altria-Aspire-5830TG by root Tue May 13 14:53:33 2014
> [0]PETSC ERROR: Libraries linked from
> /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib
> [0]PETSC ERROR: Configure run at Tue May 13 14:43:13 2014
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
> --with-cxx=g++ --download-mpich --download-scalapack --download-metis
> --download-parmetis --download-mumps --download-PASTIX --download-superLU
> --download-superLU-dist --with-scalar-type=complex --with-clanguage=cxx
> 
> My code
> 
>     input(Ap,Ai,Ax,Az,size,nz);          //Process input
>     MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);
>     EPSCreate( PETSC_COMM_WORLD, &eps ); //Setup Solver
>     EPSSetOperators(eps,A,NULL);
>     EPSSetProblemType(eps,EPS_NHEP);
>     EPSSetDimensions(eps,1,6,0);
>     EPSSetType(eps,type);
>     EPSSetTarget(eps,offset);
>     EPSSetWhichEigenpairs(eps,EPS_TARGET_REAL); //Set Target
> 
> 
>     //EPSSetExtraction(eps,EPS_HARMONIC);
>     EPSGetST(eps,&st);               //shift-and-invert
>     STSetType(st,STSINVERT);
>     STSetShift(st,offset);
>     STGetKSP(st,&ksp);
>     KSPSetType(ksp,KSPPREONLY);
>     KSPGetPC(ksp,&pc);
>     PCSetType(pc,PCLU);
>     PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU);
>     EPSSolve(eps);
>     EPSGetConverged(eps,&nconv);
> 
> 
> Function input
> 
> MatCreate(PETSC_COMM_WORLD,&A);
>     MatSetType(A,MATMPIAIJ);
>     MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size);
>     MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp);
>     MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);
>     MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);
> 
> 
> 
> Guoxi
> 

From altriaex86 at gmail.com  Tue May 13 01:07:15 2014
From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=)
Date: Tue, 13 May 2014 16:07:15 +1000
Subject: [petsc-users] Configured with superlu but cannot find a package
In-Reply-To: <alpine.LFD.2.11.1405130007110.2810@asterix>
References: <CACDKL=zvVY_K_H5nrE5vrnndBwxQti4F+RE7u8xZ7fX=i=GvZQ@mail.gmail.com>
	<alpine.LFD.2.11.1405130007110.2810@asterix>
Message-ID: <CACDKL=zrVkWLcZGTr6OKBmbkdFPorQOYWA8TCjJq425_OQe-HA@mail.gmail.com>

 Oh, I fixed the spelling and reconfigured it but I got  error again.

[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: No support for this operation for this object type!
[0]PETSC ERROR: Matrix format mpiaij does not have a solver package superlu
for LU. Perhaps you must ./configure with --download-superlu!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named
altria-Aspire-5830TG by root Tue May 13 16:06:00 2014
[0]PETSC ERROR: Libraries linked from
/home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib
[0]PETSC ERROR: Configure run at Tue May 13 15:58:29 2014
[0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
--with-cxx=g++ --download-mpich --download-scalapack --download-metis
--download-parmetis --download-mumps --download-superlu
--download-superlu_dist --with-scalar-type=complex --with-clanguage=cxx


2014-05-13 15:10 GMT+10:00 Satish Balay <balay at mcs.anl.gov>:

> Hm - none of these options should have any 'capitalized' letters
>
> [and superlu_dist has an '_' - not a '-'
>
> --download-PASTIX --download-superLU--download-superLU-dist
>
> They should be:
>
> --download-pastix --download-superlu --download-superlu_dist
>
> Satish
>
> On Tue, 13 May 2014, ??? wrote:
>
> > Hi, there
> >
> > From the error message below I am sure I configured PETSc with superLU
> and
> > superLU-DIST. However, it told me there's no such package.
> > Or, is mpiaij not compatible with superlu? According to the manual, I
> think
> > it should be compatible with it. Thanks a lot.
> >
> > [0]PETSC ERROR: --------------------- Error Message
> > ------------------------------------
> > [0]PETSC ERROR: No support for this operation for this object type!
> > [0]PETSC ERROR: Matrix format mpiaij does not have a solver package
> superlu
> > for LU. Perhaps you must ./configure with --download-superlu!
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named
> > altria-Aspire-5830TG by root Tue May 13 14:53:33 2014
> > [0]PETSC ERROR: Libraries linked from
> > /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib
> > [0]PETSC ERROR: Configure run at Tue May 13 14:43:13 2014
> > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
> > --with-cxx=g++ --download-mpich --download-scalapack --download-metis
> > --download-parmetis --download-mumps --download-PASTIX --download-superLU
> > --download-superLU-dist --with-scalar-type=complex --with-clanguage=cxx
> >
> > My code
> >
> >     input(Ap,Ai,Ax,Az,size,nz);          //Process input
> >     MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);
> >     EPSCreate( PETSC_COMM_WORLD, &eps ); //Setup Solver
> >     EPSSetOperators(eps,A,NULL);
> >     EPSSetProblemType(eps,EPS_NHEP);
> >     EPSSetDimensions(eps,1,6,0);
> >     EPSSetType(eps,type);
> >     EPSSetTarget(eps,offset);
> >     EPSSetWhichEigenpairs(eps,EPS_TARGET_REAL); //Set Target
> >
> >
> >     //EPSSetExtraction(eps,EPS_HARMONIC);
> >     EPSGetST(eps,&st);               //shift-and-invert
> >     STSetType(st,STSINVERT);
> >     STSetShift(st,offset);
> >     STGetKSP(st,&ksp);
> >     KSPSetType(ksp,KSPPREONLY);
> >     KSPGetPC(ksp,&pc);
> >     PCSetType(pc,PCLU);
> >     PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU);
> >     EPSSolve(eps);
> >     EPSGetConverged(eps,&nconv);
> >
> >
> > Function input
> >
> > MatCreate(PETSC_COMM_WORLD,&A);
> >     MatSetType(A,MATMPIAIJ);
> >     MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size);
> >     MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp);
> >     MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);
> >     MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);
> >
> >
> >
> > Guoxi
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/ca6f0207/attachment.html>

From altriaex86 at gmail.com  Tue May 13 02:28:08 2014
From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=)
Date: Tue, 13 May 2014 17:28:08 +1000
Subject: [petsc-users] Get wrong answer when use multi-process
Message-ID: <CACDKL=w=FGD5x_ooYyD5_Modud7OiDSRAmTMbvYDZNcGNE+8Lg@mail.gmail.com>

Hi, all

I am confused about my code, for it could return right answer when I use 1
process, but return totally wrong answer when more than 1 process.

This is how I feed data to it.

I have a CSR matrix, represented by Ap(pointer),Ai(index),and temp(data).

First determine local matrix for each process. Then feed data to them.

  int temprank,localsize,line_pos;
    line_pos = 0;
    if(rank == 0)
    {
        localsize   = size/pro + ((size % pro) > rank);
    }
    else
    {
        for (temprank = 0;temprank <rank;temprank++)
        {
            localsize   = size/pro + ((size % pro) > temprank);
            line_pos += localsize;
        }
    }

    Lin_index = new int [localsize+1];
    for(i=0;i<localsize+1;i++)
    {
        Lin_index [i] = Ap[line_pos+i]-Ap[line_pos];
    }
    std::cerr<<"line_pos    "<<line_pos<<"\n";
    MatMPIAIJSetPreallocationCSR(A,Lin_index,Ai+line_pos,temp+line_pos);

I use spectral transform with MATSOLVERMUMPS to calculate eigenvalue.


The strange thing is, when I run it with one process, the eigenvalue is
what I want, typically,
 (8.39485e+13,5.3263)  (3.93842e+13,-82.6948) first two.
But for 2 process:
eigenvalue    (2.76523e+13,7.62222e+12)
eigenvalue    (2.76523e+13,-7.62222e+12)

3 process:
eigenvalue    (6.81292e+13,-3071.82)
eigenvalue    (3.49533e+13,2.48858e+13)

4
eigenvalue    (9.7562e+13,5012.4)
eigenvalue    (7.2019e+13,8.28561e+13)

However, it could pass simple test like
    int n = 12;
    int nz = 12;
    int Ap[13] = {0,1,2,3,4,5,6,7,8,9,10,11,12};
    int Ai[12] = { 0,1,2,3,4,5,6,7,8,9,10,11};
    double Ax[12] = {-1,-2,-3,-4,-5,6,7,8,9,10,11};
    double Az[12] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0};


Do you have any idea about it?

Thanks a lot!!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/63fa406a/attachment-0001.html>

From romain.veltz at inria.fr  Tue May 13 03:22:34 2014
From: romain.veltz at inria.fr (Veltz Romain)
Date: Tue, 13 May 2014 10:22:34 +0200
Subject: [petsc-users] Continuation
Message-ID: <96748DAF-78E0-4068-AF66-D70E91D2F6A1@inria.fr>

Dear Petsc users,


I would like to perform numerical continuation with Petsc but it lacks this functionality. Hence, I am wondering if anybody has a class for Moore-Penrose continuation or Pseudo-arclength continuation done in Petsc before I start doing it myself?

Thank you for your help,


Veltz Romain


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/6650bcf1/attachment.html>

From altriaex86 at gmail.com  Tue May 13 07:33:47 2014
From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=)
Date: Tue, 13 May 2014 22:33:47 +1000
Subject: [petsc-users] Cannot open graphic monitor
Message-ID: <CACDKL=wMFz5R6jZZ58B2q6JCBKRBsJ+7hVTbeNDLtggL5E1+=w@mail.gmail.com>

Hi,

I tried to open the graphic monitor by
    char  common_options[]      = "-st_ksp_type preonly \
                               -st_pc_type lu \
                               -st_pc_factor_mat_solver_package mumps \
                -eps_tol 1e-9 \
                -eps_monitor_lg_all \
                -draw_pause .2";
ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr);


Then ./program

But nothing comes out. Should I install any other package first to get it?

Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/1f3d656f/attachment.html>

From hzhang at mcs.anl.gov  Tue May 13 09:16:04 2014
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Tue, 13 May 2014 09:16:04 -0500
Subject: [petsc-users] Configured with superlu but cannot find a package
In-Reply-To: <3bc6d9c38c644588b1ca5e0e7bb04721@LUCKMAN.anl.gov>
References: <CACDKL=zvVY_K_H5nrE5vrnndBwxQti4F+RE7u8xZ7fX=i=GvZQ@mail.gmail.com>
	<alpine.LFD.2.11.1405130007110.2810@asterix>
	<3bc6d9c38c644588b1ca5e0e7bb04721@LUCKMAN.anl.gov>
Message-ID: <CAGCphBtPV6wL7vj2FUXgX7cJkz_+BsbSRZFTtBZ2Oi+24zZ_Nw@mail.gmail.com>

?? :

>
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: No support for this operation for this object type!
> [0]PETSC ERROR: Matrix format mpiaij does not have a solver package superlu
                                                         ^^^^^^
                                            ^^^^^^^^

Superlu is a sequential package. For parallel, you must use superlu_dist.
Suggest install both Superlu and superlu_dist ( --download-superlu_dist).

Hong

> for LU. Perhaps you must ./configure with --download-superlu!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named
> altria-Aspire-5830TG by root Tue May 13 16:06:00 2014
>
> [0]PETSC ERROR: Libraries linked from
> /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib
> [0]PETSC ERROR: Configure run at Tue May 13 15:58:29 2014
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
> --with-cxx=g++ --download-mpich --download-scalapack --download-metis
> --download-parmetis --download-mumps --download-superlu
> --download-superlu_dist --with-scalar-type=complex --with-clanguage=cxx
>
>
>
> 2014-05-13 15:10 GMT+10:00 Satish Balay <balay at mcs.anl.gov>:
>
>> Hm - none of these options should have any 'capitalized' letters
>>
>> [and superlu_dist has an '_' - not a '-'
>>
>> --download-PASTIX --download-superLU--download-superLU-dist
>>
>> They should be:
>>
>> --download-pastix --download-superlu --download-superlu_dist
>>
>> Satish
>>
>> On Tue, 13 May 2014, ??? wrote:
>>
>> > Hi, there
>> >
>> > From the error message below I am sure I configured PETSc with superLU
>> > and
>> > superLU-DIST. However, it told me there's no such package.
>> > Or, is mpiaij not compatible with superlu? According to the manual, I
>> > think
>> > it should be compatible with it. Thanks a lot.
>> >
>> > [0]PETSC ERROR: --------------------- Error Message
>> > ------------------------------------
>> > [0]PETSC ERROR: No support for this operation for this object type!
>> > [0]PETSC ERROR: Matrix format mpiaij does not have a solver package
>> > superlu
>> > for LU. Perhaps you must ./configure with --download-superlu!
>> > [0]PETSC ERROR:
>> > ------------------------------------------------------------------------
>> > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>> > [0]PETSC ERROR:
>> > ------------------------------------------------------------------------
>> > [0]PETSC ERROR: Unknown Name on a arch-linux2-c-debug named
>> > altria-Aspire-5830TG by root Tue May 13 14:53:33 2014
>> > [0]PETSC ERROR: Libraries linked from
>> > /home/altria/software/petsc-3.4.4/arch-linux2-c-debug/lib
>> > [0]PETSC ERROR: Configure run at Tue May 13 14:43:13 2014
>> > [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
>> > --with-cxx=g++ --download-mpich --download-scalapack --download-metis
>> > --download-parmetis --download-mumps --download-PASTIX
>> > --download-superLU
>> > --download-superLU-dist --with-scalar-type=complex --with-clanguage=cxx
>> >
>> > My code
>> >
>> >     input(Ap,Ai,Ax,Az,size,nz);          //Process input
>> >     MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);
>> >     EPSCreate( PETSC_COMM_WORLD, &eps ); //Setup Solver
>> >     EPSSetOperators(eps,A,NULL);
>> >     EPSSetProblemType(eps,EPS_NHEP);
>> >     EPSSetDimensions(eps,1,6,0);
>> >     EPSSetType(eps,type);
>> >     EPSSetTarget(eps,offset);
>> >     EPSSetWhichEigenpairs(eps,EPS_TARGET_REAL); //Set Target
>> >
>> >
>> >     //EPSSetExtraction(eps,EPS_HARMONIC);
>> >     EPSGetST(eps,&st);               //shift-and-invert
>> >     STSetType(st,STSINVERT);
>> >     STSetShift(st,offset);
>> >     STGetKSP(st,&ksp);
>> >     KSPSetType(ksp,KSPPREONLY);
>> >     KSPGetPC(ksp,&pc);
>> >     PCSetType(pc,PCLU);
>> >     PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU);
>> >     EPSSolve(eps);
>> >     EPSGetConverged(eps,&nconv);
>> >
>> >
>> > Function input
>> >
>> > MatCreate(PETSC_COMM_WORLD,&A);
>> >     MatSetType(A,MATMPIAIJ);
>> >     MatSetSizes(A,PETSC_DECIDE, PETSC_DECIDE,size,size);
>> >     MatMPIAIJSetPreallocationCSR(A,Ap,Ai,temp);
>> >     MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);
>> >     MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);
>> >
>> >
>> >
>> > Guoxi
>> >
>
>

From bsmith at mcs.anl.gov  Tue May 13 11:03:21 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 11:03:21 -0500
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <537188D8.2030307@gmail.com>
References: <534C9A2C.5060404@gmail.com>	<534C9DB5.9070407@gmail.com>	<CB93E42E-730E-4EA5-9F84-DE656A7A43A1@mcs.anl.gov>	<53514B8A.90901@gmail.com>	<495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov>	<5351E62B.6060201@gmail.com>	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>	<53520587.6010606@gmail.com>	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>	<535248E8.2070002@gmail.com>	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>	<535284E0.8010901@gmail.com>	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>	<5352934C.1010306@gmail.com>	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>	<53529B09.8040009@gmail.com>
	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>
	<5353173D.60609@gmail.com> <53546B03.1010407@gmail.com>
	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>
	<537188D8.2030307@gmail.com>
Message-ID: <A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>


  Please send you current code. So we may compile and run it.

  Barry


On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
> 
> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures.
> 
> Thank you.
> 
> Yours sincerely,
> 
> TAY wee-beng
> 
> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>    Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email.
>> 
>>    Barry
>> 
>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>> 
>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>> 
>>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>> 
>>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>>>>>>    Hmm,
>>>>>>>> 
>>>>>>>>        Interface DMDAVecGetArrayF90
>>>>>>>>          Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>>>>>            USE_DM_HIDE
>>>>>>>>            DM_HIDE da1
>>>>>>>>            VEC_HIDE v
>>>>>>>>            PetscScalar,pointer :: d1(:,:,:)
>>>>>>>>            PetscErrorCode ierr
>>>>>>>>          End Subroutine
>>>>>>>> 
>>>>>>>>     So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array??
>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u".
>>>>>>>> 
>>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why...
>>>>>>>> 
>>>>>>>> Also, supposed I call:
>>>>>>>> 
>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>>     call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> 
>>>>>>>>     call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> 
>>>>>>>> u_array ....
>>>>>>>> 
>>>>>>>> v_array .... etc
>>>>>>>> 
>>>>>>>> Now to restore the array, does it matter the sequence they are restored?
>>>>>>>>     No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code.
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>>     call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> 
>>>>>>>>     call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> 
>>>>>>>>     call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>> 
>>>>>>>>     u_array = 0.d0
>>>>>>>> 
>>>>>>>>     v_array = 0.d0
>>>>>>>> 
>>>>>>>>     w_array = 0.d0
>>>>>>>> 
>>>>>>>>     p_array = 0.d0
>>>>>>>> 
>>>>>>>> 
>>>>>>>>     call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>> 
>>>>>>>> 
>>>>>>>>     call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> 
>>>>>>>>     call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> 
>>>>>>>>     call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other                                                     version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi?
>>>>>>>> 
>>>>>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F?
>>>>>>> Hi Matt,
>>>>>>> 
>>>>>>> Do you mean putting the above lines into ex11f90.F and test?
>>>>>>> 
>>>>>>> It already has DMDAVecGetArray(). Just run it.
>>>>>> Hi,
>>>>>> 
>>>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region.
>>>>>> 
>>>>>> No the global/local difference should not matter.
>>>>>>  Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though.
>>>>>> 
>>>>>> DMGetLocalVector()
>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter?
>>>>> 
>>>>> If so, when should I call them?
>>>>> 
>>>>> You just need a local vector from somewhere.
>>> Hi,
>>> 
>>> Anyone can help with the questions below? Still trying to find why my code doesn't work.
>>> 
>>> Thanks.
>>>> Hi,
>>>> 
>>>> I insert part of my error region code into ex11f90:
>>>> 
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>          call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>          call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>          call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>> 
>>>>     u_array = 0.d0
>>>>          v_array = 0.d0
>>>>          w_array = 0.d0
>>>>          p_array = 0.d0
>>>> 
>>>>     call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>> 
>>>>     call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>>     call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>>     call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> It worked w/o error. I'm going to change the way the modules are defined in my code.
>>>> 
>>>> My code contains a main program and a number of modules files, with subroutines inside e.g.
>>>> 
>>>> module solve
>>>>                  <- add include file?
>>>> subroutine RRK
>>>>                  <- add include file?
>>>> end subroutine RRK
>>>> 
>>>> end module solve
>>>> 
>>>> So where should the include files (#include <finclude/petscdmda.h90>) be placed?
>>>> 
>>>> After the module or inside the subroutine?
>>>> 
>>>> Thanks.
>>>>>   Matt
>>>>>  Thanks.
>>>>>>    Matt
>>>>>>  Thanks.
>>>>>>>    Matt
>>>>>>>  Thanks
>>>>>>> 
>>>>>>> Regards.
>>>>>>>>    Matt
>>>>>>>>  As in w, then v and u?
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>> thanks
>>>>>>>>     Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F  (and they do not start at 1). This is how to get the loop bounds.
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?.
>>>>>>>> 
>>>>>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so?
>>>>>>>>     Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors.  So really C or Fortran                                                           doesn?t make any difference.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>>>>>>    If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1.
>>>>>>>> 
>>>>>>>>    Barry
>>>>>>>> 
>>>>>>>> Thanks.
>>>>>>>>    Barry
>>>>>>>> 
>>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)"
>>>>>>>> 
>>>>>>>> However, by re-writing my code, I found out a few things:
>>>>>>>> 
>>>>>>>> 1. if I write my code this way:
>>>>>>>> 
>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> 
>>>>>>>> u_array = ....
>>>>>>>> 
>>>>>>>> v_array = ....
>>>>>>>> 
>>>>>>>> w_array = ....
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>> The code runs fine.
>>>>>>>> 
>>>>>>>> 2. if I write my code this way:
>>>>>>>> 
>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> 
>>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above.
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>>>>>> 
>>>>>>>> where the subroutine is:
>>>>>>>> 
>>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>> 
>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>> 
>>>>>>>> u ...
>>>>>>>> v...
>>>>>>>> w ...
>>>>>>>> 
>>>>>>>> end subroutine uvw_array_change.
>>>>>>>> 
>>>>>>>> The above will give an error at :
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>> 3. Same as above, except I change the order of the last 3 lines to:
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>> 
>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>> 
>>>>>>>> So they are now in reversed order. Now it works.
>>>>>>>> 
>>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>>>>>> 
>>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>> 
>>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>> 
>>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>> 
>>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>> 
>>>>>>>> u ...
>>>>>>>> v...
>>>>>>>> w ...
>>>>>>>> 
>>>>>>>> end subroutine uvw_array_change.
>>>>>>>> 
>>>>>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick".
>>>>>>>> 
>>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) "
>>>>>>>> 
>>>>>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other                                                           than the "trick" to continue using the 1 indices convention.
>>>>>>>> 
>>>>>>>> Thank you.
>>>>>>>> 
>>>>>>>> Yours sincerely,
>>>>>>>> 
>>>>>>>> TAY wee-beng
>>>>>>>> 
>>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>>>>>>    Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Hi Barry,
>>>>>>>> 
>>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode.
>>>>>>>> 
>>>>>>>> I have attached my code.
>>>>>>>> 
>>>>>>>> Thank you
>>>>>>>> 
>>>>>>>> Yours sincerely,
>>>>>>>> 
>>>>>>>> TAY wee-beng
>>>>>>>> 
>>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>>>>>>    Please send the code that creates da_w and the declarations of w_array
>>>>>>>> 
>>>>>>>>    Barry
>>>>>>>> 
>>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>>>>>>> <zonexo at gmail.com>
>>>>>>>>   wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi Barry,
>>>>>>>> 
>>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>>>>>> 
>>>>>>>>   mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>> 
>>>>>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts.
>>>>>>>> 
>>>>>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too.
>>>>>>>> 
>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>> An MPI process has executed an operation involving a call to the
>>>>>>>> "fork()" system call to create a child process.  Open MPI is currently
>>>>>>>> operating in a condition that could result in memory corruption or
>>>>>>>> other system errors; your MPI job may hang, crash, or produce silent
>>>>>>>> data corruption.  The use of fork() (or system() or other calls that
>>>>>>>> create child processes) is strongly discouraged.
>>>>>>>> 
>>>>>>>> The process that invoked fork was:
>>>>>>>> 
>>>>>>>>    Local host:          n12-76 (PID 20235)
>>>>>>>>    MPI_COMM_WORLD rank: 2
>>>>>>>> 
>>>>>>>> If you are *absolutely sure* that your application will successfully
>>>>>>>> and correctly survive a call to fork(), you may disable this warning
>>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76
>>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76
>>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76
>>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76
>>>>>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
>>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>>>>>>> 
>>>>>>>> ....
>>>>>>>> 
>>>>>>>>   1
>>>>>>>> [1]PETSC ERROR: ------------------------------------------------------------------------
>>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>>> [1]PETSC ERROR: or see
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org
>>>>>>>>   on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>>>>>> [1]PETSC ERROR: to get more information on the crash.
>>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>>>>>> [3]PETSC ERROR: ------------------------------------------------------------------------
>>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>>> [3]PETSC ERROR: or see
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org
>>>>>>>>   on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>>>>>> [3]PETSC ERROR: to get more information on the crash.
>>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>>>>>> 
>>>>>>>> ...
>>>>>>>> Thank you.
>>>>>>>> 
>>>>>>>> Yours sincerely,
>>>>>>>> 
>>>>>>>> TAY wee-beng
>>>>>>>> 
>>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>>>>>> 
>>>>>>>>    Because IO doesn?t always get flushed immediately it may not be hanging at this point.  It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point.
>>>>>>>> 
>>>>>>>>     Barry
>>>>>>>> 
>>>>>>>>    This routines don?t have any parallel communication in them so are unlikely to hang.
>>>>>>>> 
>>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>>>>>> 
>>>>>>>> <zonexo at gmail.com>
>>>>>>>> 
>>>>>>>>   wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>>>>>>>> 
>>>>>>>>          call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>          call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3"
>>>>>>>>          call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>          call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>>>>>>>          call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>          call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>>>>>>>          call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>>>>>          call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>>>>>>>          call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be in reverse order
>>>>>>>>          call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>>>>>>>          call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>          call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>>>>>>>          call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>> -- 
>>>>>>>> Thank you.
>>>>>>>> 
>>>>>>>> Yours sincerely,
>>>>>>>> 
>>>>>>>> TAY wee-beng
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> <code.txt>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
> 


From talebi.hossein at gmail.com  Tue May 13 11:07:58 2014
From: talebi.hossein at gmail.com (Hossein Talebi)
Date: Tue, 13 May 2014 18:07:58 +0200
Subject: [petsc-users] PetscLayoutCreate for Fortran
Message-ID: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>

Hi All,


I am using PETSC from Fortran. I would like to define my own layout i.e.
which row belongs to which CPU since I have already done the domain
decomposition.  It appears that  "PetscLayoutCreate" and the other routine
do this. But in the manual it says it is not provided in Fortran.

Is there any way that I can do this using Fortran? Anyone has an example?

Cheers
Hossein
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/28b86b33/attachment.html>

From knepley at gmail.com  Tue May 13 11:36:38 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 13 May 2014 11:36:38 -0500
Subject: [petsc-users] PetscLayoutCreate for Fortran
In-Reply-To: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
References: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
Message-ID: <CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>

On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi
<talebi.hossein at gmail.com>wrote:

> Hi All,
>
>
> I am using PETSC from Fortran. I would like to define my own layout i.e.
> which row belongs to which CPU since I have already done the domain
> decomposition.  It appears that  "PetscLayoutCreate" and the other
> routine do this. But in the manual it says it is not provided in Fortran.
>
> Is there any way that I can do this using Fortran? Anyone has an example?
>

You can do this for Vec and Mat directly. Do you want it for something else?

  Thanks,

     Matt


> Cheers
> Hossein
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/57063f58/attachment.html>

From talebi.hossein at gmail.com  Tue May 13 11:42:51 2014
From: talebi.hossein at gmail.com (Hossein Talebi)
Date: Tue, 13 May 2014 18:42:51 +0200
Subject: [petsc-users] PetscLayoutCreate for Fortran
In-Reply-To: <CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>
References: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
	<CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>
Message-ID: <CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>

I have already decomposed the Finite Element system using Metis. I just
need to have the global rows exactly like how I define and I like to have
the answer in the same layout so I don't have to move things around the
processes again.

No, I don't need it for something else.

Cheers
Hossein


On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi <talebi.hossein at gmail.com
> > wrote:
>
>> Hi All,
>>
>>
>> I am using PETSC from Fortran. I would like to define my own layout i.e.
>> which row belongs to which CPU since I have already done the domain
>> decomposition.  It appears that  "PetscLayoutCreate" and the other
>> routine do this. But in the manual it says it is not provided in Fortran.
>>
>> Is there any way that I can do this using Fortran? Anyone has an example?
>>
>
> You can do this for Vec and Mat directly. Do you want it for something
> else?
>
>   Thanks,
>
>      Matt
>
>
>> Cheers
>> Hossein
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
www.permix.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/54af6a99/attachment.html>

From knepley at gmail.com  Tue May 13 11:45:23 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 13 May 2014 11:45:23 -0500
Subject: [petsc-users] PetscLayoutCreate for Fortran
In-Reply-To: <CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>
References: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
	<CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>
	<CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>
Message-ID: <CAMYG4Gmxxs1Ogqovw+AowNJpH7=WOLLQsHDVdTtVuRdhZfKtGA@mail.gmail.com>

On Tue, May 13, 2014 at 11:42 AM, Hossein Talebi
<talebi.hossein at gmail.com>wrote:

>
> I have already decomposed the Finite Element system using Metis. I just
> need to have the global rows exactly like how I define and I like to have
> the answer in the same layout so I don't have to move things around the
> processes again.
>
> No, I don't need it for something else.
>

PetscLayout is only for contiguous sets of indices. If you want to
distribute them, you need to use VecScatter.

  Thanks,

    Matt


> Cheers
> Hossein
>
>
>
>
> On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley <knepley at gmail.com>wrote:
>
>> On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi <
>> talebi.hossein at gmail.com> wrote:
>>
>>> Hi All,
>>>
>>>
>>> I am using PETSC from Fortran. I would like to define my own layout i.e.
>>> which row belongs to which CPU since I have already done the domain
>>> decomposition.  It appears that  "PetscLayoutCreate" and the other
>>> routine do this. But in the manual it says it is not provided in Fortran.
>>>
>>> Is there any way that I can do this using Fortran? Anyone has an example?
>>>
>>
>> You can do this for Vec and Mat directly. Do you want it for something
>> else?
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Cheers
>>> Hossein
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> www.permix.org
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/72469ff0/attachment.html>

From Vincent.De-Groof at uibk.ac.at  Tue May 13 12:13:29 2014
From: Vincent.De-Groof at uibk.ac.at (De Groof, Vincent Frans Maria)
Date: Tue, 13 May 2014 17:13:29 +0000
Subject: [petsc-users] Memory usage during matrix factorization
Message-ID: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at>

Hi,


I'm investigating the performance of a few different direct solvers and I'd like to compare the memory requirements of the different solvers and orderings. I am especially interested in the memory usage necessary to store the factored matrix.

I experimented with the PetscMemoryGetCurrentUsage and PetscMemoryGetMaximumUsage before and after KSPSolve. But these seem to return the memory usage on 1 process and not the total memory usage. Is this correct? I also noticed that the difference in maximum memory usage is very small before and after KSPSolve. Does it register the memory usage in external packages?


thanks,
Vincent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/53ea4633/attachment.html>

From lu_qin_2000 at yahoo.com  Tue May 13 12:17:26 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Tue, 13 May 2014 10:17:26 -0700 (PDT)
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>	<2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
	<CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
	<1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov>
Message-ID: <1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com>

I?tried to use command line options as the example suggested ('-ksp_type preonly -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8') without changing my source code, but then the call to KSPSetUp returned error number 56.
?
Does this mean I still need to change the source code (such as adding calls to PCFactorSetMatSolverPackage, PCFactorGetMatrix, etc.)in addition to the command line options?
?
I ask this since?the use of SuperLU seems to be different from using Hypre, which can?be invoked with command line options?without changing source code.
?
Thanks a lot,
Qin?


----- Original Message -----
From: Barry Smith <bsmith at mcs.anl.gov>
To: Qin Lu <lu_qin_2000 at yahoo.com>
Cc: Xiaoye S. Li <xsli at lbl.gov>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Sent: Monday, May 12, 2014 5:11 PM
Subject: Re: [petsc-users] ILUTP in PETSc


?  See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html


On May 12, 2014, at 4:54 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Hello,
> 
> I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?)
>? 
> Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work?? 
>? 
> Many thanks,
> Qin? 
> 
> 
>? From: Xiaoye S. Li <xsli at lbl.gov>
> To: Barry Smith <bsmith at mcs.anl.gov> 
> Cc: Qin Lu <lu_qin_2000 at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
> Sent: Friday, May 2, 2014 3:40 PM
> Subject: Re: [petsc-users] ILUTP in PETSc
> 
> 
> 
> The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily.? 
> 
> In SuperLU distribution:
> 
>?  EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c)
> 
>?  SRC/zgsitrf.c : the actual ILUTP factorization routine
> 
> 
> Sherry Li
> 
> 
> 
> On Fri, May 2, 2014 at 12:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html there are two listed. ./configure ?download-hypre
>> 
>> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid
>> 
>> you can also add -help to see what options are available.
>> 
>>?  Both pretty much suck and I can?t image much reason for using them.
>> 
>>? ? Barry
>> 
>> 
>> 
>> On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>> 
>>> Hello,
>>> 
>>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthat mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?
>>> 
>>> Many thanks,
>>> Qin
>> 
>>? ? ? 

From talebi.hossein at gmail.com  Tue May 13 12:47:49 2014
From: talebi.hossein at gmail.com (Hossein Talebi)
Date: Tue, 13 May 2014 19:47:49 +0200
Subject: [petsc-users] PetscLayoutCreate for Fortran
In-Reply-To: <CAMYG4Gmxxs1Ogqovw+AowNJpH7=WOLLQsHDVdTtVuRdhZfKtGA@mail.gmail.com>
References: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
	<CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>
	<CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>
	<CAMYG4Gmxxs1Ogqovw+AowNJpH7=WOLLQsHDVdTtVuRdhZfKtGA@mail.gmail.com>
Message-ID: <CAEjaKwDd1E7xm_cEMMj_-GYARMoSqJfrDYTjC0M4gi6MZ=DA_g@mail.gmail.com>

Thank you.

If I understand correctly, before inserting the values into the Mat and
Vec, I should call VecScatter as in the ''ex30f.F" example to set Vec and
Mat with the new indexes, right?


<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatter.html#VecScatter>


On Tue, May 13, 2014 at 6:45 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, May 13, 2014 at 11:42 AM, Hossein Talebi <talebi.hossein at gmail.com
> > wrote:
>
>>
>> I have already decomposed the Finite Element system using Metis. I just
>> need to have the global rows exactly like how I define and I like to have
>> the answer in the same layout so I don't have to move things around the
>> processes again.
>>
>> No, I don't need it for something else.
>>
>
> PetscLayout is only for contiguous sets of indices. If you want to
> distribute them, you need to use VecScatter.
>
>   Thanks,
>
>     Matt
>
>
>> Cheers
>> Hossein
>>
>>
>>
>>
>> On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>
>>> On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi <
>>> talebi.hossein at gmail.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>>
>>>> I am using PETSC from Fortran. I would like to define my own layout
>>>> i.e. which row belongs to which CPU since I have already done the domain
>>>> decomposition.  It appears that  "PetscLayoutCreate" and the other
>>>> routine do this. But in the manual it says it is not provided in Fortran.
>>>>
>>>> Is there any way that I can do this using Fortran? Anyone has an
>>>> example?
>>>>
>>>
>>> You can do this for Vec and Mat directly. Do you want it for something
>>> else?
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> Cheers
>>>> Hossein
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>>
>> --
>> www.permix.org
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
www.permix.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/3ee18c2a/attachment.html>

From atmmachado at gmail.com  Tue May 13 12:54:15 2014
From: atmmachado at gmail.com (=?UTF-8?B?QW5kcsOpIFRpbcOzdGhlbw==?=)
Date: Tue, 13 May 2014 14:54:15 -0300
Subject: [petsc-users] help: petsc-dev + petsc4py acessing the tao
	optimizations solvers ?
Message-ID: <CABGOjsQTspo5CbYkd9xBgmR-EkdiKwMSvsQdfv+3O40d_pqE-Q@mail.gmail.com>

    I read about the merger of the TAO solvers on the PETSC-DEV.

    How can I use the TAO's constrainded optimization solver on
the PETSC-DEV (via petsc4py)?

    Can you show me some simple python script to deal with classical linear
constrainded optimization problems like:

        minimize sum(x) subject to x >= 0 and Ax = b

    Thanks for your time.

 Andre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/9357aeef/attachment.html>

From knepley at gmail.com  Tue May 13 13:11:58 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 13 May 2014 13:11:58 -0500
Subject: [petsc-users] PetscLayoutCreate for Fortran
In-Reply-To: <CAEjaKwDd1E7xm_cEMMj_-GYARMoSqJfrDYTjC0M4gi6MZ=DA_g@mail.gmail.com>
References: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
	<CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>
	<CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>
	<CAMYG4Gmxxs1Ogqovw+AowNJpH7=WOLLQsHDVdTtVuRdhZfKtGA@mail.gmail.com>
	<CAEjaKwDd1E7xm_cEMMj_-GYARMoSqJfrDYTjC0M4gi6MZ=DA_g@mail.gmail.com>
Message-ID: <CAMYG4G=7WJutPivzaHVr4prKQqGdF-Wv74fL_XDyt4Zi1GEqqQ@mail.gmail.com>

On Tue, May 13, 2014 at 12:47 PM, Hossein Talebi
<talebi.hossein at gmail.com>wrote:

> Thank you.
>
> If I understand correctly, before inserting the values into the Mat and
> Vec, I should call VecScatter as in the ''ex30f.F" example to set Vec and
> Mat with the new indexes, right?
>

VecScatter is a way to send information among processes, so if you need to
reorganize your
information before inserting into the Vec, then yes you would use it.

  Thanks,

     Matt


>
> On Tue, May 13, 2014 at 6:45 PM, Matthew Knepley <knepley at gmail.com>wrote:
>
>> On Tue, May 13, 2014 at 11:42 AM, Hossein Talebi <
>> talebi.hossein at gmail.com> wrote:
>>
>>>
>>> I have already decomposed the Finite Element system using Metis. I just
>>> need to have the global rows exactly like how I define and I like to have
>>> the answer in the same layout so I don't have to move things around the
>>> processes again.
>>>
>>> No, I don't need it for something else.
>>>
>>
>> PetscLayout is only for contiguous sets of indices. If you want to
>> distribute them, you need to use VecScatter.
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>> Cheers
>>> Hossein
>>>
>>>
>>>
>>>
>>> On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>>
>>>> On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi <
>>>> talebi.hossein at gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>>
>>>>> I am using PETSC from Fortran. I would like to define my own layout
>>>>> i.e. which row belongs to which CPU since I have already done the domain
>>>>> decomposition.  It appears that  "PetscLayoutCreate" and the other
>>>>> routine do this. But in the manual it says it is not provided in Fortran.
>>>>>
>>>>> Is there any way that I can do this using Fortran? Anyone has an
>>>>> example?
>>>>>
>>>>
>>>> You can do this for Vec and Mat directly. Do you want it for something
>>>> else?
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> Cheers
>>>>> Hossein
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>>>
>>> --
>>> www.permix.org
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> www.permix.org
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/b4e72611/attachment-0001.html>

From jed at jedbrown.org  Tue May 13 12:45:12 2014
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 13 May 2014 11:45:12 -0600
Subject: [petsc-users] Get wrong answer when use multi-process
In-Reply-To: <CACDKL=w=FGD5x_ooYyD5_Modud7OiDSRAmTMbvYDZNcGNE+8Lg@mail.gmail.com>
References: <CACDKL=w=FGD5x_ooYyD5_Modud7OiDSRAmTMbvYDZNcGNE+8Lg@mail.gmail.com>
Message-ID: <87bnv1a7yv.fsf@jedbrown.org>

??? <altriaex86 at gmail.com> writes:

> Hi, all
>
> I am confused about my code, for it could return right answer when I use 1
> process, but return totally wrong answer when more than 1 process.
>
> This is how I feed data to it.
>
> I have a CSR matrix, represented by Ap(pointer),Ai(index),and temp(data).

This matrix is stored redundantly on each process?

You should run with valgrind and confirm that you assemble the same
matrix in parallel before worrying about solvers.

> First determine local matrix for each process. Then feed data to them.
>
>   int temprank,localsize,line_pos;
>     line_pos = 0;
>     if(rank == 0)
>     {
>         localsize   = size/pro + ((size % pro) > rank);
>     }
>     else
>     {
>         for (temprank = 0;temprank <rank;temprank++)
>         {
>             localsize   = size/pro + ((size % pro) > temprank);
>             line_pos += localsize;
>         }
>     }
>
>     Lin_index = new int [localsize+1];
>     for(i=0;i<localsize+1;i++)
>     {
>         Lin_index [i] = Ap[line_pos+i]-Ap[line_pos];
>     }
>     std::cerr<<"line_pos    "<<line_pos<<"\n";
>     MatMPIAIJSetPreallocationCSR(A,Lin_index,Ai+line_pos,temp+line_pos);
>
> I use spectral transform with MATSOLVERMUMPS to calculate eigenvalue.
>
>
> The strange thing is, when I run it with one process, the eigenvalue is
> what I want, typically,
>  (8.39485e+13,5.3263)  (3.93842e+13,-82.6948) first two.
> But for 2 process:
> eigenvalue    (2.76523e+13,7.62222e+12)
> eigenvalue    (2.76523e+13,-7.62222e+12)
>
> 3 process:
> eigenvalue    (6.81292e+13,-3071.82)
> eigenvalue    (3.49533e+13,2.48858e+13)
>
> 4
> eigenvalue    (9.7562e+13,5012.4)
> eigenvalue    (7.2019e+13,8.28561e+13)
>
> However, it could pass simple test like
>     int n = 12;
>     int nz = 12;
>     int Ap[13] = {0,1,2,3,4,5,6,7,8,9,10,11,12};
>     int Ai[12] = { 0,1,2,3,4,5,6,7,8,9,10,11};
>     double Ax[12] = {-1,-2,-3,-4,-5,6,7,8,9,10,11};
>     double Az[12] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0};
>
>
> Do you have any idea about it?
>
> Thanks a lot!!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/eca84ec5/attachment.pgp>

From jaolive at MIT.EDU  Tue May 13 13:20:07 2014
From: jaolive at MIT.EDU (Jean-Arthur Louis Olive)
Date: Tue, 13 May 2014 18:20:07 +0000
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
References: <53725A86.3070804@uidaho.edu>
Message-ID: <591D878E-0113-42FE-923D-07E0282E9597@mit.edu>

Hi all,
we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations.

So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below).

RESIDUAL 1 (NO COUPLING):
  for (j=info->ys; j<info->ys+info->ym; j++) {
    for (i=info->xs; i<info->xs+info->xm; i++) {
      f[j][i].P = x[j][i].P - 3000000;
      f[j][i].vx= 2*x[j][i].vx;
      f[j][i].vy= 3*x[j][i].vy - 2;
      f[j][i].T = x[j][i].T;
 }

RESIDUAL 2 (ONE COUPLING TERM):
  for (j=info->ys; j<info->ys+info->ym; j++) {
    for (i=info->xs; i<info->xs+info->xm; i++) {
      f[j][i].P = x[j][i].P - 3;
      f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
      f[j][i].vy= x[j][i].vy - 2;
      f[j][i].T = x[j][i].T;     
    }
  }


and our default set of options is:


OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp 


With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below:


Result from Solve - RESIDUAL 1 
  0 SNES Function norm 8.485281374240e+07 
    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
  1 SNES Function norm 1.131370849896e+02 
    0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
  2 SNES Function norm 1.131370849896e+02 
Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2


With the coupled residual (Residual 2), the norms do not match, see below


Result from Solve - RESIDUAL 2:
  0 SNES Function norm 1.019803902719e+02 
    0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
  1 SNES Function norm 1.697056274848e+02 
    0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
  2 SNES Function norm 3.236770473841e-07 
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2


Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second.


Result from Solve with -snes_fd - RESIDUAL 2
 0 SNES Function norm 8.485281374240e+07 
    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
  1 SNES Function norm 2.039607805429e+02 
    0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
  2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
    0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
  3 SNES Function norm 2.549509757105e+01 
Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3


Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms?

Thanks a lot,
Arthur and Eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/3f61f1fd/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1855 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/3f61f1fd/attachment.bin>

From bsmith at mcs.anl.gov  Tue May 13 13:20:23 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 13:20:23 -0500
Subject: [petsc-users] Get wrong answer when use multi-process
In-Reply-To: <CACDKL=w=FGD5x_ooYyD5_Modud7OiDSRAmTMbvYDZNcGNE+8Lg@mail.gmail.com>
References: <CACDKL=w=FGD5x_ooYyD5_Modud7OiDSRAmTMbvYDZNcGNE+8Lg@mail.gmail.com>
Message-ID: <DF49AC85-CEA4-4EB3-BCB5-7711AC1B9C94@mcs.anl.gov>


  1) make sure the matrix is the same with 1 and 2 processes. Once the matrix is built you can call MatView(mat,NULL) to display it.

  2) once the matrices are the same make sure the eigensolver converges in both cases.

   Barry


On May 13, 2014, at 2:28 AM, ??? <altriaex86 at gmail.com> wrote:

> Hi, all
> 
> I am confused about my code, for it could return right answer when I use 1 process, but return totally wrong answer when more than 1 process.
> 
> This is how I feed data to it. 
> 
> I have a CSR matrix, represented by Ap(pointer),Ai(index),and temp(data).
> 
> First determine local matrix for each process. Then feed data to them.
>   
>   int temprank,localsize,line_pos;
>     line_pos = 0;
>     if(rank == 0)
>     {    
>         localsize   = size/pro + ((size % pro) > rank);
>     }    
>     else
>     {
>         for (temprank = 0;temprank <rank;temprank++)
>         {     
>             localsize   = size/pro + ((size % pro) > temprank);
>             line_pos += localsize;
>         }
>     }
> 
>     Lin_index = new int [localsize+1];
>     for(i=0;i<localsize+1;i++)
>     {
>         Lin_index [i] = Ap[line_pos+i]-Ap[line_pos];
>     }
>     std::cerr<<"line_pos    "<<line_pos<<"\n";
>     MatMPIAIJSetPreallocationCSR(A,Lin_index,Ai+line_pos,temp+line_pos);
> 
> I use spectral transform with MATSOLVERMUMPS to calculate eigenvalue.
> 
> 
> The strange thing is, when I run it with one process, the eigenvalue is what I want, typically,  
>  (8.39485e+13,5.3263)  (3.93842e+13,-82.6948) first two.
> But for 2 process:
> eigenvalue    (2.76523e+13,7.62222e+12)
> eigenvalue    (2.76523e+13,-7.62222e+12)
> 
> 3 process:
> eigenvalue    (6.81292e+13,-3071.82)
> eigenvalue    (3.49533e+13,2.48858e+13)
> 
> 4
> eigenvalue    (9.7562e+13,5012.4)
> eigenvalue    (7.2019e+13,8.28561e+13)
> 
> However, it could pass simple test like
>     int n = 12;
>     int nz = 12;
>     int Ap[13] = {0,1,2,3,4,5,6,7,8,9,10,11,12};
>     int Ai[12] = { 0,1,2,3,4,5,6,7,8,9,10,11};
>     double Ax[12] = {-1,-2,-3,-4,-5,6,7,8,9,10,11};
>     double Az[12] = {0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0};
> 
> 
> Do you have any idea about it?
> 
> Thanks a lot!!
> 
> 


From bsmith at mcs.anl.gov  Tue May 13 13:28:22 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 13:28:22 -0500
Subject: [petsc-users] Memory usage during matrix factorization
In-Reply-To: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at>
References: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at>
Message-ID: <B10D52A7-BFBE-4226-9190-FDCAEF3DA8C2@mcs.anl.gov>


   These return what the operating system reports is being used by the process so it includes any external packages. It is for a single process if you want the value over a set of processes then use MP_Allreduce() to sum them up.

   Barry

   Here is the code: it is only as reliable as the OS is at reporting the values.


#if defined(PETSC_USE_PROCFS_FOR_SIZE)

  sprintf(proc,"/proc/%d",(int)getpid());
  if ((fd = open(proc,O_RDONLY)) == -1) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_FILE_OPEN,"Unable to access system file %s to get memory usage data",file);
  if (ioctl(fd,PIOCPSINFO,&prusage) == -1) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_FILE_READ,"Unable to access system file %s to get memory usage data",file);
  *mem = (PetscLogDouble)prusage.pr_byrssize;
  close(fd);

#elif defined(PETSC_USE_SBREAK_FOR_SIZE)

  *mem = (PetscLogDouble)(8*fd - 4294967296); /* 2^32 - upper bits */

#elif defined(PETSC_USE_PROC_FOR_SIZE) && defined(PETSC_HAVE_GETPAGESIZE)
  sprintf(proc,"/proc/%d/statm",(int)getpid());
  if (!(file = fopen(proc,"r"))) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_FILE_OPEN,"Unable to access system file %s to get memory usage data",proc);
  if (fscanf(file,"%d %d",&mm,&rss) != 2) SETERRQ1(PETSC_COMM_SELF,PETSC_ERR_SYS,"Failed to read two integers (mm and rss) from %s",proc);
  *mem = ((PetscLogDouble)rss) * ((PetscLogDouble)getpagesize());
  err  = fclose(file);
  if (err) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_SYS,"fclose() failed on file");

#elif defined(PETSC_HAVE_GETRUSAGE)
  getrusage(RUSAGE_SELF,&temp);
#if defined(PETSC_USE_KBYTES_FOR_SIZE)
  *mem = 1024.0 * ((PetscLogDouble)temp.ru_maxrss);
#elif defined(PETSC_USE_PAGES_FOR_SIZE) && defined(PETSC_HAVE_GETPAGESIZE)
  *mem = ((PetscLogDouble)getpagesize())*((PetscLogDouble)temp.ru_maxrss);
#else
  *mem = temp.ru_maxrss;
#endif


On May 13, 2014, at 12:13 PM, De Groof, Vincent Frans Maria <Vincent.De-Groof at uibk.ac.at> wrote:

> Hi,
> 
> 
> I'm investigating the performance of a few different direct solvers and I'd like to compare the memory requirements of the different solvers and orderings. I am especially interested in the memory usage necessary to store the factored matrix. 
> 
> I experimented with the PetscMemoryGetCurrentUsage and PetscMemoryGetMaximumUsage before and after KSPSolve. But these seem to return the memory usage on 1 process and not the total memory usage. Is this correct? I also noticed that the difference in maximum memory usage is very small before and after KSPSolve. Does it register the memory usage in external packages?
> 
> 
> 
> thanks,
> Vincent


From jed at jedbrown.org  Tue May 13 17:59:11 2014
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 13 May 2014 16:59:11 -0600
Subject: [petsc-users] Memory usage during matrix factorization
In-Reply-To: <B10D52A7-BFBE-4226-9190-FDCAEF3DA8C2@mcs.anl.gov>
References: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at>
	<B10D52A7-BFBE-4226-9190-FDCAEF3DA8C2@mcs.anl.gov>
Message-ID: <87vbt98ev4.fsf@jedbrown.org>

Barry Smith <bsmith at mcs.anl.gov> writes:
>    Here is the code: it is only as reliable as the OS is at reporting the values.

HPC vendors have a habit of implementing these functions to return
nonsense.  Sometimes they provide non-standard functions to return
useful information.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/b0cd92e1/attachment.pgp>

From gideon.simpson at gmail.com  Tue May 13 19:16:08 2014
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 13 May 2014 20:16:08 -0400
Subject: [petsc-users] configuration on cluster with intel compilers/mkl
Message-ID: <D551548A-D61A-4CB7-90DA-7CAC2E9C537D@gmail.com>

I?m trying to set up petsc on an intel cluster that has the intel compilers and the MKL, but when I try to configure, I get the error,

TESTING: checkLib from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:113)                                                          *******************************************************************************
         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
-------------------------------------------------------------------------------
You set a value for --with-blas-lapack-lib=<lib>, but ['/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a'] cannot be used


-gideon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/37122015/attachment.html>

From bsmith at mcs.anl.gov  Tue May 13 19:55:28 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 19:55:28 -0500
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
In-Reply-To: <591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
References: <53725A86.3070804@uidaho.edu>
	<591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
Message-ID: <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>


   What do you mean by ?''the default ?coloring? method???   

   If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below.

   Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian.

   Barry


On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU> wrote:

> Hi all,
> we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations.
> 
> So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below).
> 
> RESIDUAL 1 (NO COUPLING):
>   for (j=info->ys; j<info->ys+info->ym; j++) {
>     for (i=info->xs; i<info->xs+info->xm; i++) {
>       f[j][i].P = x[j][i].P - 3000000;
>       f[j][i].vx= 2*x[j][i].vx;
>       f[j][i].vy= 3*x[j][i].vy - 2;
>       f[j][i].T = x[j][i].T;
>  }
> 
> RESIDUAL 2 (ONE COUPLING TERM):
>   for (j=info->ys; j<info->ys+info->ym; j++) {
>     for (i=info->xs; i<info->xs+info->xm; i++) {
>       f[j][i].P = x[j][i].P - 3;
>       f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
>       f[j][i].vy= x[j][i].vy - 2;
>       f[j][i].T = x[j][i].T;     
>     }
>   }
> 
> 
> and our default set of options is:
> 
> 
> OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp 
> 
> 
> With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below:
> 
> 
> Result from Solve - RESIDUAL 1 
>   0 SNES Function norm 8.485281374240e+07 
>     0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
>     1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
>   1 SNES Function norm 1.131370849896e+02 
>     0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
>   2 SNES Function norm 1.131370849896e+02 
> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> 
> 
> With the coupled residual (Residual 2), the norms do not match, see below
> 
> 
> Result from Solve - RESIDUAL 2:
>   0 SNES Function norm 1.019803902719e+02 
>     0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
>     1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
>   1 SNES Function norm 1.697056274848e+02 
>     0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
>     1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
>   2 SNES Function norm 3.236770473841e-07 
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
> 
> 
> Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second.
> 
> 
> Result from Solve with -snes_fd - RESIDUAL 2
>  0 SNES Function norm 8.485281374240e+07 
>     0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
>     1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
>   1 SNES Function norm 2.039607805429e+02 
>     0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
>     1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
>   2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
>     0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
>   3 SNES Function norm 2.549509757105e+01 
> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3
> 
> 
> Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms?
> 
> Thanks a lot,
> Arthur and Eric


From bsmith at mcs.anl.gov  Tue May 13 19:56:27 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 19:56:27 -0500
Subject: [petsc-users] configuration on cluster with intel compilers/mkl
In-Reply-To: <D551548A-D61A-4CB7-90DA-7CAC2E9C537D@gmail.com>
References: <D551548A-D61A-4CB7-90DA-7CAC2E9C537D@gmail.com>
Message-ID: <C30F035C-30C3-4CAA-BA55-E0769A0676DF@mcs.anl.gov>


   You always need to send configure.log so we can see why the library was unacceptable. 

On May 13, 2014, at 7:16 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:

> I?m trying to set up petsc on an intel cluster that has the intel compilers and the MKL, but when I try to configure, I get the error,
> 
> TESTING: checkLib from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:113) *******************************************************************************
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
> -------------------------------------------------------------------------------
> You set a value for --with-blas-lapack-lib=<lib>, but ['/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a'] cannot be used
> 
> 
> 
> -gideon
> 


From gideon.simpson at gmail.com  Tue May 13 19:59:42 2014
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 13 May 2014 20:59:42 -0400
Subject: [petsc-users] configuration on cluster with intel compilers/mkl
In-Reply-To: <C30F035C-30C3-4CAA-BA55-E0769A0676DF@mcs.anl.gov>
References: <D551548A-D61A-4CB7-90DA-7CAC2E9C537D@gmail.com>
	<C30F035C-30C3-4CAA-BA55-E0769A0676DF@mcs.anl.gov>
Message-ID: <BD5FFED8-612C-4D26-B91F-90C3050D7320@gmail.com>

Log attached,
-gideon

On May 13, 2014, at 8:56 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

> 
>   You always need to send configure.log so we can see why the library was unacceptable. 
> 
> On May 13, 2014, at 7:16 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
>> I?m trying to set up petsc on an intel cluster that has the intel compilers and the MKL, but when I try to configure, I get the error,
>> 
>> TESTING: checkLib from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:113) *******************************************************************************
>>         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
>> -------------------------------------------------------------------------------
>> You set a value for --with-blas-lapack-lib=<lib>, but ['/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a'] cannot be used
>> 
>> 
>> 
>> -gideon
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/564e1b72/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 2137370 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/564e1b72/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/564e1b72/attachment-0003.html>

From bsmith at mcs.anl.gov  Tue May 13 20:06:39 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 20:06:39 -0500
Subject: [petsc-users] PetscLayoutCreate for Fortran
In-Reply-To: <CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>
References: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
	<CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>
	<CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>
Message-ID: <44B5B5E9-5F44-4649-A01E-003189B05AD4@mcs.anl.gov>


On May 13, 2014, at 11:42 AM, Hossein Talebi <talebi.hossein at gmail.com> wrote:

> 
> I have already decomposed the Finite Element system using Metis. I just need to have the global rows exactly like how I define and I like to have the answer in the same layout so I don't have to move things around the processes again.  

   Metis tells you a good partitioning IT DOES NOT MOVE the elements to form a good partitioning. Do you move the elements around based on what metis told you and similarly do you renumber the elements (and vertices) to be contiquously numbered on each process with the first process getting the first set of numbers, the second process the second set of numbers etc? 

   If you do all that then when you create Vec and Mat you should simply set the local size (based on the number of local vertices on each process). You never need to use PetscLayoutCreate and in fact if your code was in C you would never use PetscLayoutCreate()
 
   If you do not do all that then you need to do that first before you start calling PETSc.

   Barry

> 
> No, I don't need it for something else.
> 
> Cheers
> Hossein
> 
> 
> 
> 
> On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi <talebi.hossein at gmail.com> wrote:
> Hi All,
> 
> 
> I am using PETSC from Fortran. I would like to define my own layout i.e. which row belongs to which CPU since I have already done the domain decomposition.  It appears that  "PetscLayoutCreate" and the other routine do this. But in the manual it says it is not provided in Fortran. 
> 
> Is there any way that I can do this using Fortran? Anyone has an example?
> 
> You can do this for Vec and Mat directly. Do you want it for something else?
> 
>   Thanks,
> 
>      Matt
>  
> Cheers
> Hossein
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> -- 
> www.permix.org


From bsmith at mcs.anl.gov  Tue May 13 20:14:16 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 20:14:16 -0500
Subject: [petsc-users] configuration on cluster with intel compilers/mkl
In-Reply-To: <BD5FFED8-612C-4D26-B91F-90C3050D7320@gmail.com>
References: <D551548A-D61A-4CB7-90DA-7CAC2E9C537D@gmail.com>
	<C30F035C-30C3-4CAA-BA55-E0769A0676DF@mcs.anl.gov>
	<BD5FFED8-612C-4D26-B91F-90C3050D7320@gmail.com>
Message-ID: <DE19B1ED-92BB-4B8B-A01F-20D3509E6986@mcs.anl.gov>


  Your MPI compiler is using 32 bit pointers (why?)

TEST configureCompilerFlags from config.compilerFlags(/home/simpson/software/petsc-intel/config/BuildSystem/config/compilerFlags.py:65)
TESTING: configureCompilerFlags from config.compilerFlags(config/BuildSystem/config/compilerFlags.py:65)
  Get the default compiler flags
          Pushing language C
sh: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc --version
Executing: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc --version
sh: gcc (GCC) 4.8.1
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

getCompilerVersion: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc gcc (GCC) 4.8.1
sh: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc -show
Executing: /cm/shared/apps/intel/mpi/4.1.1.036/bin/mpicc -show
sh: gcc -m32 -I/cm/shared/apps/intel/mpi/4.1.1.036/ia32/include -L/cm/shared/apps/intel/mpi/4.1.1.036/ia32/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /cm/shared/apps/intel/mpi/4.1.1.036/ia32/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/4.1 -lmpigf -lmpi -lmpigi -ldl -lrt -lpthread

  But you ask to use 64 bit pointer MKL libraries with /cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a

Cannot be done. Either use a 64 bit pointer mpicc or a 32 bit pointer mkl library.

   Barry

On May 13, 2014, at 7:59 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:

> Log attached,
> -gideon
> 
> On May 13, 2014, at 8:56 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>> 
>>   You always need to send configure.log so we can see why the library was unacceptable. 
>> 
>> On May 13, 2014, at 7:16 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>>> I?m trying to set up petsc on an intel cluster that has the intel compilers and the MKL, but when I try to configure, I get the error,
>>> 
>>> TESTING: checkLib from config.packages.BlasLapack(config/BuildSystem/config/packages/BlasLapack.py:113) *******************************************************************************
>>>         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
>>> -------------------------------------------------------------------------------
>>> You set a value for --with-blas-lapack-lib=<lib>, but ['/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.a', '/cm/shared/apps/intel/composer_xe/2013_sp1.1.106/mkl/lib/intel64/libmkl_core.a'] cannot be used
>>> 
>>> 
>>> 
>>> -gideon
>>> 
>> 
> 
> <configure.log>


From knepley at gmail.com  Tue May 13 20:27:38 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 13 May 2014 20:27:38 -0500
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
In-Reply-To: <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
References: <53725A86.3070804@uidaho.edu>
	<591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
	<2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
Message-ID: <CAMYG4Gmrityie+NzM23QhGAuWuBmNes5r2Zu0iGL_MVOdQ7kFA@mail.gmail.com>

On Tue, May 13, 2014 at 7:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    What do you mean by ?''the default ?coloring? method???
>
>    If you are using DMDA and either DMGetColoring or the SNESSetDM
> approach and dof is 4 then we color each of the 4 variables per grid point
> with a different color so coupling between variables within a grid point is
> not a problem. This would not explain the problem you are seeing below.
>
>    Run your code with -snes_type test and read the results and follow the
> directions to debug your Jacobian.


I think there may actually be a bug with the coloring for unstructured
grids. I am distilling it down to a nice test case.

  Matt


>
>    Barry
>
>
> On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU>
> wrote:
>
> > Hi all,
> > we are using PETSc to solve the steady state Stokes equations with
> non-linear viscosities using finite difference. Recently we have realized
> that our true residual norm after the last KSP solve did not match next
> SNES function norm when solving the linear Stokes equations.
> >
> > So to understand this better, we set up two extremely simple linear
> residuals, one with no coupling between variables (vx, vy, P and T), the
> other with one coupling term (shown below).
> >
> > RESIDUAL 1 (NO COUPLING):
> >   for (j=info->ys; j<info->ys+info->ym; j++) {
> >     for (i=info->xs; i<info->xs+info->xm; i++) {
> >       f[j][i].P = x[j][i].P - 3000000;
> >       f[j][i].vx= 2*x[j][i].vx;
> >       f[j][i].vy= 3*x[j][i].vy - 2;
> >       f[j][i].T = x[j][i].T;
> >  }
> >
> > RESIDUAL 2 (ONE COUPLING TERM):
> >   for (j=info->ys; j<info->ys+info->ym; j++) {
> >     for (i=info->xs; i<info->xs+info->xm; i++) {
> >       f[j][i].P = x[j][i].P - 3;
> >       f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
> >       f[j][i].vy= x[j][i].vy - 2;
> >       f[j][i].T = x[j][i].T;
> >     }
> >   }
> >
> >
> > and our default set of options is:
> >
> >
> > OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2
> -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor
> -snes_converged_reason -snes_view -log_summary -options_left 1
> -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp
> >
> >
> > With the uncoupled residual (Residual 1), we get matching KSP and SNES
> norm, highlighted below:
> >
> >
> > Result from Solve - RESIDUAL 1
> >   0 SNES Function norm 8.485281374240e+07
> >     0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm
> 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm
> 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
> >   1 SNES Function norm 1.131370849896e+02
> >     0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm
> 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
> >   2 SNES Function norm 1.131370849896e+02
> > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> >
> >
> > With the coupled residual (Residual 2), the norms do not match, see below
> >
> >
> > Result from Solve - RESIDUAL 2:
> >   0 SNES Function norm 1.019803902719e+02
> >     0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm
> 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm
> 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
> >   1 SNES Function norm 1.697056274848e+02
> >     0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm
> 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm
> 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
> >   2 SNES Function norm 3.236770473841e-07
> > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
> >
> >
> > Lastly, if we add -snes_fd to our options, the norms for residual 2 get
> better - they match after the first iteration but not after the second.
> >
> >
> > Result from Solve with -snes_fd - RESIDUAL 2
> >  0 SNES Function norm 8.485281374240e+07
> >     0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm
> 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm
> 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
> >   1 SNES Function norm 2.039607805429e+02
> >     0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm
> 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm
> 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
> >   2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
> >     0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm
> 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
> >   3 SNES Function norm 2.549509757105e+01
> > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3
> >
> >
> > Does this mean that our Jacobian is not approximated properly by the
> default ?coloring? method when it has off-diagonal terms?
> >
> > Thanks a lot,
> > Arthur and Eric
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/dafaef98/attachment.html>

From bsmith at mcs.anl.gov  Tue May 13 20:31:49 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 20:31:49 -0500
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>	<2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
	<CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
	<1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov>
	<1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com>
Message-ID: <D7DD6D15-2D81-47A7-BE8A-BCAFC4152FA4@mcs.anl.gov>


  Works fine for me.  Please please please ALWAYS cut and paste the entire error message that is printed. We print the information for a reason, because it provides clues as to what went wrong.


  ./ex10 -f0 ~/Datafiles/Matrices/arco1 -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8 -ksp_monitor_true_residual -ksp_rtol 1.e-12 -ksp_view

  0 KSP preconditioned resid norm 2.544968580491e+03 true resid norm 7.410897708964e+00 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 2.467110329809e-06 true resid norm 1.439993537311e-07 ||r(i)||/||b|| 1.943075716143e-08
  2 KSP preconditioned resid norm 1.522204461523e-12 true resid norm 2.699724724531e-11 ||r(i)||/||b|| 3.642911871885e-12
KSP Object: 1 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: ilu
    ILU: out-of-place factorization
    0 levels of fill
    tolerance for zero pivot 2.22045e-14
    using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
    matrix ordering: natural
    factor fill ratio given 0, needed 0
      Factored matrix follows:
        Mat Object:         1 MPI processes
          type: seqaij
          rows=1501, cols=1501
          package used to perform factorization: superlu
          total: nonzeros=0, allocated nonzeros=0
          total number of mallocs used during MatSetValues calls =0
            SuperLU run parameters:
              Equil: YES
              ColPerm: 3
              IterRefine: 0
              SymmetricMode: NO
              DiagPivotThresh: 0.1
              PivotGrowth: NO
              ConditionNumber: NO
              RowPerm: 1
              ReplaceTinyPivot: NO
              PrintStat: NO
              lwork: 0
              ILU_DropTol: 1e-08
              ILU_FillTol: 0.01
              ILU_FillFactor: 10
              ILU_DropRule: 9
              ILU_Norm: 2
              ILU_MILU: 0
  linear system matrix = precond matrix:
  Mat Object:   1 MPI processes
    type: seqaij
    rows=1501, cols=1501
    total: nonzeros=26131, allocated nonzeros=26131
    total number of mallocs used during MatSetValues calls =0
      using I-node routines: found 501 nodes, limit used is 5
Number of iterations =   2
Residual norm 2.69972e-11
~/Src/petsc/src/ksp/ksp/examples/tutorials  master 

On May 13, 2014, at 12:17 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> I tried to use command line options as the example suggested ('-ksp_type preonly -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8') without changing my source code, but then the call to KSPSetUp returned error number 56.
>  
> Does this mean I still need to change the source code (such as adding calls to PCFactorSetMatSolverPackage, PCFactorGetMatrix, etc.)in addition to the command line options?
>  
> I ask this since the use of SuperLU seems to be different from using Hypre, which can be invoked with command line options without changing source code.
>  
> Thanks a lot,
> Qin 
> 
> 
> ----- Original Message -----
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com>
> Cc: Xiaoye S. Li <xsli at lbl.gov>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Sent: Monday, May 12, 2014 5:11 PM
> Subject: Re: [petsc-users] ILUTP in PETSc
> 
> 
>    See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html
> 
> 
> 
> 
> 
> On May 12, 2014, at 4:54 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> 
>> Hello,
>> 
>> I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?)
>>   
>> Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work?  
>>   
>> Many thanks,
>> Qin  
>> 
>> 
>>   From: Xiaoye S. Li <xsli at lbl.gov>
>> To: Barry Smith <bsmith at mcs.anl.gov> 
>> Cc: Qin Lu <lu_qin_2000 at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
>> Sent: Friday, May 2, 2014 3:40 PM
>> Subject: Re: [petsc-users] ILUTP in PETSc
>> 
>> 
>> 
>> The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily.  
>> 
>> In SuperLU distribution:
>> 
>>    EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c)
>> 
>>    SRC/zgsitrf.c : the actual ILUTP factorization routine
>> 
>> 
>> Sherry Li
>> 
>> 
>> 
>> On Fri, May 2, 2014 at 12:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html there are two listed. ./configure ?download-hypre
>>> 
>>> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid
>>> 
>>> you can also add -help to see what options are available.
>>> 
>>>    Both pretty much suck and I can?t image much reason for using them.
>>> 
>>>     Barry
>>> 
>>> 
>>> 
>>> On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthat mentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?
>>>> 
>>>> Many thanks,
>>>> Qin
>>> 
>>>       


From bsmith at mcs.anl.gov  Tue May 13 20:38:51 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 20:38:51 -0500
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
In-Reply-To: <CAMYG4Gmrityie+NzM23QhGAuWuBmNes5r2Zu0iGL_MVOdQ7kFA@mail.gmail.com>
References: <53725A86.3070804@uidaho.edu>
	<591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
	<2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
	<CAMYG4Gmrityie+NzM23QhGAuWuBmNes5r2Zu0iGL_MVOdQ7kFA@mail.gmail.com>
Message-ID: <91410EE5-8E83-4BEC-95C1-D3BA72BDABA7@mcs.anl.gov>


  Matt,

  The code fragments they sent sure look like they are using DMDA 2d and they talk about finite differences.

   Barry

I am sure there are bugs in the unstructured grids code also :-)


On May 13, 2014, at 8:27 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, May 13, 2014 at 7:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>    What do you mean by ?''the default ?coloring? method???
> 
>    If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below.
> 
>    Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian.
> 
> I think there may actually be a bug with the coloring for unstructured grids. I am distilling it down to a nice test case.
> 
>   Matt
>  
> 
>    Barry
> 
> 
> On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU> wrote:
> 
> > Hi all,
> > we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations.
> >
> > So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below).
> >
> > RESIDUAL 1 (NO COUPLING):
> >   for (j=info->ys; j<info->ys+info->ym; j++) {
> >     for (i=info->xs; i<info->xs+info->xm; i++) {
> >       f[j][i].P = x[j][i].P - 3000000;
> >       f[j][i].vx= 2*x[j][i].vx;
> >       f[j][i].vy= 3*x[j][i].vy - 2;
> >       f[j][i].T = x[j][i].T;
> >  }
> >
> > RESIDUAL 2 (ONE COUPLING TERM):
> >   for (j=info->ys; j<info->ys+info->ym; j++) {
> >     for (i=info->xs; i<info->xs+info->xm; i++) {
> >       f[j][i].P = x[j][i].P - 3;
> >       f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
> >       f[j][i].vy= x[j][i].vy - 2;
> >       f[j][i].T = x[j][i].T;
> >     }
> >   }
> >
> >
> > and our default set of options is:
> >
> >
> > OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp
> >
> >
> > With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below:
> >
> >
> > Result from Solve - RESIDUAL 1
> >   0 SNES Function norm 8.485281374240e+07
> >     0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
> >   1 SNES Function norm 1.131370849896e+02
> >     0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
> >   2 SNES Function norm 1.131370849896e+02
> > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> >
> >
> > With the coupled residual (Residual 2), the norms do not match, see below
> >
> >
> > Result from Solve - RESIDUAL 2:
> >   0 SNES Function norm 1.019803902719e+02
> >     0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
> >   1 SNES Function norm 1.697056274848e+02
> >     0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
> >   2 SNES Function norm 3.236770473841e-07
> > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
> >
> >
> > Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second.
> >
> >
> > Result from Solve with -snes_fd - RESIDUAL 2
> >  0 SNES Function norm 8.485281374240e+07
> >     0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
> >   1 SNES Function norm 2.039607805429e+02
> >     0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
> >     1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
> >   2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
> >     0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
> >   3 SNES Function norm 2.549509757105e+01
> > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3
> >
> >
> > Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms?
> >
> > Thanks a lot,
> > Arthur and Eric
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From bsmith at mcs.anl.gov  Tue May 13 21:00:50 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 13 May 2014 21:00:50 -0500
Subject: [petsc-users] Problem with MatZeroRowsColumnsIS()
In-Reply-To: <57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de>
References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de>
	<CAMYG4GnwSkUC11GhmLj7H-EYmK2X2tMBUQUFdCnfc5PPGfNHag@mail.gmail.com>
	<57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de>
Message-ID: <1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov>


    Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to be wrong.

  You wrote

Calculating the norm2 of the residuals defined above in each case gives:
MatZeroRowsIS() 1cpu:  norm(res,2) =      0
MatZeroRowsIS() 4cpu:  norm(res,2) =      0
MatZeroRowsColumnsIS() 1cpu:  norm(res,2) =    1.6880e-10
MatZeroRowsColumnsIS() 4cpu:  norm(res,2) =    7.3786e+06

  why do you conclude this is wrong? MatZeroRowsColumnsIS() IS suppose to change the right hand side in a way different than MatZeroRowsIS(). 

   Explanation. For simplicity reorder the matrix rows/columns so that zeroed ones come last and the matrix is symmetric. Then you have 

        (  A   B )   (x_A)  =   (b_A)
        (  B   D )   (x_B)       (b_B)

   with MatZeroRows the new system is     

       (  A   B )   (x_A)  =   (b_A)
        (  0   I  )   (x_B)       (x_B)

   it has the same solution as the original problem with the give x_B

  with MatZeroRowsColumns the new system is 

      (  A   0 )   (x_A)  =   (b_A)  - B*x_B
      (  0   I  )   (x_B)       (x_B)

note the right hand side needs to be changed so that the new problem has the same solution.

   Barry


On May 9, 2014, at 9:50 AM, P?s?k, Adina-Erika <puesoek at uni-mainz.de> wrote:

> Yes, I tested the implementation with both MatZeroRowsIS() and MatZeroRowsColumnsIS(). But first, I will be more explicit about the problem I was set to solve: 
> 
> We have a Dirichlet block of size (L,W,H) and centered (xc,yc,zc), which is much smaller than the model domain, and we set Vx = Vpush, Vy=0 within the block (Vz is let free for easier convergence). 
> As I said before, since the code does not have a monolithic matrix, but 4 submatrices (VV VP; PV PP), and the rhs has 2 sub vectors rhs=(f; g), my approach is to modify only (VV, VP, f) for the Dirichlet BC.
> 
> The way I tested the implementation:
> 1) Output (VV, VP, f, Dirichlet dofs) - unmodified (no Dirichlet BC)
> 2) Output (VV, VP, f, Dirichlet dofs) - a) modified with MatZeroRowsIS(), 
>        - b) modified with MatZeroRowsColumnsIS() -> S_PETSc
> Again, the only difference between a) and b) is:
> //
> ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
>    // ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
> 
> ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
> 3) Read them in Matlab and perform the exact same operations on the unmodified matrices and f vector. -> S_Matlab
> 4) Compare S_PETSc with S_Matlab. If the implementation is correct, they should be equal (VV, VP, f).
> 5) Check for 1 cpu and 4 cpus.
> 
> Now to answer your questions:
> 
> a,b,d) Yes, matrix modification is done correctly (check the spy diagrams below) in all cases:  MatZeroRowsIS() and MatZeroRowsColumnsIS() on 1 and 4 cpus.
> 
> I should have said that in the piece of code above:
>  v_vv = 1.0;
>  v_vp = 0.0;
> The vector x_push is a duplicate of rhs, with zero elements except the values for the Dirichlet dofs. 
> 
> c) The rhs is a different matter. With MatZeroRows() there is no problem. The rhs is equivalent with the one in Matlab, sequential and parallel.
>     However, with MatZeroRowsColumns(), the residual contains nonzero elements, and in parallel the nonzero pattern is even bigger (1 cpu - 63, 4 cpu - 554). But if you look carefully, the values of the nonzero residuals are very small < +/- 1e-10.
> So, I did a tolerance filter:
> 
> tol = 1e-10;
> res = f_petsc - f_mod_matlab;
> for i=1:length(res)
>     if abs(res(i))>0 & abs(res(i))<tol
>         res(i)=0;
>     end
> end
> 
> and then the f_petsc and f_mod_matlab are equivalent on 1 and 4 cpus (figure 5). So it seems that MatZeroRowsColumnsIS() might give some nonzero residuals. 
> 
> Calculating the norm2 of the residuals defined above in each case gives:
> MatZeroRowsIS() 1cpu:  norm(res,2) =      0
> MatZeroRowsIS() 4cpu:  norm(res,2) =      0
> MatZeroRowsColumnsIS() 1cpu:  norm(res,2) =    1.6880e-10
> MatZeroRowsColumnsIS() 4cpu:  norm(res,2) =    7.3786e+06
> 
> Since this is purely a problem of matrix and vector assembly/manipulation, I think the nonzero residuals of the rhs with MatZeroRowsColumnsIS() give the parallel artefacts that I showed last time.
> If you need the raw data and the matlab scripts that I used for testing for your consideration, please let me know. 
> 
> Thanks,
> Adina
> 
> When performing the manual operations on the unmodified matrices and rhs vector in Matlab, I took into account: 
> - matlab indexing = petsc indexing +1;
> - the vectors written to file for matlab (PETSC_VIEWER_BINARY_MATLAB) have the natural ordering, rather than the petsc ordering. On 1 cpu, they are equivalent, but on 4 cpus, the Dirichlet BC indices had to be converted to natural indices in order to perform the correct operations on the rhs.
> 
> <spy_zerorows_1cpu.png>
> <spy_zerorows_4cpu.png>
> <spy_zerorowscol_1cpu.png>
> <spy_zerorowscol_4cpu.png>
> <residual_tol.png>
> 
> On May 6, 2014, at 4:22 PM, Matthew Knepley wrote:
> 
>> On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika <puesoek at uni-mainz.de> wrote:
>> Hello!
>> 
>> I was trying to implement some internal Dirichlet boundary conditions into an aij matrix of the form:  A=(  VV  VP; PV PP ). The idea was to create an internal block (let's say Dirichlet block) that moves with constant velocity within the domain (i.e. check all the dofs within the block and set the values accordingly to the desired motion).
>> 
>> Ideally, this means to zero the rows and columns in VV, VP, PV corresponding to the dirichlet dofs and modify the corresponding rhs values. However, since we have submatrices and not a monolithic matrix A,  we can choose to modify only VV and PV matrices. 
>> The global indices of the velocity points within the Dirichlet block are contained in the arrays rowid_array. 
>> 
>> What I want to point out is that the function MatZeroRowsColumnsIS() seems to create parallel artefacts, compared to MatZeroRowsIS() when run on more than 1 processor. Moreover, the results on 1 cpu are identical. 
>> See below the results of the test (the Dirichlet block is outlined in white) and the piece of the code involved where the 1) - 2) parts are the only difference. 
>> 
>> I am assuming that you are showing the result of solving the equations. It would be more useful, and presumably just as easy
>> to say:
>> 
>>   a) Are the correct rows zeroed out?
>> 
>>   b) Is the diagonal element correct?
>> 
>>   c) Is the rhs value correct?
>> 
>>   d) Are the columns zeroed correctly?
>> 
>> If we know where the problem is, its easier to fix. For example, if the rhs values are
>> correct and the rows are zeroed, then something is wrong with the solution procedure.
>> Since ZeroRows() works and ZeroRowsColumns() does not, this is a distinct possibility.
>> 
>>   Thanks,
>> 
>>      Matt
>>  
>> Thanks,
>> Adina Pusok
>> 
>> // Create an IS required by MatZeroRows()
>> ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx);  CHKERRQ(ierr);
>> ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy);  CHKERRQ(ierr);
>> ierr = ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz);  CHKERRQ(ierr);
>> 
>> 1) /*
>> ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
>> ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
>> ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);*/
>> 
>> 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
>> ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
>> ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);
>> 
>> ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);
>> ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);
>> ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL);  CHKERRQ(ierr);
>> 
>> ierr = ISDestroy(&isx); CHKERRQ(ierr);
>> ierr = ISDestroy(&isy); CHKERRQ(ierr);
>> ierr = ISDestroy(&isz); CHKERRQ(ierr);
>> 
>> 
>> Results (velocity) with MatZeroRowsColumnsIS().
>> 1cpu<r01_1cpu_rows_columns.png> 4cpu<r01_rows_columns.png>
>> 
>> Results (velocity) with MatZeroRowsIS():
>> 1cpu<r01_1cpu_rows.png> 4cpu<r01_rows.png>
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
> 


From jed at jedbrown.org  Tue May 13 23:28:19 2014
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 13 May 2014 22:28:19 -0600
Subject: [petsc-users] Problem with MatZeroRowsColumnsIS()
In-Reply-To: <1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov>
References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de>
	<CAMYG4GnwSkUC11GhmLj7H-EYmK2X2tMBUQUFdCnfc5PPGfNHag@mail.gmail.com>
	<57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de>
	<1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov>
Message-ID: <87k39p7zmk.fsf@jedbrown.org>

Barry Smith <bsmith at mcs.anl.gov> writes:

>     Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to be wrong.

Haha.  Though MatZeroRowsColumns_MPIAIJ uses PetscSF, the implementation
was written by Matt.  I think it's correct, however, at least as of
Matt's January changes in 'master'.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/4f5868b4/attachment.pgp>

From talebi.hossein at gmail.com  Wed May 14 00:43:21 2014
From: talebi.hossein at gmail.com (Hossein Talebi)
Date: Wed, 14 May 2014 07:43:21 +0200
Subject: [petsc-users] PetscLayoutCreate for Fortran
In-Reply-To: <44B5B5E9-5F44-4649-A01E-003189B05AD4@mcs.anl.gov>
References: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
	<CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>
	<CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>
	<44B5B5E9-5F44-4649-A01E-003189B05AD4@mcs.anl.gov>
Message-ID: <CAEjaKwBc7X_jYw1K+wV0wMXJOdWAiwo1W=-4b3hoO7_b3DwESQ@mail.gmail.com>

Thank you.

Well, only the first part. I move around the elements and identify the Halo
nodes etc. However, I do not renumber the vertices to be contiguous on the
CPUs like what you said.

BUT, I just noticed: I partition the domain based on the computational
wight of the elements which is different to that of Mat-Vec calculation.
This means my portioning may not be efficient for the solution process.

I think I will then go with the copy-in, solve, copy-out option.


On Wed, May 14, 2014 at 3:06 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On May 13, 2014, at 11:42 AM, Hossein Talebi <talebi.hossein at gmail.com>
> wrote:
>
> >
> > I have already decomposed the Finite Element system using Metis. I just
> need to have the global rows exactly like how I define and I like to have
> the answer in the same layout so I don't have to move things around the
> processes again.
>
>    Metis tells you a good partitioning IT DOES NOT MOVE the elements to
> form a good partitioning. Do you move the elements around based on what
> metis told you and similarly do you renumber the elements (and vertices) to
> be contiquously numbered on each process with the first process getting the
> first set of numbers, the second process the second set of numbers etc?
>
>    If you do all that then when you create Vec and Mat you should simply
> set the local size (based on the number of local vertices on each process).
> You never need to use PetscLayoutCreate and in fact if your code was in C
> you would never use PetscLayoutCreate()
>
>    If you do not do all that then you need to do that first before you
> start calling PETSc.
>
>    Barry
>
> >
> > No, I don't need it for something else.
> >
> > Cheers
> > Hossein
> >
> >
> >
> >
> > On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi <
> talebi.hossein at gmail.com> wrote:
> > Hi All,
> >
> >
> > I am using PETSC from Fortran. I would like to define my own layout i.e.
> which row belongs to which CPU since I have already done the domain
> decomposition.  It appears that  "PetscLayoutCreate" and the other routine
> do this. But in the manual it says it is not provided in Fortran.
> >
> > Is there any way that I can do this using Fortran? Anyone has an example?
> >
> > You can do this for Vec and Mat directly. Do you want it for something
> else?
> >
> >   Thanks,
> >
> >      Matt
> >
> > Cheers
> > Hossein
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> >
> >
> > --
> > www.permix.org
>
>


-- 
www.permix.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/fcae21ab/attachment.html>

From Vincent.De-Groof at uibk.ac.at  Wed May 14 03:29:47 2014
From: Vincent.De-Groof at uibk.ac.at (De Groof, Vincent Frans Maria)
Date: Wed, 14 May 2014 08:29:47 +0000
Subject: [petsc-users] Memory usage during matrix factorization
In-Reply-To: <87vbt98ev4.fsf@jedbrown.org>
References: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at>
	<B10D52A7-BFBE-4226-9190-FDCAEF3DA8C2@mcs.anl.gov>,
	<87vbt98ev4.fsf@jedbrown.org>
Message-ID: <17A78B9D13564547AC894B88C159674720382707@XMBX4.uibk.ac.at>


Thanks. I made a new function based on the PetscGetCurrentUsage which does what I want. It seems like I am being lucky as the numbers returned by the OS seem to be reasonable.


thanks again,
Vincent
________________________________________
Von: Jed Brown [jed at jedbrown.org]
Gesendet: Mittwoch, 14. Mai 2014 00:59
An: Barry Smith; De Groof, Vincent Frans Maria
Cc: petsc-users at mcs.anl.gov
Betreff: Re: [petsc-users] Memory usage during matrix factorization

Barry Smith <bsmith at mcs.anl.gov> writes:
>    Here is the code: it is only as reliable as the OS is at reporting the values.

HPC vendors have a habit of implementing these functions to return
nonsense.  Sometimes they provide non-standard functions to return
useful information.

From C.Klaij at marin.nl  Wed May 14 04:02:39 2014
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Wed, 14 May 2014 09:02:39 +0000
Subject: [petsc-users] petsc 3.4, mat_view and prefix problem
Message-ID: <dd17e553056f42e8aa5795d825e3b4e0@MAR190N1.marin.local>

I'm having problems using mat_view in petsc 3.4.3 in combination
with a prefix. For example in ../snes/examples/tutorials/ex70:

        mpiexec -n 2 ./ex70 -nx 16 -ny 24 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower -user_ksp -a00_mat_view

does not print the matrix a00 to screen. This used to work in 3.3
versions before the single consistent -xxx_view scheme.

Similarly, if I add this at line 105 of
../ksp/ksp/examples/tutorials/ex1f.F:

      call MatSetOptionsPrefix(A,"a_",ierr)

then running with -mat_view still prints the matrix to screen but
running with -a_mat_view doesn't. I expected the opposite.

The problem only occurs with mat, not with ksp. For example, if I
add this at line 184 of ../ksp/ksp/examples/tutorials/ex1f.F:

      call KSPSetOptionsPrefix(ksp,"a_",ierr)

then running with -a_ksp_monitor does print the residuals to
screen and -ksp_monitor doesn't, as expected.


dr. ir. Christiaan Klaij
CFD Researcher
Research & Development
E mailto:C.Klaij at marin.nl
T +31 317 49 33 44


MARIN
2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands
T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl


From christophe.ortiz at ciemat.es  Wed May 14 06:29:55 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Wed, 14 May 2014 13:29:55 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
Message-ID: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>

Hi all,

I am experiencing some problems of memory corruption with PetscMemzero().

I set the values of the Jacobian by blocks using MatSetValuesBlocked(). To
do so, I use some temporary two-dimensional arrays[dof][dof] that I must
reset at each loop.

Inside FormIJacobian, for instance, I declare the following two-dimensional
array:

   PetscScalar  diag[dof][dof];

and then, to zero the array diag[][] I do

   ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar));

So far no problem. It works fine.

Now, what I want is to have diag[][] as a global array so all functions can
have access to it. Therefore, I declare it outside main().
Since outside the main() I still do not know dof, which is determined later
inside main(), I declare the two-dimensional array diag as follows:

PetscScalar  **diag;

Then, inside main(), once dof is determined, I allocate memory for diag as
follows:

 diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);

 for (k = 0; k < dof; k++){
  diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
 }
That is, the classical way to allocate memory using the pointer notation.

Then, when it comes to zero the two-dimensional array diag[][] inside
FormIJacobian, I do as before:

   ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar));

Compilation goes well but when I launch the executable, after few timesteps
I get the following memory corruption message:

      TSAdapt 'basic': step   0 accepted t=0          + 1.000e-16
wlte=8.5e-05 family='arkimex' scheme=0:'1bee' dt=1.100e-16
      TSAdapt 'basic': step   1 accepted t=1e-16      + 1.100e-16
wlte=4.07e-13 family='arkimex' scheme=0:'3' dt=1.210e-16
      TSAdapt 'basic': step   2 accepted t=2.1e-16    + 1.210e-16
wlte=1.15e-13 family='arkimex' scheme=0:'3' dt=1.331e-16
      TSAdapt 'basic': step   3 accepted t=3.31e-16   + 1.331e-16
wlte=1.14e-13 family='arkimex' scheme=0:'3' dt=1.464e-16
[0]PETSC ERROR: PetscMallocValidate: error detected at TSComputeIJacobian()
line 719 in src/ts/interface/ts.c
[0]PETSC ERROR: Memory [id=0(0)] at address 0x243c260 is corrupted
(probably write past end of array)
[0]PETSC ERROR: Memory originally allocated in (null)() line 0 in
src/mat/impls/aij/seq/(null)
[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: Memory corruption!
[0]PETSC ERROR:  !
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.4.1, Jun, 10, 2013
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: ./diffusion on a icc-nompi-double-blas-debug named
mazinger.ciemat.es by u5751 Wed May 14 13:23:26 2014
[0]PETSC ERROR: Libraries linked from
/home/u5751/petsc-3.4.1/icc-nompi-double-blas-debug/lib
[0]PETSC ERROR: Configure run at Wed Apr  2 14:01:51 2014
[0]PETSC ERROR: Configure options --with-mpi=0 --with-cc=icc --with-cxx=icc
--with-clanguage=cxx --with-debugging=1 --with-scalar-type=real
--with-precision=double --download-f-blas-lapack
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: PetscMallocValidate() line 149 in src/sys/memory/mtr.c
[0]PETSC ERROR: TSComputeIJacobian() line 719 in src/ts/interface/ts.c
[0]PETSC ERROR: SNESTSFormJacobian_ARKIMEX() line 995 in
src/ts/impls/arkimex/arkimex.c
[0]PETSC ERROR: SNESTSFormJacobian() line 3397 in src/ts/interface/ts.c
[0]PETSC ERROR: SNESComputeJacobian() line 2152 in src/snes/interface/snes.c
[0]PETSC ERROR: SNESSolve_NEWTONLS() line 218 in src/snes/impls/ls/ls.c
[0]PETSC ERROR: SNESSolve() line 3636 in src/snes/interface/snes.c
[0]PETSC ERROR: TSStep_ARKIMEX() line 765 in src/ts/impls/arkimex/arkimex.c
[0]PETSC ERROR: TSStep() line 2458 in src/ts/interface/ts.c
[0]PETSC ERROR: TSSolve() line 2583 in src/ts/interface/ts.c
[0]PETSC ERROR: main() line 2690 in src/ts/examples/tutorials/diffusion.cxx
./compile_diffusion: line 25: 17061 Aborted                 ./diffusion
-ts_adapt_monitor -ts_adapt_basic_clip 0.01,1.10 -draw_pause -2
-ts_arkimex_type 3 -ts_max_snes_failures -1 -snes_type newtonls
-snes_linesearch_type basic -ksp_type gmres -pc_type ilu

Did I do something wrong ? Or is it due to the pointer notation to declare
the two-dimensional array that conflicts with PetscMemzero ?

Many thanks in advance for your help.

Christophe

--
Q
Por favor, piense en el medio ambiente antes de imprimir este mensaje.
Please consider the environment before printing this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/2f259965/attachment-0001.html>

From christophe.ortiz at ciemat.es  Wed May 14 06:53:04 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Wed, 14 May 2014 13:53:04 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
Message-ID: <CANBrw+41bGBLRe6H=cCd5BXsOq_6PbTA_=O=vOpvu_MUjGLg2Q@mail.gmail.com>

Ok, I just found the answer regarding the memory corruption with the
two-dimensional array and PetscMemzero.

Instead of
   ierr = PetscMemzero(diagbl,dof*dofsizeof(PetscScalar));CHKERRQ(ierr);

One must do the following:

  for (k = 0; k < dof; k++) {
   ierr = PetscMemzero(diagbl[k],dof*sizeof(PetscScalar));CHKERRQ(ierr);
  }

Indeed, due to the ** notation, the two-dimensional array is made of dof
rows of dof columns. You cannot set dof*dof values to just one row but you
must iterate through the rows and set dof values each time.


Now it works fine.

Christophe
--
Q
Por favor, piense en el medio ambiente antes de imprimir este mensaje.
Please consider the environment before printing this email.


On Wed, May 14, 2014 at 1:29 PM, <petsc-users-request at mcs.anl.gov> wrote:

> Send petsc-users mailing list submissions to
>         petsc-users at mcs.anl.gov
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.mcs.anl.gov/mailman/listinfo/petsc-users
> or, via email, send a message with subject or body 'help' to
>         petsc-users-request at mcs.anl.gov
>
> You can reach the person managing the list at
>         petsc-users-owner at mcs.anl.gov
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of petsc-users digest..."
>
>
> Today's Topics:
>
>    1. Re:  Problem with MatZeroRowsColumnsIS() (Barry Smith)
>    2. Re:  Problem with MatZeroRowsColumnsIS() (Jed Brown)
>    3. Re:  PetscLayoutCreate for Fortran (Hossein Talebi)
>    4. Re:  Memory usage during matrix factorization
>       (De Groof, Vincent Frans Maria)
>    5.  petsc 3.4, mat_view and prefix problem (Klaij, Christiaan)
>    6.  Memory corruption with two-dimensional array and
>       PetscMemzero (Christophe Ortiz)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 13 May 2014 21:00:50 -0500
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: "P?s?k, Adina-Erika" <puesoek at uni-mainz.de>
> Cc: "<petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Problem with MatZeroRowsColumnsIS()
> Message-ID: <1C839F22-8ADF-4904-B136-D1461AE38187 at mcs.anl.gov>
> Content-Type: text/plain; charset="iso-8859-1"
>
>
>     Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to be
> wrong.
>
>   You wrote
>
> Calculating the norm2 of the residuals defined above in each case gives:
> MatZeroRowsIS() 1cpu:  norm(res,2) =      0
> MatZeroRowsIS() 4cpu:  norm(res,2) =      0
> MatZeroRowsColumnsIS() 1cpu:  norm(res,2) =    1.6880e-10
> MatZeroRowsColumnsIS() 4cpu:  norm(res,2) =    7.3786e+06
>
>   why do you conclude this is wrong? MatZeroRowsColumnsIS() IS suppose to
> change the right hand side in a way different than MatZeroRowsIS().
>
>    Explanation. For simplicity reorder the matrix rows/columns so that
> zeroed ones come last and the matrix is symmetric. Then you have
>
>         (  A   B )   (x_A)  =   (b_A)
>         (  B   D )   (x_B)       (b_B)
>
>    with MatZeroRows the new system is
>
>        (  A   B )   (x_A)  =   (b_A)
>         (  0   I  )   (x_B)       (x_B)
>
>    it has the same solution as the original problem with the give x_B
>
>   with MatZeroRowsColumns the new system is
>
>       (  A   0 )   (x_A)  =   (b_A)  - B*x_B
>       (  0   I  )   (x_B)       (x_B)
>
> note the right hand side needs to be changed so that the new problem has
> the same solution.
>
>    Barry
>
>
> On May 9, 2014, at 9:50 AM, P?s?k, Adina-Erika <puesoek at uni-mainz.de>
> wrote:
>
> > Yes, I tested the implementation with both MatZeroRowsIS() and
> MatZeroRowsColumnsIS(). But first, I will be more explicit about the
> problem I was set to solve:
> >
> > We have a Dirichlet block of size (L,W,H) and centered (xc,yc,zc), which
> is much smaller than the model domain, and we set Vx = Vpush, Vy=0 within
> the block (Vz is let free for easier convergence).
> > As I said before, since the code does not have a monolithic matrix, but
> 4 submatrices (VV VP; PV PP), and the rhs has 2 sub vectors rhs=(f; g), my
> approach is to modify only (VV, VP, f) for the Dirichlet BC.
> >
> > The way I tested the implementation:
> > 1) Output (VV, VP, f, Dirichlet dofs) - unmodified (no Dirichlet BC)
> > 2) Output (VV, VP, f, Dirichlet dofs) - a) modified with MatZeroRowsIS(),
> >        - b) modified with MatZeroRowsColumnsIS() -> S_PETSc
> > Again, the only difference between a) and b) is:
> > //
> > ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >    // ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);
>  CHKERRQ(ierr);
> >
> > ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> > ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
> > 3) Read them in Matlab and perform the exact same operations on the
> unmodified matrices and f vector. -> S_Matlab
> > 4) Compare S_PETSc with S_Matlab. If the implementation is correct, they
> should be equal (VV, VP, f).
> > 5) Check for 1 cpu and 4 cpus.
> >
> > Now to answer your questions:
> >
> > a,b,d) Yes, matrix modification is done correctly (check the spy
> diagrams below) in all cases:  MatZeroRowsIS() and MatZeroRowsColumnsIS()
> on 1 and 4 cpus.
> >
> > I should have said that in the piece of code above:
> >  v_vv = 1.0;
> >  v_vp = 0.0;
> > The vector x_push is a duplicate of rhs, with zero elements except the
> values for the Dirichlet dofs.
> >
> > c) The rhs is a different matter. With MatZeroRows() there is no
> problem. The rhs is equivalent with the one in Matlab, sequential and
> parallel.
> >     However, with MatZeroRowsColumns(), the residual contains nonzero
> elements, and in parallel the nonzero pattern is even bigger (1 cpu - 63, 4
> cpu - 554). But if you look carefully, the values of the nonzero residuals
> are very small < +/- 1e-10.
> > So, I did a tolerance filter:
> >
> > tol = 1e-10;
> > res = f_petsc - f_mod_matlab;
> > for i=1:length(res)
> >     if abs(res(i))>0 & abs(res(i))<tol
> >         res(i)=0;
> >     end
> > end
> >
> > and then the f_petsc and f_mod_matlab are equivalent on 1 and 4 cpus
> (figure 5). So it seems that MatZeroRowsColumnsIS() might give some nonzero
> residuals.
> >
> > Calculating the norm2 of the residuals defined above in each case gives:
> > MatZeroRowsIS() 1cpu:  norm(res,2) =      0
> > MatZeroRowsIS() 4cpu:  norm(res,2) =      0
> > MatZeroRowsColumnsIS() 1cpu:  norm(res,2) =    1.6880e-10
> > MatZeroRowsColumnsIS() 4cpu:  norm(res,2) =    7.3786e+06
> >
> > Since this is purely a problem of matrix and vector
> assembly/manipulation, I think the nonzero residuals of the rhs with
> MatZeroRowsColumnsIS() give the parallel artefacts that I showed last time.
> > If you need the raw data and the matlab scripts that I used for testing
> for your consideration, please let me know.
> >
> > Thanks,
> > Adina
> >
> > When performing the manual operations on the unmodified matrices and rhs
> vector in Matlab, I took into account:
> > - matlab indexing = petsc indexing +1;
> > - the vectors written to file for matlab (PETSC_VIEWER_BINARY_MATLAB)
> have the natural ordering, rather than the petsc ordering. On 1 cpu, they
> are equivalent, but on 4 cpus, the Dirichlet BC indices had to be converted
> to natural indices in order to perform the correct operations on the rhs.
> >
> > <spy_zerorows_1cpu.png>
> > <spy_zerorows_4cpu.png>
> > <spy_zerorowscol_1cpu.png>
> > <spy_zerorowscol_4cpu.png>
> > <residual_tol.png>
> >
> > On May 6, 2014, at 4:22 PM, Matthew Knepley wrote:
> >
> >> On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika <
> puesoek at uni-mainz.de> wrote:
> >> Hello!
> >>
> >> I was trying to implement some internal Dirichlet boundary conditions
> into an aij matrix of the form:  A=(  VV  VP; PV PP ). The idea was to
> create an internal block (let's say Dirichlet block) that moves with
> constant velocity within the domain (i.e. check all the dofs within the
> block and set the values accordingly to the desired motion).
> >>
> >> Ideally, this means to zero the rows and columns in VV, VP, PV
> corresponding to the dirichlet dofs and modify the corresponding rhs
> values. However, since we have submatrices and not a monolithic matrix A,
>  we can choose to modify only VV and PV matrices.
> >> The global indices of the velocity points within the Dirichlet block
> are contained in the arrays rowid_array.
> >>
> >> What I want to point out is that the function MatZeroRowsColumnsIS()
> seems to create parallel artefacts, compared to MatZeroRowsIS() when run on
> more than 1 processor. Moreover, the results on 1 cpu are identical.
> >> See below the results of the test (the Dirichlet block is outlined in
> white) and the piece of the code involved where the 1) - 2) parts are the
> only difference.
> >>
> >> I am assuming that you are showing the result of solving the equations.
> It would be more useful, and presumably just as easy
> >> to say:
> >>
> >>   a) Are the correct rows zeroed out?
> >>
> >>   b) Is the diagonal element correct?
> >>
> >>   c) Is the rhs value correct?
> >>
> >>   d) Are the columns zeroed correctly?
> >>
> >> If we know where the problem is, its easier to fix. For example, if the
> rhs values are
> >> correct and the rows are zeroed, then something is wrong with the
> solution procedure.
> >> Since ZeroRows() works and ZeroRowsColumns() does not, this is a
> distinct possibility.
> >>
> >>   Thanks,
> >>
> >>      Matt
> >>
> >> Thanks,
> >> Adina Pusok
> >>
> >> // Create an IS required by MatZeroRows()
> >> ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx);
>  CHKERRQ(ierr);
> >> ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy);
>  CHKERRQ(ierr);
> >> ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz);
>  CHKERRQ(ierr);
> >>
> >> 1) /*
> >> ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >> ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >> ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs);
>  CHKERRQ(ierr);*/
> >>
> >> 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >> ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >> ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >>
> >> ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL);
>  CHKERRQ(ierr);
> >> ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL);
>  CHKERRQ(ierr);
> >> ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL);
>  CHKERRQ(ierr);
> >>
> >> ierr = ISDestroy(&isx); CHKERRQ(ierr);
> >> ierr = ISDestroy(&isy); CHKERRQ(ierr);
> >> ierr = ISDestroy(&isz); CHKERRQ(ierr);
> >>
> >>
> >> Results (velocity) with MatZeroRowsColumnsIS().
> >> 1cpu<r01_1cpu_rows_columns.png> 4cpu<r01_rows_columns.png>
> >>
> >> Results (velocity) with MatZeroRowsIS():
> >> 1cpu<r01_1cpu_rows.png> 4cpu<r01_rows.png>
> >>
> >>
> >>
> >> --
> >> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >> -- Norbert Wiener
> >
>
>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 13 May 2014 22:28:19 -0600
> From: Jed Brown <jed at jedbrown.org>
> To: Barry Smith <bsmith at mcs.anl.gov>, P?s?k, Adina-Erika
>         <puesoek at uni-mainz.de>
> Cc: "<petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Problem with MatZeroRowsColumnsIS()
> Message-ID: <87k39p7zmk.fsf at jedbrown.org>
> Content-Type: text/plain; charset="us-ascii"
>
> Barry Smith <bsmith at mcs.anl.gov> writes:
>
> >     Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to
> be wrong.
>
> Haha.  Though MatZeroRowsColumns_MPIAIJ uses PetscSF, the implementation
> was written by Matt.  I think it's correct, however, at least as of
> Matt's January changes in 'master'.
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 835 bytes
> Desc: not available
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140513/4f5868b4/attachment-0001.pgp
> >
>
> ------------------------------
>
> Message: 3
> Date: Wed, 14 May 2014 07:43:21 +0200
> From: Hossein Talebi <talebi.hossein at gmail.com>
> To: Barry Smith <bsmith at mcs.anl.gov>
> Cc: petsc-users <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] PetscLayoutCreate for Fortran
> Message-ID:
>         <CAEjaKwBc7X_jYw1K+wV0wMXJOdWAiwo1W=-
> 4b3hoO7_b3DwESQ at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Thank you.
>
> Well, only the first part. I move around the elements and identify the Halo
> nodes etc. However, I do not renumber the vertices to be contiguous on the
> CPUs like what you said.
>
> BUT, I just noticed: I partition the domain based on the computational
> wight of the elements which is different to that of Mat-Vec calculation.
> This means my portioning may not be efficient for the solution process.
>
> I think I will then go with the copy-in, solve, copy-out option.
>
>
>
>
> On Wed, May 14, 2014 at 3:06 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> >
> > On May 13, 2014, at 11:42 AM, Hossein Talebi <talebi.hossein at gmail.com>
> > wrote:
> >
> > >
> > > I have already decomposed the Finite Element system using Metis. I just
> > need to have the global rows exactly like how I define and I like to have
> > the answer in the same layout so I don't have to move things around the
> > processes again.
> >
> >    Metis tells you a good partitioning IT DOES NOT MOVE the elements to
> > form a good partitioning. Do you move the elements around based on what
> > metis told you and similarly do you renumber the elements (and vertices)
> to
> > be contiquously numbered on each process with the first process getting
> the
> > first set of numbers, the second process the second set of numbers etc?
> >
> >    If you do all that then when you create Vec and Mat you should simply
> > set the local size (based on the number of local vertices on each
> process).
> > You never need to use PetscLayoutCreate and in fact if your code was in C
> > you would never use PetscLayoutCreate()
> >
> >    If you do not do all that then you need to do that first before you
> > start calling PETSc.
> >
> >    Barry
> >
> > >
> > > No, I don't need it for something else.
> > >
> > > Cheers
> > > Hossein
> > >
> > >
> > >
> > >
> > > On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley <knepley at gmail.com>
> > wrote:
> > > On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi <
> > talebi.hossein at gmail.com> wrote:
> > > Hi All,
> > >
> > >
> > > I am using PETSC from Fortran. I would like to define my own layout
> i.e.
> > which row belongs to which CPU since I have already done the domain
> > decomposition.  It appears that  "PetscLayoutCreate" and the other
> routine
> > do this. But in the manual it says it is not provided in Fortran.
> > >
> > > Is there any way that I can do this using Fortran? Anyone has an
> example?
> > >
> > > You can do this for Vec and Mat directly. Do you want it for something
> > else?
> > >
> > >   Thanks,
> > >
> > >      Matt
> > >
> > > Cheers
> > > Hossein
> > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their
> > experiments is infinitely more interesting than any results to which
> their
> > experiments lead.
> > > -- Norbert Wiener
> > >
> > >
> > >
> > > --
> > > www.permix.org
> >
> >
>
>
> --
> www.permix.org
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/fcae21ab/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 4
> Date: Wed, 14 May 2014 08:29:47 +0000
> From: "De Groof, Vincent Frans Maria" <Vincent.De-Groof at uibk.ac.at>
> To: Jed Brown <jed at jedbrown.org>, Barry Smith <bsmith at mcs.anl.gov>
> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Memory usage during matrix factorization
> Message-ID:
>         <17A78B9D13564547AC894B88C159674720382707 at XMBX4.uibk.ac.at>
> Content-Type: text/plain; charset="us-ascii"
>
>
> Thanks. I made a new function based on the PetscGetCurrentUsage which does
> what I want. It seems like I am being lucky as the numbers returned by the
> OS seem to be reasonable.
>
>
> thanks again,
> Vincent
> ________________________________________
> Von: Jed Brown [jed at jedbrown.org]
> Gesendet: Mittwoch, 14. Mai 2014 00:59
> An: Barry Smith; De Groof, Vincent Frans Maria
> Cc: petsc-users at mcs.anl.gov
> Betreff: Re: [petsc-users] Memory usage during matrix factorization
>
> Barry Smith <bsmith at mcs.anl.gov> writes:
> >    Here is the code: it is only as reliable as the OS is at reporting
> the values.
>
> HPC vendors have a habit of implementing these functions to return
> nonsense.  Sometimes they provide non-standard functions to return
> useful information.
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 14 May 2014 09:02:39 +0000
> From: "Klaij, Christiaan" <C.Klaij at marin.nl>
> To: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Subject: [petsc-users] petsc 3.4, mat_view and prefix problem
> Message-ID: <dd17e553056f42e8aa5795d825e3b4e0 at MAR190N1.marin.local>
> Content-Type: text/plain; charset="utf-8"
>
> I'm having problems using mat_view in petsc 3.4.3 in combination
> with a prefix. For example in ../snes/examples/tutorials/ex70:
>
>         mpiexec -n 2 ./ex70 -nx 16 -ny 24 -ksp_type fgmres -pc_type
> fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower
> -user_ksp -a00_mat_view
>
> does not print the matrix a00 to screen. This used to work in 3.3
> versions before the single consistent -xxx_view scheme.
>
> Similarly, if I add this at line 105 of
> ../ksp/ksp/examples/tutorials/ex1f.F:
>
>       call MatSetOptionsPrefix(A,"a_",ierr)
>
> then running with -mat_view still prints the matrix to screen but
> running with -a_mat_view doesn't. I expected the opposite.
>
> The problem only occurs with mat, not with ksp. For example, if I
> add this at line 184 of ../ksp/ksp/examples/tutorials/ex1f.F:
>
>       call KSPSetOptionsPrefix(ksp,"a_",ierr)
>
> then running with -a_ksp_monitor does print the residuals to
> screen and -ksp_monitor doesn't, as expected.
>
>
> dr. ir. Christiaan Klaij
> CFD Researcher
> Research & Development
> E mailto:C.Klaij at marin.nl
> T +31 317 49 33 44
>
>
> MARIN
> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands
> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl
>
>
> ------------------------------
>
> Message: 6
> Date: Wed, 14 May 2014 13:29:55 +0200
> From: Christophe Ortiz <christophe.ortiz at ciemat.es>
> To: petsc-users at mcs.anl.gov
> Subject: [petsc-users] Memory corruption with two-dimensional array
>         and     PetscMemzero
> Message-ID:
>         <
> CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi all,
>
> I am experiencing some problems of memory corruption with PetscMemzero().
>
> I set the values of the Jacobian by blocks using MatSetValuesBlocked(). To
> do so, I use some temporary two-dimensional arrays[dof][dof] that I must
> reset at each loop.
>
> Inside FormIJacobian, for instance, I declare the following two-dimensional
> array:
>
>    PetscScalar  diag[dof][dof];
>
> and then, to zero the array diag[][] I do
>
>    ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar));
>
> So far no problem. It works fine.
>
> Now, what I want is to have diag[][] as a global array so all functions can
> have access to it. Therefore, I declare it outside main().
> Since outside the main() I still do not know dof, which is determined later
> inside main(), I declare the two-dimensional array diag as follows:
>
> PetscScalar  **diag;
>
> Then, inside main(), once dof is determined, I allocate memory for diag as
> follows:
>
>  diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
>
>  for (k = 0; k < dof; k++){
>   diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
>  }
> That is, the classical way to allocate memory using the pointer notation.
>
> Then, when it comes to zero the two-dimensional array diag[][] inside
> FormIJacobian, I do as before:
>
>    ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar));
>
> Compilation goes well but when I launch the executable, after few timesteps
> I get the following memory corruption message:
>
>       TSAdapt 'basic': step   0 accepted t=0          + 1.000e-16
> wlte=8.5e-05 family='arkimex' scheme=0:'1bee' dt=1.100e-16
>       TSAdapt 'basic': step   1 accepted t=1e-16      + 1.100e-16
> wlte=4.07e-13 family='arkimex' scheme=0:'3' dt=1.210e-16
>       TSAdapt 'basic': step   2 accepted t=2.1e-16    + 1.210e-16
> wlte=1.15e-13 family='arkimex' scheme=0:'3' dt=1.331e-16
>       TSAdapt 'basic': step   3 accepted t=3.31e-16   + 1.331e-16
> wlte=1.14e-13 family='arkimex' scheme=0:'3' dt=1.464e-16
> [0]PETSC ERROR: PetscMallocValidate: error detected at TSComputeIJacobian()
> line 719 in src/ts/interface/ts.c
> [0]PETSC ERROR: Memory [id=0(0)] at address 0x243c260 is corrupted
> (probably write past end of array)
> [0]PETSC ERROR: Memory originally allocated in (null)() line 0 in
> src/mat/impls/aij/seq/(null)
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Memory corruption!
> [0]PETSC ERROR:  !
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.1, Jun, 10, 2013
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./diffusion on a icc-nompi-double-blas-debug named
> mazinger.ciemat.es by u5751 Wed May 14 13:23:26 2014
> [0]PETSC ERROR: Libraries linked from
> /home/u5751/petsc-3.4.1/icc-nompi-double-blas-debug/lib
> [0]PETSC ERROR: Configure run at Wed Apr  2 14:01:51 2014
> [0]PETSC ERROR: Configure options --with-mpi=0 --with-cc=icc --with-cxx=icc
> --with-clanguage=cxx --with-debugging=1 --with-scalar-type=real
> --with-precision=double --download-f-blas-lapack
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocValidate() line 149 in src/sys/memory/mtr.c
> [0]PETSC ERROR: TSComputeIJacobian() line 719 in src/ts/interface/ts.c
> [0]PETSC ERROR: SNESTSFormJacobian_ARKIMEX() line 995 in
> src/ts/impls/arkimex/arkimex.c
> [0]PETSC ERROR: SNESTSFormJacobian() line 3397 in src/ts/interface/ts.c
> [0]PETSC ERROR: SNESComputeJacobian() line 2152 in
> src/snes/interface/snes.c
> [0]PETSC ERROR: SNESSolve_NEWTONLS() line 218 in src/snes/impls/ls/ls.c
> [0]PETSC ERROR: SNESSolve() line 3636 in src/snes/interface/snes.c
> [0]PETSC ERROR: TSStep_ARKIMEX() line 765 in src/ts/impls/arkimex/arkimex.c
> [0]PETSC ERROR: TSStep() line 2458 in src/ts/interface/ts.c
> [0]PETSC ERROR: TSSolve() line 2583 in src/ts/interface/ts.c
> [0]PETSC ERROR: main() line 2690 in src/ts/examples/tutorials/diffusion.cxx
> ./compile_diffusion: line 25: 17061 Aborted                 ./diffusion
> -ts_adapt_monitor -ts_adapt_basic_clip 0.01,1.10 -draw_pause -2
> -ts_arkimex_type 3 -ts_max_snes_failures -1 -snes_type newtonls
> -snes_linesearch_type basic -ksp_type gmres -pc_type ilu
>
> Did I do something wrong ? Or is it due to the pointer notation to declare
> the two-dimensional array that conflicts with PetscMemzero ?
>
> Many thanks in advance for your help.
>
> Christophe
>
> --
> Q
> Por favor, piense en el medio ambiente antes de imprimir este mensaje.
> Please consider the environment before printing this email.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/2f259965/attachment.html
> >
>
> ------------------------------
>
> _______________________________________________
> petsc-users mailing list
> petsc-users at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/petsc-users
>
>
> End of petsc-users Digest, Vol 65, Issue 49
> *******************************************
>
> ----------------------------
> Confidencialidad:
> Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su
> destinatario y puede contener informaci?n privilegiada o confidencial. Si
> no es vd. el destinatario indicado, queda notificado de que la utilizaci?n,
> divulgaci?n y/o copia sin autorizaci?n est? prohibida en virtud de la
> legislaci?n vigente. Si ha recibido este mensaje por error, le rogamos que
> nos lo comunique inmediatamente respondiendo al mensaje y proceda a su
> destrucci?n.
>
> Disclaimer:
> This message and its attached files is intended exclusively for its
> recipients and may contain confidential information. If you received this
> e-mail in error you are hereby notified that any dissemination, copy or
> disclosure of this communication is strictly prohibited and may be
> unlawful. In this case, please notify us by a reply and delete this email
> and its contents immediately.
> ----------------------------
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/a2025596/attachment-0001.html>

From knepley at gmail.com  Wed May 14 07:24:49 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 14 May 2014 07:24:49 -0500
Subject: [petsc-users] Problem with MatZeroRowsColumnsIS()
In-Reply-To: <1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov>
References: <22FD65B0-CDA5-4663-8854-7237A674D5D1@uni-mainz.de>
	<CAMYG4GnwSkUC11GhmLj7H-EYmK2X2tMBUQUFdCnfc5PPGfNHag@mail.gmail.com>
	<57A1DAFE-C5BC-42E7-8389-AC67BB7ABBBA@uni-mainz.de>
	<1C839F22-8ADF-4904-B136-D1461AE38187@mcs.anl.gov>
Message-ID: <CAMYG4G=m+Sh-k+E=SEqT1EGan1iQOQWO==RexhoHxVnQKyveuQ@mail.gmail.com>

On Tue, May 13, 2014 at 9:00 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>     Given that Jed wrote MatZeroRowsColumns_MPIAIJ() it is unlikely to be
> wrong.
>
>   You wrote
>
> Calculating the norm2 of the residuals defined above in each case gives:
> MatZeroRowsIS() 1cpu:  norm(res,2) =      0
> MatZeroRowsIS() 4cpu:  norm(res,2) =      0
> MatZeroRowsColumnsIS() 1cpu:  norm(res,2) =    1.6880e-10
> MatZeroRowsColumnsIS() 4cpu:  norm(res,2) =    7.3786e+06
>

I think the issue is that the RHS changes between 1 and 4 processes. This
could be a bug, but I have
gone over the code and our test which look correct. I think it could also
be a misintepretation of how
it works since you are using a composite matrix. Does it work if you put
everything in a single matrix?

  Thanks,

     Matt


>   why do you conclude this is wrong? MatZeroRowsColumnsIS() IS suppose to
> change the right hand side in a way different than MatZeroRowsIS().
>
>    Explanation. For simplicity reorder the matrix rows/columns so that
> zeroed ones come last and the matrix is symmetric. Then you have
>
>         (  A   B )   (x_A)  =   (b_A)
>         (  B   D )   (x_B)       (b_B)
>
>    with MatZeroRows the new system is
>
>        (  A   B )   (x_A)  =   (b_A)
>         (  0   I  )   (x_B)       (x_B)
>
>    it has the same solution as the original problem with the give x_B
>
>   with MatZeroRowsColumns the new system is
>
>       (  A   0 )   (x_A)  =   (b_A)  - B*x_B
>       (  0   I  )   (x_B)       (x_B)
>
> note the right hand side needs to be changed so that the new problem has
> the same solution.
>
>    Barry
>
>
> On May 9, 2014, at 9:50 AM, P?s?k, Adina-Erika <puesoek at uni-mainz.de>
> wrote:
>
> > Yes, I tested the implementation with both MatZeroRowsIS() and
> MatZeroRowsColumnsIS(). But first, I will be more explicit about the
> problem I was set to solve:
> >
> > We have a Dirichlet block of size (L,W,H) and centered (xc,yc,zc), which
> is much smaller than the model domain, and we set Vx = Vpush, Vy=0 within
> the block (Vz is let free for easier convergence).
> > As I said before, since the code does not have a monolithic matrix, but
> 4 submatrices (VV VP; PV PP), and the rhs has 2 sub vectors rhs=(f; g), my
> approach is to modify only (VV, VP, f) for the Dirichlet BC.
> >
> > The way I tested the implementation:
> > 1) Output (VV, VP, f, Dirichlet dofs) - unmodified (no Dirichlet BC)
> > 2) Output (VV, VP, f, Dirichlet dofs) - a) modified with MatZeroRowsIS(),
> >        - b) modified with MatZeroRowsColumnsIS() -> S_PETSc
> > Again, the only difference between a) and b) is:
> > //
> > ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >    // ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);
>  CHKERRQ(ierr);
> >
> > ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> > ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
> > 3) Read them in Matlab and perform the exact same operations on the
> unmodified matrices and f vector. -> S_Matlab
> > 4) Compare S_PETSc with S_Matlab. If the implementation is correct, they
> should be equal (VV, VP, f).
> > 5) Check for 1 cpu and 4 cpus.
> >
> > Now to answer your questions:
> >
> > a,b,d) Yes, matrix modification is done correctly (check the spy
> diagrams below) in all cases:  MatZeroRowsIS() and MatZeroRowsColumnsIS()
> on 1 and 4 cpus.
> >
> > I should have said that in the piece of code above:
> >  v_vv = 1.0;
> >  v_vp = 0.0;
> > The vector x_push is a duplicate of rhs, with zero elements except the
> values for the Dirichlet dofs.
> >
> > c) The rhs is a different matter. With MatZeroRows() there is no
> problem. The rhs is equivalent with the one in Matlab, sequential and
> parallel.
> >     However, with MatZeroRowsColumns(), the residual contains nonzero
> elements, and in parallel the nonzero pattern is even bigger (1 cpu - 63, 4
> cpu - 554). But if you look carefully, the values of the nonzero residuals
> are very small < +/- 1e-10.
> > So, I did a tolerance filter:
> >
> > tol = 1e-10;
> > res = f_petsc - f_mod_matlab;
> > for i=1:length(res)
> >     if abs(res(i))>0 & abs(res(i))<tol
> >         res(i)=0;
> >     end
> > end
> >
> > and then the f_petsc and f_mod_matlab are equivalent on 1 and 4 cpus
> (figure 5). So it seems that MatZeroRowsColumnsIS() might give some nonzero
> residuals.
> >
> > Calculating the norm2 of the residuals defined above in each case gives:
> > MatZeroRowsIS() 1cpu:  norm(res,2) =      0
> > MatZeroRowsIS() 4cpu:  norm(res,2) =      0
> > MatZeroRowsColumnsIS() 1cpu:  norm(res,2) =    1.6880e-10
> > MatZeroRowsColumnsIS() 4cpu:  norm(res,2) =    7.3786e+06
> >
> > Since this is purely a problem of matrix and vector
> assembly/manipulation, I think the nonzero residuals of the rhs with
> MatZeroRowsColumnsIS() give the parallel artefacts that I showed last time.
> > If you need the raw data and the matlab scripts that I used for testing
> for your consideration, please let me know.
> >
> > Thanks,
> > Adina
> >
> > When performing the manual operations on the unmodified matrices and rhs
> vector in Matlab, I took into account:
> > - matlab indexing = petsc indexing +1;
> > - the vectors written to file for matlab (PETSC_VIEWER_BINARY_MATLAB)
> have the natural ordering, rather than the petsc ordering. On 1 cpu, they
> are equivalent, but on 4 cpus, the Dirichlet BC indices had to be converted
> to natural indices in order to perform the correct operations on the rhs.
> >
> > <spy_zerorows_1cpu.png>
> > <spy_zerorows_4cpu.png>
> > <spy_zerorowscol_1cpu.png>
> > <spy_zerorowscol_4cpu.png>
> > <residual_tol.png>
> >
> > On May 6, 2014, at 4:22 PM, Matthew Knepley wrote:
> >
> >> On Tue, May 6, 2014 at 7:23 AM, P?s?k, Adina-Erika <
> puesoek at uni-mainz.de> wrote:
> >> Hello!
> >>
> >> I was trying to implement some internal Dirichlet boundary conditions
> into an aij matrix of the form:  A=(  VV  VP; PV PP ). The idea was to
> create an internal block (let's say Dirichlet block) that moves with
> constant velocity within the domain (i.e. check all the dofs within the
> block and set the values accordingly to the desired motion).
> >>
> >> Ideally, this means to zero the rows and columns in VV, VP, PV
> corresponding to the dirichlet dofs and modify the corresponding rhs
> values. However, since we have submatrices and not a monolithic matrix A,
>  we can choose to modify only VV and PV matrices.
> >> The global indices of the velocity points within the Dirichlet block
> are contained in the arrays rowid_array.
> >>
> >> What I want to point out is that the function MatZeroRowsColumnsIS()
> seems to create parallel artefacts, compared to MatZeroRowsIS() when run on
> more than 1 processor. Moreover, the results on 1 cpu are identical.
> >> See below the results of the test (the Dirichlet block is outlined in
> white) and the piece of the code involved where the 1) - 2) parts are the
> only difference.
> >>
> >> I am assuming that you are showing the result of solving the equations.
> It would be more useful, and presumably just as easy
> >> to say:
> >>
> >>   a) Are the correct rows zeroed out?
> >>
> >>   b) Is the diagonal element correct?
> >>
> >>   c) Is the rhs value correct?
> >>
> >>   d) Are the columns zeroed correctly?
> >>
> >> If we know where the problem is, its easier to fix. For example, if the
> rhs values are
> >> correct and the rows are zeroed, then something is wrong with the
> solution procedure.
> >> Since ZeroRows() works and ZeroRowsColumns() does not, this is a
> distinct possibility.
> >>
> >>   Thanks,
> >>
> >>      Matt
> >>
> >> Thanks,
> >> Adina Pusok
> >>
> >> // Create an IS required by MatZeroRows()
> >> ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,numRowsx,rowidx_array,PETSC_COPY_VALUES,&isx);
>  CHKERRQ(ierr);
> >> ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,numRowsy,rowidy_array,PETSC_COPY_VALUES,&isy);
>  CHKERRQ(ierr);
> >> ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,numRowsz,rowidz_array,PETSC_COPY_VALUES,&isz);
>  CHKERRQ(ierr);
> >>
> >> 1) /*
> >> ierr = MatZeroRowsColumnsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >> ierr = MatZeroRowsColumnsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >> ierr = MatZeroRowsColumnsIS(VV_MAT,isz,v_vv,x_push,rhs);
>  CHKERRQ(ierr);*/
> >>
> >> 2) ierr = MatZeroRowsIS(VV_MAT,isx,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >> ierr = MatZeroRowsIS(VV_MAT,isy,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >> ierr = MatZeroRowsIS(VV_MAT,isz,v_vv,x_push,rhs);  CHKERRQ(ierr);
> >>
> >> ierr = MatZeroRowsIS(VP_MAT,isx,v_vp,PETSC_NULL,PETSC_NULL);
>  CHKERRQ(ierr);
> >> ierr = MatZeroRowsIS(VP_MAT,isy,v_vp,PETSC_NULL,PETSC_NULL);
>  CHKERRQ(ierr);
> >> ierr = MatZeroRowsIS(VP_MAT,isz,v_vp,PETSC_NULL,PETSC_NULL);
>  CHKERRQ(ierr);
> >>
> >> ierr = ISDestroy(&isx); CHKERRQ(ierr);
> >> ierr = ISDestroy(&isy); CHKERRQ(ierr);
> >> ierr = ISDestroy(&isz); CHKERRQ(ierr);
> >>
> >>
> >> Results (velocity) with MatZeroRowsColumnsIS().
> >> 1cpu<r01_1cpu_rows_columns.png> 4cpu<r01_rows_columns.png>
> >>
> >> Results (velocity) with MatZeroRowsIS():
> >> 1cpu<r01_1cpu_rows.png> 4cpu<r01_rows.png>
> >>
> >>
> >>
> >> --
> >> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >> -- Norbert Wiener
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/e2aba086/attachment.html>

From oliver.browne at upm.es  Wed May 14 07:42:58 2014
From: oliver.browne at upm.es (Oliver Browne)
Date: Wed, 14 May 2014 14:42:58 +0200
Subject: [petsc-users] MatMPIAIJSetPreallocationCSR
Message-ID: <a227e0f68145ccb3baf17a4e38e0c63a@upm.es>

Hi,

I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns 
and values for my matrix (efficiency). I have my 3 vectors in CSR 
format. If I run on a single processor, with my test case, everything 
works fine. I also worked without MatMPIAIJSetPreallocationCSR, and 
individually input each value with the call MatSetValues in MPI and this 
also works fine.

If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the 
vectors for each processor as they have done here;

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR

Thanks in advance,

Ollie


From atmmachado at gmail.com  Wed May 14 08:07:16 2014
From: atmmachado at gmail.com (=?UTF-8?B?QW5kcsOpIFRpbcOzdGhlbw==?=)
Date: Wed, 14 May 2014 10:07:16 -0300
Subject: [petsc-users] help: (petsc-dev) + petsc4py acessing the tao
	optimizations solvers ?
Message-ID: <CABGOjsSQ=uM9K4wN=sfX0m7M859VL7Mjv5gUR3djcwCFG-PZkg@mail.gmail.com>

 I read about the merger of the TAO solvers on the PETSC-DEV.

    How can I use the TAO's constrainded optimization solver on
the PETSC-DEV (via petsc4py)?

    Can you show me some simple python script to deal with classical linear
constrainded optimization problems like:

        minimize sum(x) subject to x >= 0 and Ax = b

    Thanks for your time.

Andre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/a8de712e/attachment.html>

From bsmith at mcs.anl.gov  Wed May 14 08:16:42 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 14 May 2014 08:16:42 -0500
Subject: [petsc-users] PetscLayoutCreate for Fortran
In-Reply-To: <CAEjaKwBc7X_jYw1K+wV0wMXJOdWAiwo1W=-4b3hoO7_b3DwESQ@mail.gmail.com>
References: <CAEjaKwDEpu7xqJLh+k12tDUQCaPS1vOqaCYL8-+Kq0CkdZ0KUg@mail.gmail.com>
	<CAMYG4GmOyP3t-BpszpoAi4P5GNCd6YYU6YpP40Pve3khhbLy6w@mail.gmail.com>
	<CAEjaKwD5Dzv9FrffBZQnjq9T7_t3=+uhbo=TcNpADpcmRD7=rg@mail.gmail.com>
	<44B5B5E9-5F44-4649-A01E-003189B05AD4@mcs.anl.gov>
	<CAEjaKwBc7X_jYw1K+wV0wMXJOdWAiwo1W=-4b3hoO7_b3DwESQ@mail.gmail.com>
Message-ID: <1670730F-7EB3-4CCF-91FE-452A7446215C@mcs.anl.gov>


On May 14, 2014, at 12:43 AM, Hossein Talebi <talebi.hossein at gmail.com> wrote:

> 
> Thank you.
> 
> Well, only the first part. I move around the elements and identify the Halo nodes etc. However, I do not renumber the vertices to be contiguous on the CPUs like what you said.

    You need to do this! Once this is done then using the PETSc solvers is easy. Note you can do this by simply counting the number of local vertices on each process and using an MPI_Scan to get the first number on each process from the previous process.
>  
> BUT, I just noticed: I partition the domain based on the computational wight of the elements which is different to that of Mat-Vec calculation. This means my portioning may not be efficient for the solution process. 

    That is fine, it is what we do to. 
> 
> I think I will then go with the copy-in, solve, copy-out option.

    I do not know what you mean here but it sounds bad.

> 
> 
> 
> 
> On Wed, May 14, 2014 at 3:06 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> On May 13, 2014, at 11:42 AM, Hossein Talebi <talebi.hossein at gmail.com> wrote:
> 
> >
> > I have already decomposed the Finite Element system using Metis. I just need to have the global rows exactly like how I define and I like to have the answer in the same layout so I don't have to move things around the processes again.
> 
>    Metis tells you a good partitioning IT DOES NOT MOVE the elements to form a good partitioning. Do you move the elements around based on what metis told you and similarly do you renumber the elements (and vertices) to be contiquously numbered on each process with the first process getting the first set of numbers, the second process the second set of numbers etc?
> 
>    If you do all that then when you create Vec and Mat you should simply set the local size (based on the number of local vertices on each process). You never need to use PetscLayoutCreate and in fact if your code was in C you would never use PetscLayoutCreate()
> 
>    If you do not do all that then you need to do that first before you start calling PETSc.
> 
>    Barry
> 
> >
> > No, I don't need it for something else.
> >
> > Cheers
> > Hossein
> >
> >
> >
> >
> > On Tue, May 13, 2014 at 6:36 PM, Matthew Knepley <knepley at gmail.com> wrote:
> > On Tue, May 13, 2014 at 11:07 AM, Hossein Talebi <talebi.hossein at gmail.com> wrote:
> > Hi All,
> >
> >
> > I am using PETSC from Fortran. I would like to define my own layout i.e. which row belongs to which CPU since I have already done the domain decomposition.  It appears that  "PetscLayoutCreate" and the other routine do this. But in the manual it says it is not provided in Fortran.
> >
> > Is there any way that I can do this using Fortran? Anyone has an example?
> >
> > You can do this for Vec and Mat directly. Do you want it for something else?
> >
> >   Thanks,
> >
> >      Matt
> >
> > Cheers
> > Hossein
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> >
> >
> >
> > --
> > www.permix.org
> 
> 
> 
> 
> -- 
> www.permix.org


From bsmith at mcs.anl.gov  Wed May 14 08:27:59 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 14 May 2014 08:27:59 -0500
Subject: [petsc-users] MatMPIAIJSetPreallocationCSR
In-Reply-To: <a227e0f68145ccb3baf17a4e38e0c63a@upm.es>
References: <a227e0f68145ccb3baf17a4e38e0c63a@upm.es>
Message-ID: <C1DEB677-2A1E-4E2C-91DB-F40B32A0F514@mcs.anl.gov>


On May 14, 2014, at 7:42 AM, Oliver Browne <oliver.browne at upm.es> wrote:

> Hi,
> 
> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine.
> 
> If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here;

   What do you mean by ?separate? the vectors? Each processor needs to provide ITS rows to the function call. You cannot have processor zero deliver all the rows. 

   
   Barry

> 
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR
> 
> Thanks in advance,
> 
> Ollie
> 


From atmmachado at gmail.com  Wed May 14 09:29:52 2014
From: atmmachado at gmail.com (=?UTF-8?B?QW5kcsOpIFRpbcOzdGhlbw==?=)
Date: Wed, 14 May 2014 11:29:52 -0300
Subject: [petsc-users] help: (petsc-dev) + petsc4py acessing the tao
 optimizations solvers ?
In-Reply-To: <CABGOjsSQ=uM9K4wN=sfX0m7M859VL7Mjv5gUR3djcwCFG-PZkg@mail.gmail.com>
References: <CABGOjsSQ=uM9K4wN=sfX0m7M859VL7Mjv5gUR3djcwCFG-PZkg@mail.gmail.com>
Message-ID: <CABGOjsSjbwb7cRd7bzfCQGixxUvddR5NpsgHGCP=uEPPBaBJsQ@mail.gmail.com>

p.s. I am not quite sure of the need of petsc-dev. I Just read about the
inclusion of tao solvers in it and installed petsc-dev, cython and petsc4py
on my ubuntu 12.04.

Unfortunately petsc4py (and the old tao4py) documentation does not have
this type of python documented examples. After some correspondence in
petsc4py mailing list they suggested that I might find some help on
petsc-users mailing list.


2014-05-14 10:07 GMT-03:00 Andr? Tim?theo <atmmachado at gmail.com>:

>  I read about the merger of the TAO solvers on the PETSC-DEV.
>
>     How can I use the TAO's constrainded optimization solver on
> the PETSC-DEV (via petsc4py)?
>
>     Can you show me some simple python script to deal with classical
> linear
> constrainded optimization problems like:
>
>         minimize sum(x) subject to x >= 0 and Ax = b
>
>     Thanks for your time.
>
>  Andre
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/e18d388f/attachment.html>

From lu_qin_2000 at yahoo.com  Wed May 14 09:48:32 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Wed, 14 May 2014 07:48:32 -0700 (PDT)
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <D7DD6D15-2D81-47A7-BE8A-BCAFC4152FA4@mcs.anl.gov>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>	<2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
	<CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
	<1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov>
	<1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com>
	<D7DD6D15-2D81-47A7-BE8A-BCAFC4152FA4@mcs.anl.gov>
Message-ID: <1400078912.39037.YahooMailNeo@web160203.mail.bf1.yahoo.com>

It turns out that I can not set PC side as right when KSP type?is set to pre_only. After I fixed that, it works fine.
?
This brings me a question: why do I have to set KSP type?to pre_only when SuperLU's ILUTP is used as preconditioner? Can I still?set KSP type?as KSPBCGS (which?seems to be?the fastest with PETSc's ILU for my cases)?
?
Thanks,
Qin
 

________________________________
 From: Barry Smith <bsmith at mcs.anl.gov>
To: Qin Lu <lu_qin_2000 at yahoo.com> 
Cc: Xiaoye S. Li <xsli at lbl.gov>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
Sent: Tuesday, May 13, 2014 8:31 PM
Subject: Re: [petsc-users] ILUTP in PETSc
  

? Works fine for me.? Please please please ALWAYS cut and paste the entire error message that is printed. We print the information for a reason, because it provides clues as to what went wrong.


? ./ex10 -f0 ~/Datafiles/Matrices/arco1 -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8 -ksp_monitor_true_residual -ksp_rtol 1.e-12 -ksp_view

? 0 KSP preconditioned resid norm 2.544968580491e+03 true resid norm 7.410897708964e+00 ||r(i)||/||b|| 1.000000000000e+00
? 1 KSP preconditioned resid norm 2.467110329809e-06 true resid norm 1.439993537311e-07 ||r(i)||/||b|| 1.943075716143e-08
? 2 KSP preconditioned resid norm 1.522204461523e-12 true resid norm 2.699724724531e-11 ||r(i)||/||b|| 3.642911871885e-12
KSP Object: 1 MPI processes
? type: gmres
? ? GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
? ? GMRES: happy breakdown tolerance 1e-30
? maximum iterations=10000, initial guess is zero
? tolerances:? relative=1e-12, absolute=1e-50, divergence=10000
? left preconditioning
? using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
? type: ilu
? ? ILU: out-of-place factorization
? ? 0 levels of fill
? ? tolerance for zero pivot 2.22045e-14
? ? using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
? ? matrix ordering: natural
? ? factor fill ratio given 0, needed 0
? ? ? Factored matrix follows:
? ? ? ? Mat Object:? ? ? ?  1 MPI processes
? ? ? ? ? type: seqaij
? ? ? ? ? rows=1501, cols=1501
? ? ? ? ? package used to perform factorization: superlu
? ? ? ? ? total: nonzeros=0, allocated nonzeros=0
? ? ? ? ? total number of mallocs used during MatSetValues calls =0
? ? ? ? ? ? SuperLU run parameters:
? ? ? ? ? ? ? Equil: YES
? ? ? ? ? ? ? ColPerm: 3
? ? ? ? ? ? ? IterRefine: 0
? ? ? ? ? ? ? SymmetricMode: NO
? ? ? ? ? ? ? DiagPivotThresh: 0.1
? ? ? ? ? ? ? PivotGrowth: NO
? ? ? ? ? ? ? ConditionNumber: NO
? ? ? ? ? ? ? RowPerm: 1
? ? ? ? ? ? ? ReplaceTinyPivot: NO
? ? ? ? ? ? ? PrintStat: NO
? ? ? ? ? ? ? lwork: 0
? ? ? ? ? ? ? ILU_DropTol: 1e-08
? ? ? ? ? ? ? ILU_FillTol: 0.01
? ? ? ? ? ? ? ILU_FillFactor: 10
? ? ? ? ? ? ? ILU_DropRule: 9
? ? ? ? ? ? ? ILU_Norm: 2
? ? ? ? ? ? ? ILU_MILU: 0
? linear system matrix = precond matrix:
? Mat Object:?  1 MPI processes
? ? type: seqaij
? ? rows=1501, cols=1501
? ? total: nonzeros=26131, allocated nonzeros=26131
? ? total number of mallocs used during MatSetValues calls =0
? ? ? using I-node routines: found 501 nodes, limit used is 5
Number of iterations =?  2
Residual norm 2.69972e-11
~/Src/petsc/src/ksp/ksp/examples/tutorials? master 


On May 13, 2014, at 12:17 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> I tried to use command line options as the example suggested ('-ksp_type preonly -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8') without changing my source code, but then the call to KSPSetUp returned error number 56.
>? 
> Does this mean I still need to change the source code (such as adding calls to PCFactorSetMatSolverPackage, PCFactorGetMatrix, etc.)in addition to the command line options?
>? 
> I ask this since the use of SuperLU seems to be different from using Hypre, which can be invoked with command line options without changing source code.
>? 
> Thanks a lot,
> Qin 
> 
> 
> ----- Original Message -----
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com>
> Cc: Xiaoye S. Li <xsli at lbl.gov>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Sent: Monday, May 12, 2014 5:11 PM
> Subject: Re: [petsc-users] ILUTP in PETSc
> 
> 
>? ? See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html
> 
> 
> 
> 
> 
> On May 12, 2014, at 4:54 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> 
>> Hello,
>> 
>> I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?)
>>? 
>> Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work?? 
>>? 
>> Many thanks,
>> Qin? 
>> 
>> 
>>?  From: Xiaoye S. Li <xsli at lbl.gov>
>> To: Barry Smith <bsmith at mcs.anl.gov> 
>> Cc: Qin Lu <lu_qin_2000 at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
>> Sent: Friday, May 2, 2014 3:40 PM
>> Subject: Re: [petsc-users] ILUTP in PETSc
>> 
>> 
>> 
>> The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily.? 
>> 
>> In SuperLU distribution:
>> 
>>? ? EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c)
>> 
>>? ? SRC/zgsitrf.c : the actual ILUTP factorization routine
>> 
>> 
>> Sherry Li
>> 
>> 
>> 
>> On Fri, May 2, 2014 at 12:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.htmlthere are two listed. ./configure ?download-hypre
>>> 
>>> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid
>>> 
>>> you can also add -help to see what options are available.
>>> 
>>>? ? Both pretty much suck and I can?t image much reason for using them.
>>> 
>>>? ?  Barry
>>> 
>>> 
>>> 
>>> On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthatmentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?
>>>> 
>>>> Many thanks,
>>>> Qin
>>> 
>>>? ? ? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/9ff3d8a7/attachment.html>

From jason.sarich at gmail.com  Wed May 14 09:57:19 2014
From: jason.sarich at gmail.com (Jason Sarich)
Date: Wed, 14 May 2014 09:57:19 -0500
Subject: [petsc-users] help: (petsc-dev) + petsc4py acessing the tao
 optimizations solvers ?
In-Reply-To: <fada4d4bac1a4e4e8b6eb7b4c2a88b91@LUCKMAN.anl.gov>
References: <CABGOjsSQ=uM9K4wN=sfX0m7M859VL7Mjv5gUR3djcwCFG-PZkg@mail.gmail.com>
	<fada4d4bac1a4e4e8b6eb7b4c2a88b91@LUCKMAN.anl.gov>
Message-ID: <CAFcybMkE8ZMLTAJTuZ0zUTLOLm3t9CGFd+sC=-CzF0Q3oz_rCw@mail.gmail.com>

Hi Andre,

TAO specializes in unconstrained and bound-constrained optimization, there
is not a lot of support for linear constrained optimization.
There is an interior point solver (ipm) that can accept general
constraints, there are new functions for setting up these constraints, they
aren't quite solid yet, and I've not very experienced in using the petsc4py
package, but I can give you some general help.

There is a simple C example using nonlinear constraints in
src/tao/examples/tutorials/toy.c, you should be able to easily  modify this
for your example, where the equality constraint function will evaluate Ax-b
and the equality jacobian will be the A matrix (you won't need to set the
inequality constraint or jacobian). Because the ipm method builds a KKT
system and solves it, it doesn't work well with iterative methods, a direct
solver like superlu may be necessary.

I don't know enough about the actual python bindings to give you an example
program in python, but it should follow pretty directly from the C example.
Please let me know if you have any specific questions.

Jason Sarich


On Wed, May 14, 2014 at 9:29 AM, Andr? Tim?theo <atmmachado at gmail.com>wrote:

>  p.s. I am not quite sure of the need of petsc-dev. I Just read about the
> inclusion of tao solvers in it and installed petsc-dev, cython and petsc4py
> on my ubuntu 12.04.
>
>  Unfortunately petsc4py (and the old tao4py) documentation does not have
> this type of python documented examples. After some correspondence in
> petsc4py mailing list they suggested that I might find some help on
> petsc-users mailing list.
>
>
> 2014-05-14 10:07 GMT-03:00 Andr? Tim?theo <atmmachado at gmail.com>:
>
>     I read about the merger of the TAO solvers on the PETSC-DEV.
>>
>>      How can I use the TAO's constrainded optimization solver on
>> the PETSC-DEV (via petsc4py)?
>>
>>      Can you show me some simple python script to deal with classical
>> linear
>> constrainded optimization problems like:
>>
>>          minimize sum(x) subject to x >= 0 and Ax = b
>>
>>      Thanks for your time.
>>
>>  Andre
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/39c156da/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed May 14 11:34:10 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 14 May 2014 11:34:10 -0500
Subject: [petsc-users] MatMPIAIJSetPreallocationCSR
In-Reply-To: <30468df2eb47c9fc3cb02433a6ecc1f9@upm.es>
References: <a227e0f68145ccb3baf17a4e38e0c63a@upm.es>
	<C1DEB677-2A1E-4E2C-91DB-F40B32A0F514@mcs.anl.gov>
	<30468df2eb47c9fc3cb02433a6ecc1f9@upm.es>
Message-ID: <373F07AB-A56A-4A9A-BD2E-F854FEC16D50@mcs.anl.gov>


  See the manual page for MatMPIAIJSetPreallocationCSR() it gives an explicit simple example 

      The format which is used for the sparse matrix input, is equivalent to a
    row-major ordering.. i.e for the following matrix, the input data expected is
    as shown:

        1 0 0
        2 0 3     P0
       -------
        4 5 6     P1

     Process0 [P0]: rows_owned=[0,1]
        i =  {0,1,3}  [size = nrow+1  = 2+1]
        j =  {0,0,2}  [size = nz = 6]
        v =  {1,2,3}  [size = nz = 6]

     Process1 [P1]: rows_owned=[2]
        i =  {0,3}    [size = nrow+1  = 1+1]
        j =  {0,1,2}  [size = nz = 6]
        v =  {4,5,6}  [size = nz = 6]


   The column indices are global, the numerical values are just numerical values and do not need to be adjusted. On each process the i indices start with 0 because they just point into the local part of the j indices.

    Are you saying each process of yours HAS the entire matrix? If so you just need to adjust the local portion of the i vales and pass that plus the appropriate location in j and v to the routine as in the example above.

   Barry

On May 14, 2014, at 8:36 AM, Oliver Browne <oliver.browne at upm.es> wrote:

> 
> 
> 
> On 14-05-2014 15:27, Barry Smith wrote:
>> On May 14, 2014, at 7:42 AM, Oliver Browne <oliver.browne at upm.es> wrote:
>>> Hi,
>>> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine.
>>> If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here;
>>   What do you mean by ?separate? the vectors? Each processor
>> needs to provide ITS rows to the function call. You cannot have
>> processor zero deliver all the rows.
> 
> I mean split them so they change from global numbering to local numbering.
> 
> At the moment I just have
> 
> CALL MatMPIAIJSetPreallocationCSR(A,NVPN,NNVI,CONT,ierr) - 3 vectors have global numbering
> 
> How can submit this to a specific processor?
> 
> Ollie
> 
> 
>>   Barry
>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR
>>> Thanks in advance,
>>> Ollie


From bsmith at mcs.anl.gov  Wed May 14 11:37:57 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 14 May 2014 11:37:57 -0500
Subject: [petsc-users] ILUTP in PETSc
In-Reply-To: <1400078912.39037.YahooMailNeo@web160203.mail.bf1.yahoo.com>
References: <1399044433.78885.YahooMailNeo@web160202.mail.bf1.yahoo.com>	<2E9F2B52-5BB3-46ED-8AA6-59269D205EB7@mcs.anl.gov>
	<CAFvbobW4j3NFFL5280FEsC0=WeWzSbx5oTCO5WvBzFKAu1nMrw@mail.gmail.com>
	<1399931657.91489.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<8AE05934-684C-47B5-9A24-2C2B00699F46@mcs.anl.gov>
	<1400001446.45172.YahooMailNeo@web160206.mail.bf1.yahoo.com>
	<D7DD6D15-2D81-47A7-BE8A-BCAFC4152FA4@mcs.anl.gov>
	<1400078912.39037.YahooMailNeo@web160203.mail.bf1.yahoo.com>
Message-ID: <B677D638-644F-43DB-9453-485AFEF92234@mcs.anl.gov>


On May 14, 2014, at 9:48 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> It turns out that I can not set PC side as right when KSP type is set to pre_only. After I fixed that, it works fine.
>  
> This brings me a question: why do I have to set KSP type to pre_only when SuperLU's ILUTP is used as preconditioner?

 You don?t and you shouldn?t.  Since you are using SuperLU ILU as a preconditioner, not a direct solver, using preonly means that it will apply the ILU triangular solves only once which means it will not return the correct solution. For LU you can use preonly since LU is a direct (?exact?) solver.

> Can I still set KSP type as KSPBCGS (which seems to be the fastest with PETSc's ILU for my cases)?

    Yes. 

   Barry

>  
> Thanks,
> Qin
> 
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com> 
> Cc: Xiaoye S. Li <xsli at lbl.gov>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
> Sent: Tuesday, May 13, 2014 8:31 PM
> Subject: Re: [petsc-users] ILUTP in PETSc
> 
> 
>   Works fine for me.  Please please please ALWAYS cut and paste the entire error message that is printed. We print the information for a reason, because it provides clues as to what went wrong.
> 
> 
>   ./ex10 -f0 ~/Datafiles/Matrices/arco1 -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8 -ksp_monitor_true_residual -ksp_rtol 1.e-12 -ksp_view
> 
>   0 KSP preconditioned resid norm 2.544968580491e+03 true resid norm 7.410897708964e+00 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP preconditioned resid norm 2.467110329809e-06 true resid norm 1.439993537311e-07 ||r(i)||/||b|| 1.943075716143e-08
>   2 KSP preconditioned resid norm 1.522204461523e-12 true resid norm 2.699724724531e-11 ||r(i)||/||b|| 3.642911871885e-12
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-12, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: ilu
>     ILU: out-of-place factorization
>     0 levels of fill
>     tolerance for zero pivot 2.22045e-14
>     using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>     matrix ordering: natural
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:        1 MPI processes
>           type: seqaij
>           rows=1501, cols=1501
>           package used to perform factorization: superlu
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU run parameters:
>               Equil: YES
>               ColPerm: 3
>               IterRefine: 0
>               SymmetricMode: NO
>               DiagPivotThresh: 0.1
>               PivotGrowth: NO
>               ConditionNumber: NO
>               RowPerm: 1
>               ReplaceTinyPivot: NO
>               PrintStat: NO
>               lwork: 0
>               ILU_DropTol: 1e-08
>               ILU_FillTol: 0.01
>               ILU_FillFactor: 10
>               ILU_DropRule: 9
>               ILU_Norm: 2
>               ILU_MILU: 0
>   linear system matrix = precond matrix:
>   Mat Object:  1 MPI processes
>     type: seqaij
>     rows=1501, cols=1501
>     total: nonzeros=26131, allocated nonzeros=26131
>     total number of mallocs used during MatSetValues calls =0
>       using I-node routines: found 501 nodes, limit used is 5
> Number of iterations =  2
> Residual norm 2.69972e-11
> ~/Src/petsc/src/ksp/ksp/examples/tutorials  master 
> 
> On May 13, 2014, at 12:17 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> 
> > I tried to use command line options as the example suggested ('-ksp_type preonly -pc_type ilu -pc_factor_mat_solver_package superlu -mat_superlu_ilu_droptol 1.e-8') without changing my source code, but then the call to KSPSetUp returned error number 56.
> >  
> > Does this mean I still need to change the source code (such as adding calls to PCFactorSetMatSolverPackage, PCFactorGetMatrix, etc.)in addition to the command line options?
> >  
> > I ask this since the use of SuperLU seems to be different from using Hypre, which can be invoked with command line options without changing source code.
> >  
> > Thanks a lot,
> > Qin 
> > 
> > 
> > ----- Original Message -----
> > From: Barry Smith <bsmith at mcs.anl.gov>
> > To: Qin Lu <lu_qin_2000 at yahoo.com>
> > Cc: Xiaoye S. Li <xsli at lbl.gov>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> > Sent: Monday, May 12, 2014 5:11 PM
> > Subject: Re: [petsc-users] ILUTP in PETSc
> > 
> > 
> >    See for example: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERSUPERLU.html
> > 
> > 
> > 
> > 
> > 
> > On May 12, 2014, at 4:54 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> > 
> >> Hello,
> >> 
> >> I have built PETSc with SuperLU, but what are PETSc's command line options to invoke SuperLU's ILUTP preconditioner and to set the dropping tolerance? (-mat_superlu_ilu_droptol for the latter?)
> >>  
> >> Do I need to do some programming in order to call SuperLU's preconditioner, or the command line options would work?  
> >>  
> >> Many thanks,
> >> Qin  
> >> 
> >> 
> >>  From: Xiaoye S. Li <xsli at lbl.gov>
> >> To: Barry Smith <bsmith at mcs.anl.gov> 
> >> Cc: Qin Lu <lu_qin_2000 at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
> >> Sent: Friday, May 2, 2014 3:40 PM
> >> Subject: Re: [petsc-users] ILUTP in PETSc
> >> 
> >> 
> >> 
> >> The sequential SuperLU has ILUTP implementation, not in parallel versions. PETSc already supports the option of using SuperLU, so you should be able to try easily.  
> >> 
> >> In SuperLU distribution:
> >> 
> >>    EXAMPLE/zitersol.c : an example to use GMRES with ILUTP preconditioner (returned from driver SRC/zgsisx.c)
> >> 
> >>    SRC/zgsitrf.c : the actual ILUTP factorization routine
> >> 
> >> 
> >> Sherry Li
> >> 
> >> 
> >> 
> >> On Fri, May 2, 2014 at 12:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >> 
> >> 
> >>> At http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.htmlthere are two listed. ./configure ?download-hypre
> >>> 
> >>> mpiexec -n 23 ./yourprogram -pc_type hypre -pc_hypre_type ilupt or euclid
> >>> 
> >>> you can also add -help to see what options are available.
> >>> 
> >>>    Both pretty much suck and I can?t image much reason for using them.
> >>> 
> >>>    Barry
> >>> 
> >>> 
> >>> 
> >>> On May 2, 2014, at 10:27 AM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> >>> 
> >>>> Hello,
> >>>> 
> >>>> I am interested in using ILUTP preconditioner with PETSc linear solver. There is an online doc https://fs.hlrs.de/projects/par/par_prog_ws/pdf/petsc_nersc01_short.pdfthatmentioned it is available in PETSc with other packages (page 62-63). Is there any instructions or examples on how to use it?
> >>>> 
> >>>> Many thanks,
> >>>> Qin
> >>> 
> >>>      
> 
> 


From bsmith at mcs.anl.gov  Wed May 14 13:03:04 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 14 May 2014 13:03:04 -0500
Subject: [petsc-users] petsc 3.4, mat_view and prefix problem
In-Reply-To: <dd17e553056f42e8aa5795d825e3b4e0@MAR190N1.marin.local>
References: <dd17e553056f42e8aa5795d825e3b4e0@MAR190N1.marin.local>
Message-ID: <CD3885D7-2882-4C84-BB07-11E5C503D7D3@mcs.anl.gov>


   Yes, some of this handling of prefixes and viewing is wonky in 3.4  it is all fixed and coherent in master of the development version and will be correct in the next release.

    Barry

On May 14, 2014, at 4:02 AM, Klaij, Christiaan <C.Klaij at marin.nl> wrote:

> I'm having problems using mat_view in petsc 3.4.3 in combination
> with a prefix. For example in ../snes/examples/tutorials/ex70:
> 
>        mpiexec -n 2 ./ex70 -nx 16 -ny 24 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower -user_ksp -a00_mat_view
> 
> does not print the matrix a00 to screen. This used to work in 3.3
> versions before the single consistent -xxx_view scheme.
> 
> Similarly, if I add this at line 105 of
> ../ksp/ksp/examples/tutorials/ex1f.F:
> 
>      call MatSetOptionsPrefix(A,"a_",ierr)
> 
> then running with -mat_view still prints the matrix to screen but
> running with -a_mat_view doesn't. I expected the opposite.
> 
> The problem only occurs with mat, not with ksp. For example, if I
> add this at line 184 of ../ksp/ksp/examples/tutorials/ex1f.F:
> 
>      call KSPSetOptionsPrefix(ksp,"a_",ierr)
> 
> then running with -a_ksp_monitor does print the residuals to
> screen and -ksp_monitor doesn't, as expected.
> 
> 
> dr. ir. Christiaan Klaij
> CFD Researcher
> Research & Development
> E mailto:C.Klaij at marin.nl
> T +31 317 49 33 44
> 
> 
> MARIN
> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands
> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl
> 


From bsmith at mcs.anl.gov  Wed May 14 13:05:12 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 14 May 2014 13:05:12 -0500
Subject: [petsc-users] MatMPIAIJSetPreallocationCSR
In-Reply-To: <4ede9bde820986a689a2ba2fcb6291db@upm.es>
References: <a227e0f68145ccb3baf17a4e38e0c63a@upm.es>
	<C1DEB677-2A1E-4E2C-91DB-F40B32A0F514@mcs.anl.gov>
	<30468df2eb47c9fc3cb02433a6ecc1f9@upm.es>
	<373F07AB-A56A-4A9A-BD2E-F854FEC16D50@mcs.anl.gov>
	<4ede9bde820986a689a2ba2fcb6291db@upm.es>
Message-ID: <0CC4B85D-12C2-4C4B-BEFE-6E42437101E3@mcs.anl.gov>


On May 14, 2014, at 12:27 PM, Oliver Browne <oliver.browne at upm.es> wrote:

> 
> 
> On 14-05-2014 17:34, Barry Smith wrote:
>> See the manual page for MatMPIAIJSetPreallocationCSR() it gives an
>> explicit simple example
>>      The format which is used for the sparse matrix input, is equivalent to a
>>    row-major ordering.. i.e for the following matrix, the input data
>> expected is
>>    as shown:
>>        1 0 0
>>        2 0 3     P0
>>       -------
>>        4 5 6     P1
>>     Process0 [P0]: rows_owned=[0,1]
>>        i =  {0,1,3}  [size = nrow+1  = 2+1]
>>        j =  {0,0,2}  [size = nz = 6]
>>        v =  {1,2,3}  [size = nz = 6]
>>     Process1 [P1]: rows_owned=[2]
>>        i =  {0,3}    [size = nrow+1  = 1+1]
>>        j =  {0,1,2}  [size = nz = 6]
>>        v =  {4,5,6}  [size = nz = 6]
>>   The column indices are global, the numerical values are just
>> numerical values and do not need to be adjusted. On each process the i
>> indices start with 0 because they just point into the local part of
>> the j indices.
>>    Are you saying each process of yours HAS the entire matrix?
> 
> I am not entirely sure about this and what it means. Each processor has a portion of the matrix.
> 
> 
> If so
>> you just need to adjust the local portion of the i vales and pass that
>> plus the appropriate location in j and v to the routine as in the
>> example above.
> 
> So this MatMPIAIJSetPreallocationCSR call should be in some sort of loop;
> 
> Do counter =  1, No of Processors
> 
> calculate local numbering for i and isolate parts of j and v needed
> 
> Call MatMPIAIJSetPreallocationCSR(A,i,j,v)
> 
> END DO
> 
> Is this correct?

   Oh boy, oh boy.  No absolutely not. Each process is calling MatMPIAIJSetPreallocationCSR() once with its part of the data. 

  Barry

> 
> Ollie
> 
>>   Barry
>> On May 14, 2014, at 8:36 AM, Oliver Browne <oliver.browne at upm.es> wrote:
>>> On 14-05-2014 15:27, Barry Smith wrote:
>>>> On May 14, 2014, at 7:42 AM, Oliver Browne <oliver.browne at upm.es> wrote:
>>>>> Hi,
>>>>> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine.
>>>>> If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here;
>>>>  What do you mean by ?separate? the vectors? Each processor
>>>> needs to provide ITS rows to the function call. You cannot have
>>>> processor zero deliver all the rows.
>>> I mean split them so they change from global numbering to local numbering.
>>> At the moment I just have
>>> CALL MatMPIAIJSetPreallocationCSR(A,NVPN,NNVI,CONT,ierr) - 3 vectors have global numbering
>>> How can submit this to a specific processor?
>>> Ollie
>>>>  Barry
>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR
>>>>> Thanks in advance,
>>>>> Ollie


From jed at jedbrown.org  Wed May 14 19:08:21 2014
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 14 May 2014 18:08:21 -0600
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
Message-ID: <87zjij6gzu.fsf@jedbrown.org>

Christophe Ortiz <christophe.ortiz at ciemat.es> writes:

> Hi all,
>
> I am experiencing some problems of memory corruption with PetscMemzero().
>
> I set the values of the Jacobian by blocks using MatSetValuesBlocked(). To
> do so, I use some temporary two-dimensional arrays[dof][dof] that I must
> reset at each loop.
>
> Inside FormIJacobian, for instance, I declare the following two-dimensional
> array:
>
>    PetscScalar  diag[dof][dof];
>
> and then, to zero the array diag[][] I do
>
>    ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar));

Note that this can also be spelled

  PetscMemzero(diag,sizeof diag);

> Then, inside main(), once dof is determined, I allocate memory for diag as
> follows:
>
>  diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
>
>  for (k = 0; k < dof; k++){
>   diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
>  }
> That is, the classical way to allocate memory using the pointer notation.

Note that you can do a contiguous allocation by creating a Vec, then use
VecGetArray2D to get 2D indexing of it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/8a6b3b42/attachment.pgp>

From jed at jedbrown.org  Wed May 14 19:58:23 2014
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 14 May 2014 18:58:23 -0600
Subject: [petsc-users] Memory usage during matrix factorization
In-Reply-To: <17A78B9D13564547AC894B88C159674720382707@XMBX4.uibk.ac.at>
References: <17A78B9D13564547AC894B88C1596747203826DE@XMBX4.uibk.ac.at>
	<B10D52A7-BFBE-4226-9190-FDCAEF3DA8C2@mcs.anl.gov>
	<87vbt98ev4.fsf@jedbrown.org>
	<17A78B9D13564547AC894B88C159674720382707@XMBX4.uibk.ac.at>
Message-ID: <87r43v6eog.fsf@jedbrown.org>

"De Groof, Vincent Frans Maria" <Vincent.De-Groof at uibk.ac.at> writes:

> Thanks. I made a new function based on the PetscGetCurrentUsage which
> does what I want. It seems like I am being lucky as the numbers
> returned by the OS seem to be reasonable.

For example, Blue Gene/Q wants us to use Kernel_GetMemorySize, which
resides in a header with inline assembly.  So we need compiler flags
that support the inline assembly just to access the function.
getrusage() is useless on BG/Q and some other HPC systems.  If it works
on your system, you should think the vendor for doing something
reasonable with the useful POSIX function.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140514/4d9d1bc7/attachment.pgp>

From mairhofer at itt.uni-stuttgart.de  Thu May 15 09:39:00 2014
From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer)
Date: Thu, 15 May 2014 16:39:00 +0200
Subject: [petsc-users] PetscMalloc with Fortran
Message-ID: <5374D184.5070207@itt.uni-stuttgart.de>

Hi, I'm trying to set the coloring of a matrix using ISColoringCreate. 
Therefore I need an array 'colors' which in C can be creates as (from 
example ex5s.c)

int *colors
PetscMalloc(...,&colors)

colors(i) = ....

ISColoringCreate(...)

How do I have to define the array colors in Fortran?

I tried:

Integer, allocatable :: colors(:)    and    allocate() instead of 
PetscMalloc

and

Integer, pointer :: colors

but neither worked.

Thanks,
Jonas

From jed at jedbrown.org  Thu May 15 09:45:21 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 15 May 2014 08:45:21 -0600
Subject: [petsc-users] PetscMalloc with Fortran
In-Reply-To: <5374D184.5070207@itt.uni-stuttgart.de>
References: <5374D184.5070207@itt.uni-stuttgart.de>
Message-ID: <87ha4r3xtq.fsf@jedbrown.org>

Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de> writes:

> Hi, I'm trying to set the coloring of a matrix using ISColoringCreate. 
> Therefore I need an array 'colors' which in C can be creates as (from 
> example ex5s.c)
>
> int *colors
> PetscMalloc(...,&colors)

There is no PetscMalloc in Fortran, due to language "deficiencies".

> colors(i) = ....
>
> ISColoringCreate(...)
>
> How do I have to define the array colors in Fortran?
>
> I tried:
>
> Integer, allocatable :: colors(:)    and    allocate() instead of 
> PetscMalloc
>
> and
>
> Integer, pointer :: colors
>
> but neither worked.

The ISColoringCreate Fortran binding copies from the array you pass into
one allocated using PetscMalloc.  You should pass a normal Fortran array
(statically or dynamically allocated).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140515/83aa46ed/attachment.pgp>

From mairhofer at itt.uni-stuttgart.de  Thu May 15 11:56:44 2014
From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer)
Date: Thu, 15 May 2014 18:56:44 +0200
Subject: [petsc-users] PetscMalloc with Fortran
In-Reply-To: <87ha4r3xtq.fsf@jedbrown.org>
References: <5374D184.5070207@itt.uni-stuttgart.de>
	<87ha4r3xtq.fsf@jedbrown.org>
Message-ID: <5374F1CC.3080906@itt.uni-stuttgart.de>


If 'colors' can be a dynamically allocated array then I dont know where
the mistake is in this code:


        ISColoring iscoloring
        Integer, allocatable :: colors(:)
        PetscInt maxc

       ...


       !calculate max. number of colors
       maxc = 2*irc+1 !irc is the number of ghost nodes needed to
calculate the function I want to solve

      allocate(colors(user%xm))  !where user%xm is the number of locally
owned nodes of a global array

      !Set colors
      DO i=1,user%xm
           colors(i) = mod(i,maxc)
      END DO

     call
ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr)

     ...

     deallocate(colors)
     call ISColoringDestroy(iscoloring,ierr)


On execution I get the following error message (running the DO Loop from
0 to user%xm-1 does not change anything):


[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: Arguments are incompatible!
[0]PETSC ERROR: Number of colors passed in 291 is less then the actual
number of colors in array 61665!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named
aries.itt.uni-stuttgart.de by mhofer Thu May 15 18:01:41 2014
[0]PETSC ERROR: Libraries linked from
/usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib
[0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014
[0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
--download-f-blas-lapack --download-mpich
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: ISColoringCreate() line 276 in
/usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c


But when I print out colors, it only has entries from 0 to 218, so no
entry is larger then 291 as stated in the error message.


Am 15.05.2014 16:45, schrieb Jed Brown:
> Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de> writes:
>
>> Hi, I'm trying to set the coloring of a matrix using ISColoringCreate.
>> Therefore I need an array 'colors' which in C can be creates as (from
>> example ex5s.c)
>>
>> int *colors
>> PetscMalloc(...,&colors)
> There is no PetscMalloc in Fortran, due to language "deficiencies".
>
>> colors(i) = ....
>>
>> ISColoringCreate(...)
>>
>> How do I have to define the array colors in Fortran?
>>
>> I tried:
>>
>> Integer, allocatable :: colors(:)    and    allocate() instead of
>> PetscMalloc
>>
>> and
>>
>> Integer, pointer :: colors
>>
>> but neither worked.
> The ISColoringCreate Fortran binding copies from the array you pass into
> one allocated using PetscMalloc.  You should pass a normal Fortran array
> (statically or dynamically allocated).


From prbrune at gmail.com  Thu May 15 12:16:27 2014
From: prbrune at gmail.com (Peter Brune)
Date: Thu, 15 May 2014 12:16:27 -0500
Subject: [petsc-users] PetscMalloc with Fortran
In-Reply-To: <5374F1CC.3080906@itt.uni-stuttgart.de>
References: <5374D184.5070207@itt.uni-stuttgart.de>
	<87ha4r3xtq.fsf@jedbrown.org>
	<5374F1CC.3080906@itt.uni-stuttgart.de>
Message-ID: <CAKKtykwL42Zk3GjyKODttyh9FffowVtLFJ0-S_pXQAQAeB_Omw@mail.gmail.com>

You should be using an array of type ISColoringValue. ISColoringValue is by
default a short, not an int, so you're getting nonsense entries.  We should
either maintain or remove ex5s if it does something like this.

- Peter


On Thu, May 15, 2014 at 11:56 AM, Jonas Mairhofer <
mairhofer at itt.uni-stuttgart.de> wrote:

>
> If 'colors' can be a dynamically allocated array then I dont know where
> the mistake is in this code:
>
>
>
>
>
>        ISColoring iscoloring
>        Integer, allocatable :: colors(:)
>        PetscInt maxc
>
>       ...
>
>
>       !calculate max. number of colors
>       maxc = 2*irc+1 !irc is the number of ghost nodes needed to
> calculate the function I want to solve
>
>      allocate(colors(user%xm))  !where user%xm is the number of locally
> owned nodes of a global array
>
>      !Set colors
>      DO i=1,user%xm
>           colors(i) = mod(i,maxc)
>      END DO
>
>     call
> ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr)
>
>     ...
>
>     deallocate(colors)
>     call ISColoringDestroy(iscoloring,ierr)
>
>
>
>
> On execution I get the following error message (running the DO Loop from
> 0 to user%xm-1 does not change anything):
>
>
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Arguments are incompatible!
> [0]PETSC ERROR: Number of colors passed in 291 is less then the actual
> number of colors in array 61665!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named
> aries.itt.uni-stuttgart.de by mhofer Thu May 15 18:01:41 2014
> [0]PETSC ERROR: Libraries linked from
> /usr/ITT/mhofer/Documents/Diss/NumericalMethods/
> Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib
> [0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
> --download-f-blas-lapack --download-mpich
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ISColoringCreate() line 276 in
> /usr/ITT/mhofer/Documents/Diss/NumericalMethods/
> Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c
>
>
>
>
>
> But when I print out colors, it only has entries from 0 to 218, so no
> entry is larger then 291 as stated in the error message.
>
>
>
>
>
>
>
>
>
>
> Am 15.05.2014 16:45, schrieb Jed Brown:
>
>  Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de> writes:
>>
>>  Hi, I'm trying to set the coloring of a matrix using ISColoringCreate.
>>> Therefore I need an array 'colors' which in C can be creates as (from
>>> example ex5s.c)
>>>
>>> int *colors
>>> PetscMalloc(...,&colors)
>>>
>> There is no PetscMalloc in Fortran, due to language "deficiencies".
>>
>>  colors(i) = ....
>>>
>>> ISColoringCreate(...)
>>>
>>> How do I have to define the array colors in Fortran?
>>>
>>> I tried:
>>>
>>> Integer, allocatable :: colors(:)    and    allocate() instead of
>>> PetscMalloc
>>>
>>> and
>>>
>>> Integer, pointer :: colors
>>>
>>> but neither worked.
>>>
>> The ISColoringCreate Fortran binding copies from the array you pass into
>> one allocated using PetscMalloc.  You should pass a normal Fortran array
>> (statically or dynamically allocated).
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140515/eaa7e1fc/attachment.html>

From mrosso at uci.edu  Thu May 15 17:15:23 2014
From: mrosso at uci.edu (Michele Rosso)
Date: Thu, 15 May 2014 15:15:23 -0700
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
Message-ID: <53753C7B.8010201@uci.edu>

Hi,

I am solving an inhomogeneous Laplacian in 3D (basically a slightly 
modified version of example ex34).
The laplacian is discretized by using a  cell-center finite difference 
7-point stencil with periodic BCs.
I am solving a time-dependent problem so the solution of the laplacian 
is repeated at each time step with a different matrix (always SPD 
though) and rhs. Also, the laplacian features large magnitude variations 
in the coefficients. I solve by means of CG + GAMG as preconditioner.
Everything works fine for a while until I receive  a 
DIVERGED_INDEFINITE_PC message. Before checking my model is incorrect I 
would like to rule out the possibility of improper use of the linear solver.
I attached the full output of a serial run with -log-summary -ksp_view 
-ksp_converged_reason ksp_monitor_true_residual. I would appreciate if 
you could help me in locating the issue.
Thanks.

Michele


-------------- next part --------------

  0 KSP unpreconditioned resid norm 1.436519531784e-03 true resid norm 1.436519531784e-03 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP unpreconditioned resid norm 5.442655469101e-04 true resid norm 5.442655469101e-04 ||r(i)||/||b|| 3.788779302111e-01
  2 KSP unpreconditioned resid norm 9.032635039164e-05 true resid norm 9.032635039164e-05 ||r(i)||/||b|| 6.287860930055e-02
  3 KSP unpreconditioned resid norm 2.083274324922e-05 true resid norm 2.083274324922e-05 ||r(i)||/||b|| 1.450223459430e-02
  4 KSP unpreconditioned resid norm 3.472803766647e-06 true resid norm 3.472803766647e-06 ||r(i)||/||b|| 2.417512390058e-03
  5 KSP unpreconditioned resid norm 5.985774054401e-07 true resid norm 5.985774054401e-07 ||r(i)||/||b|| 4.166858801403e-04
  6 KSP unpreconditioned resid norm 1.076601847506e-07 true resid norm 1.076601847507e-07 ||r(i)||/||b|| 7.494515902407e-05
  7 KSP unpreconditioned resid norm 1.899550823170e-08 true resid norm 1.899550823182e-08 ||r(i)||/||b|| 1.322328573440e-05
  8 KSP unpreconditioned resid norm 3.253138971746e-09 true resid norm 3.253138971800e-09 ||r(i)||/||b|| 2.264597800324e-06
  9 KSP unpreconditioned resid norm 5.542615532199e-10 true resid norm 5.542615531507e-10 ||r(i)||/||b|| 3.858364198240e-07
Linear solve converged due to CONVERGED_RTOL iterations 9
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        1 ****************************************
                         Simulation time =     0.0391 sec 
                          Time/time step =    28.3848 sec 
                  Average time/time step =    28.3848 sec 
  
   U MAX =  0.000000000000000E+00  V MAX =  0.000000000000000E+00  W MAX =  0.000000000000000E+00  
   U MIN =  0.000000000000000E+00  V MIN =  0.000000000000000E+00  W MIN = -2.393531935714555E-04  
   U MAX =  0.000000000000000E+00  V MAX =  0.000000000000000E+00  W MAX =  2.393531935714555E-04  
   max(|divU|)    =  1.519695631721842E-03
   sum(divU*dV)   = -5.185694521768437E-19
   sum(|divU|*dV) =  7.171332934751148E-04
   Convective cfl =  1.531860438857315E-03  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.222653406554768E-01  
               Iterations to convergence =     9
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.387273402621e-03 true resid norm 1.387273402621e-03 ||r(i)||/||b|| 4.913739999200e-01
  1 KSP unpreconditioned resid norm 5.438934302605e-04 true resid norm 5.438934302605e-04 ||r(i)||/||b|| 1.926477433016e-01
  2 KSP unpreconditioned resid norm 8.987734737919e-05 true resid norm 8.987734737919e-05 ||r(i)||/||b|| 3.183467051302e-02
  3 KSP unpreconditioned resid norm 2.082485764685e-05 true resid norm 2.082485764685e-05 ||r(i)||/||b|| 7.376191009187e-03
  4 KSP unpreconditioned resid norm 3.456700428092e-06 true resid norm 3.456700428091e-06 ||r(i)||/||b|| 1.224367678835e-03
  5 KSP unpreconditioned resid norm 5.978250288791e-07 true resid norm 5.978250288791e-07 ||r(i)||/||b|| 2.117503839817e-04
  6 KSP unpreconditioned resid norm 1.072323731238e-07 true resid norm 1.072323731239e-07 ||r(i)||/||b|| 3.798184265859e-05
  7 KSP unpreconditioned resid norm 1.896537313626e-08 true resid norm 1.896537313628e-08 ||r(i)||/||b|| 6.717559235506e-06
  8 KSP unpreconditioned resid norm 3.238581391500e-09 true resid norm 3.238581391409e-09 ||r(i)||/||b|| 1.147109639208e-06
  9 KSP unpreconditioned resid norm 5.535175902890e-10 true resid norm 5.535175902904e-10 ||r(i)||/||b|| 1.960566329991e-07
Linear solve converged due to CONVERGED_RTOL iterations 9
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        2 ****************************************
                         Simulation time =     0.0781 sec 
                          Time/time step =     5.6450 sec 
                  Average time/time step =    17.0149 sec 
  
   U MAX =  2.461642558262655E-06  V MAX =  2.461642558262655E-06  W MAX =  0.000000000000000E+00  
   U MIN = -2.461642558262655E-06  V MIN = -2.461642558262655E-06  W MIN = -4.787063871429022E-04  
   U MAX =  2.461642558262655E-06  V MAX =  2.461642558262655E-06  W MAX =  4.787063871429022E-04  
   max(|divU|)    =  2.991458565354511E-03
   sum(divU*dV)   = -3.567187440313492E-19
   sum(|divU|*dV) =  1.437560546432204E-03
   Convective cfl =  3.095229902460337E-03  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.215998719111754E-01  
               Iterations to convergence =     9
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.358918535350e-03 true resid norm 1.358918535350e-03 ||r(i)||/||b|| 3.249904384726e-01
  1 KSP unpreconditioned resid norm 5.438197237016e-04 true resid norm 5.438197237016e-04 ||r(i)||/||b|| 1.300565161621e-01
  2 KSP unpreconditioned resid norm 8.964805696977e-05 true resid norm 8.964805696977e-05 ||r(i)||/||b|| 2.143966734202e-02
  3 KSP unpreconditioned resid norm 2.082401686586e-05 true resid norm 2.082401686586e-05 ||r(i)||/||b|| 4.980141337354e-03
  4 KSP unpreconditioned resid norm 3.448023436879e-06 true resid norm 3.448023436880e-06 ||r(i)||/||b|| 8.246076710745e-04
  5 KSP unpreconditioned resid norm 5.975866292463e-07 true resid norm 5.975866292461e-07 ||r(i)||/||b|| 1.429150722519e-04
  6 KSP unpreconditioned resid norm 1.070083763305e-07 true resid norm 1.070083763303e-07 ||r(i)||/||b|| 2.559145249634e-05
  7 KSP unpreconditioned resid norm 1.895097623737e-08 true resid norm 1.895097623731e-08 ||r(i)||/||b|| 4.532196681870e-06
  8 KSP unpreconditioned resid norm 3.230731653657e-09 true resid norm 3.230731653621e-09 ||r(i)||/||b|| 7.726415302937e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        3 ****************************************
                         Simulation time =     0.1172 sec 
                          Time/time step =     5.2815 sec 
                  Average time/time step =    13.1038 sec 
  
   U MAX =  6.252436586750240E-06  V MAX =  6.252436586750240E-06  W MAX =  0.000000000000000E+00  
   U MIN = -6.252436586750240E-06  V MIN = -6.252436586750240E-06  W MIN = -7.180595806250817E-04  
   U MAX =  6.252436586750240E-06  V MAX =  6.252436586750240E-06  W MAX =  7.180595806250817E-04  
   max(|divU|)    =  4.445068216431675E-03
   sum(divU*dV)   =  8.811539788548160E-18
   sum(|divU|*dV) =  2.160108093875001E-03
   Convective cfl =  4.675612504310926E-03  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.209274045579312E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.332134890432e-03 true resid norm 1.332134890432e-03 ||r(i)||/||b|| 2.416626272755e-01
  1 KSP unpreconditioned resid norm 5.438184509314e-04 true resid norm 5.438184509314e-04 ||r(i)||/||b|| 9.865412020722e-02
  2 KSP unpreconditioned resid norm 8.944495318173e-05 true resid norm 8.944495318173e-05 ||r(i)||/||b|| 1.622621142774e-02
  3 KSP unpreconditioned resid norm 2.082458866812e-05 true resid norm 2.082458866812e-05 ||r(i)||/||b|| 3.777789205592e-03
  4 KSP unpreconditioned resid norm 3.440115169336e-06 true resid norm 3.440115169336e-06 ||r(i)||/||b|| 6.240713879072e-04
  5 KSP unpreconditioned resid norm 5.974692409002e-07 true resid norm 5.974692409003e-07 ||r(i)||/||b|| 1.083869114977e-04
  6 KSP unpreconditioned resid norm 1.068060495192e-07 true resid norm 1.068060495197e-07 ||r(i)||/||b|| 1.937568839403e-05
  7 KSP unpreconditioned resid norm 1.893864251792e-08 true resid norm 1.893864251800e-08 ||r(i)||/||b|| 3.435659662397e-06
  8 KSP unpreconditioned resid norm 3.223578380294e-09 true resid norm 3.223578380319e-09 ||r(i)||/||b|| 5.847894430296e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        4 ****************************************
                         Simulation time =     0.1562 sec 
                          Time/time step =     5.2840 sec 
                  Average time/time step =    11.1488 sec 
  
   U MAX =  1.130662420195935E-05  V MAX =  1.130662420195935E-05  W MAX =  0.000000000000000E+00  
   U MIN = -1.130662420195935E-05  V MIN = -1.130662420195935E-05  W MIN = -9.574127740050222E-04  
   U MAX =  1.130662420195935E-05  V MAX =  1.130662420195935E-05  W MAX =  9.574127740050222E-04  
   max(|divU|)    =  5.875355255447298E-03
   sum(divU*dV)   = -6.773730321252588E-19
   sum(|divU|*dV) =  2.884755678217677E-03
   Convective cfl =  6.272166543417223E-03  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.202483055484767E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.307223362909e-03 true resid norm 1.307223362909e-03 ||r(i)||/||b|| 1.917339325704e-01
  1 KSP unpreconditioned resid norm 5.438982564847e-04 true resid norm 5.438982564847e-04 ||r(i)||/||b|| 7.977500601115e-02
  2 KSP unpreconditioned resid norm 8.927299097308e-05 true resid norm 8.927299097308e-05 ||r(i)||/||b|| 1.309390737441e-02
  3 KSP unpreconditioned resid norm 2.082704964639e-05 true resid norm 2.082704964639e-05 ||r(i)||/||b|| 3.054758846763e-03
  4 KSP unpreconditioned resid norm 3.433123136852e-06 true resid norm 3.433123136852e-06 ||r(i)||/||b|| 5.035453149813e-04
  5 KSP unpreconditioned resid norm 5.974837111114e-07 true resid norm 5.974837111118e-07 ||r(i)||/||b|| 8.763452737203e-05
  6 KSP unpreconditioned resid norm 1.066300405697e-07 true resid norm 1.066300405697e-07 ||r(i)||/||b|| 1.563971207115e-05
  7 KSP unpreconditioned resid norm 1.892883581926e-08 true resid norm 1.892883581947e-08 ||r(i)||/||b|| 2.776342768669e-06
  8 KSP unpreconditioned resid norm 3.217263431375e-09 true resid norm 3.217263430635e-09 ||r(i)||/||b|| 4.718845969046e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        5 ****************************************
                         Simulation time =     0.1953 sec 
                          Time/time step =     5.2903 sec 
                  Average time/time step =     9.9771 sec 
  
   U MAX =  1.751966015931765E-05  V MAX =  1.751966015931765E-05  W MAX =  0.000000000000000E+00  
   U MIN = -1.751966015931765E-05  V MIN = -1.751966015931765E-05  W MIN = -1.196765967264871E-03  
   U MAX =  1.751966015931765E-05  V MAX =  1.751966015931765E-05  W MAX =  1.196765967264871E-03  
   max(|divU|)    =  7.282771149617234E-03
   sum(divU*dV)   =  1.540113958245672E-18
   sum(|divU|*dV) =  3.611478531740617E-03
   Convective cfl =  7.883553840534440E-03  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.195631525122878E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.284025180478e-03 true resid norm 1.284025180478e-03 ||r(i)||/||b|| 1.585293618291e-01
  1 KSP unpreconditioned resid norm 5.440529475024e-04 true resid norm 5.440529475024e-04 ||r(i)||/||b|| 6.717030777910e-02
  2 KSP unpreconditioned resid norm 8.912993213531e-05 true resid norm 8.912993213531e-05 ||r(i)||/||b|| 1.100423221921e-02
  3 KSP unpreconditioned resid norm 2.083128820889e-05 true resid norm 2.083128820889e-05 ||r(i)||/||b|| 2.571889458278e-03
  4 KSP unpreconditioned resid norm 3.426985329255e-06 true resid norm 3.426985329255e-06 ||r(i)||/||b|| 4.231052517541e-04
  5 KSP unpreconditioned resid norm 5.976212412688e-07 true resid norm 5.976212412689e-07 ||r(i)||/||b|| 7.378400006037e-05
  6 KSP unpreconditioned resid norm 1.064784783066e-07 true resid norm 1.064784783067e-07 ||r(i)||/||b|| 1.314613254563e-05
  7 KSP unpreconditioned resid norm 1.892141014094e-08 true resid norm 1.892141014064e-08 ||r(i)||/||b|| 2.336090537872e-06
  8 KSP unpreconditioned resid norm 3.211733810102e-09 true resid norm 3.211733810296e-09 ||r(i)||/||b|| 3.965296935390e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        6 ****************************************
                         Simulation time =     0.2344 sec 
                          Time/time step =     5.2861 sec 
                  Average time/time step =     9.1953 sec 
  
   U MAX =  2.479691939489266E-05  V MAX =  2.479691939489266E-05  W MAX =  0.000000000000000E+00  
   U MIN = -2.479691939489266E-05  V MIN = -2.479691939489266E-05  W MIN = -1.436119160392139E-03  
   U MAX =  2.479691939489266E-05  V MAX =  2.479691939489266E-05  W MAX =  1.436119160392139E-03  
   max(|divU|)    =  8.667778929815951E-03
   sum(divU*dV)   =  1.929412357010522E-17
   sum(|divU|*dV) =  4.340254834708106E-03
   Convective cfl =  9.508563194764315E-03  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.188724683969847E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.262404152977e-03 true resid norm 1.262404152977e-03 ||r(i)||/||b|| 1.348871940068e-01
  1 KSP unpreconditioned resid norm 5.442761720331e-04 true resid norm 5.442761720331e-04 ||r(i)||/||b|| 5.815561160596e-02
  2 KSP unpreconditioned resid norm 8.901362921075e-05 true resid norm 8.901362921075e-05 ||r(i)||/||b|| 9.511057646855e-03
  3 KSP unpreconditioned resid norm 2.083716594647e-05 true resid norm 2.083716594647e-05 ||r(i)||/||b|| 2.226439796593e-03
  4 KSP unpreconditioned resid norm 3.421634895428e-06 true resid norm 3.421634895429e-06 ||r(i)||/||b|| 3.655998191003e-04
  5 KSP unpreconditioned resid norm 5.978718503849e-07 true resid norm 5.978718503845e-07 ||r(i)||/||b|| 6.388228055477e-05
  6 KSP unpreconditioned resid norm 1.063494715136e-07 true resid norm 1.063494715136e-07 ||r(i)||/||b|| 1.136338292514e-05
  7 KSP unpreconditioned resid norm 1.891618348142e-08 true resid norm 1.891618348117e-08 ||r(i)||/||b|| 2.021183869741e-06
  8 KSP unpreconditioned resid norm 3.206931177866e-09 true resid norm 3.206931177846e-09 ||r(i)||/||b|| 3.426588441841e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        7 ****************************************
                         Simulation time =     0.2734 sec 
                          Time/time step =     5.2908 sec 
                  Average time/time step =     8.6375 sec 
  
   U MAX =  3.305085009380369E-05  V MAX =  3.305085009380369E-05  W MAX =  0.000000000000000E+00  
   U MIN = -3.305085009380369E-05  V MIN = -3.305085009380369E-05  W MIN = -1.675472353378616E-03  
   U MAX =  3.305085009380369E-05  V MAX =  3.305085009380369E-05  W MAX =  1.675472353378616E-03  
   max(|divU|)    =  1.003084665173450E-02
   sum(divU*dV)   =  1.650583198534465E-17
   sum(|divU|*dV) =  5.071065394490515E-03
   Convective cfl =  1.114607394282383E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.181767370247083E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.242229955292e-03 true resid norm 1.242229955292e-03 ||r(i)||/||b|| 1.172208952731e-01
  1 KSP unpreconditioned resid norm 5.445606391168e-04 true resid norm 5.445606391168e-04 ||r(i)||/||b|| 5.138652902053e-02
  2 KSP unpreconditioned resid norm 8.892190380460e-05 true resid norm 8.892190380460e-05 ||r(i)||/||b|| 8.390962662720e-03
  3 KSP unpreconditioned resid norm 2.084447741222e-05 true resid norm 2.084447741222e-05 ||r(i)||/||b|| 1.966953295042e-03
  4 KSP unpreconditioned resid norm 3.416997089687e-06 true resid norm 3.416997089688e-06 ||r(i)||/||b|| 3.224390591231e-04
  5 KSP unpreconditioned resid norm 5.982254535369e-07 true resid norm 5.982254535377e-07 ||r(i)||/||b|| 5.645051702395e-05
  6 KSP unpreconditioned resid norm 1.062409410453e-07 true resid norm 1.062409410446e-07 ||r(i)||/||b|| 1.002524385348e-05
  7 KSP unpreconditioned resid norm 1.891292046742e-08 true resid norm 1.891292046669e-08 ||r(i)||/||b|| 1.784685242769e-06
  8 KSP unpreconditioned resid norm 3.202788322478e-09 true resid norm 3.202788322675e-09 ||r(i)||/||b|| 3.022256168875e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        8 ****************************************
                         Simulation time =     0.3125 sec 
                          Time/time step =     5.2868 sec 
                  Average time/time step =     8.2187 sec 
  
   U MAX =  4.220050557427550E-05  V MAX =  4.220050557427550E-05  W MAX =  0.000000000000000E+00  
   U MIN = -4.220050557427550E-05  V MIN = -4.220050557427550E-05  W MIN = -1.914825546219209E-03  
   U MAX =  4.220050557427550E-05  V MAX =  4.220050557427550E-05  W MAX =  1.914825546219209E-03  
   max(|divU|)    =  1.137244546422055E-02
   sum(divU*dV)   =  2.000003067929342E-18
   sum(|divU|*dV) =  5.803893153988580E-03
   Convective cfl =  1.279504996715366E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.174764056878764E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.223379873944e-03 true resid norm 1.223379873944e-03 ||r(i)||/||b|| 1.035358121775e-01
  1 KSP unpreconditioned resid norm 5.448986744158e-04 true resid norm 5.448986744158e-04 ||r(i)||/||b|| 4.611529747356e-02
  2 KSP unpreconditioned resid norm 8.885276850463e-05 true resid norm 8.885276850464e-05 ||r(i)||/||b|| 7.519695024645e-03
  3 KSP unpreconditioned resid norm 2.085297091140e-05 true resid norm 2.085297091140e-05 ||r(i)||/||b|| 1.764806930055e-03
  4 KSP unpreconditioned resid norm 3.412995973201e-06 true resid norm 3.412995973200e-06 ||r(i)||/||b|| 2.888451229009e-04
  5 KSP unpreconditioned resid norm 5.986734251487e-07 true resid norm 5.986734251486e-07 ||r(i)||/||b|| 5.066630620792e-05
  6 KSP unpreconditioned resid norm 1.061508214252e-07 true resid norm 1.061508214263e-07 ||r(i)||/||b|| 8.983645835408e-06
  7 KSP unpreconditioned resid norm 1.891136234479e-08 true resid norm 1.891136234533e-08 ||r(i)||/||b|| 1.600486734750e-06
  8 KSP unpreconditioned resid norm 3.199236033375e-09 true resid norm 3.199236033557e-09 ||r(i)||/||b|| 2.707544141740e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =        9 ****************************************
                         Simulation time =     0.3516 sec 
                          Time/time step =     5.2901 sec 
                  Average time/time step =     7.8933 sec 
  
   U MAX =  5.217110505333921E-05  V MAX =  5.217110505333921E-05  W MAX =  0.000000000000000E+00  
   U MIN = -5.217110505333921E-05  V MIN = -5.217110505333921E-05  W MIN = -2.154178738910741E-03  
   U MAX =  5.217110505333921E-05  V MAX =  5.217110505333921E-05  W MAX =  2.154178738910741E-03  
   max(|divU|)    =  1.269304839536958E-02
   sum(divU*dV)   = -2.694000336582297E-17
   sum(|divU|*dV) =  6.538722584835714E-03
   Convective cfl =  1.445453407371149E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.167718875855106E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.205741071227e-03 true resid norm 1.205741071227e-03 ||r(i)||/||b|| 9.263443341308e-02
  1 KSP unpreconditioned resid norm 5.452830457760e-04 true resid norm 5.452830457760e-04 ||r(i)||/||b|| 4.189289657671e-02
  2 KSP unpreconditioned resid norm 8.880461896504e-05 true resid norm 8.880461896504e-05 ||r(i)||/||b|| 6.822663471120e-03
  3 KSP unpreconditioned resid norm 2.086238471379e-05 true resid norm 2.086238471379e-05 ||r(i)||/||b|| 1.602811112373e-03
  4 KSP unpreconditioned resid norm 3.409561410338e-06 true resid norm 3.409561410337e-06 ||r(i)||/||b|| 2.619491008233e-04
  5 KSP unpreconditioned resid norm 5.992097002504e-07 true resid norm 5.992097002501e-07 ||r(i)||/||b|| 4.603596278080e-05
  6 KSP unpreconditioned resid norm 1.060772787794e-07 true resid norm 1.060772787798e-07 ||r(i)||/||b|| 8.149683918264e-06
  7 KSP unpreconditioned resid norm 1.891126501641e-08 true resid norm 1.891126501618e-08 ||r(i)||/||b|| 1.452910879212e-06
  8 KSP unpreconditioned resid norm 3.196215804033e-09 true resid norm 3.196215804331e-09 ||r(i)||/||b|| 2.455582273554e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       10 ****************************************
                         Simulation time =     0.3906 sec 
                          Time/time step =     5.2867 sec 
                  Average time/time step =     7.6326 sec 
  
   U MAX =  6.289364203444345E-05  V MAX =  6.289364203444345E-05  W MAX =  0.000000000000000E+00  
   U MIN = -6.289364203444345E-05  V MIN = -6.289364203444345E-05  W MIN = -2.393531931450745E-03  
   U MAX =  6.289364203444345E-05  V MAX =  6.289364203444345E-05  W MAX =  2.393531931450745E-03  
   max(|divU|)    =  1.399312989607586E-02
   sum(divU*dV)   =  1.953489400469769E-17
   sum(|divU|*dV) =  7.275539118175428E-03
   Convective cfl =  1.612364297932565E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.160635639983364E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.189212180809e-03 true resid norm 1.189212180809e-03 ||r(i)||/||b|| 8.375462057382e-02
  1 KSP unpreconditioned resid norm 5.457076531056e-04 true resid norm 5.457076531056e-04 ||r(i)||/||b|| 3.843345886265e-02
  2 KSP unpreconditioned resid norm 8.877633152293e-05 true resid norm 8.877633152293e-05 ||r(i)||/||b|| 6.252398085579e-03
  3 KSP unpreconditioned resid norm 2.087248091046e-05 true resid norm 2.087248091046e-05 ||r(i)||/||b|| 1.470020865327e-03
  4 KSP unpreconditioned resid norm 3.406633345812e-06 true resid norm 3.406633345813e-06 ||r(i)||/||b|| 2.399246222979e-04
  5 KSP unpreconditioned resid norm 5.998310926621e-07 true resid norm 5.998310926614e-07 ||r(i)||/||b|| 4.224530019533e-05
  6 KSP unpreconditioned resid norm 1.060188550981e-07 true resid norm 1.060188550980e-07 ||r(i)||/||b|| 7.466765919236e-06
  7 KSP unpreconditioned resid norm 1.891242496403e-08 true resid norm 1.891242496489e-08 ||r(i)||/||b|| 1.331976751186e-06
  8 KSP unpreconditioned resid norm 3.193674607405e-09 true resid norm 3.193674607819e-09 ||r(i)||/||b|| 2.249262237056e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       11 ****************************************
                         Simulation time =     0.4297 sec 
                          Time/time step =     5.2932 sec 
                  Average time/time step =     7.4199 sec 
  
   U MAX =  7.430448596718565E-05  V MAX =  7.430448596718565E-05  W MAX =  2.709060511180018E-21  
   U MIN = -7.430448596718565E-05  V MIN = -7.430448596718565E-05  W MIN = -2.632885123836222E-03  
   U MAX =  7.430448596718565E-05  V MAX =  7.430448596718565E-05  W MAX =  2.632885123836222E-03  
   max(|divU|)    =  1.527316561904287E-02
   sum(divU*dV)   = -1.972843168610879E-17
   sum(|divU|*dV) =  8.014328726644185E-03
   Convective cfl =  1.780156221293180E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.153517864980171E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.173703535188e-03 true resid norm 1.173703535188e-03 ||r(i)||/||b|| 7.638845186070e-02
  1 KSP unpreconditioned resid norm 5.461678657327e-04 true resid norm 5.461678657327e-04 ||r(i)||/||b|| 3.554638498442e-02
  2 KSP unpreconditioned resid norm 8.876725102317e-05 true resid norm 8.876725102317e-05 ||r(i)||/||b|| 5.777262041306e-03
  3 KSP unpreconditioned resid norm 2.088306633812e-05 true resid norm 2.088306633812e-05 ||r(i)||/||b|| 1.359138027490e-03
  4 KSP unpreconditioned resid norm 3.404162327555e-06 true resid norm 3.404162327556e-06 ||r(i)||/||b|| 2.215539804460e-04
  5 KSP unpreconditioned resid norm 6.005368354461e-07 true resid norm 6.005368354463e-07 ||r(i)||/||b|| 3.908489475386e-05
  6 KSP unpreconditioned resid norm 1.059745081597e-07 true resid norm 1.059745081589e-07 ||r(i)||/||b|| 6.897166424276e-06
  7 KSP unpreconditioned resid norm 1.891469142575e-08 true resid norm 1.891469142698e-08 ||r(i)||/||b|| 1.231029772180e-06
  8 KSP unpreconditioned resid norm 3.191570541883e-09 true resid norm 3.191570541947e-09 ||r(i)||/||b|| 2.077178140768e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       12 ****************************************
                         Simulation time =     0.4688 sec 
                          Time/time step =     5.2795 sec 
                  Average time/time step =     7.2416 sec 
  
   U MAX =  8.634496512897778E-05  V MAX =  8.634496512897778E-05  W MAX =  8.489812655424283E-19  
   U MIN = -8.634496512897778E-05  V MIN = -8.634496512897778E-05  W MIN = -2.872238316062392E-03  
   U MAX =  8.634496512897778E-05  V MAX =  8.634496512897778E-05  W MAX =  2.872238316062392E-03  
   max(|divU|)    =  1.653363196596731E-02
   sum(divU*dV)   = -2.462346780191219E-17
   sum(|divU|*dV) =  8.755077706950073E-03
   Convective cfl =  1.948754077645023E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.146368792554558E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.159136096728e-03 true resid norm 1.159136096728e-03 ||r(i)||/||b|| 7.018451665043e-02
  1 KSP unpreconditioned resid norm 5.466604833622e-04 true resid norm 5.466604833622e-04 ||r(i)||/||b|| 3.309973859407e-02
  2 KSP unpreconditioned resid norm 8.877709917005e-05 true resid norm 8.877709917005e-05 ||r(i)||/||b|| 5.375363438739e-03
  3 KSP unpreconditioned resid norm 2.089399796449e-05 true resid norm 2.089399796449e-05 ||r(i)||/||b|| 1.265110414706e-03
  4 KSP unpreconditioned resid norm 3.402106992486e-06 true resid norm 3.402106992486e-06 ||r(i)||/||b|| 2.059941326429e-04
  5 KSP unpreconditioned resid norm 6.013276058776e-07 true resid norm 6.013276058795e-07 ||r(i)||/||b|| 3.640977749406e-05
  6 KSP unpreconditioned resid norm 1.059435670909e-07 true resid norm 1.059435670911e-07 ||r(i)||/||b|| 6.414775684667e-06
  7 KSP unpreconditioned resid norm 1.891796347385e-08 true resid norm 1.891796347347e-08 ||r(i)||/||b|| 1.145463527660e-06
  8 KSP unpreconditioned resid norm 3.189873620070e-09 true resid norm 3.189873619632e-09 ||r(i)||/||b|| 1.931436168727e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       13 ****************************************
                         Simulation time =     0.5078 sec 
                          Time/time step =     5.2942 sec 
                  Average time/time step =     7.0918 sec 
  
   U MAX =  9.896094562848124E-05  V MAX =  9.896094562848124E-05  W MAX =  9.909002805648082E-18  
   U MIN = -9.896094562848124E-05  V MIN = -9.896094562848124E-05  W MIN = -3.111591508121445E-03  
   U MAX =  9.896094562848124E-05  V MAX =  9.896094562848124E-05  W MAX =  3.111591508121445E-03  
   max(|divU|)    =  1.777500515817062E-02
   sum(divU*dV)   =  1.234132652517441E-18
   sum(|divU|*dV) =  9.497772642299201E-03
   Convective cfl =  2.118088575602181E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.139191413667278E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.145439560085e-03 true resid norm 1.145439560085e-03 ||r(i)||/||b|| 6.489199472629e-02
  1 KSP unpreconditioned resid norm 5.471834130806e-04 true resid norm 5.471834130806e-04 ||r(i)||/||b|| 3.099929877864e-02
  2 KSP unpreconditioned resid norm 8.880585182682e-05 true resid norm 8.880585182682e-05 ||r(i)||/||b|| 5.031071973787e-03
  3 KSP unpreconditioned resid norm 2.090517530672e-05 true resid norm 2.090517530672e-05 ||r(i)||/||b|| 1.184330079935e-03
  4 KSP unpreconditioned resid norm 3.400429960731e-06 true resid norm 3.400429960732e-06 ||r(i)||/||b|| 1.926427991213e-04
  5 KSP unpreconditioned resid norm 6.022043913180e-07 true resid norm 6.022043913185e-07 ||r(i)||/||b|| 3.411637378990e-05
  6 KSP unpreconditioned resid norm 1.059256456909e-07 true resid norm 1.059256456905e-07 ||r(i)||/||b|| 6.000950797457e-06
  7 KSP unpreconditioned resid norm 1.892217526757e-08 true resid norm 1.892217526718e-08 ||r(i)||/||b|| 1.071988204735e-06
  8 KSP unpreconditioned resid norm 3.188547743643e-09 true resid norm 3.188547741446e-09 ||r(i)||/||b|| 1.806391453837e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       14 ****************************************
                         Simulation time =     0.5469 sec 
                          Time/time step =     5.2864 sec 
                  Average time/time step =     6.9628 sec 
  
   U MAX =  1.121024310588700E-04  V MAX =  1.121024310588700E-04  W MAX =  4.203521926363097E-17  
   U MIN = -1.121024310588700E-04  V MIN = -1.121024310588700E-04  W MIN = -3.350944700001264E-03  
   U MAX =  1.121024310588700E-04  V MAX =  1.121024310588700E-04  W MAX =  3.350944700001264E-03  
   max(|divU|)    =  1.899775987895855E-02
   sum(divU*dV)   =  6.277424218272801E-18
   sum(|divU|*dV) =  1.024240047107154E-02
   Convective cfl =  2.288095719756163E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.131988490632803E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.132550247035e-03 true resid norm 1.132550247035e-03 ||r(i)||/||b|| 6.032706786034e-02
  1 KSP unpreconditioned resid norm 5.477352053856e-04 true resid norm 5.477352053856e-04 ||r(i)||/||b|| 2.917597606933e-02
  2 KSP unpreconditioned resid norm 8.885363013505e-05 true resid norm 8.885363013505e-05 ||r(i)||/||b|| 4.732928175885e-03
  3 KSP unpreconditioned resid norm 2.091652524753e-05 true resid norm 2.091652524753e-05 ||r(i)||/||b|| 1.114151571918e-03
  4 KSP unpreconditioned resid norm 3.399093507522e-06 true resid norm 3.399093507523e-06 ||r(i)||/||b|| 1.810580548006e-04
  5 KSP unpreconditioned resid norm 6.031675064902e-07 true resid norm 6.031675064902e-07 ||r(i)||/||b|| 3.212866465789e-05
  6 KSP unpreconditioned resid norm 1.059205575554e-07 true resid norm 1.059205575550e-07 ||r(i)||/||b|| 5.642024872768e-06
  7 KSP unpreconditioned resid norm 1.892728637898e-08 true resid norm 1.892728637999e-08 ||r(i)||/||b|| 1.008191639044e-06
  8 KSP unpreconditioned resid norm 3.187569971159e-09 true resid norm 3.187569969133e-09 ||r(i)||/||b|| 1.697909212779e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       15 ****************************************
                         Simulation time =     0.5859 sec 
                          Time/time step =     5.2800 sec 
                  Average time/time step =     6.8506 sec 
  
   U MAX =  1.257232025658109E-04  V MAX =  1.257232025658109E-04  W MAX =  1.252536729387660E-16  
   U MIN = -1.257232025658109E-04  V MIN = -1.257232025658109E-04  W MIN = -3.590297891684162E-03  
   U MAX =  1.257232025658109E-04  V MAX =  1.257232025658109E-04  W MAX =  3.590297891684162E-03  
   max(|divU|)    =  2.020236775959870E-02
   sum(divU*dV)   = -2.143760560576972E-17
   sum(|divU|*dV) =  1.098894856855465E-02
   Convective cfl =  2.458716349962102E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.124762576993311E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.120409377489e-03 true resid norm 1.120409377489e-03 ||r(i)||/||b|| 5.635193026078e-02
  1 KSP unpreconditioned resid norm 5.483146006913e-04 true resid norm 5.483146006913e-04 ||r(i)||/||b|| 2.757794316965e-02
  2 KSP unpreconditioned resid norm 8.892063571212e-05 true resid norm 8.892063571212e-05 ||r(i)||/||b|| 4.472338024897e-03
  3 KSP unpreconditioned resid norm 2.092798582458e-05 true resid norm 2.092798582458e-05 ||r(i)||/||b|| 1.052590616770e-03
  4 KSP unpreconditioned resid norm 3.398055920868e-06 true resid norm 3.398055920868e-06 ||r(i)||/||b|| 1.709080753181e-04
  5 KSP unpreconditioned resid norm 6.042159401370e-07 true resid norm 6.042159401381e-07 ||r(i)||/||b|| 3.038954796811e-05
  6 KSP unpreconditioned resid norm 1.059282601319e-07 true resid norm 1.059282601309e-07 ||r(i)||/||b|| 5.327750773493e-06
  7 KSP unpreconditioned resid norm 1.893326684092e-08 true resid norm 1.893326684090e-08 ||r(i)||/||b|| 9.522645508544e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       16 ****************************************
                         Simulation time =     0.6250 sec 
                          Time/time step =     4.9413 sec 
                  Average time/time step =     6.7313 sec 
  
   U MAX =  1.397805072379311E-04  V MAX =  1.397805072379311E-04  W MAX =  7.133117161023170E-16  
   U MIN = -1.397805072379311E-04  V MIN = -1.397805072379311E-04  W MIN = -3.829651083145590E-03  
   U MAX =  1.397805072379311E-04  V MAX =  1.397805072379311E-04  W MAX =  3.829651083145590E-03  
   max(|divU|)    =  2.138929604270771E-02
   sum(divU*dV)   =  1.306705937858796E-17
   sum(|divU|*dV) =  1.173740476487146E-02
   Convective cfl =  2.629895742477729E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.117516034740129E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.108962607610e-03 true resid norm 1.108962607610e-03 ||r(i)||/||b|| 5.286125774350e-02
  1 KSP unpreconditioned resid norm 5.489205855391e-04 true resid norm 5.489205855391e-04 ||r(i)||/||b|| 2.616556442369e-02
  2 KSP unpreconditioned resid norm 8.900718881832e-05 true resid norm 8.900718881832e-05 ||r(i)||/||b|| 4.242732727741e-03
  3 KSP unpreconditioned resid norm 2.093948323922e-05 true resid norm 2.093948323923e-05 ||r(i)||/||b|| 9.981287132030e-04
  4 KSP unpreconditioned resid norm 3.397281585611e-06 true resid norm 3.397281585611e-06 ||r(i)||/||b|| 1.619392541208e-04
  5 KSP unpreconditioned resid norm 6.053482815860e-07 true resid norm 6.053482815853e-07 ||r(i)||/||b|| 2.885532056524e-05
  6 KSP unpreconditioned resid norm 1.059487960121e-07 true resid norm 1.059487960112e-07 ||r(i)||/||b|| 5.050293468081e-06
  7 KSP unpreconditioned resid norm 1.894010874842e-08 true resid norm 1.894010874811e-08 ||r(i)||/||b|| 9.028239215219e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       17 ****************************************
                         Simulation time =     0.6641 sec 
                          Time/time step =     4.9385 sec 
                  Average time/time step =     6.6258 sec 
  
   U MAX =  1.542347908037065E-04  V MAX =  1.542347908037065E-04  W MAX =  5.397018840409226E-15  
   U MIN = -1.542347908037065E-04  V MIN = -1.542347908037065E-04  W MIN = -4.069004274352844E-03  
   U MAX =  1.542347908037065E-04  V MAX =  1.542347908037065E-04  W MAX =  4.069004274352844E-03  
   max(|divU|)    =  2.255900665212567E-02
   sum(divU*dV)   =  3.138559063017270E-17
   sum(|divU|*dV) =  1.248775726365972E-02
   Convective cfl =  2.801583267814565E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.110251049105822E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.098157656259e-03 true resid norm 1.098157656259e-03 ||r(i)||/||b|| 4.977309599735e-02
  1 KSP unpreconditioned resid norm 5.495507861759e-04 true resid norm 5.495507861759e-04 ||r(i)||/||b|| 2.490793911044e-02
  2 KSP unpreconditioned resid norm 8.911358386156e-05 true resid norm 8.911358386156e-05 ||r(i)||/||b|| 4.039000173546e-03
  3 KSP unpreconditioned resid norm 2.095097948922e-05 true resid norm 2.095097948922e-05 ||r(i)||/||b|| 9.495859792192e-04
  4 KSP unpreconditioned resid norm 3.396688656334e-06 true resid norm 3.396688656331e-06 ||r(i)||/||b|| 1.539521302803e-04
  5 KSP unpreconditioned resid norm 6.065576823288e-07 true resid norm 6.065576823288e-07 ||r(i)||/||b|| 2.749172997012e-05
  6 KSP unpreconditioned resid norm 1.059825424861e-07 true resid norm 1.059825424861e-07 ||r(i)||/||b|| 4.803571901665e-06
  7 KSP unpreconditioned resid norm 1.894779795394e-08 true resid norm 1.894779795395e-08 ||r(i)||/||b|| 8.587934174348e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       18 ****************************************
                         Simulation time =     0.7031 sec 
                          Time/time step =     4.9321 sec 
                  Average time/time step =     6.5317 sec 
  
   U MAX =  1.690494631829092E-04  V MAX =  1.690494631829092E-04  W MAX =  1.511073763710468E-14  
   U MIN = -1.690494631829092E-04  V MIN = -1.690494631829092E-04  W MIN = -4.308357465263742E-03  
   U MAX =  1.690494631829092E-04  V MAX =  1.690494631829092E-04  W MAX =  4.308357465263742E-03  
   max(|divU|)    =  2.371195570421031E-02
   sum(divU*dV)   =  3.211602794323830E-18
   sum(|divU|*dV) =  1.323999447402270E-02
   Convective cfl =  2.973732090642919E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.102969641552062E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.087947834748e-03 true resid norm 1.087947834748e-03 ||r(i)||/||b|| 4.702287504217e-02
  1 KSP unpreconditioned resid norm 5.502038146917e-04 true resid norm 5.502038146917e-04 ||r(i)||/||b|| 2.378070381654e-02
  2 KSP unpreconditioned resid norm 8.924045016893e-05 true resid norm 8.924045016893e-05 ||r(i)||/||b|| 3.857117412228e-03
  3 KSP unpreconditioned resid norm 2.096241068775e-05 true resid norm 2.096241068775e-05 ||r(i)||/||b|| 9.060294867735e-04
  4 KSP unpreconditioned resid norm 3.396224040976e-06 true resid norm 3.396224040977e-06 ||r(i)||/||b|| 1.467903272505e-04
  5 KSP unpreconditioned resid norm 6.078391357452e-07 true resid norm 6.078391357456e-07 ||r(i)||/||b|| 2.627179614043e-05
  6 KSP unpreconditioned resid norm 1.060299620778e-07 true resid norm 1.060299620776e-07 ||r(i)||/||b|| 4.582787426253e-06
  7 KSP unpreconditioned resid norm 1.895638236531e-08 true resid norm 1.895638236494e-08 ||r(i)||/||b|| 8.193256797143e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       19 ****************************************
                         Simulation time =     0.7422 sec 
                          Time/time step =     4.9272 sec 
                  Average time/time step =     6.4473 sec 
  
   U MAX =  1.841906839855050E-04  V MAX =  1.841906839855050E-04  W MAX =  3.305759669279126E-14  
   U MIN = -1.841906839855050E-04  V MIN = -1.841906839855050E-04  W MIN = -4.547710655825268E-03  
   U MAX =  1.841906839855050E-04  V MAX =  1.841906839855050E-04  W MAX =  4.547710655825268E-03  
   max(|divU|)    =  2.484859332098110E-02
   sum(divU*dV)   =  1.110347635684405E-17
   sum(|divU|*dV) =  1.399410480450672E-02
   Convective cfl =  3.146298895229618E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.095673681655521E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.078290222827e-03 true resid norm 1.078290222827e-03 ||r(i)||/||b|| 4.455900228385e-02
  1 KSP unpreconditioned resid norm 5.508779799768e-04 true resid norm 5.508779799768e-04 ||r(i)||/||b|| 2.276434734200e-02
  2 KSP unpreconditioned resid norm 8.938865146485e-05 true resid norm 8.938865146485e-05 ||r(i)||/||b|| 3.693874840422e-03
  3 KSP unpreconditioned resid norm 2.097375746539e-05 true resid norm 2.097375746539e-05 ||r(i)||/||b|| 8.667144401542e-04
  4 KSP unpreconditioned resid norm 3.395811674005e-06 true resid norm 3.395811674004e-06 ||r(i)||/||b|| 1.403276937268e-04
  5 KSP unpreconditioned resid norm 6.091841326740e-07 true resid norm 6.091841326724e-07 ||r(i)||/||b|| 2.517377658111e-05
  6 KSP unpreconditioned resid norm 1.060918815922e-07 true resid norm 1.060918815906e-07 ||r(i)||/||b|| 4.384115049934e-06
  7 KSP unpreconditioned resid norm 1.896595062233e-08 true resid norm 1.896595062427e-08 ||r(i)||/||b|| 7.837443197499e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       20 ****************************************
                         Simulation time =     0.7812 sec 
                          Time/time step =     4.9370 sec 
                  Average time/time step =     6.3718 sec 
  
   U MAX =  1.996271584416596E-04  V MAX =  1.996271584416596E-04  W MAX =  6.378338308959739E-14  
   U MIN = -1.996271584416596E-04  V MIN = -1.996271584416596E-04  W MIN = -4.787063845972175E-03  
   U MAX =  1.996271584416596E-04  V MAX =  1.996271584416596E-04  W MAX =  4.787063845972175E-03  
   max(|divU|)    =  2.596936353016404E-02
   sum(divU*dV)   = -5.087558198290590E-18
   sum(|divU|*dV) =  1.475007648125029E-02
   Convective cfl =  3.319243624227516E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.088364898408413E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.069146388466e-03 true resid norm 1.069146388466e-03 ||r(i)||/||b|| 4.233981144113e-02
  1 KSP unpreconditioned resid norm 5.515719033567e-04 true resid norm 5.515719033567e-04 ||r(i)||/||b|| 2.184308027066e-02
  2 KSP unpreconditioned resid norm 8.955941510882e-05 true resid norm 8.955941510882e-05 ||r(i)||/||b|| 3.546688076949e-03
  3 KSP unpreconditioned resid norm 2.098506285815e-05 true resid norm 2.098506285815e-05 ||r(i)||/||b|| 8.310401775466e-04
  4 KSP unpreconditioned resid norm 3.395363857784e-06 true resid norm 3.395363857784e-06 ||r(i)||/||b|| 1.344615359164e-04
  5 KSP unpreconditioned resid norm 6.105809735096e-07 true resid norm 6.105809735105e-07 ||r(i)||/||b|| 2.417992855503e-05
  6 KSP unpreconditioned resid norm 1.061694543455e-07 true resid norm 1.061694543438e-07 ||r(i)||/||b|| 4.204470712540e-06
  7 KSP unpreconditioned resid norm 1.897668188500e-08 true resid norm 1.897668188377e-08 ||r(i)||/||b|| 7.515052582182e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       21 ****************************************
                         Simulation time =     0.8203 sec 
                          Time/time step =     4.9311 sec 
                  Average time/time step =     6.3032 sec 
  
   U MAX =  2.153299398103797E-04  V MAX =  2.153299398103797E-04  W MAX =  1.135363150842961E-13  
   U MIN = -2.153299398103797E-04  V MIN = -2.153299398103797E-04  W MIN = -5.026417035625515E-03  
   U MAX =  2.153299398103797E-04  V MAX =  2.153299398103797E-04  W MAX =  5.026417035625515E-03  
   max(|divU|)    =  2.707470407611532E-02
   sum(divU*dV)   =  2.692776324389155E-17
   sum(|divU|*dV) =  1.550789744160050E-02
   Convective cfl =  3.492529225757616E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.081044891147881E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.060482205902e-03 true resid norm 1.060482205902e-03 ||r(i)||/||b|| 4.033130934561e-02
  1 KSP unpreconditioned resid norm 5.522846765089e-04 true resid norm 5.522846765089e-04 ||r(i)||/||b|| 2.100399611720e-02
  2 KSP unpreconditioned resid norm 8.975439648397e-05 true resid norm 8.975439648397e-05 ||r(i)||/||b|| 3.413458811074e-03
  3 KSP unpreconditioned resid norm 2.099646773029e-05 true resid norm 2.099646773029e-05 ||r(i)||/||b|| 7.985188534825e-04
  4 KSP unpreconditioned resid norm 3.394779537215e-06 true resid norm 3.394779537214e-06 ||r(i)||/||b|| 1.291072145422e-04
  5 KSP unpreconditioned resid norm 6.120131793176e-07 true resid norm 6.120131793168e-07 ||r(i)||/||b|| 2.327553703518e-05
  6 KSP unpreconditioned resid norm 1.062640565168e-07 true resid norm 1.062640565178e-07 ||r(i)||/||b|| 4.041339413228e-06
  7 KSP unpreconditioned resid norm 1.898884634996e-08 true resid norm 1.898884635060e-08 ||r(i)||/||b|| 7.221667954637e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       22 ****************************************
                         Simulation time =     0.8594 sec 
                          Time/time step =     4.9284 sec 
                  Average time/time step =     6.2407 sec 
  
   U MAX =  2.312722392105076E-04  V MAX =  2.312722392105076E-04  W MAX =  1.906783837822408E-13  
   U MIN = -2.312722392105076E-04  V MIN = -2.312722392105076E-04  W MIN = -5.265770379365912E-03  
   U MAX =  2.312722392105076E-04  V MAX =  2.312722392105076E-04  W MAX =  5.265770379365912E-03  
   max(|divU|)    =  2.816504607520991E-02
   sum(divU*dV)   =  8.245320893979468E-18
   sum(|divU|*dV) =  1.626755532681830E-02
   Convective cfl =  3.666121508983634E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.073715135884436E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.052267378610e-03 true resid norm 1.052267378610e-03 ||r(i)||/||b|| 3.850549867735e-02
  1 KSP unpreconditioned resid norm 5.530158897949e-04 true resid norm 5.530158897949e-04 ||r(i)||/||b|| 2.023644659705e-02
  2 KSP unpreconditioned resid norm 8.997572962617e-05 true resid norm 8.997572962617e-05 ||r(i)||/||b|| 3.292471484474e-03
  3 KSP unpreconditioned resid norm 2.100825099164e-05 true resid norm 2.100825099164e-05 ||r(i)||/||b|| 7.687525026587e-04
  4 KSP unpreconditioned resid norm 3.393942616429e-06 true resid norm 3.393942616428e-06 ||r(i)||/||b|| 1.241941502554e-04
  5 KSP unpreconditioned resid norm 6.134571479999e-07 true resid norm 6.134571479977e-07 ||r(i)||/||b|| 2.244816657915e-05
  6 KSP unpreconditioned resid norm 1.063770079844e-07 true resid norm 1.063770079864e-07 ||r(i)||/||b|| 3.892641569610e-06
  7 KSP unpreconditioned resid norm 1.900281175641e-08 true resid norm 1.900281175742e-08 ||r(i)||/||b|| 6.953676963341e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       23 ****************************************
                         Simulation time =     0.8984 sec 
                          Time/time step =     4.9308 sec 
                  Average time/time step =     6.1837 sec 
  
   U MAX =  2.474292462119088E-04  V MAX =  2.474292462119088E-04  W MAX =  3.061489539643341E-13  
   U MIN = -2.474292462119088E-04  V MIN = -2.474292462119088E-04  W MIN = -5.505123957015076E-03  
   U MAX =  2.474292462119088E-04  V MAX =  2.474292462119088E-04  W MAX =  5.505123957015076E-03  
   max(|divU|)    =  2.924081356023149E-02
   sum(divU*dV)   =  3.321134889941472E-17
   sum(|divU|*dV) =  1.702903756496682E-02
   Convective cfl =  3.839988767640892E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.066377001411958E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.044474816858e-03 true resid norm 1.044474816858e-03 ||r(i)||/||b|| 3.683911364142e-02
  1 KSP unpreconditioned resid norm 5.537655397092e-04 true resid norm 5.537655397092e-04 ||r(i)||/||b|| 1.953156870686e-02
  2 KSP unpreconditioned resid norm 9.022607780607e-05 true resid norm 9.022607780607e-05 ||r(i)||/||b|| 3.182315820419e-03
  3 KSP unpreconditioned resid norm 2.102087735276e-05 true resid norm 2.102087735276e-05 ||r(i)||/||b|| 7.414161424877e-04
  4 KSP unpreconditioned resid norm 3.392721415668e-06 true resid norm 3.392721415672e-06 ||r(i)||/||b|| 1.196628657468e-04
  5 KSP unpreconditioned resid norm 6.148790223682e-07 true resid norm 6.148790223667e-07 ||r(i)||/||b|| 2.168706972642e-05
  6 KSP unpreconditioned resid norm 1.065090503259e-07 true resid norm 1.065090503268e-07 ||r(i)||/||b|| 3.756623851048e-06
  7 KSP unpreconditioned resid norm 1.901908100219e-08 true resid norm 1.901908100011e-08 ||r(i)||/||b|| 6.708118520521e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       24 ****************************************
                         Simulation time =     0.9375 sec 
                          Time/time step =     4.9346 sec 
                  Average time/time step =     6.1317 sec 
  
   U MAX =  2.637779631398120E-04  V MAX =  2.637779631398120E-04  W MAX =  4.739806503114139E-13  
   U MIN = -2.637779631398120E-04  V MIN = -2.637779631398120E-04  W MIN = -5.744477599478519E-03  
   U MAX =  2.637779631398120E-04  V MAX =  2.637779631398120E-04  W MAX =  5.744477599478519E-03  
   max(|divU|)    =  3.030242301450445E-02
   sum(divU*dV)   = -1.221618764499917E-17
   sum(|divU|*dV) =  1.779233150950448E-02
   Convective cfl =  4.014101456485212E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.059031763149156E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.037080016490e-03 true resid norm 1.037080016490e-03 ||r(i)||/||b|| 3.531265790473e-02
  1 KSP unpreconditioned resid norm 5.545338685654e-04 true resid norm 5.545338685654e-04 ||r(i)||/||b|| 1.888192278887e-02
  2 KSP unpreconditioned resid norm 9.050870210229e-05 true resid norm 9.050870210229e-05 ||r(i)||/||b|| 3.081828580168e-03
  3 KSP unpreconditioned resid norm 2.103505922838e-05 true resid norm 2.103505922838e-05 ||r(i)||/||b|| 7.162454571748e-04
  4 KSP unpreconditioned resid norm 3.390971345242e-06 true resid norm 3.390971345244e-06 ||r(i)||/||b|| 1.154628468155e-04
  5 KSP unpreconditioned resid norm 6.162308170075e-07 true resid norm 6.162308170072e-07 ||r(i)||/||b|| 2.098270884149e-05
  6 KSP unpreconditioned resid norm 1.066594669233e-07 true resid norm 1.066594669234e-07 ||r(i)||/||b|| 3.631763420256e-06
  7 KSP unpreconditioned resid norm 1.903823155722e-08 true resid norm 1.903823155849e-08 ||r(i)||/||b|| 6.482533145433e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       25 ****************************************
                         Simulation time =     0.9766 sec 
                          Time/time step =     4.9259 sec 
                  Average time/time step =     6.0835 sec 
  
   U MAX =  2.802970539392997E-04  V MAX =  2.802970539392997E-04  W MAX =  7.118669076767622E-13  
   U MIN = -2.802970539392997E-04  V MIN = -2.802970539392997E-04  W MIN = -5.983831310938474E-03  
   U MAX =  2.802970539392997E-04  V MAX =  2.802970539392997E-04  W MAX =  5.983831310938474E-03  
   max(|divU|)    =  3.135028298108410E-02
   sum(divU*dV)   =  2.828981006113311E-17
   sum(|divU|*dV) =  1.855823305025077E-02
   Convective cfl =  4.188432268042927E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.051680600073254E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.030060522589e-03 true resid norm 1.030060522589e-03 ||r(i)||/||b|| 3.390966630881e-02
  1 KSP unpreconditioned resid norm 5.553212043042e-04 true resid norm 5.553212043042e-04 ||r(i)||/||b|| 1.828121388910e-02
  2 KSP unpreconditioned resid norm 9.082756384795e-05 true resid norm 9.082756384795e-05 ||r(i)||/||b|| 2.990049918606e-03
  3 KSP unpreconditioned resid norm 2.105184319773e-05 true resid norm 2.105184319773e-05 ||r(i)||/||b|| 6.930281885051e-04
  4 KSP unpreconditioned resid norm 3.388544492928e-06 true resid norm 3.388544492927e-06 ||r(i)||/||b|| 1.115511278298e-04
  5 KSP unpreconditioned resid norm 6.174461021315e-07 true resid norm 6.174461021323e-07 ||r(i)||/||b|| 2.032636998296e-05
  6 KSP unpreconditioned resid norm 1.068247480447e-07 true resid norm 1.068247480449e-07 ||r(i)||/||b|| 3.516678370141e-06
  7 KSP unpreconditioned resid norm 1.906085595275e-08 true resid norm 1.906085595211e-08 ||r(i)||/||b|| 6.274847455291e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       26 ****************************************
                         Simulation time =     1.0156 sec 
                          Time/time step =     4.9355 sec 
                  Average time/time step =     6.0393 sec 
  
   U MAX =  2.969667062277622E-04  V MAX =  2.969667062277622E-04  W MAX =  1.041779941880119E-12  
   U MIN = -2.969667062277622E-04  V MIN = -2.969667062277622E-04  W MIN = -6.226019377274523E-03  
   U MAX =  2.969667062277622E-04  V MAX =  2.969667062277622E-04  W MAX =  6.226019377274523E-03  
   max(|divU|)    =  3.238479376966569E-02
   sum(divU*dV)   = -6.519427921141458E-17
   sum(|divU|*dV) =  1.932639723845376E-02
   Convective cfl =  4.364769785427231E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.044248168636828E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.023395463874e-03 true resid norm 1.023395463874e-03 ||r(i)||/||b|| 3.261613207602e-02
  1 KSP unpreconditioned resid norm 5.561278383905e-04 true resid norm 5.561278383905e-04 ||r(i)||/||b|| 1.772407604723e-02
  2 KSP unpreconditioned resid norm 9.118747527464e-05 true resid norm 9.118747527464e-05 ||r(i)||/||b|| 2.906191049526e-03
  3 KSP unpreconditioned resid norm 2.107273617243e-05 true resid norm 2.107273617243e-05 ||r(i)||/||b|| 6.715987811802e-04
  4 KSP unpreconditioned resid norm 3.385312807046e-06 true resid norm 3.385312807042e-06 ||r(i)||/||b|| 1.078916347891e-04
  5 KSP unpreconditioned resid norm 6.184362436807e-07 true resid norm 6.184362436783e-07 ||r(i)||/||b|| 1.970987650077e-05
  6 KSP unpreconditioned resid norm 1.069966168774e-07 true resid norm 1.069966168782e-07 ||r(i)||/||b|| 3.410036404283e-06
  7 KSP unpreconditioned resid norm 1.908732989312e-08 true resid norm 1.908732989601e-08 ||r(i)||/||b|| 6.083228769749e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       27 ****************************************
                         Simulation time =     1.0547 sec 
                          Time/time step =     4.9416 sec 
                  Average time/time step =     5.9987 sec 
  
   U MAX =  3.137685038647847E-04  V MAX =  3.137685038647847E-04  W MAX =  1.490637308130234E-12  
   U MIN = -3.137685038647847E-04  V MIN = -3.137685038647847E-04  W MIN = -6.468958239699217E-03  
   U MAX =  3.137685038647847E-04  V MAX =  3.137685038647847E-04  W MAX =  6.468958239699217E-03  
   max(|divU|)    =  3.340634722407749E-02
   sum(divU*dV)   =  1.986223127301173E-17
   sum(|divU|*dV) =  2.009669468666401E-02
   Convective cfl =  4.541756958354424E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.036791757405083E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.017065243907e-03 true resid norm 1.017065243907e-03 ||r(i)||/||b|| 3.142006095747e-02
  1 KSP unpreconditioned resid norm 5.569540235575e-04 true resid norm 5.569540235575e-04 ||r(i)||/||b|| 1.720590638164e-02
  2 KSP unpreconditioned resid norm 9.159428781950e-05 true resid norm 9.159428781949e-05 ||r(i)||/||b|| 2.829610119789e-03
  3 KSP unpreconditioned resid norm 2.109988559131e-05 true resid norm 2.109988559131e-05 ||r(i)||/||b|| 6.518359519669e-04
  4 KSP unpreconditioned resid norm 3.381216180592e-06 true resid norm 3.381216180594e-06 ||r(i)||/||b|| 1.044554605922e-04
  5 KSP unpreconditioned resid norm 6.190897167649e-07 true resid norm 6.190897167645e-07 ||r(i)||/||b|| 1.912545606628e-05
  6 KSP unpreconditioned resid norm 1.071592799258e-07 true resid norm 1.071592799256e-07 ||r(i)||/||b|| 3.310457345377e-06
  7 KSP unpreconditioned resid norm 1.911726630815e-08 true resid norm 1.911726630583e-08 ||r(i)||/||b|| 5.905871587568e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       28 ****************************************
                         Simulation time =     1.0938 sec 
                          Time/time step =     4.9359 sec 
                  Average time/time step =     5.9607 sec 
  
   U MAX =  3.306853077415233E-04  V MAX =  3.306853077415233E-04  W MAX =  2.091014663450294E-12  
   U MIN = -3.306853077415233E-04  V MIN = -3.306853077415233E-04  W MIN = -6.712267098442212E-03  
   U MAX =  3.306853077415233E-04  V MAX =  3.306853077415233E-04  W MAX =  6.712267098442212E-03  
   max(|divU|)    =  3.441532656931547E-02
   sum(divU*dV)   = -9.662507560945147E-17
   sum(|divU|*dV) =  2.086896801303882E-02
   Convective cfl =  4.719128136912165E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.029322602551064E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.011051571776e-03 true resid norm 1.011051571776e-03 ||r(i)||/||b|| 3.031112740975e-02
  1 KSP unpreconditioned resid norm 5.577999890334e-04 true resid norm 5.577999890334e-04 ||r(i)||/||b|| 1.672273404120e-02
  2 KSP unpreconditioned resid norm 9.205514482216e-05 true resid norm 9.205514482217e-05 ||r(i)||/||b|| 2.759795149248e-03
  3 KSP unpreconditioned resid norm 2.113634676424e-05 true resid norm 2.113634676424e-05 ||r(i)||/||b|| 6.336635218537e-04
  4 KSP unpreconditioned resid norm 3.376357337148e-06 true resid norm 3.376357337147e-06 ||r(i)||/||b|| 1.012225293783e-04
  5 KSP unpreconditioned resid norm 6.192812171961e-07 true resid norm 6.192812171938e-07 ||r(i)||/||b|| 1.856592917792e-05
  6 KSP unpreconditioned resid norm 1.072861280251e-07 true resid norm 1.072861280260e-07 ||r(i)||/||b|| 3.216417032202e-06
  7 KSP unpreconditioned resid norm 1.914838022261e-08 true resid norm 1.914838022181e-08 ||r(i)||/||b|| 5.740646756266e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       29 ****************************************
                         Simulation time =     1.1328 sec 
                          Time/time step =     4.9249 sec 
                  Average time/time step =     5.9250 sec 
  
   U MAX =  3.477011451042520E-04  V MAX =  3.477011451042520E-04  W MAX =  2.881901841773920E-12  
   U MIN = -3.477011451042520E-04  V MIN = -3.477011451042520E-04  W MIN = -6.955954894808746E-03  
   U MAX =  3.477011451042520E-04  V MAX =  3.477011451042520E-04  W MAX =  6.955954894808746E-03  
   max(|divU|)    =  3.541210624580458E-02
   sum(divU*dV)   =  3.454128104623594E-17
   sum(|divU|*dV) =  2.164316982199105E-02
   Convective cfl =  4.896868598411040E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.021841360604639E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.005337398416e-03 true resid norm 1.005337398416e-03 ||r(i)||/||b|| 2.928039669666e-02
  1 KSP unpreconditioned resid norm 5.586660114961e-04 true resid norm 5.586660114961e-04 ||r(i)||/||b|| 1.627111700342e-02
  2 KSP unpreconditioned resid norm 9.257879095466e-05 true resid norm 9.257879095465e-05 ||r(i)||/||b|| 2.696352218787e-03
  3 KSP unpreconditioned resid norm 2.118646758439e-05 true resid norm 2.118646758439e-05 ||r(i)||/||b|| 6.170547086473e-04
  4 KSP unpreconditioned resid norm 3.371181273599e-06 true resid norm 3.371181273596e-06 ||r(i)||/||b|| 9.818547005487e-05
  5 KSP unpreconditioned resid norm 6.189056124530e-07 true resid norm 6.189056124541e-07 ||r(i)||/||b|| 1.802559208380e-05
  6 KSP unpreconditioned resid norm 1.073375933067e-07 true resid norm 1.073375933085e-07 ||r(i)||/||b|| 3.126201529446e-06
  7 KSP unpreconditioned resid norm 1.917441834395e-08 true resid norm 1.917441834284e-08 ||r(i)||/||b|| 5.584538846266e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       30 ****************************************
                         Simulation time =     1.1719 sec 
                          Time/time step =     4.9409 sec 
                  Average time/time step =     5.8922 sec 
  
   U MAX =  3.648011082886927E-04  V MAX =  3.648011082886927E-04  W MAX =  3.909498888809396E-12  
   U MIN = -3.648011082886927E-04  V MIN = -3.648011082886927E-04  W MIN = -7.200030149701984E-03  
   U MAX =  3.648011082886927E-04  V MAX =  3.648011082886927E-04  W MAX =  7.200030149701984E-03  
   max(|divU|)    =  3.639705168563705E-02
   sum(divU*dV)   =  1.507642367696106E-17
   sum(|divU|*dV) =  2.241929289046668E-02
   Convective cfl =  5.074964714418797E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.014348641117621E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.999070192231e-04 true resid norm 9.999070192231e-04 ||r(i)||/||b|| 2.832010531776e-02
  1 KSP unpreconditioned resid norm 5.595524902876e-04 true resid norm 5.595524902876e-04 ||r(i)||/||b|| 1.584805902060e-02
  2 KSP unpreconditioned resid norm 9.317595589083e-05 true resid norm 9.317595589083e-05 ||r(i)||/||b|| 2.638998260020e-03
  3 KSP unpreconditioned resid norm 2.125644558449e-05 true resid norm 2.125644558449e-05 ||r(i)||/||b|| 6.020407558511e-04
  4 KSP unpreconditioned resid norm 3.366807458257e-06 true resid norm 3.366807458254e-06 ||r(i)||/||b|| 9.535720818967e-05
  5 KSP unpreconditioned resid norm 6.179712823500e-07 true resid norm 6.179712823509e-07 ||r(i)||/||b|| 1.750263920852e-05
  6 KSP unpreconditioned resid norm 1.072668127310e-07 true resid norm 1.072668127304e-07 ||r(i)||/||b|| 3.038089917587e-06
  7 KSP unpreconditioned resid norm 1.918241746540e-08 true resid norm 1.918241746529e-08 ||r(i)||/||b|| 5.432985991920e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       31 ****************************************
                         Simulation time =     1.2109 sec 
                          Time/time step =     4.9422 sec 
                  Average time/time step =     5.8615 sec 
  
   U MAX =  3.819712638762183E-04  V MAX =  3.819712638762183E-04  W MAX =  5.228048290315647E-12  
   U MIN = -3.819712638762183E-04  V MIN = -3.819712638762183E-04  W MIN = -7.444500960418116E-03  
   U MAX =  3.819712638762183E-04  V MAX =  3.819712638762183E-04  W MAX =  7.444500960418116E-03  
   max(|divU|)    =  3.737051905435520E-02
   sum(divU*dV)   =  3.823246617899632E-17
   sum(|divU|*dV) =  2.319732986848551E-02
   Convective cfl =  5.253403832429154E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  9.006845011728921E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.947462645860e-04 true resid norm 9.947462645860e-04 ||r(i)||/||b|| 2.742348500190e-02
  1 KSP unpreconditioned resid norm 5.604599968758e-04 true resid norm 5.604599968758e-04 ||r(i)||/||b|| 1.545094147690e-02
  2 KSP unpreconditioned resid norm 9.385983767545e-05 true resid norm 9.385983767545e-05 ||r(i)||/||b|| 2.587558196908e-03
  3 KSP unpreconditioned resid norm 2.135513624454e-05 true resid norm 2.135513624454e-05 ||r(i)||/||b|| 5.887252652911e-04
  4 KSP unpreconditioned resid norm 3.365634334577e-06 true resid norm 3.365634334576e-06 ||r(i)||/||b|| 9.278488995838e-05
  5 KSP unpreconditioned resid norm 6.168289689638e-07 true resid norm 6.168289689632e-07 ||r(i)||/||b|| 1.700493943160e-05
  6 KSP unpreconditioned resid norm 1.070545756011e-07 true resid norm 1.070545756046e-07 ||r(i)||/||b|| 2.951314976488e-06
  7 KSP unpreconditioned resid norm 1.915262615660e-08 true resid norm 1.915262615561e-08 ||r(i)||/||b|| 5.280057586784e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       32 ****************************************
                         Simulation time =     1.2500 sec 
                          Time/time step =     4.9284 sec 
                  Average time/time step =     5.8324 sec 
  
   U MAX =  3.991985725649520E-04  V MAX =  3.991985725649520E-04  W MAX =  6.900699263091350E-12  
   U MIN = -3.991985725649520E-04  V MIN = -3.991985725649520E-04  W MIN = -7.689374996953326E-03  
   U MAX =  3.991985725649520E-04  V MAX =  3.991985725649520E-04  W MAX =  7.689374996953326E-03  
   max(|divU|)    =  3.833285500851496E-02
   sum(divU*dV)   = -3.832772970957615E-17
   sum(|divU|*dV) =  2.397727321507414E-02
   Convective cfl =  5.432174170933268E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.999331002654465E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.898426613669e-04 true resid norm 9.898426613669e-04 ||r(i)||/||b|| 2.658461796242e-02
  1 KSP unpreconditioned resid norm 5.613892905236e-04 true resid norm 5.613892905236e-04 ||r(i)||/||b|| 1.507746675230e-02
  2 KSP unpreconditioned resid norm 9.464672133814e-05 true resid norm 9.464672133815e-05 ||r(i)||/||b|| 2.541966543143e-03
  3 KSP unpreconditioned resid norm 2.149522795860e-05 true resid norm 2.149522795859e-05 ||r(i)||/||b|| 5.773063190723e-04
  4 KSP unpreconditioned resid norm 3.372418915567e-06 true resid norm 3.372418915564e-06 ||r(i)||/||b|| 9.057446398170e-05
  5 KSP unpreconditioned resid norm 6.166967184143e-07 true resid norm 6.166967184150e-07 ||r(i)||/||b|| 1.656288145341e-05
  6 KSP unpreconditioned resid norm 1.068366215632e-07 true resid norm 1.068366215608e-07 ||r(i)||/||b|| 2.869355786976e-06
  7 KSP unpreconditioned resid norm 1.907817749690e-08 true resid norm 1.907817750218e-08 ||r(i)||/||b|| 5.123905849990e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       33 ****************************************
                         Simulation time =     1.2891 sec 
                          Time/time step =     4.9415 sec 
                  Average time/time step =     5.8054 sec 
  
   U MAX =  4.164708188215090E-04  V MAX =  4.164708188215090E-04  W MAX =  9.000399662903064E-12  
   U MIN = -4.164708188215090E-04  V MIN = -4.164708188215090E-04  W MIN = -7.934659497922977E-03  
   U MAX =  4.164708188215090E-04  V MAX =  4.164708188215090E-04  W MAX =  7.934659497922977E-03  
   max(|divU|)    =  3.928439652168028E-02
   sum(divU*dV)   = -5.266414533402403E-17
   sum(|divU|*dV) =  2.475911515855458E-02
   Convective cfl =  5.611264726762238E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.991807110650943E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.851854712036e-04 true resid norm 9.851854712036e-04 ||r(i)||/||b|| 2.579831449281e-02
  1 KSP unpreconditioned resid norm 5.623413102085e-04 true resid norm 5.623413102085e-04 ||r(i)||/||b|| 1.472561096068e-02
  2 KSP unpreconditioned resid norm 9.555677903554e-05 true resid norm 9.555677903554e-05 ||r(i)||/||b|| 2.502273845419e-03
  3 KSP unpreconditioned resid norm 2.169494954217e-05 true resid norm 2.169494954217e-05 ||r(i)||/||b|| 5.681094043244e-04
  4 KSP unpreconditioned resid norm 3.396150651318e-06 true resid norm 3.396150651318e-06 ||r(i)||/||b|| 8.893245498294e-05
  5 KSP unpreconditioned resid norm 6.207853792369e-07 true resid norm 6.207853792359e-07 ||r(i)||/||b|| 1.625604204912e-05
  6 KSP unpreconditioned resid norm 1.072872335213e-07 true resid norm 1.072872335227e-07 ||r(i)||/||b|| 2.809450476469e-06
  7 KSP unpreconditioned resid norm 1.907040101434e-08 true resid norm 1.907040101209e-08 ||r(i)||/||b|| 4.993823165225e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       34 ****************************************
                         Simulation time =     1.3281 sec 
                          Time/time step =     4.9264 sec 
                  Average time/time step =     5.7795 sec 
  
   U MAX =  4.337765483258318E-04  V MAX =  4.337765483258318E-04  W MAX =  1.161081079721469E-11  
   U MIN = -4.337765483258318E-04  V MIN = -4.337765483258318E-04  W MIN = -8.180361266170116E-03  
   U MAX =  4.337765483258318E-04  V MAX =  4.337765483258318E-04  W MAX =  8.180361266170116E-03  
   max(|divU|)    =  4.022547080844885E-02
   sum(divU*dV)   =  1.233544127631819E-17
   sum(|divU|*dV) =  2.554284768086973E-02
   Convective cfl =  5.790665192205939E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.984273802559585E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.807655573968e-04 true resid norm 9.807655573968e-04 ||r(i)||/||b|| 2.506000711107e-02
  1 KSP unpreconditioned resid norm 5.633171668328e-04 true resid norm 5.633171668328e-04 ||r(i)||/||b|| 1.439358478706e-02
  2 KSP unpreconditioned resid norm 9.661510804421e-05 true resid norm 9.661510804421e-05 ||r(i)||/||b|| 2.468658566121e-03
  3 KSP unpreconditioned resid norm 2.198053894015e-05 true resid norm 2.198053894015e-05 ||r(i)||/||b|| 5.616352022061e-04
  4 KSP unpreconditioned resid norm 3.453141439878e-06 true resid norm 3.453141439878e-06 ||r(i)||/||b|| 8.823285889908e-05
  5 KSP unpreconditioned resid norm 6.364463899108e-07 true resid norm 6.364463899125e-07 ||r(i)||/||b|| 1.626214433891e-05
  6 KSP unpreconditioned resid norm 1.105673245591e-07 true resid norm 1.105673245571e-07 ||r(i)||/||b|| 2.825158284520e-06
  7 KSP unpreconditioned resid norm 1.972920790450e-08 true resid norm 1.972920790619e-08 ||r(i)||/||b|| 5.041103724495e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       35 ****************************************
                         Simulation time =     1.3672 sec 
                          Time/time step =     4.9401 sec 
                  Average time/time step =     5.7555 sec 
  
   U MAX =  4.511050108339042E-04  V MAX =  4.511050108339042E-04  W MAX =  1.482724021178660E-11  
   U MIN = -4.511050108339042E-04  V MIN = -4.511050108339042E-04  W MIN = -8.426486664096112E-03  
   U MAX =  4.511050108339042E-04  V MAX =  4.511050108339042E-04  W MAX =  8.426486664096112E-03  
   max(|divU|)    =  4.115639534074935E-02
   sum(divU*dV)   = -1.145860583228735E-17
   sum(|divU|*dV) =  2.632846251912788E-02
   Convective cfl =  5.970365878888909E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.976731518557743E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.765751078588e-04 true resid norm 9.765751078588e-04 ||r(i)||/||b|| 2.436565860028e-02
  1 KSP unpreconditioned resid norm 5.643181597471e-04 true resid norm 5.643181597471e-04 ||r(i)||/||b|| 1.407980145273e-02
  2 KSP unpreconditioned resid norm 9.785307283726e-05 true resid norm 9.785307283727e-05 ||r(i)||/||b|| 2.441445155169e-03
  3 KSP unpreconditioned resid norm 2.238976860976e-05 true resid norm 2.238976860975e-05 ||r(i)||/||b|| 5.586272409508e-04
  4 KSP unpreconditioned resid norm 3.571652839321e-06 true resid norm 3.571652839320e-06 ||r(i)||/||b|| 8.911313940040e-05
  5 KSP unpreconditioned resid norm 6.783193984635e-07 true resid norm 6.783193984689e-07 ||r(i)||/||b|| 1.692414515999e-05
  6 KSP unpreconditioned resid norm 1.217975177427e-07 true resid norm 1.217975177415e-07 ||r(i)||/||b|| 3.038861744242e-06
  7 KSP unpreconditioned resid norm 2.276836536065e-08 true resid norm 2.276836535724e-08 ||r(i)||/||b|| 5.680732723130e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       36 ****************************************
                         Simulation time =     1.4062 sec 
                          Time/time step =     4.9283 sec 
                  Average time/time step =     5.7326 sec 
  
   U MAX =  4.684461064760571E-04  V MAX =  4.684461064760571E-04  W MAX =  1.875758734736681E-11  
   U MIN = -4.684461064760571E-04  V MIN = -4.684461064760571E-04  W MIN = -8.673041608687577E-03  
   U MAX =  4.684461064760571E-04  V MAX =  4.684461064760571E-04  W MAX =  8.673041608687577E-03  
   max(|divU|)    =  4.207747792094865E-02
   sum(divU*dV)   =  4.599775939863605E-17
   sum(|divU|*dV) =  2.711595117503776E-02
   Convective cfl =  6.150357645849402E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.969180675226200E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.726073108163e-04 true resid norm 9.726073108163e-04 ||r(i)||/||b|| 2.371168358058e-02
  1 KSP unpreconditioned resid norm 5.653458120211e-04 true resid norm 5.653458120211e-04 ||r(i)||/||b|| 1.378285034379e-02
  2 KSP unpreconditioned resid norm 9.931002100599e-05 true resid norm 9.931002100599e-05 ||r(i)||/||b|| 2.421129029452e-03
  3 KSP unpreconditioned resid norm 2.297687015357e-05 true resid norm 2.297687015357e-05 ||r(i)||/||b|| 5.601646920547e-04
  4 KSP unpreconditioned resid norm 3.797658632871e-06 true resid norm 3.797658632874e-06 ||r(i)||/||b|| 9.258503287849e-05
  5 KSP unpreconditioned resid norm 7.702859013373e-07 true resid norm 7.702859013399e-07 ||r(i)||/||b|| 1.877918801970e-05
  6 KSP unpreconditioned resid norm 1.486225606933e-07 true resid norm 1.486225606894e-07 ||r(i)||/||b|| 3.623344275549e-06
  7 KSP unpreconditioned resid norm 3.002239710354e-08 true resid norm 3.002239710279e-08 ||r(i)||/||b|| 7.319311427287e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       37 ****************************************
                         Simulation time =     1.4453 sec 
                          Time/time step =     4.9175 sec 
                  Average time/time step =     5.7105 sec 
  
   U MAX =  4.857903344348851E-04  V MAX =  4.857903344348851E-04  W MAX =  2.352329683379803E-11  
   U MIN = -4.857903344348851E-04  V MIN = -4.857903344348851E-04  W MIN = -8.920031566154332E-03  
   U MAX =  4.857903344348851E-04  V MAX =  4.857903344348851E-04  W MAX =  8.920031566154332E-03  
   max(|divU|)    =  4.298901676450798E-02
   sum(divU*dV)   =  1.778694787993800E-17
   sum(|divU|*dV) =  2.790530492402937E-02
   Convective cfl =  6.330631830415426E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.961621668491502E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.688561023665e-04 true resid norm 9.688561023665e-04 ||r(i)||/||b|| 2.309488393758e-02
  1 KSP unpreconditioned resid norm 5.664018426655e-04 true resid norm 5.664018426655e-04 ||r(i)||/||b|| 1.350147332142e-02
  2 KSP unpreconditioned resid norm 1.010354191468e-04 true resid norm 1.010354191468e-04 ||r(i)||/||b|| 2.408408506775e-03
  3 KSP unpreconditioned resid norm 2.381918152940e-05 true resid norm 2.381918152940e-05 ||r(i)||/||b|| 5.677842473883e-04
  4 KSP unpreconditioned resid norm 4.199762467923e-06 true resid norm 4.199762467922e-06 ||r(i)||/||b|| 1.001108694317e-04
  5 KSP unpreconditioned resid norm 9.414129391480e-07 true resid norm 9.414129391472e-07 ||r(i)||/||b|| 2.244071386229e-05
  6 KSP unpreconditioned resid norm 1.950841469631e-07 true resid norm 1.950841469624e-07 ||r(i)||/||b|| 4.650273369962e-06
  7 KSP unpreconditioned resid norm 3.989489021571e-08 true resid norm 3.989489021477e-08 ||r(i)||/||b|| 9.509852463769e-07
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       38 ****************************************
                         Simulation time =     1.4844 sec 
                          Time/time step =     4.9318 sec 
                  Average time/time step =     5.6900 sec 
  
   U MAX =  5.031287439701298E-04  V MAX =  5.031287439701298E-04  W MAX =  2.926031410161659E-11  
   U MIN = -5.031287439701298E-04  V MIN = -5.031287439701298E-04  W MIN = -9.167461546052079E-03  
   U MAX =  5.031287439701298E-04  V MAX =  5.031287439701298E-04  W MAX =  9.167461546052079E-03  
   max(|divU|)    =  4.389130055351834E-02
   sum(divU*dV)   = -1.294684815588316E-16
   sum(|divU|*dV) =  2.869651481910813E-02
   Convective cfl =  6.511180181755097E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.954054876448685E-01  
               Iterations to convergence =     7
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.653160648367e-04 true resid norm 9.653160648367e-04 ||r(i)||/||b|| 2.251239751711e-02
  1 KSP unpreconditioned resid norm 5.674880156214e-04 true resid norm 5.674880156214e-04 ||r(i)||/||b|| 1.323454178298e-02
  2 KSP unpreconditioned resid norm 1.030914649249e-04 true resid norm 1.030914649249e-04 ||r(i)||/||b|| 2.404223987926e-03
  3 KSP unpreconditioned resid norm 2.502572173112e-05 true resid norm 2.502572173112e-05 ||r(i)||/||b|| 5.836316376429e-04
  4 KSP unpreconditioned resid norm 4.870445401901e-06 true resid norm 4.870445401897e-06 ||r(i)||/||b|| 1.135849769489e-04
  5 KSP unpreconditioned resid norm 1.215482808089e-06 true resid norm 1.215482808082e-06 ||r(i)||/||b|| 2.834660392334e-05
  6 KSP unpreconditioned resid norm 2.551625330565e-07 true resid norm 2.551625330483e-07 ||r(i)||/||b|| 5.950714573914e-06
  7 KSP unpreconditioned resid norm 4.861220178024e-08 true resid norm 4.861220177235e-08 ||r(i)||/||b|| 1.133698329849e-06
  8 KSP unpreconditioned resid norm 8.144841115238e-09 true resid norm 8.144841112622e-09 ||r(i)||/||b|| 1.899480465728e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       39 ****************************************
                         Simulation time =     1.5234 sec 
                          Time/time step =     5.2763 sec 
                  Average time/time step =     5.6794 sec 
  
   U MAX =  5.204528884791881E-04  V MAX =  5.204528884791881E-04  W MAX =  3.612003794311162E-11  
   U MIN = -5.204528884791881E-04  V MIN = -5.204528884791881E-04  W MIN = -9.415336094758673E-03  
   U MAX =  5.204528884791881E-04  V MAX =  5.204528884791881E-04  W MAX =  9.415336094758673E-03  
   max(|divU|)    =  4.478460844464908E-02
   sum(divU*dV)   =  1.137276829931661E-16
   sum(|divU|*dV) =  2.948957168820364E-02
   Convective cfl =  6.691994797898912E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.946480662031222E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.619817564018e-04 true resid norm 9.619817564018e-04 ||r(i)||/||b|| 2.196164143736e-02
  1 KSP unpreconditioned resid norm 5.686026262463e-04 true resid norm 5.686026262463e-04 ||r(i)||/||b|| 1.298096030913e-02
  2 KSP unpreconditioned resid norm 1.055538212906e-04 true resid norm 1.055538212906e-04 ||r(i)||/||b|| 2.409749623732e-03
  3 KSP unpreconditioned resid norm 2.674695797372e-05 true resid norm 2.674695797372e-05 ||r(i)||/||b|| 6.106218716207e-04
  4 KSP unpreconditioned resid norm 5.924085158408e-06 true resid norm 5.924085158412e-06 ||r(i)||/||b|| 1.352443881889e-04
  5 KSP unpreconditioned resid norm 1.602272846882e-06 true resid norm 1.602272846877e-06 ||r(i)||/||b|| 3.657921942258e-05
  6 KSP unpreconditioned resid norm 3.168298436773e-07 true resid norm 3.168298436703e-07 ||r(i)||/||b|| 7.233092911626e-06
  7 KSP unpreconditioned resid norm 5.586014529979e-08 true resid norm 5.586014529694e-08 ||r(i)||/||b|| 1.275263770322e-06
  8 KSP unpreconditioned resid norm 9.407129276743e-09 true resid norm 9.407129273877e-09 ||r(i)||/||b|| 2.147608296030e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       40 ****************************************
                         Simulation time =     1.5625 sec 
                          Time/time step =     5.2801 sec 
                  Average time/time step =     5.6694 sec 
  
   U MAX =  5.377547834995266E-04  V MAX =  5.377547834995266E-04  W MAX =  4.427026464318359E-11  
   U MIN = -5.377547834995266E-04  V MIN = -5.377547834995266E-04  W MIN = -9.663659288215718E-03  
   U MAX =  5.377547834995266E-04  V MAX =  5.377547834995266E-04  W MAX =  9.663659288215718E-03  
   max(|divU|)    =  4.566921003955636E-02
   sum(divU*dV)   =  6.161473781623374E-17
   sum(|divU|*dV) =  3.028446612649909E-02
   Convective cfl =  6.873068067337454E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.938899375482311E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.588526158093e-04 true resid norm 9.588526158093e-04 ||r(i)||/||b|| 2.144038799229e-02
  1 KSP unpreconditioned resid norm 5.697560491174e-04 true resid norm 5.697560491174e-04 ||r(i)||/||b|| 1.274000879032e-02
  2 KSP unpreconditioned resid norm 1.085274114053e-04 true resid norm 1.085274114053e-04 ||r(i)||/||b|| 2.426723116738e-03
  3 KSP unpreconditioned resid norm 2.919112513317e-05 true resid norm 2.919112513317e-05 ||r(i)||/||b|| 6.527270598919e-04
  4 KSP unpreconditioned resid norm 7.501781719811e-06 true resid norm 7.501781719801e-06 ||r(i)||/||b|| 1.677433090906e-04
  5 KSP unpreconditioned resid norm 2.097754413791e-06 true resid norm 2.097754413779e-06 ||r(i)||/||b|| 4.690675897672e-05
  6 KSP unpreconditioned resid norm 3.750084920979e-07 true resid norm 3.750084920952e-07 ||r(i)||/||b|| 8.385363337763e-06
  7 KSP unpreconditioned resid norm 6.407190922772e-08 true resid norm 6.407190922495e-08 ||r(i)||/||b|| 1.432677525764e-06
  8 KSP unpreconditioned resid norm 1.109760823695e-08 true resid norm 1.109760823122e-08 ||r(i)||/||b|| 2.481476530813e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       41 ****************************************
                         Simulation time =     1.6016 sec 
                          Time/time step =     5.2897 sec 
                  Average time/time step =     5.6602 sec 
  
   U MAX =  5.550268693159551E-04  V MAX =  5.550268693159551E-04  W MAX =  5.389611832243758E-11  
   U MIN = -5.550268693159551E-04  V MIN = -5.550268693159551E-04  W MIN = -9.912434723936841E-03  
   U MAX =  5.550268693159551E-04  V MAX =  5.550268693159551E-04  W MAX =  9.912434723936841E-03  
   max(|divU|)    =  4.654536534164120E-02
   sum(divU*dV)   =  7.663481308386309E-17
   sum(|divU|*dV) =  3.108213844533452E-02
   Convective cfl =  7.054392616044000E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.931311356592495E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.559282785354e-04 true resid norm 9.559282785354e-04 ||r(i)||/||b|| 2.094662207925e-02
  1 KSP unpreconditioned resid norm 5.709496722979e-04 true resid norm 5.709496722979e-04 ||r(i)||/||b|| 1.251084132611e-02
  2 KSP unpreconditioned resid norm 1.121376956818e-04 true resid norm 1.121376956818e-04 ||r(i)||/||b|| 2.457198918608e-03
  3 KSP unpreconditioned resid norm 3.263680051620e-05 true resid norm 3.263680051620e-05 ||r(i)||/||b|| 7.151485541742e-04
  4 KSP unpreconditioned resid norm 9.787969787122e-06 true resid norm 9.787969787117e-06 ||r(i)||/||b|| 2.144772873212e-04
  5 KSP unpreconditioned resid norm 2.697948086712e-06 true resid norm 2.697948086711e-06 ||r(i)||/||b|| 5.911834625121e-05
  6 KSP unpreconditioned resid norm 4.388419847123e-07 true resid norm 4.388419847060e-07 ||r(i)||/||b|| 9.616053225489e-06
  7 KSP unpreconditioned resid norm 7.533046586121e-08 true resid norm 7.533046585203e-08 ||r(i)||/||b|| 1.650666514097e-06
  8 KSP unpreconditioned resid norm 1.327386810018e-08 true resid norm 1.327386809678e-08 ||r(i)||/||b|| 2.908614639783e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       42 ****************************************
                         Simulation time =     1.6406 sec 
                          Time/time step =     5.2955 sec 
                  Average time/time step =     5.6515 sec 
  
   U MAX =  5.722619783523600E-04  V MAX =  5.722619783523600E-04  W MAX =  6.520096218797720E-11  
   U MIN = -5.722619783523600E-04  V MIN = -5.722619783523600E-04  W MIN = -1.016166551240614E-02  
   U MAX =  5.722619783523600E-04  V MAX =  5.722619783523600E-04  W MAX =  1.016166551240614E-02  
   max(|divU|)    =  4.741332470628808E-02
   sum(divU*dV)   = -6.679930840361994E-17
   sum(|divU|*dV) =  3.188275111165742E-02
   Convective cfl =  7.235961260230947E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.923716936691350E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.532094014231e-04 true resid norm 9.532094014231e-04 ||r(i)||/||b|| 2.047853444210e-02
  1 KSP unpreconditioned resid norm 5.721881245947e-04 true resid norm 5.721881245947e-04 ||r(i)||/||b|| 1.229275980638e-02
  2 KSP unpreconditioned resid norm 1.165466916946e-04 true resid norm 1.165466916946e-04 ||r(i)||/||b|| 2.503862673215e-03
  3 KSP unpreconditioned resid norm 3.745944846755e-05 true resid norm 3.745944846755e-05 ||r(i)||/||b|| 8.047702891721e-04
  4 KSP unpreconditioned resid norm 1.305337758017e-05 true resid norm 1.305337758017e-05 ||r(i)||/||b|| 2.804358013697e-04
  5 KSP unpreconditioned resid norm 3.436611414495e-06 true resid norm 3.436611414503e-06 ||r(i)||/||b|| 7.383137966426e-05
  6 KSP unpreconditioned resid norm 5.329476257960e-07 true resid norm 5.329476257973e-07 ||r(i)||/||b|| 1.144972583614e-05
  7 KSP unpreconditioned resid norm 9.361074626380e-08 true resid norm 9.361074626318e-08 ||r(i)||/||b|| 2.011112027052e-06
  8 KSP unpreconditioned resid norm 1.654045230649e-08 true resid norm 1.654045231200e-08 ||r(i)||/||b|| 3.553513234904e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       43 ****************************************
                         Simulation time =     1.6797 sec 
                          Time/time step =     5.2843 sec 
                  Average time/time step =     5.6430 sec 
  
   U MAX =  5.894533070614432E-04  V MAX =  5.894533070614432E-04  W MAX =  7.840728546853753E-11  
   U MIN = -5.894533070614432E-04  V MIN = -5.894533070614432E-04  W MIN = -1.041135426811980E-02  
   U MAX =  5.894533070614432E-04  V MAX =  5.894533070614432E-04  W MAX =  1.041135426811980E-02  
   max(|divU|)    =  4.827332879033855E-02
   sum(divU*dV)   = -1.944361779260113E-17
   sum(|divU|*dV) =  3.268620746643017E-02
   Convective cfl =  7.417766964635319E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.916116440402518E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.506987107380e-04 true resid norm 9.506987107380e-04 ||r(i)||/||b|| 2.003452088410e-02
  1 KSP unpreconditioned resid norm 5.734767148944e-04 true resid norm 5.734767148944e-04 ||r(i)||/||b|| 1.208514442202e-02
  2 KSP unpreconditioned resid norm 1.219614101567e-04 true resid norm 1.219614101567e-04 ||r(i)||/||b|| 2.570150134044e-03
  3 KSP unpreconditioned resid norm 4.417104052445e-05 true resid norm 4.417104052446e-05 ||r(i)||/||b|| 9.308371031371e-04
  4 KSP unpreconditioned resid norm 1.774091199672e-05 true resid norm 1.774091199672e-05 ||r(i)||/||b|| 3.738625790554e-04
  5 KSP unpreconditioned resid norm 4.454157853553e-06 true resid norm 4.454157853559e-06 ||r(i)||/||b|| 9.386456248470e-05
  6 KSP unpreconditioned resid norm 7.200394014205e-07 true resid norm 7.200394014228e-07 ||r(i)||/||b|| 1.517372881886e-05
  7 KSP unpreconditioned resid norm 1.379733077289e-07 true resid norm 1.379733077328e-07 ||r(i)||/||b|| 2.907576379351e-06
  8 KSP unpreconditioned resid norm 2.662633277832e-08 true resid norm 2.662633278849e-08 ||r(i)||/||b|| 5.611092287101e-07
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       44 ****************************************
                         Simulation time =     1.7188 sec 
                          Time/time step =     5.2837 sec 
                  Average time/time step =     5.6348 sec 
  
   U MAX =  6.065943917762114E-04  V MAX =  6.065943917762114E-04  W MAX =  9.375756091715709E-11  
   U MIN = -6.065943917762114E-04  V MIN = -6.065943917762114E-04  W MIN = -1.066150310063001E-02  
   U MAX =  6.065943917762114E-04  V MAX =  6.065943917762114E-04  W MAX =  1.066150310063001E-02  
   max(|divU|)    =  4.912560853256981E-02
   sum(divU*dV)   =  6.111669858309849E-17
   sum(|divU|*dV) =  3.349197454532234E-02
   Convective cfl =  7.599802805876757E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.908510187181871E-01  
               Iterations to convergence =     8
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.484010305825e-04 true resid norm 9.484010305825e-04 ||r(i)||/||b|| 1.961316094007e-02
  1 KSP unpreconditioned resid norm 5.748213015688e-04 true resid norm 5.748213015688e-04 ||r(i)||/||b|| 1.188744248045e-02
  2 KSP unpreconditioned resid norm 1.286469364839e-04 true resid norm 1.286469364839e-04 ||r(i)||/||b|| 2.660449523295e-03
  3 KSP unpreconditioned resid norm 5.348907510610e-05 true resid norm 5.348907510610e-05 ||r(i)||/||b|| 1.106166911214e-03
  4 KSP unpreconditioned resid norm 2.465650792099e-05 true resid norm 2.465650792099e-05 ||r(i)||/||b|| 5.099025016637e-04
  5 KSP unpreconditioned resid norm 6.146862394919e-06 true resid norm 6.146862394916e-06 ||r(i)||/||b|| 1.271185896476e-04
  6 KSP unpreconditioned resid norm 1.188069342317e-06 true resid norm 1.188069342320e-06 ||r(i)||/||b|| 2.456955915007e-05
  7 KSP unpreconditioned resid norm 2.741412806621e-07 true resid norm 2.741412806753e-07 ||r(i)||/||b|| 5.669307481560e-06
  8 KSP unpreconditioned resid norm 6.049923165219e-08 true resid norm 6.049923166410e-08 ||r(i)||/||b|| 1.251138631355e-06
  9 KSP unpreconditioned resid norm 1.176107887875e-08 true resid norm 1.176107888235e-08 ||r(i)||/||b|| 2.432219340870e-07
Linear solve converged due to CONVERGED_RTOL iterations 9
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       45 ****************************************
                         Simulation time =     1.7578 sec 
                          Time/time step =     5.6541 sec 
                  Average time/time step =     5.6352 sec 
  
   U MAX =  6.236790880062606E-04  V MAX =  6.236790880062606E-04  W MAX =  1.115150678938498E-10  
   U MIN = -6.236790880062606E-04  V MIN = -6.236790880062606E-04  W MIN = -1.091211360600322E-02  
   U MAX =  6.236790880062606E-04  V MAX =  6.236790880062606E-04  W MAX =  1.091211360600322E-02  
   max(|divU|)    =  4.997038517101132E-02
   sum(divU*dV)   = -1.366059071355439E-16
   sum(|divU|*dV) =  3.429973463439592E-02
   Convective cfl =  7.782061940490073E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.900898492656075E-01  
               Iterations to convergence =     9
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.463226338129e-04 true resid norm 9.463226338129e-04 ||r(i)||/||b|| 1.921318554716e-02
  1 KSP unpreconditioned resid norm 5.762320387465e-04 true resid norm 5.762320387465e-04 ||r(i)||/||b|| 1.169923732464e-02
  2 KSP unpreconditioned resid norm 1.369454100402e-04 true resid norm 1.369454100402e-04 ||r(i)||/||b|| 2.780402242240e-03
  3 KSP unpreconditioned resid norm 6.645141899166e-05 true resid norm 6.645141899167e-05 ||r(i)||/||b|| 1.349162957051e-03
  4 KSP unpreconditioned resid norm 3.542849089818e-05 true resid norm 3.542849089818e-05 ||r(i)||/||b|| 7.193045426169e-04
  5 KSP unpreconditioned resid norm 9.634571468376e-06 true resid norm 9.634571468390e-06 ||r(i)||/||b|| 1.956106751286e-04
  6 KSP unpreconditioned resid norm 2.490903677247e-06 true resid norm 2.490903677264e-06 ||r(i)||/||b|| 5.057280975999e-05
  7 KSP unpreconditioned resid norm 6.460104461267e-07 true resid norm 6.460104461372e-07 ||r(i)||/||b|| 1.311594811701e-05
  8 KSP unpreconditioned resid norm 1.308945022433e-07 true resid norm 1.308945022564e-07 ||r(i)||/||b|| 2.657550679965e-06
  9 KSP unpreconditioned resid norm 2.332721980281e-08 true resid norm 2.332721980822e-08 ||r(i)||/||b|| 4.736124725971e-07
Linear solve converged due to CONVERGED_RTOL iterations 9
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       46 ****************************************
                         Simulation time =     1.7969 sec 
                          Time/time step =     5.6396 sec 
                  Average time/time step =     5.6353 sec 
  
   U MAX =  6.407015528670302E-04  V MAX =  6.407015528670302E-04  W MAX =  1.319646761900996E-10  
   U MIN = -6.407015528670302E-04  V MIN = -6.407015528670302E-04  W MIN = -1.116318685908394E-02  
   U MAX =  6.407015528670302E-04  V MAX =  6.407015528670302E-04  W MAX =  1.116318685908394E-02  
   max(|divU|)    =  5.080787027504124E-02
   sum(divU*dV)   = -1.802459127280384E-16
   sum(|divU|*dV) =  3.510958606679235E-02
   Convective cfl =  7.964537577483519E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.893281669768407E-01  
               Iterations to convergence =     9
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.444723150048e-04 true resid norm 9.444723150048e-04 ||r(i)||/||b|| 1.883348287760e-02
  1 KSP unpreconditioned resid norm 5.777155860646e-04 true resid norm 5.777155860646e-04 ||r(i)||/||b|| 1.152007996996e-02
  2 KSP unpreconditioned resid norm 1.472956246270e-04 true resid norm 1.472956246270e-04 ||r(i)||/||b|| 2.937184690633e-03
  3 KSP unpreconditioned resid norm 8.458940194082e-05 true resid norm 8.458940194082e-05 ||r(i)||/||b|| 1.686775808850e-03
  4 KSP unpreconditioned resid norm 5.365762440872e-05 true resid norm 5.365762440871e-05 ||r(i)||/||b|| 1.069973078617e-03
  5 KSP unpreconditioned resid norm 1.864657810331e-05 true resid norm 1.864657810332e-05 ||r(i)||/||b|| 3.718266844412e-04
  6 KSP unpreconditioned resid norm 6.524460622530e-06 true resid norm 6.524460622545e-06 ||r(i)||/||b|| 1.301026144103e-04
  7 KSP unpreconditioned resid norm 1.622750138813e-06 true resid norm 1.622750138828e-06 ||r(i)||/||b|| 3.235884892410e-05
  8 KSP unpreconditioned resid norm 2.958932862640e-07 true resid norm 2.958932862731e-07 ||r(i)||/||b|| 5.900332971215e-06
  9 KSP unpreconditioned resid norm 6.413461809109e-08 true resid norm 6.413461809469e-08 ||r(i)||/||b|| 1.278892152325e-06
 10 KSP unpreconditioned resid norm 1.891915012387e-08 true resid norm 1.891915013001e-08 ||r(i)||/||b|| 3.772619740903e-07
Linear solve converged due to CONVERGED_RTOL iterations 10
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       47 ****************************************
                         Simulation time =     1.8359 sec 
                          Time/time step =     5.9881 sec 
                  Average time/time step =     5.6428 sec 
  
   U MAX =  6.576562305696783E-04  V MAX =  6.576562305696783E-04  W MAX =  1.554135859239686E-10  
   U MIN = -6.576562305696783E-04  V MIN = -6.576562305696783E-04  W MIN = -1.141472340685389E-02  
   U MAX =  6.576562305696783E-04  V MAX =  6.576562305696783E-04  W MAX =  1.141472340685389E-02  
   max(|divU|)    =  5.163826577654140E-02
   sum(divU*dV)   = -7.239041132354072E-17
   sum(|divU|*dV) =  3.592336567287319E-02
   Convective cfl =  8.147222955515682E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.885660029728505E-01  
               Iterations to convergence =    10
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.428598999965e-04 true resid norm 9.428598999965e-04 ||r(i)||/||b|| 1.847305363556e-02
  1 KSP unpreconditioned resid norm 5.792840703613e-04 true resid norm 5.792840703613e-04 ||r(i)||/||b|| 1.134966679785e-02
  2 KSP unpreconditioned resid norm 1.602663517902e-04 true resid norm 1.602663517901e-04 ||r(i)||/||b|| 3.140030573583e-03
  3 KSP unpreconditioned resid norm 1.101142848680e-04 true resid norm 1.101142848680e-04 ||r(i)||/||b|| 2.157422423433e-03
  4 KSP unpreconditioned resid norm 8.731041325492e-05 true resid norm 8.731041325492e-05 ||r(i)||/||b|| 1.710635850572e-03
  5 KSP unpreconditioned resid norm 5.035997710797e-05 true resid norm 5.035997710799e-05 ||r(i)||/||b|| 9.866816461331e-04
  6 KSP unpreconditioned resid norm 2.457879661966e-05 true resid norm 2.457879661967e-05 ||r(i)||/||b|| 4.815619247933e-04
  7 KSP unpreconditioned resid norm 5.634575585722e-06 true resid norm 5.634575585741e-06 ||r(i)||/||b|| 1.103958467312e-04
  8 KSP unpreconditioned resid norm 1.134851225677e-06 true resid norm 1.134851225701e-06 ||r(i)||/||b|| 2.223465815104e-05
  9 KSP unpreconditioned resid norm 3.682136578226e-07 true resid norm 3.682136578415e-07 ||r(i)||/||b|| 7.214253836309e-06
 10 KSP unpreconditioned resid norm 1.353419828913e-07 true resid norm 1.353419829004e-07 ||r(i)||/||b|| 2.651697998050e-06
 11 KSP unpreconditioned resid norm 3.039998964113e-08 true resid norm 3.039998964440e-08 ||r(i)||/||b|| 5.956140877596e-07
Linear solve converged due to CONVERGED_RTOL iterations 11
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       48 ****************************************
                         Simulation time =     1.8750 sec 
                          Time/time step =     6.3739 sec 
                  Average time/time step =     5.6581 sec 
  
   U MAX =  6.745378410411323E-04  V MAX =  6.745378410411323E-04  W MAX =  1.821920190168132E-10  
   U MIN = -6.745378410411323E-04  V MIN = -6.745378410411323E-04  W MIN = -1.166672326300875E-02  
   U MAX =  6.745378410411323E-04  V MAX =  6.745378410411323E-04  W MAX =  1.166672326300875E-02  
   max(|divU|)    =  5.252541757016951E-02
   sum(divU*dV)   =  9.877092826794907E-17
   sum(|divU|*dV) =  3.673921294188139E-02
   Convective cfl =  8.330111324858251E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.878033882759807E-01  
               Iterations to convergence =    11
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.414933765577e-04 true resid norm 9.414933765577e-04 ||r(i)||/||b|| 1.813094466660e-02
  1 KSP unpreconditioned resid norm 5.809485671766e-04 true resid norm 5.809485671766e-04 ||r(i)||/||b|| 1.118770093118e-02
  2 KSP unpreconditioned resid norm 1.765886995066e-04 true resid norm 1.765886995066e-04 ||r(i)||/||b|| 3.400682383137e-03
  3 KSP unpreconditioned resid norm 1.457796058411e-04 true resid norm 1.457796058411e-04 ||r(i)||/||b|| 2.807371812521e-03
  4 KSP unpreconditioned resid norm 1.438887535261e-04 true resid norm 1.438887535261e-04 ||r(i)||/||b|| 2.770958450994e-03
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 5
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       49 ****************************************
                         Simulation time =     1.9141 sec 
                          Time/time step =     4.1844 sec 
                  Average time/time step =     5.6280 sec 
  
   U MAX =  6.913413717411347E-04  V MAX =  6.913413717411347E-04  W MAX =  2.126538579577966E-10  
   U MIN = -6.913413717411347E-04  V MIN = -6.913413717411347E-04  W MIN = -1.191918590367497E-02  
   U MAX =  6.913413717411347E-04  V MAX =  6.913413717411347E-04  W MAX =  1.191918590367497E-02  
   max(|divU|)    =  5.340673281799226E-02
   sum(divU*dV)   = -5.499226812896440E-17
   sum(|divU|*dV) =  3.755711646535303E-02
   Convective cfl =  8.513195934180635E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.870403538643769E-01  
               Iterations to convergence =     5
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.519521198687e-04 true resid norm 9.519521198687e-04 ||r(i)||/||b|| 1.802538489112e-02
  1 KSP unpreconditioned resid norm 6.431597000660e-04 true resid norm 6.431597000660e-04 ||r(i)||/||b|| 1.217834479085e-02
  2 KSP unpreconditioned resid norm 3.702102145333e-04 true resid norm 3.702102145333e-04 ||r(i)||/||b|| 7.009997108370e-03
  3 KSP unpreconditioned resid norm 3.869684006500e-04 true resid norm 3.869684006501e-04 ||r(i)||/||b|| 7.327316381605e-03
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 4
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       50 ****************************************
                         Simulation time =     1.9531 sec 
                          Time/time step =     3.8233 sec 
                  Average time/time step =     5.5919 sec 
  
   U MAX =  7.080620726354203E-04  V MAX =  7.080620726354203E-04  W MAX =  2.471772277684510E-10  
   U MIN = -7.080620726354203E-04  V MIN = -7.080620726354203E-04  W MIN = -1.217211026400828E-02  
   U MAX =  7.080620726354203E-04  V MAX =  7.080620726354203E-04  W MAX =  1.217211026400828E-02  
   max(|divU|)    =  5.428216207969611E-02
   sum(divU*dV)   =  2.375536962971300E-17
   sum(|divU|*dV) =  3.837706418998891E-02
   Convective cfl =  8.696470021938640E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.862769307070695E-01  
               Iterations to convergence =     4
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.016568468033e-03 true resid norm 1.016568468033e-03 ||r(i)||/||b|| 1.893304776508e-02
  1 KSP unpreconditioned resid norm 8.594004065921e-04 true resid norm 8.594004065921e-04 ||r(i)||/||b|| 1.600587610083e-02
  2 KSP unpreconditioned resid norm 8.414870106025e-04 true resid norm 8.414870106025e-04 ||r(i)||/||b|| 1.567224861525e-02
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 3
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       51 ****************************************
                         Simulation time =     1.9922 sec 
                          Time/time step =     3.4677 sec 
                  Average time/time step =     5.5502 sec 
  
   U MAX =  7.246954541516481E-04  V MAX =  7.246954541516481E-04  W MAX =  2.861650172945249E-10  
   U MIN = -7.246954541516481E-04  V MIN = -7.246954541516481E-04  W MIN = -1.242549473530513E-02  
   U MAX =  7.246954541516481E-04  V MAX =  7.246954541516481E-04  W MAX =  1.242549473530513E-02  
   max(|divU|)    =  5.515171879413290E-02
   sum(divU*dV)   = -4.605848273045875E-20
   sum(|divU|*dV) =  3.919904340353457E-02
   Convective cfl =  8.879926811909390E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.855131497816892E-01  
               Iterations to convergence =     3
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.258666639469e-03 true resid norm 1.258666639469e-03 ||r(i)||/||b|| 2.306487833771e-02
  1 KSP unpreconditioned resid norm 1.504157912911e-03 true resid norm 1.504157912911e-03 ||r(i)||/||b|| 2.756346928893e-02
  2 KSP unpreconditioned resid norm 1.513545089714e-03 true resid norm 1.513545089714e-03 ||r(i)||/||b|| 2.773548790299e-02
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 3
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       52 ****************************************
                         Simulation time =     2.0312 sec 
                          Time/time step =     3.4708 sec 
                  Average time/time step =     5.5103 sec 
  
   U MAX =  7.412372878275829E-04  V MAX =  7.412372878275829E-04  W MAX =  3.300453361744023E-10  
   U MIN = -7.412372878275829E-04  V MIN = -7.412372878275829E-04  W MIN = -1.267933716225437E-02  
   U MAX =  7.412372878275829E-04  V MAX =  7.412372878275829E-04  W MAX =  1.267933716225437E-02  
   max(|divU|)    =  5.601540601387010E-02
   sum(divU*dV)   = -1.943497884490673E-16
   sum(|divU|*dV) =  4.002304072475085E-02
   Convective cfl =  9.063559512262100E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.847490420774172E-01  
               Iterations to convergence =     3
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.761560659866e-03 true resid norm 1.761560659866e-03 ||r(i)||/||b|| 3.177100637149e-02
  1 KSP unpreconditioned resid norm 2.631712886238e-03 true resid norm 2.631712886238e-03 ||r(i)||/||b|| 4.746482410829e-02
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       53 ****************************************
                         Simulation time =     2.0703 sec 
                          Time/time step =     3.1133 sec 
                  Average time/time step =     5.4650 sec 
  
   U MAX =  7.576836087104737E-04  V MAX =  7.576836087104737E-04  W MAX =  3.792719040608200E-10  
   U MIN = -7.576836087104737E-04  V MIN = -7.576836087104737E-04  W MIN = -1.293363484006254E-02  
   U MAX =  7.576836087104737E-04  V MAX =  7.576836087104737E-04  W MAX =  1.293363484006254E-02  
   max(|divU|)    =  5.687321627651611E-02
   sum(divU*dV)   = -4.700826762012491E-17
   sum(|divU|*dV) =  4.084904209733418E-02
   Convective cfl =  9.247361316789432E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.839846385889424E-01  
               Iterations to convergence =     2
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 2.799899355961e-03 true resid norm 2.799899355961e-03 ||r(i)||/||b|| 4.971640159634e-02
  1 KSP unpreconditioned resid norm 4.527402695250e-03 true resid norm 4.527402695250e-03 ||r(i)||/||b|| 8.039080765750e-02
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       54 ****************************************
                         Simulation time =     2.1094 sec 
                          Time/time step =     3.1127 sec 
                  Average time/time step =     5.4215 sec 
  
   U MAX =  7.740307160562104E-04  V MAX =  7.740307160562104E-04  W MAX =  4.343243689038809E-10  
   U MIN = -7.740307160562104E-04  V MIN = -7.740307160562104E-04  W MIN = -1.318838451137465E-02  
   U MAX =  7.740307160562104E-04  V MAX =  7.740307160562104E-04  W MAX =  1.318838451137465E-02  
   max(|divU|)    =  5.772513218099847E-02
   sum(divU*dV)   =  1.862846001320152E-16
   sum(|divU|*dV) =  4.167703278667138E-02
   Convective cfl =  9.431325403831724E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.832199703200240E-01  
               Iterations to convergence =     2
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 4.697531508109e-03 true resid norm 4.697531508109e-03 ||r(i)||/||b|| 8.214417102469e-02
  1 KSP unpreconditioned resid norm 6.214787472969e-03 true resid norm 6.214787472969e-03 ||r(i)||/||b|| 1.086759214239e-01
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       55 ****************************************
                         Simulation time =     2.1484 sec 
                          Time/time step =     3.1091 sec 
                  Average time/time step =     5.3794 sec 
  
   U MAX =  7.902751731180926E-04  V MAX =  7.902751731180926E-04  W MAX =  4.957085513370435E-10  
   U MIN = -7.902751731180926E-04  V MIN = -7.902751731180926E-04  W MIN = -1.344358236313642E-02  
   U MAX =  7.902751731180926E-04  V MAX =  7.902751731180926E-04  W MAX =  1.344358236313642E-02  
   max(|divU|)    =  5.857112669786742E-02
   sum(divU*dV)   = -1.033909168771564E-16
   sum(|divU|*dV) =  4.250699737772344E-02
   Convective cfl =  9.615444933998471E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.824550682920411E-01  
               Iterations to convergence =     2
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 6.443380291215e-03 true resid norm 6.443380291215e-03 ||r(i)||/||b|| 1.109919986786e-01
  1 KSP unpreconditioned resid norm 8.054889194093e-03 true resid norm 8.054889194093e-03 ||r(i)||/||b|| 1.387514333130e-01
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       56 ****************************************
                         Simulation time =     2.1875 sec 
                          Time/time step =     3.1099 sec 
                  Average time/time step =     5.3389 sec 
  
   U MAX =  8.064138093027779E-04  V MAX =  8.064138093027779E-04  W MAX =  5.639566124425724E-10  
   U MIN = -8.064138093027779E-04  V MIN = -8.064138093027779E-04  W MIN = -1.369922402375323E-02  
   U MAX =  8.064138093027779E-04  V MAX =  8.064138093027779E-04  W MAX =  1.369922402375323E-02  
   max(|divU|)    =  5.941116286987973E-02
   sum(divU*dV)   = -1.848161969535551E-16
   sum(|divU|*dV) =  4.333892934140238E-02
   Convective cfl =  9.799713051109626E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.816899635391449E-01  
               Iterations to convergence =     2
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 8.320817865971e-03 true resid norm 8.320817865971e-03 ||r(i)||/||b|| 1.412314148924e-01
  1 KSP unpreconditioned resid norm 9.218063404335e-03 true resid norm 9.218063404335e-03 ||r(i)||/||b|| 1.564605977601e-01
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       57 ****************************************
                         Simulation time =     2.2266 sec 
                          Time/time step =     3.1149 sec 
                  Average time/time step =     5.2999 sec 
  
   U MAX =  8.224437242859797E-04  V MAX =  8.224437242859797E-04  W MAX =  6.396271424076921E-10  
   U MIN = -8.224437242859797E-04  V MIN = -8.224437242859797E-04  W MIN = -1.395530456105393E-02  
   U MAX =  8.224437242859797E-04  V MAX =  8.224437242859797E-04  W MAX =  1.395530456105393E-02  
   max(|divU|)    =  6.024519384412073E-02
   sum(divU*dV)   =  1.301277135451108E-16
   sum(|divU|*dV) =  4.417280564319946E-02
   Convective cfl =  9.984122886160571E-02  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.809246870908851E-01  
               Iterations to convergence =     2
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 9.540635888297e-03 true resid norm 9.540635888297e-03 ||r(i)||/||b|| 1.596034321147e-01
  1 KSP unpreconditioned resid norm 1.056716658719e-02 true resid norm 1.056716658719e-02 ||r(i)||/||b|| 1.767760634396e-01
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       58 ****************************************
                         Simulation time =     2.2656 sec 
                          Time/time step =     3.1126 sec 
                  Average time/time step =     5.2622 sec 
  
   U MAX =  8.383622937791725E-04  V MAX =  8.383622937791725E-04  W MAX =  7.233051678163470E-10  
   U MIN = -8.383622937791725E-04  V MIN = -8.383622937791725E-04  W MIN = -1.421181848163963E-02  
   U MAX =  8.383622937791725E-04  V MAX =  8.383622937791725E-04  W MAX =  1.421181848163963E-02  
   max(|divU|)    =  6.107316302490359E-02
   sum(divU*dV)   =  2.143023305597198E-16
   sum(|divU|*dV) =  4.500860689979907E-02
   Convective cfl =  1.016866756428670E-01  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.801592699424566E-01  
               Iterations to convergence =     2
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.093039378003e-02 true resid norm 1.093039378003e-02 ||r(i)||/||b|| 1.802637775768e-01
  1 KSP unpreconditioned resid norm 1.120586105321e-02 true resid norm 1.120586105321e-02 ||r(i)||/||b|| 1.848067768742e-01
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       59 ****************************************
                         Simulation time =     2.3047 sec 
                          Time/time step =     3.1166 sec 
                  Average time/time step =     5.2258 sec 
  
   U MAX =  8.541671768674378E-04  V MAX =  8.541671768674378E-04  W MAX =  8.156020755531066E-10  
   U MIN = -8.541671768674378E-04  V MIN = -8.541671768674378E-04  W MIN = -1.446875973217087E-02  
   U MAX =  8.541671768674378E-04  V MAX =  8.541671768674378E-04  W MAX =  1.446875973217087E-02  
   max(|divU|)    =  6.189500431536687E-02
   sum(divU*dV)   = -1.994552209034640E-16
   sum(|divU|*dV) =  4.584631503542637E-02
   Convective cfl =  1.035334021497968E-01  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.793937430115675E-01  
               Iterations to convergence =     2
 Convergence in            1  iterations
  0 KSP unpreconditioned resid norm 1.162684118472e-02 true resid norm 1.162684118472e-02 ||r(i)||/||b|| 1.890803640116e-01
  1 KSP unpreconditioned resid norm 1.216327907130e-02 true resid norm 1.216327907130e-02 ||r(i)||/||b|| 1.978041325102e-01
Linear solve did not converge due to DIVERGED_INDEFINITE_PC iterations 2
KSP Object: 1 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     1 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     1 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 1
        Local solve info for each block is in the following KSP and PC objects:
      [0] number of local blocks = 1, first local block number = 0
        [0] local block number 0
        KSP Object:        (mg_coarse_sub_)         1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_sub_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot
            matrix ordering: nd
            factor fill ratio given 5, needed 2.26888
              Factored matrix follows:
                Matrix Object:                 1 MPI processes
                  type: seqaij
                  rows=155, cols=155
                  package used to perform factorization: petsc
                  total: nonzeros=16851, allocated nonzeros=16851
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 110 nodes, limit used is 5
          linear system matrix = precond matrix:
          Matrix Object:           1 MPI processes
            type: seqaij
            rows=155, cols=155
            total: nonzeros=7427, allocated nonzeros=7427
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        - - - - - - - - - - - - - - - - - -
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=155, cols=155
        total: nonzeros=7427, allocated nonzeros=7427
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0704919, max = 1.48033
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=11665, cols=11665
        total: nonzeros=393037, allocated nonzeros=393037
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     1 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.0925946, max = 1.94449
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      has attached null space
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     1 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       1 MPI processes
        type: seqaij
        rows=131072, cols=131072
        total: nonzeros=917504, allocated nonzeros=917504
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   1 MPI processes
    type: seqaij
    rows=131072, cols=131072
    total: nonzeros=917504, allocated nonzeros=917504
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines

**************************************** TIME STEP =       60 ****************************************
                         Simulation time =     2.3438 sec 
                          Time/time step =     3.1097 sec 
                  Average time/time step =     5.1905 sec 
  
   U MAX =  8.745386602467303E-04  V MAX =  8.745386602467303E-04  W MAX =  9.171554515245112E-10  
   U MIN = -8.745386602467303E-04  V MIN = -8.745386602467303E-04  W MIN = -1.472612170301261E-02  
   U MAX =  8.745386602467303E-04  V MAX =  8.745386602467303E-04  W MAX =  1.472612170301261E-02  
   max(|divU|)    =  6.271064239595950E-02
   sum(divU*dV)   =  5.434308178504297E-17
   sum(|divU|*dV) =  4.668591125631307E-02
   Convective cfl =  1.054412737504388E-01  Viscous cfl =  7.281777777777779E-01  Gravity cfl =  6.205004906490450E-01  
   DT =  3.906250000000000E-02  MAX DT ALLOWED =  8.786033133991950E-01  
               Iterations to convergence =     2

---------------------------------------- SUMMARY ----------------------------------------
                              Setup time =     0.0290 min 
                     Initialization time =     0.0021 min 
                         Processing time =     5.1905 min 
                    Post-processing time =     0.0000 min 
                   Total simulation time =     5.2216 min 
           Processing time per time step =     5.1905 sec 
              Total number of time steps =       60
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./hit on a interlagos-64idx-gnu-dbg named nid21503 with 1 processor, by Unknown Thu May 15 13:14:15 2014
Using Petsc Release Version 3.4.2, Jul, 02, 2013 

                         Max       Max/Min        Avg      Total 
Time (sec):           3.135e+02      1.00000   3.135e+02
Objects:              1.264e+03      1.00000   1.264e+03
Flops:                1.133e+10      1.00000   1.133e+10  1.133e+10
Flops/sec:            3.614e+07      1.00000   3.614e+07  3.614e+07
Memory:               1.210e+08      1.00000              1.210e+08
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       2.820e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 3.1346e+02 100.0%  1.1329e+10 100.0%  0.000e+00   0.0%  0.000e+00        0.0%  2.819e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run ./configure                #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################


Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

ThreadCommRunKer       1 1.0 5.9605e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
ThreadCommBarrie       1 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecMDot               20 1.0 1.8802e-02 1.0 1.57e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   835
VecTDot              790 1.0 2.0298e-01 1.0 2.07e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  1020
VecNorm              980 1.0 1.0135e-01 1.0 2.54e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  2509
VecScale            3230 1.0 4.8646e-01 1.0 2.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0   474
VecCopy             1314 1.0 2.0094e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              3909 1.0 2.9169e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             7196 1.0 1.2969e+00 1.0 1.12e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0 10  0  0  0   0 10  0  0  0   864
VecAYPX             7253 1.0 4.8285e+00 1.0 7.25e+08 1.0 0.0e+00 0.0e+00 0.0e+00  2  6  0  0  0   2  6  0  0  0   150
VecMAXPY              22 1.0 3.6978e-02 1.0 1.86e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   502
VecAssemblyBegin      62 1.0 1.8764e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        62 1.0 1.7571e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult    4834 1.0 2.3412e+00 1.0 3.45e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  3  0  0  0   1  3  0  0  0   147
VecScatterBegin      120 1.0 1.4015e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSetRandom           2 1.0 1.5882e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          22 1.0 1.3849e-02 1.0 4.71e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   340
MatMult             5729 1.0 4.0637e+01 1.0 7.52e+09 1.0 0.0e+00 0.0e+00 0.0e+00 13 66  0  0  0  13 66  0  0  0   185
MatMultAdd           802 1.0 2.9998e+00 1.0 4.08e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0   136
MatMultTranspose     802 1.0 3.5211e+00 1.0 4.08e+08 1.0 0.0e+00 0.0e+00 8.0e+02  1  4  0  0  3   1  4  0  0  3   116
MatSolve             401 1.0 6.9252e-02 1.0 1.35e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   194
MatLUFactorSym         1 1.0 2.0640e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 9.8920e-03 1.0 1.08e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   109
MatConvert             2 1.0 3.3493e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale               6 1.0 2.2011e-02 1.0 3.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   165
MatAssemblyBegin      79 1.0 3.0279e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd        79 1.0 5.6627e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRow         428211 1.0 1.1904e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 1.1492e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 5.8603e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             2 1.0 7.4472e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView              360 1.0 5.0706e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                2 1.0 7.3698e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult             2 1.0 2.6457e-01 1.0 3.13e+06 1.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0    12
MatMatMultSym          2 1.0 2.0506e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
MatMatMultNum          2 1.0 5.9428e-02 1.0 3.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    53
MatPtAP                2 1.0 8.5463e-01 1.0 1.91e+07 1.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0    22
MatPtAPSymbolic        2 1.0 2.9844e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
MatPtAPNumeric         2 1.0 5.5616e-01 1.0 1.91e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    34
MatTrnMatMult          2 1.0 2.5088e+00 1.0 3.99e+07 1.0 0.0e+00 0.0e+00 2.4e+01  1  0  0  0  0   1  0  0  0  0    16
MatGetSymTrans         4 1.0 5.5090e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog        20 1.0 5.0538e-02 1.0 3.14e+07 1.0 0.0e+00 0.0e+00 1.1e+02  0  0  0  0  0   0  0  0  0  0   621
KSPSetUp               8 1.0 5.7340e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+02  0  0  0  0  0   0  0  0  0  0     0
KSPSolve              60 1.0 1.6715e+02 1.0 1.13e+10 1.0 0.0e+00 0.0e+00 2.7e+04 53100  0  0 96  53100  0  0 96    68
PCSetUp                2 1.0 2.2745e+01 1.0 1.32e+08 1.0 0.0e+00 0.0e+00 5.8e+02  7  1  0  0  2   7  1  0  0  2     6
PCSetUpOnBlocks      401 1.0 1.4589e-02 1.0 1.08e+06 1.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0    74
PCApply              401 1.0 1.1839e+02 1.0 8.85e+09 1.0 0.0e+00 0.0e+00 2.1e+04 38 78  0  0 75  38 78  0  0 75    75
PCGAMGgraph_AGG        2 1.0 5.8934e+00 1.0 2.62e+06 1.0 0.0e+00 0.0e+00 3.0e+01  2  0  0  0  0   2  0  0  0  0     0
PCGAMGcoarse_AGG       2 1.0 2.7180e+00 1.0 3.99e+07 1.0 0.0e+00 0.0e+00 3.4e+01  1  0  0  0  0   1  0  0  0  0    15
PCGAMGProl_AGG         2 1.0 1.1115e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  4  0  0  0  0   4  0  0  0  0     0
PCGAMGPOpt_AGG         2 1.0 2.1251e+00 1.0 6.98e+07 1.0 0.0e+00 0.0e+00 3.5e+02  1  1  0  0  1   1  1  0  0  1    33
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector  1093           1093   1124012864     0
      Vector Scatter     2              2         1384     0
              Matrix    16             16     80634240     0
      Matrix Coarsen     2              2         1368     0
   Matrix Null Space   120            120        78240     0
    Distributed Mesh     1              1      2409608     0
     Bipartite Graph     2              2         1712     0
           Index Set    10             10      1206832     0
   IS L to G Mapping     1              1      1202876     0
       Krylov Solver     7              7        85064     0
      Preconditioner     7              7         7752     0
              Viewer     1              0            0     0
         PetscRandom     2              2         1344     0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
#PETSc Option Table entries:
-finput input_droplet.txt
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_summary
-options_left
-pc_gamg_agg_nsmooths 1
-pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure run at: Wed May 14 12:49:45 2014
Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok  --with-blas-lapack-lib="-L/opt/acml/5.3.0/gfortran64/lib  -lacml" --with-x="0 " --with-debugging="1 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-dynamic-loading="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " PETSC_ARCH=interlagos-64idx-gnu-dbg
-----------------------------------------
Libraries compiled on Wed May 14 12:49:45 2014 on h2ologin2 
Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64
Using PETSc directory: /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2
Using PETSc arch: interlagos-64idx-gnu-dbg
-----------------------------------------

Using C compiler: cc  -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -fno-inline -O0  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: ftn  -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/include
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-gnu-dbg/lib -lsuperlu_dist_3.3 -L/opt/acml/5.3.0/gfortran64/lib -lacml -lpthread -lparmetis -lmetis -ldl 
-----------------------------------------

#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view
-log_summary
-options_left
-pc_gamg_agg_nsmooths 1
-pc_type gamg
#End of PETSc Option Table entries
There is no unused database option.
Application 4423330 resources: utime ~282s, stime ~33s, Rss ~206632, inblocks ~23827, outblocks ~77441

From mairhofer at itt.uni-stuttgart.de  Fri May 16 02:59:58 2014
From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer)
Date: Fri, 16 May 2014 09:59:58 +0200
Subject: [petsc-users] PetscMalloc with Fortran
In-Reply-To: <CAKKtykwL42Zk3GjyKODttyh9FffowVtLFJ0-S_pXQAQAeB_Omw@mail.gmail.com>
References: <5374D184.5070207@itt.uni-stuttgart.de>	<87ha4r3xtq.fsf@jedbrown.org>	<5374F1CC.3080906@itt.uni-stuttgart.de>
	<CAKKtykwL42Zk3GjyKODttyh9FffowVtLFJ0-S_pXQAQAeB_Omw@mail.gmail.com>
Message-ID: <5375C57E.4070001@itt.uni-stuttgart.de>


I tried to use ISColoringValue, but when I include

IScoloringValue   colors


into my code, I get an error message from the compiler(gfortran):


      ISColoringValue colors
       1
     error: unclassifiable statement at (1)

I'm including these header files, am I missing one?


#include <finclude/petscsys.h>
#include <finclude/petscvec.h>
#include <finclude/petscdmda.h>
#include <finclude/petscis.h>
#include <finclude/petscmat.h>
#include <finclude/petscksp.h>
#include <finclude/petscpc.h>
#include <finclude/petscsnes.h>
#include <finclude/petscvec.h90>
#include <finclude/petscdmda.h90>


Thank you for your fast responses!


Am 15.05.2014 19:16, schrieb Peter Brune:
> You should be using an array of type ISColoringValue. ISColoringValue 
> is by default a short, not an int, so you're getting nonsense entries. 
>  We should either maintain or remove ex5s if it does something like this.
>
> - Peter
>
>
> On Thu, May 15, 2014 at 11:56 AM, Jonas Mairhofer 
> <mairhofer at itt.uni-stuttgart.de 
> <mailto:mairhofer at itt.uni-stuttgart.de>> wrote:
>
>
>     If 'colors' can be a dynamically allocated array then I dont know
>     where
>     the mistake is in this code:
>
>
>
>
>
>            ISColoring iscoloring
>            Integer, allocatable :: colors(:)
>            PetscInt maxc
>
>           ...
>
>
>           !calculate max. number of colors
>           maxc = 2*irc+1 !irc is the number of ghost nodes needed to
>     calculate the function I want to solve
>
>          allocate(colors(user%xm))  !where user%xm is the number of
>     locally
>     owned nodes of a global array
>
>          !Set colors
>          DO i=1,user%xm
>               colors(i) = mod(i,maxc)
>          END DO
>
>         call
>     ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr)
>
>         ...
>
>         deallocate(colors)
>         call ISColoringDestroy(iscoloring,ierr)
>
>
>
>
>     On execution I get the following error message (running the DO
>     Loop from
>     0 to user%xm-1 does not change anything):
>
>
>     [0]PETSC ERROR: --------------------- Error Message
>     ------------------------------------
>     [0]PETSC ERROR: Arguments are incompatible!
>     [0]PETSC ERROR: Number of colors passed in 291 is less then the actual
>     number of colors in array 61665!
>     [0]PETSC ERROR:
>     ------------------------------------------------------------------------
>     [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
>     [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>     [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>     [0]PETSC ERROR: See docs/index.html for manual pages.
>     [0]PETSC ERROR:
>     ------------------------------------------------------------------------
>     [0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named
>     aries.itt.uni-stuttgart.de <http://aries.itt.uni-stuttgart.de> by
>     mhofer Thu May 15 18:01:41 2014
>     [0]PETSC ERROR: Libraries linked from
>     /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib
>     [0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014
>     [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
>     --download-f-blas-lapack --download-mpich
>     [0]PETSC ERROR:
>     ------------------------------------------------------------------------
>     [0]PETSC ERROR: ISColoringCreate() line 276 in
>     /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c
>
>
>
>
>
>     But when I print out colors, it only has entries from 0 to 218, so no
>     entry is larger then 291 as stated in the error message.
>
>
>
>
>
>
>
>
>
>
>     Am 15.05.2014 16:45, schrieb Jed Brown:
>
>         Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de
>         <mailto:mairhofer at itt.uni-stuttgart.de>> writes:
>
>             Hi, I'm trying to set the coloring of a matrix using
>             ISColoringCreate.
>             Therefore I need an array 'colors' which in C can be
>             creates as (from
>             example ex5s.c)
>
>             int *colors
>             PetscMalloc(...,&colors)
>
>         There is no PetscMalloc in Fortran, due to language
>         "deficiencies".
>
>             colors(i) = ....
>
>             ISColoringCreate(...)
>
>             How do I have to define the array colors in Fortran?
>
>             I tried:
>
>             Integer, allocatable :: colors(:)    and  allocate()
>             instead of
>             PetscMalloc
>
>             and
>
>             Integer, pointer :: colors
>
>             but neither worked.
>
>         The ISColoringCreate Fortran binding copies from the array you
>         pass into
>         one allocated using PetscMalloc.  You should pass a normal
>         Fortran array
>         (statically or dynamically allocated).
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140516/d5793199/attachment.html>

From oliver.browne at upm.es  Fri May 16 03:39:55 2014
From: oliver.browne at upm.es (Oliver Browne)
Date: Fri, 16 May 2014 10:39:55 +0200
Subject: [petsc-users] MatMPIAIJSetPreallocationCSR
In-Reply-To: <0CC4B85D-12C2-4C4B-BEFE-6E42437101E3@mcs.anl.gov>
References: <a227e0f68145ccb3baf17a4e38e0c63a@upm.es>
	<C1DEB677-2A1E-4E2C-91DB-F40B32A0F514@mcs.anl.gov>
	<30468df2eb47c9fc3cb02433a6ecc1f9@upm.es>
	<373F07AB-A56A-4A9A-BD2E-F854FEC16D50@mcs.anl.gov>
	<4ede9bde820986a689a2ba2fcb6291db@upm.es>
	<0CC4B85D-12C2-4C4B-BEFE-6E42437101E3@mcs.anl.gov>
Message-ID: <8ae9445ef0992d6905969309fd6fbb56@upm.es>


>> 
>> 
>> On 14-05-2014 17:34, Barry Smith wrote:
>>> See the manual page for MatMPIAIJSetPreallocationCSR() it gives an
>>> explicit simple example
>>>      The format which is used for the sparse matrix input, is 
>>> equivalent to a
>>>    row-major ordering.. i.e for the following matrix, the input data
>>> expected is
>>>    as shown:
>>>        1 0 0
>>>        2 0 3     P0
>>>       -------
>>>        4 5 6     P1
>>>     Process0 [P0]: rows_owned=[0,1]
>>>        i =  {0,1,3}  [size = nrow+1  = 2+1]
>>>        j =  {0,0,2}  [size = nz = 6]
>>>        v =  {1,2,3}  [size = nz = 6]
>>>     Process1 [P1]: rows_owned=[2]
>>>        i =  {0,3}    [size = nrow+1  = 1+1]
>>>        j =  {0,1,2}  [size = nz = 6]
>>>        v =  {4,5,6}  [size = nz = 6]
>>>   The column indices are global, the numerical values are just
>>> numerical values and do not need to be adjusted. On each process the 
>>> i
>>> indices start with 0 because they just point into the local part of
>>> the j indices.
>>>    Are you saying each process of yours HAS the entire matrix?
>> 
>> I am not entirely sure about this and what it means. Each processor 
>> has a portion of the matrix.
>> 
>> 
>> If so
>>> you just need to adjust the local portion of the i vales and pass 
>>> that
>>> plus the appropriate location in j and v to the routine as in the
>>> example above.
>> 
>> So this MatMPIAIJSetPreallocationCSR call should be in some sort of 
>> loop;
>> 
>> Do counter =  1, No of Processors
>> 
>> calculate local numbering for i and isolate parts of j and v needed
>> 
>> Call MatMPIAIJSetPreallocationCSR(A,i,j,v)
>> 
>> END DO
>> 
>> Is this correct?
> 
>    Oh boy, oh boy.  No absolutely not. Each process is calling
> MatMPIAIJSetPreallocationCSR() once with its part of the data.
> 
>   Barry


so using the above example

I should

  CALL MatMPIAIJSetPreallocationCSR(A,i,j,v)

where

i = [0, 1, 3, 6]
j = [0, 0, 2, 0 , 1 , 2]
v = [1,2,3,4,5,6]

Ollie


> 
>> 
>> Ollie
>> 
>>>   Barry
>>> On May 14, 2014, at 8:36 AM, Oliver Browne <oliver.browne at upm.es> 
>>> wrote:
>>>> On 14-05-2014 15:27, Barry Smith wrote:
>>>>> On May 14, 2014, at 7:42 AM, Oliver Browne <oliver.browne at upm.es> 
>>>>> wrote:
>>>>>> Hi,
>>>>>> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, 
>>>>>> columns and values for my matrix (efficiency). I have my 3 vectors 
>>>>>> in CSR format. If I run on a single processor, with my test case, 
>>>>>> everything works fine. I also worked without 
>>>>>> MatMPIAIJSetPreallocationCSR, and individually input each value 
>>>>>> with the call MatSetValues in MPI and this also works fine.
>>>>>> If I want to use MatMPIAIJSetPreallocationCSR, do I need to 
>>>>>> separate the vectors for each processor as they have done here;
>>>>>  What do you mean by ?separate? the vectors? Each processor
>>>>> needs to provide ITS rows to the function call. You cannot have
>>>>> processor zero deliver all the rows.
>>>> I mean split them so they change from global numbering to local 
>>>> numbering.
>>>> At the moment I just have
>>>> CALL MatMPIAIJSetPreallocationCSR(A,NVPN,NNVI,CONT,ierr) - 3 vectors 
>>>> have global numbering
>>>> How can submit this to a specific processor?
>>>> Ollie
>>>>>  Barry
>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR
>>>>>> Thanks in advance,
>>>>>> Ollie

From bsmith at mcs.anl.gov  Fri May 16 07:12:25 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 16 May 2014 07:12:25 -0500
Subject: [petsc-users] MatMPIAIJSetPreallocationCSR
In-Reply-To: <8ae9445ef0992d6905969309fd6fbb56@upm.es>
References: <a227e0f68145ccb3baf17a4e38e0c63a@upm.es>
	<C1DEB677-2A1E-4E2C-91DB-F40B32A0F514@mcs.anl.gov>
	<30468df2eb47c9fc3cb02433a6ecc1f9@upm.es>
	<373F07AB-A56A-4A9A-BD2E-F854FEC16D50@mcs.anl.gov>
	<4ede9bde820986a689a2ba2fcb6291db@upm.es>
	<0CC4B85D-12C2-4C4B-BEFE-6E42437101E3@mcs.anl.gov>
	<8ae9445ef0992d6905969309fd6fbb56@upm.es>
Message-ID: <F625489D-DF52-41E0-9481-E8635D424278@mcs.anl.gov>


   No, process zero calls with 
> i = [0, 1, 3
> j = [0, 0, 2]
> v = [1,2,3]

and process one calls with 

> i = [0, 3]
> j = [ 0 , 1 , 2]
> v = [4,5,6]

this is how MPI works, each process figures out its own information and calls appropriate routines with its own values.

   Barry

On May 16, 2014, at 3:39 AM, Oliver Browne <oliver.browne at upm.es> wrote:

> 
>>> On 14-05-2014 17:34, Barry Smith wrote:
>>>> See the manual page for MatMPIAIJSetPreallocationCSR() it gives an
>>>> explicit simple example
>>>>     The format which is used for the sparse matrix input, is equivalent to a
>>>>   row-major ordering.. i.e for the following matrix, the input data
>>>> expected is
>>>>   as shown:
>>>>       1 0 0
>>>>       2 0 3     P0
>>>>      -------
>>>>       4 5 6     P1
>>>>    Process0 [P0]: rows_owned=[0,1]
>>>>       i =  {0,1,3}  [size = nrow+1  = 2+1]
>>>>       j =  {0,0,2}  [size = nz = 6]
>>>>       v =  {1,2,3}  [size = nz = 6]
>>>>    Process1 [P1]: rows_owned=[2]
>>>>       i =  {0,3}    [size = nrow+1  = 1+1]
>>>>       j =  {0,1,2}  [size = nz = 6]
>>>>       v =  {4,5,6}  [size = nz = 6]
>>>>  The column indices are global, the numerical values are just
>>>> numerical values and do not need to be adjusted. On each process the i
>>>> indices start with 0 because they just point into the local part of
>>>> the j indices.
>>>>   Are you saying each process of yours HAS the entire matrix?
>>> I am not entirely sure about this and what it means. Each processor has a portion of the matrix.
>>> If so
>>>> you just need to adjust the local portion of the i vales and pass that
>>>> plus the appropriate location in j and v to the routine as in the
>>>> example above.
>>> So this MatMPIAIJSetPreallocationCSR call should be in some sort of loop;
>>> Do counter =  1, No of Processors
>>> calculate local numbering for i and isolate parts of j and v needed
>>> Call MatMPIAIJSetPreallocationCSR(A,i,j,v)
>>> END DO
>>> Is this correct?
>>   Oh boy, oh boy.  No absolutely not. Each process is calling
>> MatMPIAIJSetPreallocationCSR() once with its part of the data.
>>  Barry
> 
> 
> so using the above example
> 
> I should
> 
> CALL MatMPIAIJSetPreallocationCSR(A,i,j,v)
> 
> where
> 
> i = [0, 1, 3, 6]
> j = [0, 0, 2, 0 , 1 , 2]
> v = [1,2,3,4,5,6]
> 
> Ollie
> 
> 
>>> Ollie
>>>>  Barry
>>>> On May 14, 2014, at 8:36 AM, Oliver Browne <oliver.browne at upm.es> wrote:
>>>>> On 14-05-2014 15:27, Barry Smith wrote:
>>>>>> On May 14, 2014, at 7:42 AM, Oliver Browne <oliver.browne at upm.es> wrote:
>>>>>>> Hi,
>>>>>>> I am using MatMPIAIJSetPreallocationCSR to preallocate the rows, columns and values for my matrix (efficiency). I have my 3 vectors in CSR format. If I run on a single processor, with my test case, everything works fine. I also worked without MatMPIAIJSetPreallocationCSR, and individually input each value with the call MatSetValues in MPI and this also works fine.
>>>>>>> If I want to use MatMPIAIJSetPreallocationCSR, do I need to separate the vectors for each processor as they have done here;
>>>>>> What do you mean by ?separate? the vectors? Each processor
>>>>>> needs to provide ITS rows to the function call. You cannot have
>>>>>> processor zero deliver all the rows.
>>>>> I mean split them so they change from global numbering to local numbering.
>>>>> At the moment I just have
>>>>> CALL MatMPIAIJSetPreallocationCSR(A,NVPN,NNVI,CONT,ierr) - 3 vectors have global numbering
>>>>> How can submit this to a specific processor?
>>>>> Ollie
>>>>>> Barry
>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocationCSR.html#MatMPIAIJSetPreallocationCSR
>>>>>>> Thanks in advance,
>>>>>>> Ollie


From dario.isola at newmerical.com  Fri May 16 10:46:33 2014
From: dario.isola at newmerical.com (Dario Isola)
Date: Fri, 16 May 2014 11:46:33 -0400
Subject: [petsc-users] hypre support
Message-ID: <537632D9.2090603@newmerical.com>

Dear all,

I am investigating the use of hypre+petsc. I was able to successfully 
configure, install, compile petsc 3.3 with the external package for hypre.

I tried to run it with the following options

    -pc_type hypre -pc_type_hypre pilut -ksp_type richardson

and, although he did not complain, it does not solve the system either.

To what extent is hypre supported by petsc? More specifically, what kind 
of matrices? I am using a *baij*//matrix.

Thanks in advance,
D
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140516/2ee3cc3e/attachment.html>

From bsmith at mcs.anl.gov  Fri May 16 10:49:55 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 16 May 2014 10:49:55 -0500
Subject: [petsc-users] PetscMalloc with Fortran
In-Reply-To: <5375C57E.4070001@itt.uni-stuttgart.de>
References: <5374D184.5070207@itt.uni-stuttgart.de>	<87ha4r3xtq.fsf@jedbrown.org>	<5374F1CC.3080906@itt.uni-stuttgart.de>
	<CAKKtykwL42Zk3GjyKODttyh9FffowVtLFJ0-S_pXQAQAeB_Omw@mail.gmail.com>
	<5375C57E.4070001@itt.uni-stuttgart.de>
Message-ID: <5B38C1F8-60E6-44B0-8268-F23563C437B4@mcs.anl.gov>


   Sorry it is missing in the fortran includes. You can use a short unsigned integer (16 bit) to represent it. I don?t know how that it is indicated in Fortran but a Fortran programmer would know.

Request-assigned: Satish, please add ISColoringValue to Fortran include; note that its value is configure assigned on the C side.

   Barry

On May 16, 2014, at 2:59 AM, Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de> wrote:

> 
> 
> I tried to use ISColoringValue, but when I include
> 
> IScoloringValue   colors 
> 
> 
> into my code, I get an error message from the compiler(gfortran):
> 
> 
>      ISColoringValue colors
>       1
>     error: unclassifiable statement at (1)
> 
> I'm including these header files, am I missing one?
> 
> 
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscdmda.h>
> #include <finclude/petscis.h>
> #include <finclude/petscmat.h>
> #include <finclude/petscksp.h>
> #include <finclude/petscpc.h>
> #include <finclude/petscsnes.h>
> #include <finclude/petscvec.h90>
> #include <finclude/petscdmda.h90>
> 
> 
> 
> Thank you for your fast responses!
> 
> 
> 
> Am 15.05.2014 19:16, schrieb Peter Brune:
>> You should be using an array of type ISColoringValue. ISColoringValue is by default a short, not an int, so you're getting nonsense entries.  We should either maintain or remove ex5s if it does something like this.
>> 
>> - Peter
>> 
>> 
>> On Thu, May 15, 2014 at 11:56 AM, Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de> wrote:
>> 
>> If 'colors' can be a dynamically allocated array then I dont know where
>> the mistake is in this code:
>> 
>> 
>> 
>> 
>> 
>>        ISColoring iscoloring
>>        Integer, allocatable :: colors(:)
>>        PetscInt maxc
>> 
>>       ...
>> 
>> 
>>       !calculate max. number of colors
>>       maxc = 2*irc+1 !irc is the number of ghost nodes needed to
>> calculate the function I want to solve
>> 
>>      allocate(colors(user%xm))  !where user%xm is the number of locally
>> owned nodes of a global array
>> 
>>      !Set colors
>>      DO i=1,user%xm
>>           colors(i) = mod(i,maxc)
>>      END DO
>> 
>>     call
>> ISColoringCreate(PETSC_COMM_WORLD,maxc,user%xm,colors,iscoloring,ierr)
>> 
>>     ...
>> 
>>     deallocate(colors)
>>     call ISColoringDestroy(iscoloring,ierr)
>> 
>> 
>> 
>> 
>> On execution I get the following error message (running the DO Loop from
>> 0 to user%xm-1 does not change anything):
>> 
>> 
>> [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [0]PETSC ERROR: Arguments are incompatible!
>> [0]PETSC ERROR: Number of colors passed in 291 is less then the actual
>> number of colors in array 61665!
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [0]PETSC ERROR: See docs/index.html for manual pages.
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: ./DFT on a arch-linux2-c-debug named
>> aries.itt.uni-stuttgart.de by mhofer Thu May 15 18:01:41 2014
>> [0]PETSC ERROR: Libraries linked from
>> /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/arch-linux2-c-debug/lib
>> [0]PETSC ERROR: Configure run at Wed Mar 19 11:00:35 2014
>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-fc=gfortran
>> --download-f-blas-lapack --download-mpich
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: ISColoringCreate() line 276 in
>> /usr/ITT/mhofer/Documents/Diss/NumericalMethods/Libraries/Petsc/petsc-3.4.4/src/vec/is/is/utils/iscoloring.c
>> 
>> 
>> 
>> 
>> 
>> But when I print out colors, it only has entries from 0 to 218, so no
>> entry is larger then 291 as stated in the error message.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Am 15.05.2014 16:45, schrieb Jed Brown:
>> 
>> Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de> writes:
>> 
>> Hi, I'm trying to set the coloring of a matrix using ISColoringCreate.
>> Therefore I need an array 'colors' which in C can be creates as (from
>> example ex5s.c)
>> 
>> int *colors
>> PetscMalloc(...,&colors)
>> There is no PetscMalloc in Fortran, due to language "deficiencies".
>> 
>> colors(i) = ....
>> 
>> ISColoringCreate(...)
>> 
>> How do I have to define the array colors in Fortran?
>> 
>> I tried:
>> 
>> Integer, allocatable :: colors(:)    and    allocate() instead of
>> PetscMalloc
>> 
>> and
>> 
>> Integer, pointer :: colors
>> 
>> but neither worked.
>> The ISColoringCreate Fortran binding copies from the array you pass into
>> one allocated using PetscMalloc.  You should pass a normal Fortran array
>> (statically or dynamically allocated).
>> 
>> 
> 


From jed at jedbrown.org  Fri May 16 10:50:11 2014
From: jed at jedbrown.org (Jed Brown)
Date: Fri, 16 May 2014 09:50:11 -0600
Subject: [petsc-users] hypre support
In-Reply-To: <537632D9.2090603@newmerical.com>
References: <537632D9.2090603@newmerical.com>
Message-ID: <87y4y1wwng.fsf@jedbrown.org>

Dario Isola <dario.isola at newmerical.com> writes:

> Dear all,
>
> I am investigating the use of hypre+petsc. I was able to successfully 
> configure, install, compile petsc 3.3 with the external package for hypre.
>
> I tried to run it with the following options
>
>     -pc_type hypre -pc_type_hypre pilut -ksp_type richardson
>
> and, although he did not complain, it does not solve the system either.

There is no reason pilut (which is deprecated; Hypre recommends using
euclid) can be expected to create a contractive Richardson iteration.
You should probably use Krylov (perhaps GMRES, the default), which will
also fix the scaling.  Use -ksp_monitor_true_residual
-ksp_converged_reason while debugging/tuning the solver.

> To what extent is hypre supported by petsc? More specifically, what kind 
> of matrices? I am using a *baij*//matrix.

AIJ and BAIJ work with Hypre.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140516/fc1c9bf3/attachment.pgp>

From bsmith at mcs.anl.gov  Fri May 16 10:54:19 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 16 May 2014 10:54:19 -0500
Subject: [petsc-users] hypre support
In-Reply-To: <537632D9.2090603@newmerical.com>
References: <537632D9.2090603@newmerical.com>
Message-ID: <B37A27EA-6E44-4036-AF2F-8E8EC34E3E68@mcs.anl.gov>


On May 16, 2014, at 10:46 AM, Dario Isola <dario.isola at newmerical.com> wrote:

> Dear all,
> 
> I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre.
> 
> I tried to run it with the following options
> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson
> and, although he did not complain, it does not solve the system either.

   Do you meaning it did not converge? At first always run with -ksp_view (or -snes_view if using snes or -ts_view if using ts) and -ksp_monitor_true_residual to see what is going on.

> -pc_type_hypre pilut

is wrong it is -pc_hypre_type pilut

Note that pilut will generally not work with Richardson you need a ?real? Krylov method like GMRES. 

Also the ilu type preconditioners don?t scale particularly well though occasionally they can be fine.

> 
> To what extent is hypre supported by petsc? More specifically, what kind of matrices?

   If it cannot handle the matrix type it would give an error message.  Hypre uses a format like AIJ so you should use AIJ. Note that you can make the matrix type a runtime option so you don?t have to compile in that it is BAIJ.


> I am using a baij matrix.
> 
> Thanks in advance,
> D


From dario.isola at newmerical.com  Fri May 16 11:49:40 2014
From: dario.isola at newmerical.com (Dario Isola)
Date: Fri, 16 May 2014 12:49:40 -0400
Subject: [petsc-users] hypre support
In-Reply-To: <B37A27EA-6E44-4036-AF2F-8E8EC34E3E68@mcs.anl.gov>
References: <537632D9.2090603@newmerical.com>
	<B37A27EA-6E44-4036-AF2F-8E8EC34E3E68@mcs.anl.gov>
Message-ID: <537641A4.3050907@newmerical.com>

Thanks a lot for your answers.

I ran it with

    -ksp_type gmres -pc_type hypre -pc_hypre_type euclid

and it worked very well. Thanks.

I then tried to use boomeramg as a preconditioner coupled with 
Richardson but I was not successful, it failed to solve the system and 
returned nans.

    -ksp_type richardson -pc_type hypre -pc_hypre_type boomeramg
    -pc_hypre_boomeramg_relax_type_all SOR/Jacobi
    -pc_hypre_boomeramg_print_debug -ksp_view -ksp_monitor_true_residual

and i got the following

      ===== Proc = 0     Level = 0  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 18308 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 7 nc_offd = 0

      ===== Proc = 0     Level = 1  =====
    Proc = 0    Coarsen 1st pass = 0.010000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 8725 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 16 nc_offd = 0

      ===== Proc = 0     Level = 2  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 4721 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
    Proc = 0  iter 3  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 4 nc_offd = 0

      ===== Proc = 0     Level = 3  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 2495 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 22 nc_offd = 0
    Proc = 0  iter 3  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 4 nc_offd = 0

      ===== Proc = 0     Level = 4  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 1337 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
    Proc = 0  iter 3  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0

      ===== Proc = 0     Level = 5  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 695 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 3 nc_offd = 0

      ===== Proc = 0     Level = 6  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 343 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 21 nc_offd = 0
    Proc = 0  iter 3  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0

      ===== Proc = 0     Level = 7  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 174 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 15 nc_offd = 0
    Proc = 0  iter 3  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0

      ===== Proc = 0     Level = 8  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 81 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0

      ===== Proc = 0     Level = 9  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 37 nc_offd = 0
    Proc = 0  iter 2  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 6 nc_offd = 0

      ===== Proc = 0     Level = 10  =====
    Proc = 0    Coarsen 1st pass = 0.000000
    Proc = 0    Coarsen 2nd pass = 0.000000
    Proc = 0    Initialize CLJP phase = 0.000000
    Proc = 0  iter 1  comm. and subgraph update = 0.000000
    Proc = 0    CLJP phase = 0.000000 graph_size = 11 nc_offd = 0


       0 KSP preconditioned resid norm 7.299769365830e+14 true resid
    norm 8.197927963033e-03 ||r(i)||/||b|| 1.000000000000e+00
       1 KSP preconditioned resid norm 2.319459389445e+28 true resid
    norm 6.152576199945e+12 ||r(i)||/||b|| 7.505038136086e+14
    KSP Object: 1 MPI processes
       type: richardson
         Richardson: damping factor=1
       maximum iterations=90, initial guess is zero
       tolerances:  relative=0.1, absolute=1e-50, divergence=100000
       left preconditioning
       using PRECONDITIONED norm type for convergence test
    PC Object: 1 MPI processes
       type: hypre
         HYPRE BoomerAMG preconditioning
         HYPRE BoomerAMG: Cycle type V
         HYPRE BoomerAMG: Maximum number of levels 25
         HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
         HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
         HYPRE BoomerAMG: Threshold for strong coupling 0.25
         HYPRE BoomerAMG: Interpolation truncation factor 0
         HYPRE BoomerAMG: Interpolation: max elements per row 0
         HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
         HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
         HYPRE BoomerAMG: Maximum row sums 0.9
         HYPRE BoomerAMG: Sweeps down         1
         HYPRE BoomerAMG: Sweeps up           1
         HYPRE BoomerAMG: Sweeps on coarse    1
         HYPRE BoomerAMG: Relax down          SOR/Jacobi
         HYPRE BoomerAMG: Relax up            SOR/Jacobi
         HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
         HYPRE BoomerAMG: Relax weight  (all)      1
         HYPRE BoomerAMG: Outer relax weight (all) 1
         HYPRE BoomerAMG: Using CF-relaxation
         HYPRE BoomerAMG: Measure type        local
         HYPRE BoomerAMG: Coarsen type        Falgout
         HYPRE BoomerAMG: Interpolation type  classical
       linear system matrix = precond matrix:
       Matrix Object:   1 MPI processes
         type: seqbaij
         rows=22905, cols=22905, bs=5
         total: nonzeros=785525, allocated nonzeros=785525
         total number of mallocs used during MatSetValues calls =0
             block size is 5

Do you guys have any suggestion?  Is it possible that I am haven't 
initialized boomeramg properly? Or it is just my system equations that 
can not be solved by AMG?

Sincerely,
Dario


On 05/16/2014 11:54 AM, Barry Smith wrote:
> On May 16, 2014, at 10:46 AM, Dario Isola <dario.isola at newmerical.com> wrote:
>
>> Dear all,
>>
>> I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre.
>>
>> I tried to run it with the following options
>> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson
>> and, although he did not complain, it does not solve the system either.
>     Do you meaning it did not converge? At first always run with -ksp_view (or -snes_view if using snes or -ts_view if using ts) and -ksp_monitor_true_residual to see what is going on.
>
>> -pc_type_hypre pilut
> is wrong it is -pc_hypre_type pilut
>
> Note that pilut will generally not work with Richardson you need a ?real? Krylov method like GMRES.
>
> Also the ilu type preconditioners don?t scale particularly well though occasionally they can be fine.
>
>> To what extent is hypre supported by petsc? More specifically, what kind of matrices?
>     If it cannot handle the matrix type it would give an error message.  Hypre uses a format like AIJ so you should use AIJ. Note that you can make the matrix type a runtime option so you don?t have to compile in that it is BAIJ.
>
>
>> I am using a baij matrix.
>>
>> Thanks in advance,
>> D

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140516/1925e267/attachment.html>

From bsmith at mcs.anl.gov  Fri May 16 12:56:49 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 16 May 2014 12:56:49 -0500
Subject: [petsc-users] hypre support
In-Reply-To: <537641A4.3050907@newmerical.com>
References: <537632D9.2090603@newmerical.com>
	<B37A27EA-6E44-4036-AF2F-8E8EC34E3E68@mcs.anl.gov>
	<537641A4.3050907@newmerical.com>
Message-ID: <98D9D6B3-D03E-4254-9634-FA3D229918DD@mcs.anl.gov>


   Algebraic multigrid is not for everything.

On May 16, 2014, at 11:49 AM, Dario Isola <dario.isola at newmerical.com> wrote:

> Thanks a lot for your answers.
> 
> I ran it with
> -ksp_type gmres -pc_type hypre -pc_hypre_type euclid
> and it worked very well. Thanks.
> 
> I then tried to use boomeramg as a preconditioner coupled with Richardson but I was not successful, it failed to solve the system and returned nans.
> -ksp_type richardson -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_relax_type_all SOR/Jacobi -pc_hypre_boomeramg_print_debug -ksp_view -ksp_monitor_true_residual
> and i got the following
> 
>  ===== Proc = 0     Level = 0  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 18308 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 7 nc_offd = 0
> 
>  ===== Proc = 0     Level = 1  =====
> Proc = 0    Coarsen 1st pass = 0.010000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 8725 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 16 nc_offd = 0
> 
>  ===== Proc = 0     Level = 2  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 4721 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
> Proc = 0  iter 3  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 4 nc_offd = 0
> 
>  ===== Proc = 0     Level = 3  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 2495 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 22 nc_offd = 0
> Proc = 0  iter 3  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 4 nc_offd = 0
> 
>  ===== Proc = 0     Level = 4  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 1337 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
> Proc = 0  iter 3  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
> 
>  ===== Proc = 0     Level = 5  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 695 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 3 nc_offd = 0
> 
>  ===== Proc = 0     Level = 6  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 343 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 21 nc_offd = 0
> Proc = 0  iter 3  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
> 
>  ===== Proc = 0     Level = 7  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 174 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 15 nc_offd = 0
> Proc = 0  iter 3  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
> 
>  ===== Proc = 0     Level = 8  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 81 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
> 
>  ===== Proc = 0     Level = 9  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 37 nc_offd = 0
> Proc = 0  iter 2  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 6 nc_offd = 0
> 
>  ===== Proc = 0     Level = 10  =====
> Proc = 0    Coarsen 1st pass = 0.000000
> Proc = 0    Coarsen 2nd pass = 0.000000
> Proc = 0    Initialize CLJP phase = 0.000000
> Proc = 0  iter 1  comm. and subgraph update = 0.000000
> Proc = 0    CLJP phase = 0.000000 graph_size = 11 nc_offd = 0
> 
> 
>   0 KSP preconditioned resid norm 7.299769365830e+14 true resid norm 8.197927963033e-03 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP preconditioned resid norm 2.319459389445e+28 true resid norm 6.152576199945e+12 ||r(i)||/||b|| 7.505038136086e+14
> KSP Object: 1 MPI processes
>   type: richardson
>     Richardson: damping factor=1
>   maximum iterations=90, initial guess is zero
>   tolerances:  relative=0.1, absolute=1e-50, divergence=100000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: hypre
>     HYPRE BoomerAMG preconditioning
>     HYPRE BoomerAMG: Cycle type V
>     HYPRE BoomerAMG: Maximum number of levels 25
>     HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>     HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>     HYPRE BoomerAMG: Threshold for strong coupling 0.25
>     HYPRE BoomerAMG: Interpolation truncation factor 0
>     HYPRE BoomerAMG: Interpolation: max elements per row 0
>     HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>     HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>     HYPRE BoomerAMG: Maximum row sums 0.9
>     HYPRE BoomerAMG: Sweeps down         1
>     HYPRE BoomerAMG: Sweeps up           1
>     HYPRE BoomerAMG: Sweeps on coarse    1
>     HYPRE BoomerAMG: Relax down          SOR/Jacobi
>     HYPRE BoomerAMG: Relax up            SOR/Jacobi
>     HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>     HYPRE BoomerAMG: Relax weight  (all)      1
>     HYPRE BoomerAMG: Outer relax weight (all) 1
>     HYPRE BoomerAMG: Using CF-relaxation
>     HYPRE BoomerAMG: Measure type        local
>     HYPRE BoomerAMG: Coarsen type        Falgout
>     HYPRE BoomerAMG: Interpolation type  classical
>   linear system matrix = precond matrix:
>   Matrix Object:   1 MPI processes
>     type: seqbaij
>     rows=22905, cols=22905, bs=5
>     total: nonzeros=785525, allocated nonzeros=785525
>     total number of mallocs used during MatSetValues calls =0
>         block size is 5
> Do you guys have any suggestion?  Is it possible that I am haven't initialized boomeramg properly? Or it is just my system equations that can not be solved by AMG?
> 
> Sincerely,
> Dario
> 
> 
> 
> 
> On 05/16/2014 11:54 AM, Barry Smith wrote:
>> On May 16, 2014, at 10:46 AM, Dario Isola <dario.isola at newmerical.com>
>>  wrote:
>> 
>> 
>>> Dear all,
>>> 
>>> I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre.
>>> 
>>> I tried to run it with the following options
>>> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson
>>> and, although he did not complain, it does not solve the system either.
>>> 
>>    Do you meaning it did not converge? At first always run with -ksp_view (or -snes_view if using snes or -ts_view if using ts) and -ksp_monitor_true_residual to see what is going on.
>> 
>> 
>>> -pc_type_hypre pilut
>>> 
>> is wrong it is -pc_hypre_type pilut
>> 
>> Note that pilut will generally not work with Richardson you need a ?real? Krylov method like GMRES. 
>> 
>> Also the ilu type preconditioners don?t scale particularly well though occasionally they can be fine.
>> 
>> 
>>> To what extent is hypre supported by petsc? More specifically, what kind of matrices?
>>> 
>>    If it cannot handle the matrix type it would give an error message.  Hypre uses a format like AIJ so you should use AIJ. Note that you can make the matrix type a runtime option so you don?t have to compile in that it is BAIJ.
>> 
>> 
>> 
>>> I am using a baij matrix.
>>> 
>>> Thanks in advance,
>>> D
>>> 
> 


From dario.isola at newmerical.com  Fri May 16 13:55:29 2014
From: dario.isola at newmerical.com (Dario Isola)
Date: Fri, 16 May 2014 14:55:29 -0400
Subject: [petsc-users] hypre support
In-Reply-To: <98D9D6B3-D03E-4254-9634-FA3D229918DD@mcs.anl.gov>
References: <537632D9.2090603@newmerical.com>
	<B37A27EA-6E44-4036-AF2F-8E8EC34E3E68@mcs.anl.gov>
	<537641A4.3050907@newmerical.com>
	<98D9D6B3-D03E-4254-9634-FA3D229918DD@mcs.anl.gov>
Message-ID: <53765F21.5080809@newmerical.com>

I was eventually able to make it run adopting a very small time-step 
(Courant number of about 1).

So either my problem is not well solved by AMG, as you said, or I am not 
using it very well.

But I guess I should be able to take it from there.

Thanks again for the support!

Dario


On 05/16/2014 01:56 PM, Barry Smith wrote:
>     Algebraic multigrid is not for everything.
>
> On May 16, 2014, at 11:49 AM, Dario Isola <dario.isola at newmerical.com> wrote:
>
>> Thanks a lot for your answers.
>>
>> I ran it with
>> -ksp_type gmres -pc_type hypre -pc_hypre_type euclid
>> and it worked very well. Thanks.
>>
>> I then tried to use boomeramg as a preconditioner coupled with Richardson but I was not successful, it failed to solve the system and returned nans.
>> -ksp_type richardson -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_relax_type_all SOR/Jacobi -pc_hypre_boomeramg_print_debug -ksp_view -ksp_monitor_true_residual
>> and i got the following
>>
>>   ===== Proc = 0     Level = 0  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 18308 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 7 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 1  =====
>> Proc = 0    Coarsen 1st pass = 0.010000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 8725 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 16 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 2  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 4721 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 4 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 3  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 2495 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 22 nc_offd = 0
>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 4 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 4  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 1337 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 5  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 695 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 3 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 6  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 343 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 21 nc_offd = 0
>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 7  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 174 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 15 nc_offd = 0
>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 8  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 81 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 9  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 37 nc_offd = 0
>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 6 nc_offd = 0
>>
>>   ===== Proc = 0     Level = 10  =====
>> Proc = 0    Coarsen 1st pass = 0.000000
>> Proc = 0    Coarsen 2nd pass = 0.000000
>> Proc = 0    Initialize CLJP phase = 0.000000
>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>> Proc = 0    CLJP phase = 0.000000 graph_size = 11 nc_offd = 0
>>
>>
>>    0 KSP preconditioned resid norm 7.299769365830e+14 true resid norm 8.197927963033e-03 ||r(i)||/||b|| 1.000000000000e+00
>>    1 KSP preconditioned resid norm 2.319459389445e+28 true resid norm 6.152576199945e+12 ||r(i)||/||b|| 7.505038136086e+14
>> KSP Object: 1 MPI processes
>>    type: richardson
>>      Richardson: damping factor=1
>>    maximum iterations=90, initial guess is zero
>>    tolerances:  relative=0.1, absolute=1e-50, divergence=100000
>>    left preconditioning
>>    using PRECONDITIONED norm type for convergence test
>> PC Object: 1 MPI processes
>>    type: hypre
>>      HYPRE BoomerAMG preconditioning
>>      HYPRE BoomerAMG: Cycle type V
>>      HYPRE BoomerAMG: Maximum number of levels 25
>>      HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>>      HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>>      HYPRE BoomerAMG: Threshold for strong coupling 0.25
>>      HYPRE BoomerAMG: Interpolation truncation factor 0
>>      HYPRE BoomerAMG: Interpolation: max elements per row 0
>>      HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>>      HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>>      HYPRE BoomerAMG: Maximum row sums 0.9
>>      HYPRE BoomerAMG: Sweeps down         1
>>      HYPRE BoomerAMG: Sweeps up           1
>>      HYPRE BoomerAMG: Sweeps on coarse    1
>>      HYPRE BoomerAMG: Relax down          SOR/Jacobi
>>      HYPRE BoomerAMG: Relax up            SOR/Jacobi
>>      HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>>      HYPRE BoomerAMG: Relax weight  (all)      1
>>      HYPRE BoomerAMG: Outer relax weight (all) 1
>>      HYPRE BoomerAMG: Using CF-relaxation
>>      HYPRE BoomerAMG: Measure type        local
>>      HYPRE BoomerAMG: Coarsen type        Falgout
>>      HYPRE BoomerAMG: Interpolation type  classical
>>    linear system matrix = precond matrix:
>>    Matrix Object:   1 MPI processes
>>      type: seqbaij
>>      rows=22905, cols=22905, bs=5
>>      total: nonzeros=785525, allocated nonzeros=785525
>>      total number of mallocs used during MatSetValues calls =0
>>          block size is 5
>> Do you guys have any suggestion?  Is it possible that I am haven't initialized boomeramg properly? Or it is just my system equations that can not be solved by AMG?
>>
>> Sincerely,
>> Dario
>>
>>
>>
>>
>> On 05/16/2014 11:54 AM, Barry Smith wrote:
>>> On May 16, 2014, at 10:46 AM, Dario Isola <dario.isola at newmerical.com>
>>>   wrote:
>>>
>>>
>>>> Dear all,
>>>>
>>>> I am investigating the use of hypre+petsc. I was able to successfully configure, install, compile petsc 3.3 with the external package for hypre.
>>>>
>>>> I tried to run it with the following options
>>>> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson
>>>> and, although he did not complain, it does not solve the system either.
>>>>
>>>     Do you meaning it did not converge? At first always run with -ksp_view (or -snes_view if using snes or -ts_view if using ts) and -ksp_monitor_true_residual to see what is going on.
>>>
>>>
>>>> -pc_type_hypre pilut
>>>>
>>> is wrong it is -pc_hypre_type pilut
>>>
>>> Note that pilut will generally not work with Richardson you need a ?real? Krylov method like GMRES.
>>>
>>> Also the ilu type preconditioners don?t scale particularly well though occasionally they can be fine.
>>>
>>>
>>>> To what extent is hypre supported by petsc? More specifically, what kind of matrices?
>>>>
>>>     If it cannot handle the matrix type it would give an error message.  Hypre uses a format like AIJ so you should use AIJ. Note that you can make the matrix type a runtime option so you don?t have to compile in that it is BAIJ.
>>>
>>>
>>>
>>>> I am using a baij matrix.
>>>>
>>>> Thanks in advance,
>>>> D
>>>>


From knepley at gmail.com  Fri May 16 14:13:13 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 16 May 2014 14:13:13 -0500
Subject: [petsc-users] hypre support
In-Reply-To: <53765F21.5080809@newmerical.com>
References: <537632D9.2090603@newmerical.com>
	<B37A27EA-6E44-4036-AF2F-8E8EC34E3E68@mcs.anl.gov>
	<537641A4.3050907@newmerical.com>
	<98D9D6B3-D03E-4254-9634-FA3D229918DD@mcs.anl.gov>
	<53765F21.5080809@newmerical.com>
Message-ID: <CAMYG4Gmn-y99WSQ8U7pvQVU=wxh7_gx+_HLUkhMTLOzBe-M1YA@mail.gmail.com>

On Fri, May 16, 2014 at 1:55 PM, Dario Isola <dario.isola at newmerical.com>wrote:

> I was eventually able to make it run adopting a very small time-step
> (Courant number of about 1).
>

AMG is intended for elliptic systems.

   Matt


> So either my problem is not well solved by AMG, as you said, or I am not
> using it very well.
>
> But I guess I should be able to take it from there.
>
> Thanks again for the support!
>
> Dario
>
>
> On 05/16/2014 01:56 PM, Barry Smith wrote:
>
>>     Algebraic multigrid is not for everything.
>>
>> On May 16, 2014, at 11:49 AM, Dario Isola <dario.isola at newmerical.com>
>> wrote:
>>
>>  Thanks a lot for your answers.
>>>
>>> I ran it with
>>> -ksp_type gmres -pc_type hypre -pc_hypre_type euclid
>>> and it worked very well. Thanks.
>>>
>>> I then tried to use boomeramg as a preconditioner coupled with
>>> Richardson but I was not successful, it failed to solve the system and
>>> returned nans.
>>> -ksp_type richardson -pc_type hypre -pc_hypre_type boomeramg
>>> -pc_hypre_boomeramg_relax_type_all SOR/Jacobi -pc_hypre_boomeramg_print_debug
>>> -ksp_view -ksp_monitor_true_residual
>>> and i got the following
>>>
>>>   ===== Proc = 0     Level = 0  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 18308 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 7 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 1  =====
>>> Proc = 0    Coarsen 1st pass = 0.010000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 8725 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 16 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 2  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 4721 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
>>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 4 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 3  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 2495 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 22 nc_offd = 0
>>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 4 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 4  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 1337 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
>>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 5  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 695 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 3 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 6  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 343 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 21 nc_offd = 0
>>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 7  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 174 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 15 nc_offd = 0
>>> Proc = 0  iter 3  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 2 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 8  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 81 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 13 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 9  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 37 nc_offd = 0
>>> Proc = 0  iter 2  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 6 nc_offd = 0
>>>
>>>   ===== Proc = 0     Level = 10  =====
>>> Proc = 0    Coarsen 1st pass = 0.000000
>>> Proc = 0    Coarsen 2nd pass = 0.000000
>>> Proc = 0    Initialize CLJP phase = 0.000000
>>> Proc = 0  iter 1  comm. and subgraph update = 0.000000
>>> Proc = 0    CLJP phase = 0.000000 graph_size = 11 nc_offd = 0
>>>
>>>
>>>    0 KSP preconditioned resid norm 7.299769365830e+14 true resid norm
>>> 8.197927963033e-03 ||r(i)||/||b|| 1.000000000000e+00
>>>    1 KSP preconditioned resid norm 2.319459389445e+28 true resid norm
>>> 6.152576199945e+12 ||r(i)||/||b|| 7.505038136086e+14
>>> KSP Object: 1 MPI processes
>>>    type: richardson
>>>      Richardson: damping factor=1
>>>    maximum iterations=90, initial guess is zero
>>>    tolerances:  relative=0.1, absolute=1e-50, divergence=100000
>>>    left preconditioning
>>>    using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI processes
>>>    type: hypre
>>>      HYPRE BoomerAMG preconditioning
>>>      HYPRE BoomerAMG: Cycle type V
>>>      HYPRE BoomerAMG: Maximum number of levels 25
>>>      HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>>>      HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>>>      HYPRE BoomerAMG: Threshold for strong coupling 0.25
>>>      HYPRE BoomerAMG: Interpolation truncation factor 0
>>>      HYPRE BoomerAMG: Interpolation: max elements per row 0
>>>      HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>>>      HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>>>      HYPRE BoomerAMG: Maximum row sums 0.9
>>>      HYPRE BoomerAMG: Sweeps down         1
>>>      HYPRE BoomerAMG: Sweeps up           1
>>>      HYPRE BoomerAMG: Sweeps on coarse    1
>>>      HYPRE BoomerAMG: Relax down          SOR/Jacobi
>>>      HYPRE BoomerAMG: Relax up            SOR/Jacobi
>>>      HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>>>      HYPRE BoomerAMG: Relax weight  (all)      1
>>>      HYPRE BoomerAMG: Outer relax weight (all) 1
>>>      HYPRE BoomerAMG: Using CF-relaxation
>>>      HYPRE BoomerAMG: Measure type        local
>>>      HYPRE BoomerAMG: Coarsen type        Falgout
>>>      HYPRE BoomerAMG: Interpolation type  classical
>>>    linear system matrix = precond matrix:
>>>    Matrix Object:   1 MPI processes
>>>      type: seqbaij
>>>      rows=22905, cols=22905, bs=5
>>>      total: nonzeros=785525, allocated nonzeros=785525
>>>      total number of mallocs used during MatSetValues calls =0
>>>          block size is 5
>>> Do you guys have any suggestion?  Is it possible that I am haven't
>>> initialized boomeramg properly? Or it is just my system equations that can
>>> not be solved by AMG?
>>>
>>> Sincerely,
>>> Dario
>>>
>>>
>>>
>>>
>>> On 05/16/2014 11:54 AM, Barry Smith wrote:
>>>
>>>> On May 16, 2014, at 10:46 AM, Dario Isola <dario.isola at newmerical.com>
>>>>   wrote:
>>>>
>>>>
>>>>  Dear all,
>>>>>
>>>>> I am investigating the use of hypre+petsc. I was able to successfully
>>>>> configure, install, compile petsc 3.3 with the external package for hypre.
>>>>>
>>>>> I tried to run it with the following options
>>>>> -pc_type hypre -pc_type_hypre pilut -ksp_type richardson
>>>>> and, although he did not complain, it does not solve the system either.
>>>>>
>>>>>      Do you meaning it did not converge? At first always run with
>>>> -ksp_view (or -snes_view if using snes or -ts_view if using ts) and
>>>> -ksp_monitor_true_residual to see what is going on.
>>>>
>>>>
>>>>  -pc_type_hypre pilut
>>>>>
>>>>>  is wrong it is -pc_hypre_type pilut
>>>>
>>>> Note that pilut will generally not work with Richardson you need a
>>>> ?real? Krylov method like GMRES.
>>>>
>>>> Also the ilu type preconditioners don?t scale particularly well though
>>>> occasionally they can be fine.
>>>>
>>>>
>>>>  To what extent is hypre supported by petsc? More specifically, what
>>>>> kind of matrices?
>>>>>
>>>>>      If it cannot handle the matrix type it would give an error
>>>> message.  Hypre uses a format like AIJ so you should use AIJ. Note that you
>>>> can make the matrix type a runtime option so you don?t have to compile in
>>>> that it is BAIJ.
>>>>
>>>>
>>>>
>>>>  I am using a baij matrix.
>>>>>
>>>>> Thanks in advance,
>>>>> D
>>>>>
>>>>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140516/336246bf/attachment-0001.html>

From altriaex86 at gmail.com  Fri May 16 18:21:23 2014
From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=)
Date: Sat, 17 May 2014 09:21:23 +1000
Subject: [petsc-users] SlepcInitialize not return
Message-ID: <CACDKL=z0YeqUYYTj32hB8P1bVs+1O8Up+DfCbm8we+duth1_Nw@mail.gmail.com>

Hi,

I write a piece of code containing SLEPc calling and compile it into a .so
library. Then I call this library in my program. I call MPI_Init in the
main program, but it turns out that function SlepcInitialize in SLEPc .so
library will not return, and the program stuck at that line. I don't know
if this problem related to the place of MPI_Init calling, but the library
works well if I only use one process and not initialize MPI outside.

Guoxi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140517/f927e0ab/attachment.html>

From bsmith at mcs.anl.gov  Fri May 16 18:28:52 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 16 May 2014 18:28:52 -0500
Subject: [petsc-users] SlepcInitialize not return
In-Reply-To: <CACDKL=z0YeqUYYTj32hB8P1bVs+1O8Up+DfCbm8we+duth1_Nw@mail.gmail.com>
References: <CACDKL=z0YeqUYYTj32hB8P1bVs+1O8Up+DfCbm8we+duth1_Nw@mail.gmail.com>
Message-ID: <249234C0-A064-42A4-8E96-7AC5259C92C3@mcs.anl.gov>


   Reproduce this in a small about of code (sounds easy in this case) and then email the code out; your verbal descriptions is not detailed enough to determine the problem. Also send a make file that links the .so library, it could be a problem related to that.

   Barry

On May 16, 2014, at 6:21 PM, ??? <altriaex86 at gmail.com> wrote:

> Hi,
> 
> I write a piece of code containing SLEPc calling and compile it into a .so library. Then I call this library in my program. I call MPI_Init in the main program, but it turns out that function SlepcInitialize in SLEPc .so library will not return, and the program stuck at that line. I don't know if this problem related to the place of MPI_Init calling, but the library works well if I only use one process and not initialize MPI outside.
> 
> Guoxi 


From bsmith at mcs.anl.gov  Fri May 16 19:54:52 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 16 May 2014 19:54:52 -0500
Subject: [petsc-users] SlepcInitialize not return
In-Reply-To: <CACDKL=wtrichvqCa0ieK_dq=nkwVmXHzp9HDfVv1P8izLOrA4Q@mail.gmail.com>
References: <CACDKL=z0YeqUYYTj32hB8P1bVs+1O8Up+DfCbm8we+duth1_Nw@mail.gmail.com>
	<249234C0-A064-42A4-8E96-7AC5259C92C3@mcs.anl.gov>
	<CACDKL=wtrichvqCa0ieK_dq=nkwVmXHzp9HDfVv1P8izLOrA4Q@mail.gmail.com>
Message-ID: <2303C406-C3A0-4FD8-819F-4E8CF9A5AC50@mcs.anl.gov>


  Sounds like some issue with how you are using the shared libraries. If you don?t show us the offending code it is going to be awfully difficult for us to debug it.

  Barry

On May 16, 2014, at 6:39 PM, ??? <altriaex86 at gmail.com> wrote:

> execute program A calls .so library B, and .so libraryB calls SLEPclibrary.
> I used to put MPI_Init() only in A. But this time I try to put it in A,B( and C).
> Now it returns to this to me.
> I guess maybe this is why it does not return. However I don't know how to fix it.
> 
> INTERNAL ERROR: Invalid error class (59) encountered while returning from
> MPI_Init.  Please file a bug report.
> Fatal error in MPI_Init: Unknown error.  Please file a bug report., error stack:
> (unknown)(): unable to bind socket to port
> INTERNAL ERROR: Invalid error class (59) encountered while returning from
> MPI_Init.  Please file a bug report.
> Fatal error in MPI_Init: Unknown error.  Please file a bug report., error stack:
> (unknown)(): unable to bind socket to port
> [cli_1]: aborting job:
> Fatal error in MPI_Init: Unknown error.  Please file a bug report., error stack:
> (unknown)(): unable to bind socket to port
> INTERNAL ERROR: Invalid error class (59) encountered while returning from
> MPI_Init.  Please file a bug report.
> INTERNAL ERROR: Invalid error class (59) encountered while returning from
> MPI_Init.  Please file a bug report.
> Fatal error in MPI_Init: Unknown error.  Please file a bug report., error stack:
> (unknown)(): unable to bind socket to port
> [cli_3]: aborting job:
> Fatal error in MPI_Init: Unknown error.  Please file a bug report., error stack:
> (unknown)(): unable to bind socket to port
> [cli_0]: aborting job:
> Fatal error in MPI_Init: Unknown error.  Please file a bug report., error stack:
> (unknown)(): unable to bind socket to port
> Fatal error in MPI_Init: Unknown error.  Please file a bug report., error stack:
> (unknown)(): unable to bind socket to port
> [cli_2]: aborting job:
> Fatal error in MPI_Init: Unknown error.  Please file a bug report., error stack:
> (unknown)(): unable to bind socket to port
> 
> 
> 
> 2014-05-17 9:28 GMT+10:00 Barry Smith <bsmith at mcs.anl.gov>:
> 
>    Reproduce this in a small about of code (sounds easy in this case) and then email the code out; your verbal descriptions is not detailed enough to determine the problem. Also send a make file that links the .so library, it could be a problem related to that.
> 
>    Barry
> 
> On May 16, 2014, at 6:21 PM, ??? <altriaex86 at gmail.com> wrote:
> 
> > Hi,
> >
> > I write a piece of code containing SLEPc calling and compile it into a .so library. Then I call this library in my program. I call MPI_Init in the main program, but it turns out that function SlepcInitialize in SLEPc .so library will not return, and the program stuck at that line. I don't know if this problem related to the place of MPI_Init calling, but the library works well if I only use one process and not initialize MPI outside.
> >
> > Guoxi
> 
> 


From jed at jedbrown.org  Sat May 17 02:26:03 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sat, 17 May 2014 01:26:03 -0600
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <53753C7B.8010201@uci.edu>
References: <53753C7B.8010201@uci.edu>
Message-ID: <87y4y0uar8.fsf@jedbrown.org>

Michele Rosso <mrosso at uci.edu> writes:

> Hi,
>
> I am solving an inhomogeneous Laplacian in 3D (basically a slightly 
> modified version of example ex34).
> The laplacian is discretized by using a  cell-center finite difference 
> 7-point stencil with periodic BCs.
> I am solving a time-dependent problem so the solution of the laplacian 
> is repeated at each time step with a different matrix (always SPD 
> though) and rhs. Also, the laplacian features large magnitude variations 
> in the coefficients. I solve by means of CG + GAMG as preconditioner.
> Everything works fine for a while until I receive  a 
> DIVERGED_INDEFINITE_PC message. 

What is changing as you time step?  Is there a nonlinearity that
activates suddenly?  Especially a bifurcation or perhaps a source term
that is incompatible with the boundary conditions?  You could try
-mg_levels_ksp_type richardson -mg_levels_pc_type sor.  Can you
reproduce with a small problem?

The configuration looks okay to me.

> Before checking my model is incorrect I would like to rule out the
> possibility of improper use of the linear solver.  I attached the full
> output of a serial run with -log-summary -ksp_view
> -ksp_converged_reason ksp_monitor_true_residual. I would appreciate if
> you could help me in locating the issue.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140517/a8aa34e9/attachment.pgp>

From zonexo at gmail.com  Sun May 18 20:18:04 2014
From: zonexo at gmail.com (TAY wee-beng)
Date: Mon, 19 May 2014 09:18:04 +0800
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>
References: <534C9A2C.5060404@gmail.com>	<534C9DB5.9070407@gmail.com>	<CB93E42E-730E-4EA5-9F84-DE656A7A43A1@mcs.anl.gov>	<53514B8A.90901@gmail.com>	<495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov>	<5351E62B.6060201@gmail.com>	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>	<53520587.6010606@gmail.com>	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>	<535248E8.2070002@gmail.com>	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>	<535284E0.8010901@gmail.com>	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>	<5352934C.1010306@gmail.com>	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>	<53529B09.8040009@gmail.com>
	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>
	<5353173D.60609@gmail.com> <53546B03.1010407@gmail.com>
	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>
	<537188D8.2030307@gmail.com>
	<A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>
Message-ID: <53795BCC.8020500@gmail.com>

Hi Barry,

I am trying to sort out the details so that it's easier to pinpoint the 
error. However, I tried on gnu gfortran and it worked well. On intel 
ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely 
mean that it's a bug in ifort? Do you work with both intel and gnu?


Thank you

Yours sincerely,

TAY wee-beng

On 14/5/2014 12:03 AM, Barry Smith wrote:
>    Please send you current code. So we may compile and run it.
>
>    Barry
>
>
>    
> On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>
>> Hi,
>>
>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures.
>>
>> Thank you.
>>
>> Yours sincerely,
>>
>> TAY wee-beng
>>
>> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>>     Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email.
>>>
>>>     Barry
>>>
>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>>>>>>>     Hmm,
>>>>>>>>>
>>>>>>>>>         Interface DMDAVecGetArrayF90
>>>>>>>>>           Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>>>>>>             USE_DM_HIDE
>>>>>>>>>             DM_HIDE da1
>>>>>>>>>             VEC_HIDE v
>>>>>>>>>             PetscScalar,pointer :: d1(:,:,:)
>>>>>>>>>             PetscErrorCode ierr
>>>>>>>>>           End Subroutine
>>>>>>>>>
>>>>>>>>>      So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array??
>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u".
>>>>>>>>>
>>>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why...
>>>>>>>>>
>>>>>>>>> Also, supposed I call:
>>>>>>>>>
>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>
>>>>>>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>
>>>>>>>>> u_array ....
>>>>>>>>>
>>>>>>>>> v_array .... etc
>>>>>>>>>
>>>>>>>>> Now to restore the array, does it matter the sequence they are restored?
>>>>>>>>>      No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code.
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>
>>>>>>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>
>>>>>>>>>      call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>>>
>>>>>>>>>      u_array = 0.d0
>>>>>>>>>
>>>>>>>>>      v_array = 0.d0
>>>>>>>>>
>>>>>>>>>      w_array = 0.d0
>>>>>>>>>
>>>>>>>>>      p_array = 0.d0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>
>>>>>>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>
>>>>>>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other                                                     version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi?
>>>>>>>>>
>>>>>>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F?
>>>>>>>> Hi Matt,
>>>>>>>>
>>>>>>>> Do you mean putting the above lines into ex11f90.F and test?
>>>>>>>>
>>>>>>>> It already has DMDAVecGetArray(). Just run it.
>>>>>>> Hi,
>>>>>>>
>>>>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region.
>>>>>>>
>>>>>>> No the global/local difference should not matter.
>>>>>>>   Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though.
>>>>>>>
>>>>>>> DMGetLocalVector()
>>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter?
>>>>>>
>>>>>> If so, when should I call them?
>>>>>>
>>>>>> You just need a local vector from somewhere.
>>>> Hi,
>>>>
>>>> Anyone can help with the questions below? Still trying to find why my code doesn't work.
>>>>
>>>> Thanks.
>>>>> Hi,
>>>>>
>>>>> I insert part of my error region code into ex11f90:
>>>>>
>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>           call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>
>>>>>      u_array = 0.d0
>>>>>           v_array = 0.d0
>>>>>           w_array = 0.d0
>>>>>           p_array = 0.d0
>>>>>
>>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>
>>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> It worked w/o error. I'm going to change the way the modules are defined in my code.
>>>>>
>>>>> My code contains a main program and a number of modules files, with subroutines inside e.g.
>>>>>
>>>>> module solve
>>>>>                   <- add include file?
>>>>> subroutine RRK
>>>>>                   <- add include file?
>>>>> end subroutine RRK
>>>>>
>>>>> end module solve
>>>>>
>>>>> So where should the include files (#include <finclude/petscdmda.h90>) be placed?
>>>>>
>>>>> After the module or inside the subroutine?
>>>>>
>>>>> Thanks.
>>>>>>    Matt
>>>>>>   Thanks.
>>>>>>>     Matt
>>>>>>>   Thanks.
>>>>>>>>     Matt
>>>>>>>>   Thanks
>>>>>>>>
>>>>>>>> Regards.
>>>>>>>>>     Matt
>>>>>>>>>   As in w, then v and u?
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>> thanks
>>>>>>>>>      Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F  (and they do not start at 1). This is how to get the loop bounds.
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?.
>>>>>>>>>
>>>>>>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so?
>>>>>>>>>      Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors.  So really C or Fortran                                                           doesn?t make any difference.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>>>>>>>     If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1.
>>>>>>>>>
>>>>>>>>>     Barry
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>     Barry
>>>>>>>>>
>>>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)"
>>>>>>>>>
>>>>>>>>> However, by re-writing my code, I found out a few things:
>>>>>>>>>
>>>>>>>>> 1. if I write my code this way:
>>>>>>>>>
>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>
>>>>>>>>> u_array = ....
>>>>>>>>>
>>>>>>>>> v_array = ....
>>>>>>>>>
>>>>>>>>> w_array = ....
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>> The code runs fine.
>>>>>>>>>
>>>>>>>>> 2. if I write my code this way:
>>>>>>>>>
>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>
>>>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above.
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>>>>>>>
>>>>>>>>> where the subroutine is:
>>>>>>>>>
>>>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>>>
>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>>>
>>>>>>>>> u ...
>>>>>>>>> v...
>>>>>>>>> w ...
>>>>>>>>>
>>>>>>>>> end subroutine uvw_array_change.
>>>>>>>>>
>>>>>>>>> The above will give an error at :
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>> 3. Same as above, except I change the order of the last 3 lines to:
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>
>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>
>>>>>>>>> So they are now in reversed order. Now it works.
>>>>>>>>>
>>>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>>>>>>>
>>>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>>>
>>>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>>>
>>>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>>>
>>>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>>>>>
>>>>>>>>> u ...
>>>>>>>>> v...
>>>>>>>>> w ...
>>>>>>>>>
>>>>>>>>> end subroutine uvw_array_change.
>>>>>>>>>
>>>>>>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick".
>>>>>>>>>
>>>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) "
>>>>>>>>>
>>>>>>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other                                                           than the "trick" to continue using the 1 indices convention.
>>>>>>>>>
>>>>>>>>> Thank you.
>>>>>>>>>
>>>>>>>>> Yours sincerely,
>>>>>>>>>
>>>>>>>>> TAY wee-beng
>>>>>>>>>
>>>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>>>>>>>     Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi Barry,
>>>>>>>>>
>>>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode.
>>>>>>>>>
>>>>>>>>> I have attached my code.
>>>>>>>>>
>>>>>>>>> Thank you
>>>>>>>>>
>>>>>>>>> Yours sincerely,
>>>>>>>>>
>>>>>>>>> TAY wee-beng
>>>>>>>>>
>>>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>>>>>>>     Please send the code that creates da_w and the declarations of w_array
>>>>>>>>>
>>>>>>>>>     Barry
>>>>>>>>>
>>>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>>>>>>>> <zonexo at gmail.com>
>>>>>>>>>    wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Barry,
>>>>>>>>>
>>>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>>>>>>>
>>>>>>>>>    mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>>>
>>>>>>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts.
>>>>>>>>>
>>>>>>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too.
>>>>>>>>>
>>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>> An MPI process has executed an operation involving a call to the
>>>>>>>>> "fork()" system call to create a child process.  Open MPI is currently
>>>>>>>>> operating in a condition that could result in memory corruption or
>>>>>>>>> other system errors; your MPI job may hang, crash, or produce silent
>>>>>>>>> data corruption.  The use of fork() (or system() or other calls that
>>>>>>>>> create child processes) is strongly discouraged.
>>>>>>>>>
>>>>>>>>> The process that invoked fork was:
>>>>>>>>>
>>>>>>>>>     Local host:          n12-76 (PID 20235)
>>>>>>>>>     MPI_COMM_WORLD rank: 2
>>>>>>>>>
>>>>>>>>> If you are *absolutely sure* that your application will successfully
>>>>>>>>> and correctly survive a call to fork(), you may disable this warning
>>>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76
>>>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76
>>>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76
>>>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76
>>>>>>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
>>>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>>>>>>>>
>>>>>>>>> ....
>>>>>>>>>
>>>>>>>>>    1
>>>>>>>>> [1]PETSC ERROR: ------------------------------------------------------------------------
>>>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>>>> [1]PETSC ERROR: or see
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org
>>>>>>>>>    on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>>>>>>> [1]PETSC ERROR: to get more information on the crash.
>>>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>>>>>>> [3]PETSC ERROR: ------------------------------------------------------------------------
>>>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>>>> [3]PETSC ERROR: or see
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org
>>>>>>>>>    on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>>>>>>> [3]PETSC ERROR: to get more information on the crash.
>>>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>> Thank you.
>>>>>>>>>
>>>>>>>>> Yours sincerely,
>>>>>>>>>
>>>>>>>>> TAY wee-beng
>>>>>>>>>
>>>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>>>>>>>
>>>>>>>>>     Because IO doesn?t always get flushed immediately it may not be hanging at this point.  It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point.
>>>>>>>>>
>>>>>>>>>      Barry
>>>>>>>>>
>>>>>>>>>     This routines don?t have any parallel communication in them so are unlikely to hang.
>>>>>>>>>
>>>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>>>>>>>
>>>>>>>>> <zonexo at gmail.com>
>>>>>>>>>
>>>>>>>>>    wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>>>>>>>>>
>>>>>>>>>           call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3"
>>>>>>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>>>>>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>>>>>>>>           call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>>>>>>>>           call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be in reverse order
>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>>>>>>>>           call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>>>>>>>>           call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>> -- 
>>>>>>>>> Thank you.
>>>>>>>>>
>>>>>>>>> Yours sincerely,
>>>>>>>>>
>>>>>>>>> TAY wee-beng
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <code.txt>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener


From knepley at gmail.com  Sun May 18 20:53:19 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 18 May 2014 20:53:19 -0500
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <53795BCC.8020500@gmail.com>
References: <534C9A2C.5060404@gmail.com> <534C9DB5.9070407@gmail.com>
	<CB93E42E-730E-4EA5-9F84-DE656A7A43A1@mcs.anl.gov>
	<53514B8A.90901@gmail.com>
	<495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov>
	<5351E62B.6060201@gmail.com>
	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>
	<53520587.6010606@gmail.com>
	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>
	<535248E8.2070002@gmail.com>
	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>
	<535284E0.8010901@gmail.com>
	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>
	<5352934C.1010306@gmail.com>
	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>
	<53529B09.8040009@gmail.com>
	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>
	<5353173D.60609@gmail.com> <53546B03.1010407@gmail.com>
	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>
	<537188D8.2030307@gmail.com>
	<A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>
	<53795BCC.8020500@gmail.com>
Message-ID: <CAMYG4G=D_uuesofTXoNo_Yb2GZDaOEFC5y3jG2r_+9-OA_3Pgw@mail.gmail.com>

On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi Barry,
>
> I am trying to sort out the details so that it's easier to pinpoint the
> error. However, I tried on gnu gfortran and it worked well. On intel ifort,
> it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that
> it's a bug in ifort? Do you work with both intel and gnu?
>

Yes it works with Intel. Is this using optimization?

  Matt


>
> Thank you
>
> Yours sincerely,
>
> TAY wee-beng
>
> On 14/5/2014 12:03 AM, Barry Smith wrote:
>
>>    Please send you current code. So we may compile and run it.
>>
>>    Barry
>>
>>
>>    On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>
>>  Hi,
>>>
>>> I have sent the entire code a while ago. Is there any answer? I was also
>>> trying myself but it worked for some intel compiler, and some not. I'm
>>> still not able to find the answer. gnu compilers for most cluster are old
>>> versions so they are not able to compile since I have allocatable
>>> structures.
>>>
>>> Thank you.
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>>
>>>>     Please send the entire code. If we can run it and reproduce the
>>>> problem we can likely track down the issue much faster than through endless
>>>> rounds of email.
>>>>
>>>>     Barry
>>>>
>>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>>  On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>>>>
>>>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>>>>>
>>>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>> wrote:
>>>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>>>>>
>>>>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>> wrote:
>>>>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>>>>>>
>>>>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>>>>>>>
>>>>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>>>>>>>>     Hmm,
>>>>>>>>>>
>>>>>>>>>>         Interface DMDAVecGetArrayF90
>>>>>>>>>>           Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>>>>>>>             USE_DM_HIDE
>>>>>>>>>>             DM_HIDE da1
>>>>>>>>>>             VEC_HIDE v
>>>>>>>>>>             PetscScalar,pointer :: d1(:,:,:)
>>>>>>>>>>             PetscErrorCode ierr
>>>>>>>>>>           End Subroutine
>>>>>>>>>>
>>>>>>>>>>      So the d1 is a F90 POINTER. But your subroutine seems to be
>>>>>>>>>> treating it as a ?plain old Fortran array??
>>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain
>>>>>>>>>> old Fortran array"? Because I declare it as a Fortran array and it works
>>>>>>>>>> w/o any problem if I only call DMDAVecGetArrayF90 and
>>>>>>>>>> DMDAVecRestoreArrayF90 with "u".
>>>>>>>>>>
>>>>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with
>>>>>>>>>> "u", "v" and "w", error starts to happen. I wonder why...
>>>>>>>>>>
>>>>>>>>>> Also, supposed I call:
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> u_array ....
>>>>>>>>>>
>>>>>>>>>> v_array .... etc
>>>>>>>>>>
>>>>>>>>>> Now to restore the array, does it matter the sequence they are
>>>>>>>>>> restored?
>>>>>>>>>>      No it should not matter. If it matters that is a sign that
>>>>>>>>>> memory has been written to incorrectly earlier in the code.
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Hmm, I have been getting different results on different intel
>>>>>>>>>> compilers. I'm not sure if MPI played a part but I'm only using a single
>>>>>>>>>> processor. In the debug mode, things run without problem. In optimized
>>>>>>>>>> mode, in some cases, the code aborts even doing simple initialization:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      u_array = 0.d0
>>>>>>>>>>
>>>>>>>>>>      v_array = 0.d0
>>>>>>>>>>
>>>>>>>>>>      w_array = 0.d0
>>>>>>>>>>
>>>>>>>>>>      p_array = 0.d0
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr),
>>>>>>>>>> giving segmentation error. But other
>>>>>>>>>>               version of intel compiler passes thru this part w/o error.
>>>>>>>>>> Since the response is different among different compilers, is this PETSc or
>>>>>>>>>> intel 's bug? Or mvapich or openmpi?
>>>>>>>>>>
>>>>>>>>>> We do this is a bunch of examples. Can you reproduce this
>>>>>>>>>> different behavior in src/dm/examples/tutorials/ex11f90.F?
>>>>>>>>>>
>>>>>>>>> Hi Matt,
>>>>>>>>>
>>>>>>>>> Do you mean putting the above lines into ex11f90.F and test?
>>>>>>>>>
>>>>>>>>> It already has DMDAVecGetArray(). Just run it.
>>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> It worked. The differences between mine and the code is the way the
>>>>>>>> fortran modules are defined, and the ex11f90 only uses global vectors. Does
>>>>>>>> it make a difference whether global or local vectors are used? Because the
>>>>>>>> way it accesses x1 only touches the local region.
>>>>>>>>
>>>>>>>> No the global/local difference should not matter.
>>>>>>>>   Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be
>>>>>>>> used 1st, is that so? I can't find the equivalent for local vector though.
>>>>>>>>
>>>>>>>> DMGetLocalVector()
>>>>>>>>
>>>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my
>>>>>>> code. Does it matter?
>>>>>>>
>>>>>>> If so, when should I call them?
>>>>>>>
>>>>>>> You just need a local vector from somewhere.
>>>>>>>
>>>>>> Hi,
>>>>>
>>>>> Anyone can help with the questions below? Still trying to find why my
>>>>> code doesn't work.
>>>>>
>>>>> Thanks.
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I insert part of my error region code into ex11f90:
>>>>>>
>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>           call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>>
>>>>>>      u_array = 0.d0
>>>>>>           v_array = 0.d0
>>>>>>           w_array = 0.d0
>>>>>>           p_array = 0.d0
>>>>>>
>>>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>>
>>>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>
>>>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>
>>>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>
>>>>>> It worked w/o error. I'm going to change the way the modules are
>>>>>> defined in my code.
>>>>>>
>>>>>> My code contains a main program and a number of modules files, with
>>>>>> subroutines inside e.g.
>>>>>>
>>>>>> module solve
>>>>>>                   <- add include file?
>>>>>> subroutine RRK
>>>>>>                   <- add include file?
>>>>>> end subroutine RRK
>>>>>>
>>>>>> end module solve
>>>>>>
>>>>>> So where should the include files (#include <finclude/petscdmda.h90>)
>>>>>> be placed?
>>>>>>
>>>>>> After the module or inside the subroutine?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>>    Matt
>>>>>>>   Thanks.
>>>>>>>
>>>>>>>>     Matt
>>>>>>>>   Thanks.
>>>>>>>>
>>>>>>>>>     Matt
>>>>>>>>>   Thanks
>>>>>>>>>
>>>>>>>>> Regards.
>>>>>>>>>
>>>>>>>>>>     Matt
>>>>>>>>>>   As in w, then v and u?
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> thanks
>>>>>>>>>>      Note also that the beginning and end indices of the u,v,w,
>>>>>>>>>> are different for each process see for example
>>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/
>>>>>>>>>> tutorials/ex11f90.F  (and they do not start at 1). This is how
>>>>>>>>>> to get the loop bounds.
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> In my case, I fixed the u,v,w such that their indices are the
>>>>>>>>>> same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the
>>>>>>>>>> problem lies in my subroutine treating it as a ?plain old Fortran array?.
>>>>>>>>>>
>>>>>>>>>> If I declare them as pointers, their indices follow the C 0 start
>>>>>>>>>> convention, is that so?
>>>>>>>>>>      Not really. It is that in each process you need to access
>>>>>>>>>> them from the indices indicated by DMDAGetCorners() for global vectors and
>>>>>>>>>> DMDAGetGhostCorners() for local vectors.  So really C or Fortran
>>>>>>>>>>                                                 doesn?t make any difference.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow
>>>>>>>>>> the Fortran 1 start convention. Is there some way to manipulate such that I
>>>>>>>>>> do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>>>>>>>>     If you code wishes to access them with indices plus one from
>>>>>>>>>> the values returned by DMDAGetCorners() for global vectors and
>>>>>>>>>> DMDAGetGhostCorners() for local vectors then you need to manually subtract
>>>>>>>>>> off the 1.
>>>>>>>>>>
>>>>>>>>>>     Barry
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>     Barry
>>>>>>>>>>
>>>>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence
>>>>>>>>>> I can run on 1 processor. Tried using valgrind but perhaps I'm using the
>>>>>>>>>> optimized version, it didn't catch the error, besides saying "Segmentation
>>>>>>>>>> fault (core dumped)"
>>>>>>>>>>
>>>>>>>>>> However, by re-writing my code, I found out a few things:
>>>>>>>>>>
>>>>>>>>>> 1. if I write my code this way:
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> u_array = ....
>>>>>>>>>>
>>>>>>>>>> v_array = ....
>>>>>>>>>>
>>>>>>>>>> w_array = ....
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> The code runs fine.
>>>>>>>>>>
>>>>>>>>>> 2. if I write my code this way:
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this
>>>>>>>>>> subroutine does the same modification as the above.
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>>>>>>>>
>>>>>>>>>> where the subroutine is:
>>>>>>>>>>
>>>>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>>>>
>>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>>>>>>
>>>>>>>>>> u ...
>>>>>>>>>> v...
>>>>>>>>>> w ...
>>>>>>>>>>
>>>>>>>>>> end subroutine uvw_array_change.
>>>>>>>>>>
>>>>>>>>>> The above will give an error at :
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> 3. Same as above, except I change the order of the last 3 lines
>>>>>>>>>> to:
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>
>>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>
>>>>>>>>>> So they are now in reversed order. Now it works.
>>>>>>>>>>
>>>>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>>>>>>>>
>>>>>>>>>> subroutine uvw_array_change(u,v,w)
>>>>>>>>>>
>>>>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_
>>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>>>>>>>> 3):end_indices(3))
>>>>>>>>>>
>>>>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_
>>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>>>>>>>> 3):end_indices(3))
>>>>>>>>>>
>>>>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_
>>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>>>>>>>> 3):end_indices(3))
>>>>>>>>>>
>>>>>>>>>> u ...
>>>>>>>>>> v...
>>>>>>>>>> w ...
>>>>>>>>>>
>>>>>>>>>> end subroutine uvw_array_change.
>>>>>>>>>>
>>>>>>>>>> The start_indices and end_indices are simply to shift the 0
>>>>>>>>>> indices of C convention to that of the 1 indices of the Fortran convention.
>>>>>>>>>> This is necessary in my case because most of my codes start array counting
>>>>>>>>>> at 1, hence the "trick".
>>>>>>>>>>
>>>>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90
>>>>>>>>>> (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>> "
>>>>>>>>>>
>>>>>>>>>> So did I violate and cause memory corruption due to the trick
>>>>>>>>>> above? But I can't think of any way other
>>>>>>>>>>                         than the "trick" to continue using the 1 indices
>>>>>>>>>> convention.
>>>>>>>>>>
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>>>>>>>>     Try running under valgrind http://www.mcs.anl.gov/petsc/
>>>>>>>>>> documentation/faq.html#valgrind
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Barry,
>>>>>>>>>>
>>>>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode
>>>>>>>>>> but fails in non-debug mode.
>>>>>>>>>>
>>>>>>>>>> I have attached my code.
>>>>>>>>>>
>>>>>>>>>> Thank you
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>>>>>>>>     Please send the code that creates da_w and the declarations
>>>>>>>>>> of w_array
>>>>>>>>>>
>>>>>>>>>>     Barry
>>>>>>>>>>
>>>>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>>>>>>>>> <zonexo at gmail.com>
>>>>>>>>>>    wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Barry,
>>>>>>>>>>
>>>>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>>>>>>>>
>>>>>>>>>>    mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>>>>
>>>>>>>>>> I got the msg below. Before the gdb windows appear (thru x11),
>>>>>>>>>> the program aborts.
>>>>>>>>>>
>>>>>>>>>> Also I tried running in another cluster and it worked. Also tried
>>>>>>>>>> in the current cluster in debug mode and it worked too.
>>>>>>>>>>
>>>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger
>>>>>>>>>> ------------------------------------------------------------
>>>>>>>>>> --------------
>>>>>>>>>> An MPI process has executed an operation involving a call to the
>>>>>>>>>> "fork()" system call to create a child process.  Open MPI is
>>>>>>>>>> currently
>>>>>>>>>> operating in a condition that could result in memory corruption or
>>>>>>>>>> other system errors; your MPI job may hang, crash, or produce
>>>>>>>>>> silent
>>>>>>>>>> data corruption.  The use of fork() (or system() or other calls
>>>>>>>>>> that
>>>>>>>>>> create child processes) is strongly discouraged.
>>>>>>>>>>
>>>>>>>>>> The process that invoked fork was:
>>>>>>>>>>
>>>>>>>>>>     Local host:          n12-76 (PID 20235)
>>>>>>>>>>     MPI_COMM_WORLD rank: 2
>>>>>>>>>>
>>>>>>>>>> If you are *absolutely sure* that your application will
>>>>>>>>>> successfully
>>>>>>>>>> and correctly survive a call to fork(), you may disable this
>>>>>>>>>> warning
>>>>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>>>>>>>> ------------------------------------------------------------
>>>>>>>>>> --------------
>>>>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on
>>>>>>>>>> display localhost:50.0 on machine n12-76
>>>>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on
>>>>>>>>>> display localhost:50.0 on machine n12-76
>>>>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on
>>>>>>>>>> display localhost:50.0 on machine n12-76
>>>>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on
>>>>>>>>>> display localhost:50.0 on machine n12-76
>>>>>>>>>> [n12-76:20232] 3 more processes have sent help message
>>>>>>>>>> help-mpi-runtime.txt / mpi_init:warn-fork
>>>>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0
>>>>>>>>>> to see all help / error messages
>>>>>>>>>>
>>>>>>>>>> ....
>>>>>>>>>>
>>>>>>>>>>    1
>>>>>>>>>> [1]PETSC ERROR: ------------------------------
>>>>>>>>>> ------------------------------------------
>>>>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>>>> Violation, probably memory access out of range
>>>>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>>> -on_error_attach_debugger
>>>>>>>>>> [1]PETSC ERROR: or see
>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#
>>>>>>>>>> valgrind[1]PETSC ERROR: or try http://valgrind.org
>>>>>>>>>>    on GNU/linux and Apple Mac OS X to find memory corruption
>>>>>>>>>> errors
>>>>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile,
>>>>>>>>>> link, and run
>>>>>>>>>> [1]PETSC ERROR: to get more information on the crash.
>>>>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown
>>>>>>>>>> directory unknown file (null)
>>>>>>>>>> [3]PETSC ERROR: ------------------------------
>>>>>>>>>> ------------------------------------------
>>>>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>>>> Violation, probably memory access out of range
>>>>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>>> -on_error_attach_debugger
>>>>>>>>>> [3]PETSC ERROR: or see
>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#
>>>>>>>>>> valgrind[3]PETSC ERROR: or try http://valgrind.org
>>>>>>>>>>    on GNU/linux and Apple Mac OS X to find memory corruption
>>>>>>>>>> errors
>>>>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile,
>>>>>>>>>> link, and run
>>>>>>>>>> [3]PETSC ERROR: to get more information on the crash.
>>>>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown
>>>>>>>>>> directory unknown file (null)
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>>>>>>>>
>>>>>>>>>>     Because IO doesn?t always get flushed immediately it may not
>>>>>>>>>> be hanging at this point.  It is better to use the option
>>>>>>>>>> -start_in_debugger then type cont in each debugger window and then when you
>>>>>>>>>> think it is ?hanging? do a control C in each debugger window and type where
>>>>>>>>>> to see where each process is you can also look around in the debugger at
>>>>>>>>>> variables to see why it is ?hanging? at that point.
>>>>>>>>>>
>>>>>>>>>>      Barry
>>>>>>>>>>
>>>>>>>>>>     This routines don?t have any parallel communication in them
>>>>>>>>>> so are unlikely to hang.
>>>>>>>>>>
>>>>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>>>>>>>>
>>>>>>>>>> <zonexo at gmail.com>
>>>>>>>>>>
>>>>>>>>>>    wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> My code hangs and I added in mpi_barrier and print to catch the
>>>>>>>>>> bug. I found that it hangs after printing "7". Is it because I'm doing
>>>>>>>>>> something wrong? I need to access the u,v,w array so I use
>>>>>>>>>> DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>>>>>>>>>>
>>>>>>>>>>           call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"3"
>>>>>>>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"4"
>>>>>>>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"5"
>>>>>>>>>>           call I_IIB_uv_initial_1st_dm(I_
>>>>>>>>>> cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_
>>>>>>>>>> v1,I_cell_w1,u_array,v_array,w_array)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"6"
>>>>>>>>>>           call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>>>>>>  !must be in reverse order
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"7"
>>>>>>>>>>           call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>>>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0)
>>>>>>>>>> print *,"8"
>>>>>>>>>>           call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>>> --
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> <code.txt>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>>>> experiments lead.
>>>>>>>>>> -- Norbert Wiener
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>>> experiments lead.
>>>>>>>>> -- Norbert Wiener
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>> experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140518/4461f098/attachment-0001.html>

From zonexo at gmail.com  Sun May 18 22:28:13 2014
From: zonexo at gmail.com (TAY wee-beng)
Date: Mon, 19 May 2014 11:28:13 +0800
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <CAMYG4G=D_uuesofTXoNo_Yb2GZDaOEFC5y3jG2r_+9-OA_3Pgw@mail.gmail.com>
References: <534C9A2C.5060404@gmail.com>	<53514B8A.90901@gmail.com>	<495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov>	<5351E62B.6060201@gmail.com>	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>	<53520587.6010606@gmail.com>	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>	<535248E8.2070002@gmail.com>	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>	<535284E0.8010901@gmail.com>	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>	<5352934C.1010306@gmail.com>	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>	<53529B09.8040009@gmail.com>	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>	<5353173D.60609@gmail.com>	<53546B03.1010407@gmail.com>	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>	<537188D8.2030307@gmail.com>	<A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>	<53795BCC.8020500@gmail.com>
	<CAMYG4G=D_uuesofTXoNo_Yb2GZDaOEFC5y3jG2r_+9-OA_3Pgw@mail.gmail.com>
Message-ID: <53797A4D.6090602@gmail.com>

On 19/5/2014 9:53 AM, Matthew Knepley wrote:
> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>     Hi Barry,
>
>     I am trying to sort out the details so that it's easier to
>     pinpoint the error. However, I tried on gnu gfortran and it worked
>     well. On intel ifort, it stopped at one of the
>     "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in
>     ifort? Do you work with both intel and gnu?
>
>
> Yes it works with Intel. Is this using optimization?
Hi Matt,

I forgot to add that in non-optimized cases, it works with gnu and 
intel. However, in optimized cases, it works with gnu, but not intel.  
Does it definitely mean that it's a bug in ifort?
>
>   Matt
>
>
>     Thank you
>
>     Yours sincerely,
>
>     TAY wee-beng
>
>     On 14/5/2014 12:03 AM, Barry Smith wrote:
>
>            Please send you current code. So we may compile and run it.
>
>            Barry
>
>
>            On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com
>         <mailto:zonexo at gmail.com>> wrote:
>
>             Hi,
>
>             I have sent the entire code a while ago. Is there any
>             answer? I was also trying myself but it worked for some
>             intel compiler, and some not. I'm still not able to find
>             the answer. gnu compilers for most cluster are old
>             versions so they are not able to compile since I have
>             allocatable structures.
>
>             Thank you.
>
>             Yours sincerely,
>
>             TAY wee-beng
>
>             On 21/4/2014 8:58 AM, Barry Smith wrote:
>
>                     Please send the entire code. If we can run it and
>                 reproduce the problem we can likely track down the
>                 issue much faster than through endless rounds of email.
>
>                     Barry
>
>                 On Apr 20, 2014, at 7:49 PM, TAY wee-beng
>                 <zonexo at gmail.com <mailto:zonexo at gmail.com>> wrote:
>
>                     On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>
>                         On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>
>                             On Sat, Apr 19, 2014 at 10:49 AM, TAY
>                             wee-beng <zonexo at gmail.com
>                             <mailto:zonexo at gmail.com>> wrote:
>                             On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>
>                                 On Sat, Apr 19, 2014 at 10:16 AM, TAY
>                                 wee-beng <zonexo at gmail.com
>                                 <mailto:zonexo at gmail.com>> wrote:
>                                 On 19/4/2014 10:55 PM, Matthew Knepley
>                                 wrote:
>
>                                     On Sat, Apr 19, 2014 at 9:14 AM,
>                                     TAY wee-beng <zonexo at gmail.com
>                                     <mailto:zonexo at gmail.com>> wrote:
>                                     On 19/4/2014 6:48 PM, Matthew
>                                     Knepley wrote:
>
>                                         On Sat, Apr 19, 2014 at 4:59
>                                         AM, TAY wee-beng
>                                         <zonexo at gmail.com
>                                         <mailto:zonexo at gmail.com>> wrote:
>                                         On 19/4/2014 1:17 PM, Barry
>                                         Smith wrote:
>                                         On Apr 19, 2014, at 12:11 AM,
>                                         TAY wee-beng <zonexo at gmail.com
>                                         <mailto:zonexo at gmail.com>> wrote:
>
>                                         On 19/4/2014 12:10 PM, Barry
>                                         Smith wrote:
>                                         On Apr 18, 2014, at 9:57 PM,
>                                         TAY wee-beng <zonexo at gmail.com
>                                         <mailto:zonexo at gmail.com>> wrote:
>
>                                         On 19/4/2014 3:53 AM, Barry
>                                         Smith wrote:
>                                             Hmm,
>
>                                                 Interface
>                                         DMDAVecGetArrayF90
>                                                   Subroutine
>                                         DMDAVecGetArrayF903(da1,
>                                         v,d1,ierr)
>                                                     USE_DM_HIDE
>                                                     DM_HIDE da1
>                                                     VEC_HIDE v
>                                                    
>                                         PetscScalar,pointer :: d1(:,:,:)
>                                                     PetscErrorCode ierr
>                                                   End Subroutine
>
>                                              So the d1 is a F90
>                                         POINTER. But your subroutine
>                                         seems to be treating it as a
>                                         ?plain old Fortran array??
>                                         real(8), intent(inout) ::
>                                         u(:,:,:),v(:,:,:),w(:,:,:)
>                                         Hi,
>
>                                         So d1 is a pointer, and it's
>                                         different if I declare it as
>                                         "plain old Fortran array"?
>                                         Because I declare it as a
>                                         Fortran array and it works w/o
>                                         any problem if I only call
>                                         DMDAVecGetArrayF90 and
>                                         DMDAVecRestoreArrayF90 with "u".
>
>                                         But if I call
>                                         DMDAVecGetArrayF90 and
>                                         DMDAVecRestoreArrayF90 with
>                                         "u", "v" and "w", error starts
>                                         to happen. I wonder why...
>
>                                         Also, supposed I call:
>
>                                         call
>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>
>                                              call
>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>
>                                              call
>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>
>                                         u_array ....
>
>                                         v_array .... etc
>
>                                         Now to restore the array, does
>                                         it matter the sequence they
>                                         are restored?
>                                              No it should not matter.
>                                         If it matters that is a sign
>                                         that memory has been written
>                                         to incorrectly earlier in the
>                                         code.
>
>                                         Hi,
>
>                                         Hmm, I have been getting
>                                         different results on different
>                                         intel compilers. I'm not sure
>                                         if MPI played a part but I'm
>                                         only using a single processor.
>                                         In the debug mode, things run
>                                         without problem. In optimized
>                                         mode, in some cases, the code
>                                         aborts even doing simple
>                                         initialization:
>
>
>                                         call
>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>
>                                              call
>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>
>                                              call
>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>
>                                              call
>                                         DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>
>                                              u_array = 0.d0
>
>                                              v_array = 0.d0
>
>                                              w_array = 0.d0
>
>                                              p_array = 0.d0
>
>
>                                              call
>                                         DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>
>
>                                              call
>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>
>                                              call
>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>
>                                              call
>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>
>                                         The code aborts at call
>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr),
>                                         giving segmentation error. But
>                                         other         version of intel
>                                         compiler passes thru this part
>                                         w/o error. Since the response
>                                         is different among different
>                                         compilers, is this PETSc or
>                                         intel 's bug? Or mvapich or
>                                         openmpi?
>
>                                         We do this is a bunch of
>                                         examples. Can you reproduce
>                                         this different behavior in
>                                         src/dm/examples/tutorials/ex11f90.F?
>
>                                     Hi Matt,
>
>                                     Do you mean putting the above
>                                     lines into ex11f90.F and test?
>
>                                     It already has DMDAVecGetArray().
>                                     Just run it.
>
>                                 Hi,
>
>                                 It worked. The differences between
>                                 mine and the code is the way the
>                                 fortran modules are defined, and the
>                                 ex11f90 only uses global vectors. Does
>                                 it make a difference whether global or
>                                 local vectors are used? Because the
>                                 way it accesses x1 only touches the
>                                 local region.
>
>                                 No the global/local difference should
>                                 not matter.
>                                   Also, before using
>                                 DMDAVecGetArrayF90, DMGetGlobalVector
>                                 must be used 1st, is that so? I can't
>                                 find the equivalent for local vector
>                                 though.
>
>                                 DMGetLocalVector()
>
>                             Ops, I do not have DMGetLocalVector and
>                             DMRestoreLocalVector in my code. Does it
>                             matter?
>
>                             If so, when should I call them?
>
>                             You just need a local vector from somewhere.
>
>                     Hi,
>
>                     Anyone can help with the questions below? Still
>                     trying to find why my code doesn't work.
>
>                     Thanks.
>
>                         Hi,
>
>                         I insert part of my error region code into
>                         ex11f90:
>
>                         call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>                                   call
>                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>                                   call
>                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>                                   call
>                         DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>
>                              u_array = 0.d0
>                                   v_array = 0.d0
>                                   w_array = 0.d0
>                                   p_array = 0.d0
>
>                              call
>                         DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>
>                              call
>                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>
>                              call
>                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>
>                              call
>                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>
>                         It worked w/o error. I'm going to change the
>                         way the modules are defined in my code.
>
>                         My code contains a main program and a number
>                         of modules files, with subroutines inside e.g.
>
>                         module solve
>                                           <- add include file?
>                         subroutine RRK
>                                           <- add include file?
>                         end subroutine RRK
>
>                         end module solve
>
>                         So where should the include files (#include
>                         <finclude/petscdmda.h90>) be placed?
>
>                         After the module or inside the subroutine?
>
>                         Thanks.
>
>                                Matt
>                               Thanks.
>
>                                     Matt
>                                   Thanks.
>
>                                         Matt
>                                       Thanks
>
>                                     Regards.
>
>                                             Matt
>                                           As in w, then v and u?
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>                                         call
>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>                                         call
>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>
>                                         thanks
>                                              Note also that the
>                                         beginning and end indices of
>                                         the u,v,w, are different for
>                                         each process see for example
>                                         http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F
>                                          (and they do not start at 1).
>                                         This is how to get the loop
>                                         bounds.
>                                         Hi,
>
>                                         In my case, I fixed the u,v,w
>                                         such that their indices are
>                                         the same. I also checked using
>                                         DMDAGetCorners and
>                                         DMDAGetGhostCorners. Now the
>                                         problem lies in my subroutine
>                                         treating it as a ?plain old
>                                         Fortran array?.
>
>                                         If I declare them as pointers,
>                                         their indices follow the C 0
>                                         start convention, is that so?
>                                              Not really. It is that in
>                                         each process you need to
>                                         access them from the indices
>                                         indicated by DMDAGetCorners()
>                                         for global vectors and
>                                         DMDAGetGhostCorners() for
>                                         local vectors.  So really C or
>                                         Fortran   doesn?t make any
>                                         difference.
>
>
>                                         So my problem now is that in
>                                         my old MPI code, the u(i,j,k)
>                                         follow the Fortran 1 start
>                                         convention. Is there some way
>                                         to manipulate such that I do
>                                         not have to change my u(i,j,k)
>                                         to u(i-1,j-1,k-1)?
>                                             If you code wishes to
>                                         access them with indices plus
>                                         one from the values returned
>                                         by DMDAGetCorners() for global
>                                         vectors and
>                                         DMDAGetGhostCorners() for
>                                         local vectors then you need to
>                                         manually subtract off the 1.
>
>                                             Barry
>
>                                         Thanks.
>                                             Barry
>
>                                         On Apr 18, 2014, at 10:58 AM,
>                                         TAY wee-beng <zonexo at gmail.com
>                                         <mailto:zonexo at gmail.com>> wrote:
>
>                                         Hi,
>
>                                         I tried to pinpoint the
>                                         problem. I reduced my job size
>                                         and hence I can run on 1
>                                         processor. Tried using
>                                         valgrind but perhaps I'm using
>                                         the optimized version, it
>                                         didn't catch the error,
>                                         besides saying "Segmentation
>                                         fault (core dumped)"
>
>                                         However, by re-writing my
>                                         code, I found out a few things:
>
>                                         1. if I write my code this way:
>
>                                         call
>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>
>                                         call
>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>
>                                         call
>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>
>                                         u_array = ....
>
>                                         v_array = ....
>
>                                         w_array = ....
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>
>                                         The code runs fine.
>
>                                         2. if I write my code this way:
>
>                                         call
>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>
>                                         call
>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>
>                                         call
>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>
>                                         call
>                                         uvw_array_change(u_array,v_array,w_array)
>                                         -> this subroutine does the
>                                         same modification as the above.
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>                                         -> error
>
>                                         where the subroutine is:
>
>                                         subroutine uvw_array_change(u,v,w)
>
>                                         real(8), intent(inout) ::
>                                         u(:,:,:),v(:,:,:),w(:,:,:)
>
>                                         u ...
>                                         v...
>                                         w ...
>
>                                         end subroutine uvw_array_change.
>
>                                         The above will give an error at :
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>
>                                         3. Same as above, except I
>                                         change the order of the last 3
>                                         lines to:
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>
>                                         call
>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>
>                                         So they are now in reversed
>                                         order. Now it works.
>
>                                         4. Same as 2 or 3, except the
>                                         subroutine is changed to :
>
>                                         subroutine uvw_array_change(u,v,w)
>
>                                         real(8), intent(inout) ::
>                                         u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>
>                                         real(8), intent(inout) ::
>                                         v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>
>                                         real(8), intent(inout) ::
>                                         w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>
>                                         u ...
>                                         v...
>                                         w ...
>
>                                         end subroutine uvw_array_change.
>
>                                         The start_indices and
>                                         end_indices are simply to
>                                         shift the 0 indices of C
>                                         convention to that of the 1
>                                         indices of the Fortran
>                                         convention. This is necessary
>                                         in my case because most of my
>                                         codes start array counting at
>                                         1, hence the "trick".
>
>                                         However, now no matter which
>                                         order of the
>                                         DMDAVecRestoreArrayF90 (as in
>                                         2 or 3), error will occur at
>                                         "call
>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>                                         "
>
>                                         So did I violate and cause
>                                         memory corruption due to the
>                                         trick above? But I can't think
>                                         of any way other       than
>                                         the "trick" to continue using
>                                         the 1 indices convention.
>
>                                         Thank you.
>
>                                         Yours sincerely,
>
>                                         TAY wee-beng
>
>                                         On 15/4/2014 8:00 PM, Barry
>                                         Smith wrote:
>                                             Try running under valgrind
>                                         http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
>
>                                         On Apr 14, 2014, at 9:47 PM,
>                                         TAY wee-beng <zonexo at gmail.com
>                                         <mailto:zonexo at gmail.com>> wrote:
>
>                                         Hi Barry,
>
>                                         As I mentioned earlier, the
>                                         code works fine in PETSc debug
>                                         mode but fails in non-debug mode.
>
>                                         I have attached my code.
>
>                                         Thank you
>
>                                         Yours sincerely,
>
>                                         TAY wee-beng
>
>                                         On 15/4/2014 2:26 AM, Barry
>                                         Smith wrote:
>                                             Please send the code that
>                                         creates da_w and the
>                                         declarations of w_array
>
>                                             Barry
>
>                                         On Apr 14, 2014, at 9:40 AM,
>                                         TAY wee-beng
>                                         <zonexo at gmail.com
>                                         <mailto:zonexo at gmail.com>>
>                                            wrote:
>
>
>                                         Hi Barry,
>
>                                         I'm not too sure how to do it.
>                                         I'm running mpi. So I run:
>
>                                            mpirun -n 4 ./a.out
>                                         -start_in_debugger
>
>                                         I got the msg below. Before
>                                         the gdb windows appear (thru
>                                         x11), the program aborts.
>
>                                         Also I tried running in
>                                         another cluster and it worked.
>                                         Also tried in the current
>                                         cluster in debug mode and it
>                                         worked too.
>
>                                         mpirun -n 4 ./a.out
>                                         -start_in_debugger
>                                         --------------------------------------------------------------------------
>                                         An MPI process has executed an
>                                         operation involving a call to the
>                                         "fork()" system call to create
>                                         a child process.  Open MPI is
>                                         currently
>                                         operating in a condition that
>                                         could result in memory
>                                         corruption or
>                                         other system errors; your MPI
>                                         job may hang, crash, or
>                                         produce silent
>                                         data corruption.  The use of
>                                         fork() (or system() or other
>                                         calls that
>                                         create child processes) is
>                                         strongly discouraged.
>
>                                         The process that invoked fork was:
>
>                                             Local host:        
>                                          n12-76 (PID 20235)
>                                             MPI_COMM_WORLD rank: 2
>
>                                         If you are *absolutely sure*
>                                         that your application will
>                                         successfully
>                                         and correctly survive a call
>                                         to fork(), you may disable
>                                         this warning
>                                         by setting the
>                                         mpi_warn_on_fork MCA parameter
>                                         to 0.
>                                         --------------------------------------------------------------------------
>                                         [2]PETSC ERROR: PETSC:
>                                         Attaching gdb to ./a.out of
>                                         pid 20235 on display
>                                         localhost:50.0 on machine n12-76
>                                         [0]PETSC ERROR: PETSC:
>                                         Attaching gdb to ./a.out of
>                                         pid 20233 on display
>                                         localhost:50.0 on machine n12-76
>                                         [1]PETSC ERROR: PETSC:
>                                         Attaching gdb to ./a.out of
>                                         pid 20234 on display
>                                         localhost:50.0 on machine n12-76
>                                         [3]PETSC ERROR: PETSC:
>                                         Attaching gdb to ./a.out of
>                                         pid 20236 on display
>                                         localhost:50.0 on machine n12-76
>                                         [n12-76:20232] 3 more
>                                         processes have sent help
>                                         message help-mpi-runtime.txt /
>                                         mpi_init:warn-fork
>                                         [n12-76:20232] Set MCA
>                                         parameter
>                                         "orte_base_help_aggregate" to
>                                         0 to see all help / error messages
>
>                                         ....
>
>                                            1
>                                         [1]PETSC ERROR:
>                                         ------------------------------------------------------------------------
>                                         [1]PETSC ERROR: Caught signal
>                                         number 11 SEGV: Segmentation
>                                         Violation, probably memory
>                                         access out of range
>                                         [1]PETSC ERROR: Try option
>                                         -start_in_debugger or
>                                         -on_error_attach_debugger
>                                         [1]PETSC ERROR: or see
>                                         http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC
>                                         ERROR: or try http://valgrind.org
>                                            on GNU/linux and Apple Mac
>                                         OS X to find memory corruption
>                                         errors
>                                         [1]PETSC ERROR: configure
>                                         using --with-debugging=yes,
>                                         recompile, link, and run
>                                         [1]PETSC ERROR: to get more
>                                         information on the crash.
>                                         [1]PETSC ERROR: User provided
>                                         function() line 0 in unknown
>                                         directory unknown file (null)
>                                         [3]PETSC ERROR:
>                                         ------------------------------------------------------------------------
>                                         [3]PETSC ERROR: Caught signal
>                                         number 11 SEGV: Segmentation
>                                         Violation, probably memory
>                                         access out of range
>                                         [3]PETSC ERROR: Try option
>                                         -start_in_debugger or
>                                         -on_error_attach_debugger
>                                         [3]PETSC ERROR: or see
>                                         http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC
>                                         ERROR: or try http://valgrind.org
>                                            on GNU/linux and Apple Mac
>                                         OS X to find memory corruption
>                                         errors
>                                         [3]PETSC ERROR: configure
>                                         using --with-debugging=yes,
>                                         recompile, link, and run
>                                         [3]PETSC ERROR: to get more
>                                         information on the crash.
>                                         [3]PETSC ERROR: User provided
>                                         function() line 0 in unknown
>                                         directory unknown file (null)
>
>                                         ...
>                                         Thank you.
>
>                                         Yours sincerely,
>
>                                         TAY wee-beng
>
>                                         On 14/4/2014 9:05 PM, Barry
>                                         Smith wrote:
>
>                                             Because IO doesn?t always
>                                         get flushed immediately it may
>                                         not be hanging at this point.
>                                          It is better to use the
>                                         option -start_in_debugger then
>                                         type cont in each debugger
>                                         window and then when you think
>                                         it is ?hanging? do a control C
>                                         in each debugger window and
>                                         type where to see where each
>                                         process is you can also look
>                                         around in the debugger at
>                                         variables to see why it is
>                                         ?hanging? at that point.
>
>                                              Barry
>
>                                             This routines don?t have
>                                         any parallel communication in
>                                         them so are unlikely to hang.
>
>                                         On Apr 14, 2014, at 6:52 AM,
>                                         TAY wee-beng
>
>                                         <zonexo at gmail.com
>                                         <mailto:zonexo at gmail.com>>
>
>                                            wrote:
>
>
>
>                                         Hi,
>
>                                         My code hangs and I added in
>                                         mpi_barrier and print to catch
>                                         the bug. I found that it hangs
>                                         after printing "7". Is it
>                                         because I'm doing something
>                                         wrong? I need to access the
>                                         u,v,w array so I use
>                                         DMDAVecGetArrayF90. After
>                                         access, I use
>                                         DMDAVecRestoreArrayF90.
>
>                                                   call
>                                         DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>                                                   call
>                                         MPI_Barrier(MPI_COMM_WORLD,ierr);
>                                          if (myid==0) print *,"3"
>                                                   call
>                                         DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>                                                   call
>                                         MPI_Barrier(MPI_COMM_WORLD,ierr);
>                                          if (myid==0) print *,"4"
>                                                   call
>                                         DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>                                                   call
>                                         MPI_Barrier(MPI_COMM_WORLD,ierr);
>                                          if (myid==0) print *,"5"
>                                                   call
>                                         I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>                                                   call
>                                         MPI_Barrier(MPI_COMM_WORLD,ierr);
>                                          if (myid==0) print *,"6"
>                                                   call
>                                         DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>                                          !must be in reverse order
>                                                   call
>                                         MPI_Barrier(MPI_COMM_WORLD,ierr);
>                                          if (myid==0) print *,"7"
>                                                   call
>                                         DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>                                                   call
>                                         MPI_Barrier(MPI_COMM_WORLD,ierr);
>                                          if (myid==0) print *,"8"
>                                                   call
>                                         DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>                                         -- 
>                                         Thank you.
>
>                                         Yours sincerely,
>
>                                         TAY wee-beng
>
>
>
>                                         <code.txt>
>
>
>
>
>                                         -- 
>                                         What most experimenters take
>                                         for granted before they begin
>                                         their experiments is
>                                         infinitely more interesting
>                                         than any results to which
>                                         their experiments lead.
>                                         -- Norbert Wiener
>
>
>
>                                     -- 
>                                     What most experimenters take for
>                                     granted before they begin their
>                                     experiments is infinitely more
>                                     interesting than any results to
>                                     which their experiments lead.
>                                     -- Norbert Wiener
>
>
>
>                                 -- 
>                                 What most experimenters take for
>                                 granted before they begin their
>                                 experiments is infinitely more
>                                 interesting than any results to which
>                                 their experiments lead.
>                                 -- Norbert Wiener
>
>
>
>                             -- 
>                             What most experimenters take for granted
>                             before they begin their experiments is
>                             infinitely more interesting than any
>                             results to which their experiments lead.
>                             -- Norbert Wiener
>
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/fc209428/attachment-0001.html>

From scott.wales at unimelb.edu.au  Sun May 18 22:28:51 2014
From: scott.wales at unimelb.edu.au (Scott Wales)
Date: Mon, 19 May 2014 13:28:51 +1000
Subject: [petsc-users] DMPlexDistribute error
Message-ID: <53797A73.9070106@unimelb.edu.au>

Hi,

I'm trying to create a distributed unstructured grid in PETSc, and have encountered the following error:

     [0]PETSC ERROR: --------------------- Error Message ------------------------------------
     [0]PETSC ERROR: Invalid argument!
     [0]PETSC ERROR: Wrong type of object: Parameter # 1!
     [0]PETSC ERROR: ------------------------------------------------------------------------
     [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
     [0]PETSC ERROR: See docs/changes/index.html for recent updates.
     [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
     [0]PETSC ERROR: See docs/index.html for manual pages.
     [0]PETSC ERROR: ------------------------------------------------------------------------
     [0]PETSC ERROR: bin/celllist_square on a arch-linux2-c-debug named raijin5 by saw562 Mon May 19 13:00:31 2014
     [0]PETSC ERROR: Libraries linked from /home/562/saw562/opt/petsc/3.4.4/lib
     [0]PETSC ERROR: Configure run at Fri May 16 14:23:02 2014
     [0]PETSC ERROR: Configure options --with-shared-libraries=1 --prefix=/home/562/saw562/opt/petsc/3.4.4 --with-blas-lapack-lib="-L/apps/intel-ct/12.1.9.293/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread" --with-mpi-dir=/apps/openmpi/1.6.3
     [0]PETSC ERROR: ------------------------------------------------------------------------
     [0]PETSC ERROR: ISGetIndices() line 372 in src/vec/is/is/interface/index.c
     [0]PETSC ERROR: DMPlexCreatePartitionClosure() line 2637 ihttps://gist.github.com/ScottWales/2758b5ec96573c63e31an src/dm/impls/plex/plex.c
     [0]PETSC ERROR: DMPlexDistribute() line 2810 in src/dm/impls/plex/plex.c

I've created the DMPlex using `DMPlexCreateFromCellList`, added a default section and then called `DMPlexDistribute` to spread the grid points across all of the processors. You can see my test code at https://gist.github.com/ScottWales/2758b5ec96573c63e31a#file-petsc-test-c-L164. Have I missed a step in the grid setup?

Thanks, Scott

From bsmith at mcs.anl.gov  Sun May 18 22:36:27 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 18 May 2014 22:36:27 -0500
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <53797A4D.6090602@gmail.com>
References: <534C9A2C.5060404@gmail.com>	<53514B8A.90901@gmail.com>	<495519DB-D3A1-4190-AED2-4ABA885C2835@mcs.anl.gov>	<5351E62B.6060201@gmail.com>	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>	<53520587.6010606@gmail.com>	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>	<535248E8.2070002@gmail.com>	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>	<535284E0.8010901@gmail.com>	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>	<5352934C.1010306@gmail.com>	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>	<53529B09.8040009@gmail.com>	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>	<5353173D.60609@gmail.com>	<53546B03.1010407@gmail.com>	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>	<537188D8.2030307@gmail.com>	<A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>	<53795BCC.8020500@gmail.com>
	<CAMYG4G=D_uuesofTXoNo_Yb2GZDaOEFC5y3jG2r_+9-OA_3Pgw@mail.gmail.com>
	<53797A4D.6090602@gmail.com>
Message-ID: <B52953B5-E17E-4989-9A06-C87DF21FC43D@mcs.anl.gov>


On May 18, 2014, at 10:28 PM, TAY wee-beng <zonexo at gmail.com> wrote:

> On 19/5/2014 9:53 AM, Matthew Knepley wrote:
>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>> Hi Barry,
>> 
>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu?
>> 
>> Yes it works with Intel. Is this using optimization?
> Hi Matt,
> 
> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel.  Does it definitely mean that it's a bug in ifort?

  No. Does it run clean under valgrind?


>> 
>>   Matt
>>  
>> 
>> Thank you
>> 
>> Yours sincerely,
>> 
>> TAY wee-beng
>> 
>> On 14/5/2014 12:03 AM, Barry Smith wrote:
>>    Please send you current code. So we may compile and run it.
>> 
>>    Barry
>> 
>> 
>>    On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>> 
>> Hi,
>> 
>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures.
>> 
>> Thank you.
>> 
>> Yours sincerely,
>> 
>> TAY wee-beng
>> 
>> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>     Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email.
>> 
>>     Barry
>> 
>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>> 
>> On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>> 
>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>> 
>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>     Hmm,
>> 
>>         Interface DMDAVecGetArrayF90
>>           Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>             USE_DM_HIDE
>>             DM_HIDE da1
>>             VEC_HIDE v
>>             PetscScalar,pointer :: d1(:,:,:)
>>             PetscErrorCode ierr
>>           End Subroutine
>> 
>>      So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array??
>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>> Hi,
>> 
>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u".
>> 
>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why...
>> 
>> Also, supposed I call:
>> 
>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>> 
>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>> 
>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>> 
>> u_array ....
>> 
>> v_array .... etc
>> 
>> Now to restore the array, does it matter the sequence they are restored?
>>      No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code.
>> 
>> Hi,
>> 
>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization:
>> 
>> 
>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>> 
>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>> 
>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>> 
>>      call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>> 
>>      u_array = 0.d0
>> 
>>      v_array = 0.d0
>> 
>>      w_array = 0.d0
>> 
>>      p_array = 0.d0
>> 
>> 
>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>> 
>> 
>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>> 
>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>> 
>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>> 
>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other                                                                                     version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi?
>> 
>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F?
>> Hi Matt,
>> 
>> Do you mean putting the above lines into ex11f90.F and test?
>> 
>> It already has DMDAVecGetArray(). Just run it.
>> Hi,
>> 
>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region.
>> 
>> No the global/local difference should not matter.
>>   Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though.
>> 
>> DMGetLocalVector()
>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter?
>> 
>> If so, when should I call them?
>> 
>> You just need a local vector from somewhere.
>> Hi,
>> 
>> Anyone can help with the questions below? Still trying to find why my code doesn't work.
>> 
>> Thanks.
>> Hi,
>> 
>> I insert part of my error region code into ex11f90:
>> 
>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>           call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>> 
>>      u_array = 0.d0
>>           v_array = 0.d0
>>           w_array = 0.d0
>>           p_array = 0.d0
>> 
>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>> 
>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>> 
>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>> 
>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>> 
>> It worked w/o error. I'm going to change the way the modules are defined in my code.
>> 
>> My code contains a main program and a number of modules files, with subroutines inside e.g.
>> 
>> module solve
>>                   <- add include file?
>> subroutine RRK
>>                   <- add include file?
>> end subroutine RRK
>> 
>> end module solve
>> 
>> So where should the include files (#include <finclude/petscdmda.h90>) be placed?
>> 
>> After the module or inside the subroutine?
>> 
>> Thanks.
>>    Matt
>>   Thanks.
>>     Matt
>>   Thanks.
>>     Matt
>>   Thanks
>> 
>> Regards.
>>     Matt
>>   As in w, then v and u?
>> 
>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>> 
>> thanks
>>      Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F  (and they do not start at 1). This is how to get the loop bounds.
>> Hi,
>> 
>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?.
>> 
>> If I declare them as pointers, their indices follow the C 0 start convention, is that so?
>>      Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and                                 DMDAGetGhostCorners() for local vectors.  So really C or Fortran                                                           doesn?t make any difference.
>> 
>> 
>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>     If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1.
>> 
>>     Barry
>> 
>> Thanks.
>>     Barry
>> 
>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>> 
>> Hi,
>> 
>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)"
>> 
>> However, by re-writing my code, I found out a few things:
>> 
>> 1. if I write my code this way:
>> 
>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>> 
>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>> 
>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>> 
>> u_array = ....
>> 
>> v_array = ....
>> 
>> w_array = ....
>> 
>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>> 
>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>> 
>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>> 
>> The code runs fine.
>> 
>> 2. if I write my code this way:
>> 
>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>> 
>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>> 
>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>> 
>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above.
>> 
>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>> 
>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>> 
>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>> 
>> where the subroutine is:
>> 
>> subroutine uvw_array_change(u,v,w)
>> 
>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>> 
>> u ...
>> v...
>> w ...
>> 
>> end subroutine uvw_array_change.
>> 
>> The above will give an error at :
>> 
>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>> 
>> 3. Same as above, except I change the order of the last 3 lines to:
>> 
>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>> 
>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>> 
>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>> 
>> So they are now in reversed order. Now it works.
>> 
>> 4. Same as 2 or 3, except the subroutine is changed to :
>> 
>> subroutine uvw_array_change(u,v,w)
>> 
>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>> 
>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>> 
>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>> 
>> u ...
>> v...
>> w ...
>> 
>> end subroutine uvw_array_change.
>> 
>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick".
>> 
>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) "
>> 
>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other                                                           than the "trick" to continue using the 1 indices convention.
>> 
>> Thank you.
>> 
>> Yours sincerely,
>> 
>> TAY wee-beng
>> 
>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>     Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> 
>> 
>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>> 
>> Hi Barry,
>> 
>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode.
>> 
>> I have attached my code.
>> 
>> Thank you
>> 
>> Yours sincerely,
>> 
>> TAY wee-beng
>> 
>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>     Please send the code that creates da_w and the declarations of w_array
>> 
>>     Barry
>> 
>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>> <zonexo at gmail.com>
>>    wrote:
>> 
>> 
>> Hi Barry,
>> 
>> I'm not too sure how to do it. I'm running mpi. So I run:
>> 
>>    mpirun -n 4 ./a.out -start_in_debugger
>> 
>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts.
>> 
>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too.
>> 
>> mpirun -n 4 ./a.out -start_in_debugger
>> --------------------------------------------------------------------------
>> An MPI process has executed an operation involving a call to the
>> "fork()" system call to create a child process.  Open MPI is currently
>> operating in a condition that could result in memory corruption or
>> other system errors; your MPI job may hang, crash, or produce silent
>> data corruption.  The use of fork() (or system() or other calls that
>> create child processes) is strongly discouraged.
>> 
>> The process that invoked fork was:
>> 
>>     Local host:          n12-76 (PID 20235)
>>     MPI_COMM_WORLD rank: 2
>> 
>> If you are *absolutely sure* that your application will successfully
>> and correctly survive a call to fork(), you may disable this warning
>> by setting the mpi_warn_on_fork MCA parameter to 0.
>> --------------------------------------------------------------------------
>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76
>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76
>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76
>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>> 
>> ....
>> 
>>    1
>> [1]PETSC ERROR: ------------------------------------------------------------------------
>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [1]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org
>>    on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>> [1]PETSC ERROR: to get more information on the crash.
>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>> [3]PETSC ERROR: ------------------------------------------------------------------------
>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [3]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org
>>    on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>> [3]PETSC ERROR: to get more information on the crash.
>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>> 
>> ...
>> Thank you.
>> 
>> Yours sincerely,
>> 
>> TAY wee-beng
>> 
>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>> 
>>     Because IO doesn?t always get flushed immediately it may not be hanging at this point.  It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point.
>> 
>>      Barry
>> 
>>     This routines don?t have any parallel communication in them so are unlikely to hang.
>> 
>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>> 
>> <zonexo at gmail.com>
>> 
>>    wrote:
>> 
>> 
>> 
>> Hi,
>> 
>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>> 
>>           call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3"
>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>           call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>           call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be in reverse order
>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>           call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>           call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>> -- 
>> Thank you.
>> 
>> Yours sincerely,
>> 
>> TAY wee-beng
>> 
>> 
>> 
>> <code.txt>
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
> 


From zonexo at gmail.com  Mon May 19 01:26:59 2014
From: zonexo at gmail.com (TAY wee-beng)
Date: Mon, 19 May 2014 14:26:59 +0800
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <B52953B5-E17E-4989-9A06-C87DF21FC43D@mcs.anl.gov>
References: <534C9A2C.5060404@gmail.com>	<5351E62B.6060201@gmail.com>	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>	<53520587.6010606@gmail.com>	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>	<535248E8.2070002@gmail.com>	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>	<535284E0.8010901@gmail.com>	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>	<5352934C.1010306@gmail.com>	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>	<53529B09.8040009@gmail.com>	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>	<5353173D.60609@gmail.com>	<53546B03.1010407@gmail.com>	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>	<537188D8.2030307@gmail.com>	<A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>	<53795BCC.8020500@gmail.com>
	<CAMYG4G=D_uuesofTXoNo_Yb2GZDaOEFC5y3jG2r_+9-OA_3Pgw@mail.gmail.com>
	<53797A4D.6090602@gmail.com>
	<B52953B5-E17E-4989-9A06-C87DF21FC43D@mcs.anl.gov>
Message-ID: <5379A433.5000401@gmail.com>

On 19/5/2014 11:36 AM, Barry Smith wrote:
> On May 18, 2014, at 10:28 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>
>> On 19/5/2014 9:53 AM, Matthew Knepley wrote:
>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>> Hi Barry,
>>>
>>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu?
>>>
>>> Yes it works with Intel. Is this using optimization?
>> Hi Matt,
>>
>> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel.  Does it definitely mean that it's a bug in ifort?
>    No. Does it run clean under valgrind?
Hi,

Do you mean the debug or optimized version?

Thanks.
>
>>>    Matt
>>>   
>>>
>>> Thank you
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>> On 14/5/2014 12:03 AM, Barry Smith wrote:
>>>     Please send you current code. So we may compile and run it.
>>>
>>>     Barry
>>>
>>>
>>>     On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures.
>>>
>>> Thank you.
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>>      Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email.
>>>
>>>      Barry
>>>
>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>      Hmm,
>>>
>>>          Interface DMDAVecGetArrayF90
>>>            Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>              USE_DM_HIDE
>>>              DM_HIDE da1
>>>              VEC_HIDE v
>>>              PetscScalar,pointer :: d1(:,:,:)
>>>              PetscErrorCode ierr
>>>            End Subroutine
>>>
>>>       So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array??
>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>> Hi,
>>>
>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u".
>>>
>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why...
>>>
>>> Also, supposed I call:
>>>
>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>
>>>       call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>
>>>       call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>
>>> u_array ....
>>>
>>> v_array .... etc
>>>
>>> Now to restore the array, does it matter the sequence they are restored?
>>>       No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code.
>>>
>>> Hi,
>>>
>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization:
>>>
>>>
>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>
>>>       call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>
>>>       call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>
>>>       call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>
>>>       u_array = 0.d0
>>>
>>>       v_array = 0.d0
>>>
>>>       w_array = 0.d0
>>>
>>>       p_array = 0.d0
>>>
>>>
>>>       call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>
>>>
>>>       call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>
>>>       call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>
>>>       call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other                                                                                     version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi?
>>>
>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F?
>>> Hi Matt,
>>>
>>> Do you mean putting the above lines into ex11f90.F and test?
>>>
>>> It already has DMDAVecGetArray(). Just run it.
>>> Hi,
>>>
>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region.
>>>
>>> No the global/local difference should not matter.
>>>    Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though.
>>>
>>> DMGetLocalVector()
>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter?
>>>
>>> If so, when should I call them?
>>>
>>> You just need a local vector from somewhere.
>>> Hi,
>>>
>>> Anyone can help with the questions below? Still trying to find why my code doesn't work.
>>>
>>> Thanks.
>>> Hi,
>>>
>>> I insert part of my error region code into ex11f90:
>>>
>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>            call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>            call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>            call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>
>>>       u_array = 0.d0
>>>            v_array = 0.d0
>>>            w_array = 0.d0
>>>            p_array = 0.d0
>>>
>>>       call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>
>>>       call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>
>>>       call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>
>>>       call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> It worked w/o error. I'm going to change the way the modules are defined in my code.
>>>
>>> My code contains a main program and a number of modules files, with subroutines inside e.g.
>>>
>>> module solve
>>>                    <- add include file?
>>> subroutine RRK
>>>                    <- add include file?
>>> end subroutine RRK
>>>
>>> end module solve
>>>
>>> So where should the include files (#include <finclude/petscdmda.h90>) be placed?
>>>
>>> After the module or inside the subroutine?
>>>
>>> Thanks.
>>>     Matt
>>>    Thanks.
>>>      Matt
>>>    Thanks.
>>>      Matt
>>>    Thanks
>>>
>>> Regards.
>>>      Matt
>>>    As in w, then v and u?
>>>
>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> thanks
>>>       Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F  (and they do not start at 1). This is how to get the loop bounds.
>>> Hi,
>>>
>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?.
>>>
>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so?
>>>       Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and                                 DMDAGetGhostCorners() for local vectors.  So really C or Fortran                                                           doesn?t make any difference.
>>>
>>>
>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>      If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1.
>>>
>>>      Barry
>>>
>>> Thanks.
>>>      Barry
>>>
>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)"
>>>
>>> However, by re-writing my code, I found out a few things:
>>>
>>> 1. if I write my code this way:
>>>
>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>
>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>
>>> u_array = ....
>>>
>>> v_array = ....
>>>
>>> w_array = ....
>>>
>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>
>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>
>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> The code runs fine.
>>>
>>> 2. if I write my code this way:
>>>
>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>
>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>
>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above.
>>>
>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>
>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>
>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>
>>> where the subroutine is:
>>>
>>> subroutine uvw_array_change(u,v,w)
>>>
>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>
>>> u ...
>>> v...
>>> w ...
>>>
>>> end subroutine uvw_array_change.
>>>
>>> The above will give an error at :
>>>
>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> 3. Same as above, except I change the order of the last 3 lines to:
>>>
>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>
>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>
>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>
>>> So they are now in reversed order. Now it works.
>>>
>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>
>>> subroutine uvw_array_change(u,v,w)
>>>
>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>
>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>
>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>
>>> u ...
>>> v...
>>> w ...
>>>
>>> end subroutine uvw_array_change.
>>>
>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick".
>>>
>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) "
>>>
>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other                                                           than the "trick" to continue using the 1 indices convention.
>>>
>>> Thank you.
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>      Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>
>>>
>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>> Hi Barry,
>>>
>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode.
>>>
>>> I have attached my code.
>>>
>>> Thank you
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>      Please send the code that creates da_w and the declarations of w_array
>>>
>>>      Barry
>>>
>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>> <zonexo at gmail.com>
>>>     wrote:
>>>
>>>
>>> Hi Barry,
>>>
>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>
>>>     mpirun -n 4 ./a.out -start_in_debugger
>>>
>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts.
>>>
>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too.
>>>
>>> mpirun -n 4 ./a.out -start_in_debugger
>>> --------------------------------------------------------------------------
>>> An MPI process has executed an operation involving a call to the
>>> "fork()" system call to create a child process.  Open MPI is currently
>>> operating in a condition that could result in memory corruption or
>>> other system errors; your MPI job may hang, crash, or produce silent
>>> data corruption.  The use of fork() (or system() or other calls that
>>> create child processes) is strongly discouraged.
>>>
>>> The process that invoked fork was:
>>>
>>>      Local host:          n12-76 (PID 20235)
>>>      MPI_COMM_WORLD rank: 2
>>>
>>> If you are *absolutely sure* that your application will successfully
>>> and correctly survive a call to fork(), you may disable this warning
>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>> --------------------------------------------------------------------------
>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76
>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76
>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76
>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76
>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>>
>>> ....
>>>
>>>     1
>>> [1]PETSC ERROR: ------------------------------------------------------------------------
>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [1]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org
>>>     on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>> [1]PETSC ERROR: to get more information on the crash.
>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>> [3]PETSC ERROR: ------------------------------------------------------------------------
>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [3]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org
>>>     on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>> [3]PETSC ERROR: to get more information on the crash.
>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>
>>> ...
>>> Thank you.
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>
>>>      Because IO doesn?t always get flushed immediately it may not be hanging at this point.  It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point.
>>>
>>>       Barry
>>>
>>>      This routines don?t have any parallel communication in them so are unlikely to hang.
>>>
>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>
>>> <zonexo at gmail.com>
>>>
>>>     wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>>>
>>>            call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3"
>>>            call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>>            call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>>            call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>>            call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be in reverse order
>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>>            call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>>            call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>> -- 
>>> Thank you.
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>>
>>>
>>> <code.txt>
>>>
>>>
>>>
>>>
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>>
>>>
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>>
>>>
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>>
>>>
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>>
>>>
>>>
>>>
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener


From jon.the.wong at gmail.com  Mon May 19 03:57:16 2014
From: jon.the.wong at gmail.com (Jonathan Wong)
Date: Mon, 19 May 2014 01:57:16 -0700
Subject: [petsc-users] Differences between jacobi and bjacobi preconditioner
	for cg method 1-processor/block
Message-ID: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>

I'm running into an issue for a symmetric (may not be pd) finite element
problem where I am using the cg method and getting an indefinite_mat or
indefinite_pc error using the jacobi preconditioner. If I change the pc
type to bjacobi, it converges nicely. I am only using 1 process, and I
assumed they would produce the same result, as I'm using default options.

Does anyone have any ideas why this would happen?

It also works fine with gmres + jacobi pc.

Thanks,
Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/17f707e5/attachment.html>

From christophe.ortiz at ciemat.es  Mon May 19 04:14:10 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Mon, 19 May 2014 11:14:10 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
Message-ID: <CANBrw+7qyza689ZpMe8m8RvXD-dwK3ZgphuE28gt6fvKrGANQw@mail.gmail.com>

On Thu, May 15, 2014 at 2:08 AM, Jed Brown <jed at jedbrown.org> wrote:

> Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
>
> > Hi all,
> >
> > I am experiencing some problems of memory corruption with PetscMemzero().
> >
> > I set the values of the Jacobian by blocks using MatSetValuesBlocked().
> To
> > do so, I use some temporary two-dimensional arrays[dof][dof] that I must
> > reset at each loop.
> >
> > Inside FormIJacobian, for instance, I declare the following
> two-dimensional
> > array:
> >
> >    PetscScalar  diag[dof][dof];
> >
> > and then, to zero the array diag[][] I do
> >
> >    ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar));
>
> Note that this can also be spelled
>
>   PetscMemzero(diag,sizeof diag);
>

Ok.


>
> > Then, inside main(), once dof is determined, I allocate memory for diag
> as
> > follows:
> >
> >  diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
> >
> >  for (k = 0; k < dof; k++){
> >   diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
> >  }
> > That is, the classical way to allocate memory using the pointer notation.
>
> Note that you can do a contiguous allocation by creating a Vec, then use
> VecGetArray2D to get 2D indexing of it.
>


Good to know. I'll try. Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/80878537/attachment.html>

From knepley at gmail.com  Mon May 19 05:32:09 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 19 May 2014 05:32:09 -0500
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <5379A433.5000401@gmail.com>
References: <534C9A2C.5060404@gmail.com> <5351E62B.6060201@gmail.com>
	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>
	<53520587.6010606@gmail.com>
	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>
	<535248E8.2070002@gmail.com>
	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>
	<535284E0.8010901@gmail.com>
	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>
	<5352934C.1010306@gmail.com>
	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>
	<53529B09.8040009@gmail.com>
	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>
	<5353173D.60609@gmail.com> <53546B03.1010407@gmail.com>
	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>
	<537188D8.2030307@gmail.com>
	<A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>
	<53795BCC.8020500@gmail.com>
	<CAMYG4G=D_uuesofTXoNo_Yb2GZDaOEFC5y3jG2r_+9-OA_3Pgw@mail.gmail.com>
	<53797A4D.6090602@gmail.com>
	<B52953B5-E17E-4989-9A06-C87DF21FC43D@mcs.anl.gov>
	<5379A433.5000401@gmail.com>
Message-ID: <CAMYG4G=4HNwDjRagJqg8vE8YVPJ1GPEXECqyqGOiHcYce08fCA@mail.gmail.com>

On Mon, May 19, 2014 at 1:26 AM, TAY wee-beng <zonexo at gmail.com> wrote:

> On 19/5/2014 11:36 AM, Barry Smith wrote:
>
>> On May 18, 2014, at 10:28 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>
>>  On 19/5/2014 9:53 AM, Matthew Knepley wrote:
>>>
>>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> Hi Barry,
>>>>
>>>> I am trying to sort out the details so that it's easier to pinpoint the
>>>> error. However, I tried on gnu gfortran and it worked well. On intel ifort,
>>>> it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that
>>>> it's a bug in ifort? Do you work with both intel and gnu?
>>>>
>>>> Yes it works with Intel. Is this using optimization?
>>>>
>>> Hi Matt,
>>>
>>> I forgot to add that in non-optimized cases, it works with gnu and
>>> intel. However, in optimized cases, it works with gnu, but not intel.  Does
>>> it definitely mean that it's a bug in ifort?
>>>
>>    No. Does it run clean under valgrind?
>>
> Hi,
>
> Do you mean the debug or optimized version?
>

optimized. Have you tried using a lower optimization level?

   Matt


>
> Thanks.
>
>
>>     Matt
>>>>
>>>> Thank you
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>> On 14/5/2014 12:03 AM, Barry Smith wrote:
>>>>     Please send you current code. So we may compile and run it.
>>>>
>>>>     Barry
>>>>
>>>>
>>>>     On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have sent the entire code a while ago. Is there any answer? I was
>>>> also trying myself but it worked for some intel compiler, and some not. I'm
>>>> still not able to find the answer. gnu compilers for most cluster are old
>>>> versions so they are not able to compile since I have allocatable
>>>> structures.
>>>>
>>>> Thank you.
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>>>      Please send the entire code. If we can run it and reproduce the
>>>> problem we can likely track down the issue much faster than through endless
>>>> rounds of email.
>>>>
>>>>      Barry
>>>>
>>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com>
>>>> wrote:
>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com>
>>>> wrote:
>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>>      Hmm,
>>>>
>>>>          Interface DMDAVecGetArrayF90
>>>>            Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>              USE_DM_HIDE
>>>>              DM_HIDE da1
>>>>              VEC_HIDE v
>>>>              PetscScalar,pointer :: d1(:,:,:)
>>>>              PetscErrorCode ierr
>>>>            End Subroutine
>>>>
>>>>       So the d1 is a F90 POINTER. But your subroutine seems to be
>>>> treating it as a ?plain old Fortran array??
>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>> Hi,
>>>>
>>>> So d1 is a pointer, and it's different if I declare it as "plain old
>>>> Fortran array"? Because I declare it as a Fortran array and it works w/o
>>>> any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90
>>>> with "u".
>>>>
>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u",
>>>> "v" and "w", error starts to happen. I wonder why...
>>>>
>>>> Also, supposed I call:
>>>>
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>       call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>       call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>> u_array ....
>>>>
>>>> v_array .... etc
>>>>
>>>> Now to restore the array, does it matter the sequence they are restored?
>>>>       No it should not matter. If it matters that is a sign that memory
>>>> has been written to incorrectly earlier in the code.
>>>>
>>>> Hi,
>>>>
>>>> Hmm, I have been getting different results on different intel
>>>> compilers. I'm not sure if MPI played a part but I'm only using a single
>>>> processor. In the debug mode, things run without problem. In optimized
>>>> mode, in some cases, the code aborts even doing simple initialization:
>>>>
>>>>
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>>       call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>       call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>       call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>
>>>>       u_array = 0.d0
>>>>
>>>>       v_array = 0.d0
>>>>
>>>>       w_array = 0.d0
>>>>
>>>>       p_array = 0.d0
>>>>
>>>>
>>>>       call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>
>>>>
>>>>       call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>       call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>       call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr),
>>>> giving segmentation error. But other
>>>>                                               version of intel compiler
>>>> passes thru this part w/o error. Since the response is different among
>>>> different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi?
>>>>
>>>> We do this is a bunch of examples. Can you reproduce this different
>>>> behavior in src/dm/examples/tutorials/ex11f90.F?
>>>> Hi Matt,
>>>>
>>>> Do you mean putting the above lines into ex11f90.F and test?
>>>>
>>>> It already has DMDAVecGetArray(). Just run it.
>>>> Hi,
>>>>
>>>> It worked. The differences between mine and the code is the way the
>>>> fortran modules are defined, and the ex11f90 only uses global vectors. Does
>>>> it make a difference whether global or local vectors are used? Because the
>>>> way it accesses x1 only touches the local region.
>>>>
>>>> No the global/local difference should not matter.
>>>>    Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be
>>>> used 1st, is that so? I can't find the equivalent for local vector though.
>>>>
>>>> DMGetLocalVector()
>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my
>>>> code. Does it matter?
>>>>
>>>> If so, when should I call them?
>>>>
>>>> You just need a local vector from somewhere.
>>>> Hi,
>>>>
>>>> Anyone can help with the questions below? Still trying to find why my
>>>> code doesn't work.
>>>>
>>>> Thanks.
>>>> Hi,
>>>>
>>>> I insert part of my error region code into ex11f90:
>>>>
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>            call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>            call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>            call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>
>>>>       u_array = 0.d0
>>>>            v_array = 0.d0
>>>>            w_array = 0.d0
>>>>            p_array = 0.d0
>>>>
>>>>       call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>
>>>>       call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>>       call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>>       call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>> It worked w/o error. I'm going to change the way the modules are
>>>> defined in my code.
>>>>
>>>> My code contains a main program and a number of modules files, with
>>>> subroutines inside e.g.
>>>>
>>>> module solve
>>>>                    <- add include file?
>>>> subroutine RRK
>>>>                    <- add include file?
>>>> end subroutine RRK
>>>>
>>>> end module solve
>>>>
>>>> So where should the include files (#include <finclude/petscdmda.h90>)
>>>> be placed?
>>>>
>>>> After the module or inside the subroutine?
>>>>
>>>> Thanks.
>>>>     Matt
>>>>    Thanks.
>>>>      Matt
>>>>    Thanks.
>>>>      Matt
>>>>    Thanks
>>>>
>>>> Regards.
>>>>      Matt
>>>>    As in w, then v and u?
>>>>
>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>> thanks
>>>>       Note also that the beginning and end indices of the u,v,w, are
>>>> different for each process see for example
>>>> http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/
>>>> tutorials/ex11f90.F  (and they do not start at 1). This is how to get
>>>> the loop bounds.
>>>> Hi,
>>>>
>>>> In my case, I fixed the u,v,w such that their indices are the same. I
>>>> also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem
>>>> lies in my subroutine treating it as a ?plain old Fortran array?.
>>>>
>>>> If I declare them as pointers, their indices follow the C 0 start
>>>> convention, is that so?
>>>>       Not really. It is that in each process you need to access them
>>>> from the indices indicated by DMDAGetCorners() for global vectors and
>>>>                           DMDAGetGhostCorners() for local vectors.  So
>>>> really C or Fortran
>>>>   doesn?t make any difference.
>>>>
>>>>
>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the
>>>> Fortran 1 start convention. Is there some way to manipulate such that I do
>>>> not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>>      If you code wishes to access them with indices plus one from the
>>>> values returned by DMDAGetCorners() for global vectors and
>>>> DMDAGetGhostCorners() for local vectors then you need to manually subtract
>>>> off the 1.
>>>>
>>>>      Barry
>>>>
>>>> Thanks.
>>>>      Barry
>>>>
>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I tried to pinpoint the problem. I reduced my job size and hence I can
>>>> run on 1 processor. Tried using valgrind but perhaps I'm using the
>>>> optimized version, it didn't catch the error, besides saying "Segmentation
>>>> fault (core dumped)"
>>>>
>>>> However, by re-writing my code, I found out a few things:
>>>>
>>>> 1. if I write my code this way:
>>>>
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>> u_array = ....
>>>>
>>>> v_array = ....
>>>>
>>>> w_array = ....
>>>>
>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>> The code runs fine.
>>>>
>>>> 2. if I write my code this way:
>>>>
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does
>>>> the same modification as the above.
>>>>
>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>>
>>>> where the subroutine is:
>>>>
>>>> subroutine uvw_array_change(u,v,w)
>>>>
>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>
>>>> u ...
>>>> v...
>>>> w ...
>>>>
>>>> end subroutine uvw_array_change.
>>>>
>>>> The above will give an error at :
>>>>
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>> 3. Same as above, except I change the order of the last 3 lines to:
>>>>
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>
>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>
>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>
>>>> So they are now in reversed order. Now it works.
>>>>
>>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>>
>>>> subroutine uvw_array_change(u,v,w)
>>>>
>>>> real(8), intent(inout) :: u(start_indices(1):end_
>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>> 3):end_indices(3))
>>>>
>>>> real(8), intent(inout) :: v(start_indices(1):end_
>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>> 3):end_indices(3))
>>>>
>>>> real(8), intent(inout) :: w(start_indices(1):end_
>>>> indices(1),start_indices(2):end_indices(2),start_indices(
>>>> 3):end_indices(3))
>>>>
>>>> u ...
>>>> v...
>>>> w ...
>>>>
>>>> end subroutine uvw_array_change.
>>>>
>>>> The start_indices and end_indices are simply to shift the 0 indices of
>>>> C convention to that of the 1 indices of the Fortran convention. This is
>>>> necessary in my case because most of my codes start array counting at 1,
>>>> hence the "trick".
>>>>
>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in
>>>> 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> "
>>>>
>>>> So did I violate and cause memory corruption due to the trick above?
>>>> But I can't think of any way other
>>>>                   than the "trick" to continue using the 1 indices
>>>> convention.
>>>>
>>>> Thank you.
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>>      Try running under valgrind http://www.mcs.anl.gov/petsc/
>>>> documentation/faq.html#valgrind
>>>>
>>>>
>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>
>>>> Hi Barry,
>>>>
>>>> As I mentioned earlier, the code works fine in PETSc debug mode but
>>>> fails in non-debug mode.
>>>>
>>>> I have attached my code.
>>>>
>>>> Thank you
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>>      Please send the code that creates da_w and the declarations of
>>>> w_array
>>>>
>>>>      Barry
>>>>
>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>>> <zonexo at gmail.com>
>>>>     wrote:
>>>>
>>>>
>>>> Hi Barry,
>>>>
>>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>>
>>>>     mpirun -n 4 ./a.out -start_in_debugger
>>>>
>>>> I got the msg below. Before the gdb windows appear (thru x11), the
>>>> program aborts.
>>>>
>>>> Also I tried running in another cluster and it worked. Also tried in
>>>> the current cluster in debug mode and it worked too.
>>>>
>>>> mpirun -n 4 ./a.out -start_in_debugger
>>>> ------------------------------------------------------------
>>>> --------------
>>>> An MPI process has executed an operation involving a call to the
>>>> "fork()" system call to create a child process.  Open MPI is currently
>>>> operating in a condition that could result in memory corruption or
>>>> other system errors; your MPI job may hang, crash, or produce silent
>>>> data corruption.  The use of fork() (or system() or other calls that
>>>> create child processes) is strongly discouraged.
>>>>
>>>> The process that invoked fork was:
>>>>
>>>>      Local host:          n12-76 (PID 20235)
>>>>      MPI_COMM_WORLD rank: 2
>>>>
>>>> If you are *absolutely sure* that your application will successfully
>>>> and correctly survive a call to fork(), you may disable this warning
>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>> ------------------------------------------------------------
>>>> --------------
>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display
>>>> localhost:50.0 on machine n12-76
>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display
>>>> localhost:50.0 on machine n12-76
>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display
>>>> localhost:50.0 on machine n12-76
>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display
>>>> localhost:50.0 on machine n12-76
>>>> [n12-76:20232] 3 more processes have sent help message
>>>> help-mpi-runtime.txt / mpi_init:warn-fork
>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see
>>>> all help / error messages
>>>>
>>>> ....
>>>>
>>>>     1
>>>> [1]PETSC ERROR: ------------------------------
>>>> ------------------------------------------
>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>> probably memory access out of range
>>>> [1]PETSC ERROR: Try option -start_in_debugger or
>>>> -on_error_attach_debugger
>>>> [1]PETSC ERROR: or see
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSCERROR: or try
>>>> http://valgrind.org
>>>>     on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>>>> and run
>>>> [1]PETSC ERROR: to get more information on the crash.
>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory
>>>> unknown file (null)
>>>> [3]PETSC ERROR: ------------------------------
>>>> ------------------------------------------
>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>> probably memory access out of range
>>>> [3]PETSC ERROR: Try option -start_in_debugger or
>>>> -on_error_attach_debugger
>>>> [3]PETSC ERROR: or see
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSCERROR: or try
>>>> http://valgrind.org
>>>>     on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>>>> and run
>>>> [3]PETSC ERROR: to get more information on the crash.
>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory
>>>> unknown file (null)
>>>>
>>>> ...
>>>> Thank you.
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>>
>>>>      Because IO doesn?t always get flushed immediately it may not be
>>>> hanging at this point.  It is better to use the option -start_in_debugger
>>>> then type cont in each debugger window and then when you think it is
>>>> ?hanging? do a control C in each debugger window and type where to see
>>>> where each process is you can also look around in the debugger at variables
>>>> to see why it is ?hanging? at that point.
>>>>
>>>>       Barry
>>>>
>>>>      This routines don?t have any parallel communication in them so are
>>>> unlikely to hang.
>>>>
>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>>
>>>> <zonexo at gmail.com>
>>>>
>>>>     wrote:
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I
>>>> found that it hangs after printing "7". Is it because I'm doing something
>>>> wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After
>>>> access, I use DMDAVecRestoreArrayF90.
>>>>
>>>>            call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print
>>>> *,"3"
>>>>            call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print
>>>> *,"4"
>>>>            call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print
>>>> *,"5"
>>>>            call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_
>>>> cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print
>>>> *,"6"
>>>>            call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>  !must be in reverse order
>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print
>>>> *,"7"
>>>>            call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print
>>>> *,"8"
>>>>            call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> --
>>>> Thank you.
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>>
>>>>
>>>> <code.txt>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/39572b8a/attachment-0001.html>

From knepley at gmail.com  Mon May 19 05:41:13 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 19 May 2014 05:41:13 -0500
Subject: [petsc-users] Differences between jacobi and bjacobi
 preconditioner for cg method 1-processor/block
In-Reply-To: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>
References: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>
Message-ID: <CAMYG4GmtDiZJ=K2zCFDxea=8WqSK_U=tLgaC00bo5St_DLzkGQ@mail.gmail.com>

On Mon, May 19, 2014 at 3:57 AM, Jonathan Wong <jon.the.wong at gmail.com>wrote:

> I'm running into an issue for a symmetric (may not be pd) finite element
> problem where I am using the cg method and getting an indefinite_mat or
> indefinite_pc error using the jacobi preconditioner. If I change the pc
> type to bjacobi, it converges nicely. I am only using 1 process, and I
> assumed they would produce the same result, as I'm using default options.
>
> Does anyone have any ideas why this would happen?
>

No, Block-Jacobi and Jacobi are completely different. If you are not
positive definite, you should be using MINRES.

   Matt


> It also works fine with gmres + jacobi pc.
>
> Thanks,
> Jon
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/676b336e/attachment.html>

From knepley at gmail.com  Mon May 19 05:46:02 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 19 May 2014 05:46:02 -0500
Subject: [petsc-users] DMPlexDistribute error
In-Reply-To: <53797A73.9070106@unimelb.edu.au>
References: <53797A73.9070106@unimelb.edu.au>
Message-ID: <CAMYG4GmX8vBvcCaRLKSP18S6a8U5LUwa_E2_s+9XCxEWRCb21Q@mail.gmail.com>

On Sun, May 18, 2014 at 10:28 PM, Scott Wales <scott.wales at unimelb.edu.au>wrote:

> Hi,
>
> I'm trying to create a distributed unstructured grid in PETSc, and have
> encountered the following error:
>
>     [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
>     [0]PETSC ERROR: Invalid argument!
>     [0]PETSC ERROR: Wrong type of object: Parameter # 1!
>     [0]PETSC ERROR: ------------------------------
> ------------------------------------------
>     [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
>     [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>     [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>     [0]PETSC ERROR: See docs/index.html for manual pages.
>     [0]PETSC ERROR: ------------------------------
> ------------------------------------------
>     [0]PETSC ERROR: bin/celllist_square on a arch-linux2-c-debug named
> raijin5 by saw562 Mon May 19 13:00:31 2014
>     [0]PETSC ERROR: Libraries linked from /home/562/saw562/opt/petsc/3.
> 4.4/lib
>     [0]PETSC ERROR: Configure run at Fri May 16 14:23:02 2014
>     [0]PETSC ERROR: Configure options --with-shared-libraries=1
> --prefix=/home/562/saw562/opt/petsc/3.4.4 --with-blas-lapack-lib="-L/
> apps/intel-ct/12.1.9.293/mkl/lib/intel64 -lmkl_intel_lp64
> -lmkl_sequential -lmkl_core -lpthread" --with-mpi-dir=/apps/openmpi/1.6.3
>     [0]PETSC ERROR: ------------------------------
> ------------------------------------------
>     [0]PETSC ERROR: ISGetIndices() line 372 in
> src/vec/is/is/interface/index.c
>     [0]PETSC ERROR: DMPlexCreatePartitionClosure() line 2637 ihttps://
> gist.github.com/ScottWales/2758b5ec96573c63e31an src/dm/impls/plex/plex.c
>     [0]PETSC ERROR: DMPlexDistribute() line 2810 in
> src/dm/impls/plex/plex.c
>
> I've created the DMPlex using `DMPlexCreateFromCellList`, added a default
> section and then called `DMPlexDistribute` to spread the grid points across
> all of the processors. You can see my test code at
> https://gist.github.com/ScottWales/2758b5ec96573c63e31a#file-
> petsc-test-c-L164. Have I missed a step in the grid setup?
>

1) I have better checking in the 'master' branch, and we are about to
release, so I recommend upgrading

2) You did not install with any mesh partitioner, so it freaked out. You
need something like --download-chaco in configure.

  Thanks,

      Matt


> Thanks, Scott
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/28066020/attachment.html>

From jed at jedbrown.org  Mon May 19 08:35:28 2014
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 19 May 2014 07:35:28 -0600
Subject: [petsc-users] Differences between jacobi and bjacobi
	preconditioner for cg method 1-processor/block
In-Reply-To: <CAMYG4GmtDiZJ=K2zCFDxea=8WqSK_U=tLgaC00bo5St_DLzkGQ@mail.gmail.com>
References: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>
	<CAMYG4GmtDiZJ=K2zCFDxea=8WqSK_U=tLgaC00bo5St_DLzkGQ@mail.gmail.com>
Message-ID: <87egzpsxgf.fsf@jedbrown.org>

Matthew Knepley <knepley at gmail.com> writes:
> No, Block-Jacobi and Jacobi are completely different. If you are not
> positive definite, you should be using MINRES.

MINRES requires an SPD preconditioner.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/75b622cc/attachment.pgp>

From bsmith at mcs.anl.gov  Mon May 19 12:43:48 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 19 May 2014 12:43:48 -0500
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <5379A433.5000401@gmail.com>
References: <534C9A2C.5060404@gmail.com>	<5351E62B.6060201@gmail.com>	<C2B656E2-D56B-42C5-A764-DFCE647B95FD@mcs.anl.gov>	<53520587.6010606@gmail.com>	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>	<535248E8.2070002@gmail.com>	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>	<535284E0.8010901@gmail.com>	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>	<5352934C.1010306@gmail.com>	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>	<53529B09.8040009@gmail.com>	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>	<5353173D.60609@gmail.com>	<53546B03.1010407@gmail.com>	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>	<537188D8.2030307@gmail.com>	<A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>	<53795BCC.8020500@gmail.com>
	<CAMYG4G=D_uuesofTXoNo_Yb2GZDaOEFC5y3jG2r_+9-OA_3Pgw@mail.gmail.com>
	<53797A4D.6090602@gmail.com>
	<B52953B5-E17E-4989-9A06-C87DF21FC43D@mcs.anl.gov>
	<5379A433.5000401@gmail.com>
Message-ID: <D500529E-1D91-4D48-8174-5F00FA4C470F@mcs.anl.gov>


On May 19, 2014, at 1:26 AM, TAY wee-beng <zonexo at gmail.com> wrote:

> On 19/5/2014 11:36 AM, Barry Smith wrote:
>> On May 18, 2014, at 10:28 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>> 
>>> On 19/5/2014 9:53 AM, Matthew Knepley wrote:
>>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> Hi Barry,
>>>> 
>>>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu?
>>>> 
>>>> Yes it works with Intel. Is this using optimization?
>>> Hi Matt,
>>> 
>>> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel.  Does it definitely mean that it's a bug in ifort?
>>   No. Does it run clean under valgrind?
> Hi,
> 
> Do you mean the debug or optimized version?

  Both.

> 
> Thanks.
>> 
>>>>   Matt
>>>>  
>>>> Thank you
>>>> 
>>>> Yours sincerely,
>>>> 
>>>> TAY wee-beng
>>>> 
>>>> On 14/5/2014 12:03 AM, Barry Smith wrote:
>>>>    Please send you current code. So we may compile and run it.
>>>> 
>>>>    Barry
>>>> 
>>>> 
>>>>    On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures.
>>>> 
>>>> Thank you.
>>>> 
>>>> Yours sincerely,
>>>> 
>>>> TAY wee-beng
>>>> 
>>>> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>>>     Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email.
>>>> 
>>>>     Barry
>>>> 
>>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> 
>>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> 
>>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> 
>>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>>     Hmm,
>>>> 
>>>>         Interface DMDAVecGetArrayF90
>>>>           Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>             USE_DM_HIDE
>>>>             DM_HIDE da1
>>>>             VEC_HIDE v
>>>>             PetscScalar,pointer :: d1(:,:,:)
>>>>             PetscErrorCode ierr
>>>>           End Subroutine
>>>> 
>>>>      So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array??
>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>> Hi,
>>>> 
>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u".
>>>> 
>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why...
>>>> 
>>>> Also, supposed I call:
>>>> 
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>> u_array ....
>>>> 
>>>> v_array .... etc
>>>> 
>>>> Now to restore the array, does it matter the sequence they are restored?
>>>>      No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code.
>>>> 
>>>> Hi,
>>>> 
>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization:
>>>> 
>>>> 
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>>      call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>>      call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>>      call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>> 
>>>>      u_array = 0.d0
>>>> 
>>>>      v_array = 0.d0
>>>> 
>>>>      w_array = 0.d0
>>>> 
>>>>      p_array = 0.d0
>>>> 
>>>> 
>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>> 
>>>> 
>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other                                                                                     version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi?
>>>> 
>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F?
>>>> Hi Matt,
>>>> 
>>>> Do you mean putting the above lines into ex11f90.F and test?
>>>> 
>>>> It already has DMDAVecGetArray(). Just run it.
>>>> Hi,
>>>> 
>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region.
>>>> 
>>>> No the global/local difference should not matter.
>>>>   Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though.
>>>> 
>>>> DMGetLocalVector()
>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter?
>>>> 
>>>> If so, when should I call them?
>>>> 
>>>> You just need a local vector from somewhere.
>>>> Hi,
>>>> 
>>>> Anyone can help with the questions below? Still trying to find why my code doesn't work.
>>>> 
>>>> Thanks.
>>>> Hi,
>>>> 
>>>> I insert part of my error region code into ex11f90:
>>>> 
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>           call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>> 
>>>>      u_array = 0.d0
>>>>           v_array = 0.d0
>>>>           w_array = 0.d0
>>>>           p_array = 0.d0
>>>> 
>>>>      call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>> 
>>>>      call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>>      call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>>      call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> It worked w/o error. I'm going to change the way the modules are defined in my code.
>>>> 
>>>> My code contains a main program and a number of modules files, with subroutines inside e.g.
>>>> 
>>>> module solve
>>>>                   <- add include file?
>>>> subroutine RRK
>>>>                   <- add include file?
>>>> end subroutine RRK
>>>> 
>>>> end module solve
>>>> 
>>>> So where should the include files (#include <finclude/petscdmda.h90>) be placed?
>>>> 
>>>> After the module or inside the subroutine?
>>>> 
>>>> Thanks.
>>>>    Matt
>>>>   Thanks.
>>>>     Matt
>>>>   Thanks.
>>>>     Matt
>>>>   Thanks
>>>> 
>>>> Regards.
>>>>     Matt
>>>>   As in w, then v and u?
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> thanks
>>>>      Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F  (and they do not start at 1). This is how to get the loop bounds.
>>>> Hi,
>>>> 
>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?.
>>>> 
>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so?
>>>>      Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and                                 DMDAGetGhostCorners() for local vectors.  So really C or Fortran                                                           doesn?t make any difference.
>>>> 
>>>> 
>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>>     If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1.
>>>> 
>>>>     Barry
>>>> 
>>>> Thanks.
>>>>     Barry
>>>> 
>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)"
>>>> 
>>>> However, by re-writing my code, I found out a few things:
>>>> 
>>>> 1. if I write my code this way:
>>>> 
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>> u_array = ....
>>>> 
>>>> v_array = ....
>>>> 
>>>> w_array = ....
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> The code runs fine.
>>>> 
>>>> 2. if I write my code this way:
>>>> 
>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above.
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>> 
>>>> where the subroutine is:
>>>> 
>>>> subroutine uvw_array_change(u,v,w)
>>>> 
>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>> 
>>>> u ...
>>>> v...
>>>> w ...
>>>> 
>>>> end subroutine uvw_array_change.
>>>> 
>>>> The above will give an error at :
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> 3. Same as above, except I change the order of the last 3 lines to:
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>> 
>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>> 
>>>> So they are now in reversed order. Now it works.
>>>> 
>>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>> 
>>>> subroutine uvw_array_change(u,v,w)
>>>> 
>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>> 
>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>> 
>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>> 
>>>> u ...
>>>> v...
>>>> w ...
>>>> 
>>>> end subroutine uvw_array_change.
>>>> 
>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick".
>>>> 
>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) "
>>>> 
>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other                                                           than the "trick" to continue using the 1 indices convention.
>>>> 
>>>> Thank you.
>>>> 
>>>> Yours sincerely,
>>>> 
>>>> TAY wee-beng
>>>> 
>>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>>     Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>> 
>>>> 
>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>> 
>>>> Hi Barry,
>>>> 
>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode.
>>>> 
>>>> I have attached my code.
>>>> 
>>>> Thank you
>>>> 
>>>> Yours sincerely,
>>>> 
>>>> TAY wee-beng
>>>> 
>>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>>     Please send the code that creates da_w and the declarations of w_array
>>>> 
>>>>     Barry
>>>> 
>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>>> <zonexo at gmail.com>
>>>>    wrote:
>>>> 
>>>> 
>>>> Hi Barry,
>>>> 
>>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>> 
>>>>    mpirun -n 4 ./a.out -start_in_debugger
>>>> 
>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts.
>>>> 
>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too.
>>>> 
>>>> mpirun -n 4 ./a.out -start_in_debugger
>>>> --------------------------------------------------------------------------
>>>> An MPI process has executed an operation involving a call to the
>>>> "fork()" system call to create a child process.  Open MPI is currently
>>>> operating in a condition that could result in memory corruption or
>>>> other system errors; your MPI job may hang, crash, or produce silent
>>>> data corruption.  The use of fork() (or system() or other calls that
>>>> create child processes) is strongly discouraged.
>>>> 
>>>> The process that invoked fork was:
>>>> 
>>>>     Local host:          n12-76 (PID 20235)
>>>>     MPI_COMM_WORLD rank: 2
>>>> 
>>>> If you are *absolutely sure* that your application will successfully
>>>> and correctly survive a call to fork(), you may disable this warning
>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>> --------------------------------------------------------------------------
>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76
>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76
>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76
>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76
>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>>> 
>>>> ....
>>>> 
>>>>    1
>>>> [1]PETSC ERROR: ------------------------------------------------------------------------
>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [1]PETSC ERROR: or see
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org
>>>>    on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>> [1]PETSC ERROR: to get more information on the crash.
>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>> [3]PETSC ERROR: ------------------------------------------------------------------------
>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [3]PETSC ERROR: or see
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org
>>>>    on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>> [3]PETSC ERROR: to get more information on the crash.
>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>> 
>>>> ...
>>>> Thank you.
>>>> 
>>>> Yours sincerely,
>>>> 
>>>> TAY wee-beng
>>>> 
>>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>> 
>>>>     Because IO doesn?t always get flushed immediately it may not be hanging at this point.  It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point.
>>>> 
>>>>      Barry
>>>> 
>>>>     This routines don?t have any parallel communication in them so are unlikely to hang.
>>>> 
>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>> 
>>>> <zonexo at gmail.com>
>>>> 
>>>>    wrote:
>>>> 
>>>> 
>>>> 
>>>> Hi,
>>>> 
>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>>>> 
>>>>           call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3"
>>>>           call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>>>           call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>>>           call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>>>           call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be in reverse order
>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>>>           call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>           call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>>>           call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>> -- 
>>>> Thank you.
>>>> 
>>>> Yours sincerely,
>>>> 
>>>> TAY wee-beng
>>>> 
>>>> 
>>>> 
>>>> <code.txt>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
> 


From jon.the.wong at gmail.com  Mon May 19 13:42:18 2014
From: jon.the.wong at gmail.com (Jonathan Wong)
Date: Mon, 19 May 2014 11:42:18 -0700
Subject: [petsc-users] Differences between jacobi and bjacobi
 preconditioner for cg method 1-processor/block
In-Reply-To: <87egzpsxgf.fsf@jedbrown.org>
References: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>
	<CAMYG4GmtDiZJ=K2zCFDxea=8WqSK_U=tLgaC00bo5St_DLzkGQ@mail.gmail.com>
	<87egzpsxgf.fsf@jedbrown.org>
Message-ID: <CAO8hVirKnRjhLnFYhCAXaoh==oicUgj89TPnzqc3Z-2oi6BPbg@mail.gmail.com>

Thanks for the input. To clarify, I'm trying to compare GPU algorithms to
Petsc, and they only have cg/jacobi for what I'm comparing at the moment.
This is why I'm not using gmres (which also works well).

I can solve the problem with the GPU (custom code) using CG + jacobi for
all the meshes. On the CPU side, I can solve everything with cg/bjacobi and
almost all of my meshes with cg/jacobi except for my 50k node mesh. I can
solve the problem with my finite element built-in direct solver (just takes
awhile) on one processor. I've been reading that by default the bjacobi pc
uses one block per processor. So I had assumed that for one processor
block-jacobi and jacobi would give similar results. cg+bjacobi works fine.
cg+jacobi does not.

I'll just look into the preconditioner code and use KSPview to try to
figure out what the differences are for one processor. I'm not sure why the
GPU can consistently solve the problem with cg/jacobi. I'm assuming this is
due to the way round-off or the order of operations differences between the
two.


On Mon, May 19, 2014 at 6:35 AM, Jed Brown <jed at jedbrown.org> wrote:

> Matthew Knepley <knepley at gmail.com> writes:
> > No, Block-Jacobi and Jacobi are completely different. If you are not
> > positive definite, you should be using MINRES.
>
> MINRES requires an SPD preconditioner.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/a714ce56/attachment.html>

From knepley at gmail.com  Mon May 19 13:45:32 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 19 May 2014 13:45:32 -0500
Subject: [petsc-users] Differences between jacobi and bjacobi
 preconditioner for cg method 1-processor/block
In-Reply-To: <CAO8hVirKnRjhLnFYhCAXaoh==oicUgj89TPnzqc3Z-2oi6BPbg@mail.gmail.com>
References: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>
	<CAMYG4GmtDiZJ=K2zCFDxea=8WqSK_U=tLgaC00bo5St_DLzkGQ@mail.gmail.com>
	<87egzpsxgf.fsf@jedbrown.org>
	<CAO8hVirKnRjhLnFYhCAXaoh==oicUgj89TPnzqc3Z-2oi6BPbg@mail.gmail.com>
Message-ID: <CAMYG4GmZEqUxZAe1V4bt6hzTAZi-XtESvAViJ_VSAp-K=QXMzw@mail.gmail.com>

On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong <jon.the.wong at gmail.com>wrote:

> Thanks for the input. To clarify, I'm trying to compare GPU algorithms to
> Petsc, and they only have cg/jacobi for what I'm comparing at the moment.
> This is why I'm not using gmres (which also works well).
>
> I can solve the problem with the GPU (custom code) using CG + jacobi for
> all the meshes. On the CPU side, I can solve everything with cg/bjacobi and
> almost all of my meshes with cg/jacobi except for my 50k node mesh. I can
> solve the problem with my finite element built-in direct solver (just takes
> awhile) on one processor. I've been reading that by default the bjacobi pc
> uses one block per processor. So I had assumed that for one processor
> block-jacobi and jacobi would give similar results. cg+bjacobi works fine.
> cg+jacobi does not.
>

"Jacobi" means preconditioning by the inverse of the diagonal of the
matrix. Block-Jacobi means using a preconditioner
formed from each of the blocks, in this case 1 block. By default the inner
preconditioner is ILU(0), not jacobi. You can
make them equivalent using -sub_pc_type jacobi.

   Matt


> I'll just look into the preconditioner code and use KSPview to try to
> figure out what the differences are for one processor. I'm not sure why the
> GPU can consistently solve the problem with cg/jacobi. I'm assuming this is
> due to the way round-off or the order of operations differences between the
> two.
>
>
> On Mon, May 19, 2014 at 6:35 AM, Jed Brown <jed at jedbrown.org> wrote:
>
>> Matthew Knepley <knepley at gmail.com> writes:
>> > No, Block-Jacobi and Jacobi are completely different. If you are not
>> > positive definite, you should be using MINRES.
>>
>> MINRES requires an SPD preconditioner.
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/f02da053/attachment-0001.html>

From paulmullowney at gmail.com  Mon May 19 14:07:53 2014
From: paulmullowney at gmail.com (Paul Mullowney)
Date: Mon, 19 May 2014 13:07:53 -0600
Subject: [petsc-users] Differences between jacobi and bjacobi
 preconditioner for cg method 1-processor/block
In-Reply-To: <CAMYG4GmZEqUxZAe1V4bt6hzTAZi-XtESvAViJ_VSAp-K=QXMzw@mail.gmail.com>
References: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>
	<CAMYG4GmtDiZJ=K2zCFDxea=8WqSK_U=tLgaC00bo5St_DLzkGQ@mail.gmail.com>
	<87egzpsxgf.fsf@jedbrown.org>
	<CAO8hVirKnRjhLnFYhCAXaoh==oicUgj89TPnzqc3Z-2oi6BPbg@mail.gmail.com>
	<CAMYG4GmZEqUxZAe1V4bt6hzTAZi-XtESvAViJ_VSAp-K=QXMzw@mail.gmail.com>
Message-ID: <CAMJ8fwo+V9CiayxMtDqx5fwpJ2aujBOh77d+dY2JMQtQQAtEig@mail.gmail.com>

I don't think bjacobi is working on GPUs. I know Dominic made a pull
request a few months ago, but I don't know if its been integrated into next.
-Paul


On Mon, May 19, 2014 at 12:45 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong <jon.the.wong at gmail.com>wrote:
>
>> Thanks for the input. To clarify, I'm trying to compare GPU algorithms to
>> Petsc, and they only have cg/jacobi for what I'm comparing at the moment.
>> This is why I'm not using gmres (which also works well).
>>
>> I can solve the problem with the GPU (custom code) using CG + jacobi for
>> all the meshes. On the CPU side, I can solve everything with cg/bjacobi and
>> almost all of my meshes with cg/jacobi except for my 50k node mesh. I can
>> solve the problem with my finite element built-in direct solver (just takes
>> awhile) on one processor. I've been reading that by default the bjacobi pc
>> uses one block per processor. So I had assumed that for one processor
>> block-jacobi and jacobi would give similar results. cg+bjacobi works fine.
>> cg+jacobi does not.
>>
>
> "Jacobi" means preconditioning by the inverse of the diagonal of the
> matrix. Block-Jacobi means using a preconditioner
> formed from each of the blocks, in this case 1 block. By default the inner
> preconditioner is ILU(0), not jacobi. You can
> make them equivalent using -sub_pc_type jacobi.
>
>    Matt
>
>
>> I'll just look into the preconditioner code and use KSPview to try to
>> figure out what the differences are for one processor. I'm not sure why the
>> GPU can consistently solve the problem with cg/jacobi. I'm assuming this is
>> due to the way round-off or the order of operations differences between the
>> two.
>>
>>
>> On Mon, May 19, 2014 at 6:35 AM, Jed Brown <jed at jedbrown.org> wrote:
>>
>>> Matthew Knepley <knepley at gmail.com> writes:
>>> > No, Block-Jacobi and Jacobi are completely different. If you are not
>>> > positive definite, you should be using MINRES.
>>>
>>> MINRES requires an SPD preconditioner.
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/ee942a04/attachment.html>

From jon.the.wong at gmail.com  Mon May 19 15:16:16 2014
From: jon.the.wong at gmail.com (Jonathan Wong)
Date: Mon, 19 May 2014 13:16:16 -0700
Subject: [petsc-users] Differences between jacobi and bjacobi
 preconditioner for cg method 1-processor/block
In-Reply-To: <CAMJ8fwo+V9CiayxMtDqx5fwpJ2aujBOh77d+dY2JMQtQQAtEig@mail.gmail.com>
References: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>
	<CAMYG4GmtDiZJ=K2zCFDxea=8WqSK_U=tLgaC00bo5St_DLzkGQ@mail.gmail.com>
	<87egzpsxgf.fsf@jedbrown.org>
	<CAO8hVirKnRjhLnFYhCAXaoh==oicUgj89TPnzqc3Z-2oi6BPbg@mail.gmail.com>
	<CAMYG4GmZEqUxZAe1V4bt6hzTAZi-XtESvAViJ_VSAp-K=QXMzw@mail.gmail.com>
	<CAMJ8fwo+V9CiayxMtDqx5fwpJ2aujBOh77d+dY2JMQtQQAtEig@mail.gmail.com>
Message-ID: <CAO8hVir65Jx8z1CBZWnev6gVNDm6M7y65Ng-yS6oKy9mvRyQcw@mail.gmail.com>

Matthew: Thanks for clarifying about the block-jacobi.

Paul: I'm only using bjacobi with PETSc to show that the problem is
solvable, and to provide some "estimation" as to the performance of the
jacobi preconditioner. On the GPU, I am using CUSP to do cg+jacobi which
works fine for this 50k node mesh.


On Mon, May 19, 2014 at 12:07 PM, Paul Mullowney <paulmullowney at gmail.com>wrote:

> I don't think bjacobi is working on GPUs. I know Dominic made a pull
> request a few months ago, but I don't know if its been integrated into next.
> -Paul
>
>
> On Mon, May 19, 2014 at 12:45 PM, Matthew Knepley <knepley at gmail.com>wrote:
>
>> On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong <jon.the.wong at gmail.com>wrote:
>>
>>> Thanks for the input. To clarify, I'm trying to compare GPU algorithms
>>> to Petsc, and they only have cg/jacobi for what I'm comparing at the
>>> moment. This is why I'm not using gmres (which also works well).
>>>
>>> I can solve the problem with the GPU (custom code) using CG + jacobi for
>>> all the meshes. On the CPU side, I can solve everything with cg/bjacobi and
>>> almost all of my meshes with cg/jacobi except for my 50k node mesh. I can
>>> solve the problem with my finite element built-in direct solver (just takes
>>> awhile) on one processor. I've been reading that by default the bjacobi pc
>>> uses one block per processor. So I had assumed that for one processor
>>> block-jacobi and jacobi would give similar results. cg+bjacobi works fine.
>>> cg+jacobi does not.
>>>
>>
>> "Jacobi" means preconditioning by the inverse of the diagonal of the
>> matrix. Block-Jacobi means using a preconditioner
>> formed from each of the blocks, in this case 1 block. By default the
>> inner preconditioner is ILU(0), not jacobi. You can
>> make them equivalent using -sub_pc_type jacobi.
>>
>>    Matt
>>
>>
>>>  I'll just look into the preconditioner code and use KSPview to try to
>>> figure out what the differences are for one processor. I'm not sure why the
>>> GPU can consistently solve the problem with cg/jacobi. I'm assuming this is
>>> due to the way round-off or the order of operations differences between the
>>> two.
>>>
>>>
>>> On Mon, May 19, 2014 at 6:35 AM, Jed Brown <jed at jedbrown.org> wrote:
>>>
>>>> Matthew Knepley <knepley at gmail.com> writes:
>>>> > No, Block-Jacobi and Jacobi are completely different. If you are not
>>>> > positive definite, you should be using MINRES.
>>>>
>>>> MINRES requires an SPD preconditioner.
>>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/403146a1/attachment.html>

From knepley at gmail.com  Mon May 19 15:20:05 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 19 May 2014 15:20:05 -0500
Subject: [petsc-users] Differences between jacobi and bjacobi
 preconditioner for cg method 1-processor/block
In-Reply-To: <CAO8hVir65Jx8z1CBZWnev6gVNDm6M7y65Ng-yS6oKy9mvRyQcw@mail.gmail.com>
References: <CAO8hVipSyCRXTYj=_MbxbJYDvY2-YHJ7sf6vh-Ra0k1DQWU9rg@mail.gmail.com>
	<CAMYG4GmtDiZJ=K2zCFDxea=8WqSK_U=tLgaC00bo5St_DLzkGQ@mail.gmail.com>
	<87egzpsxgf.fsf@jedbrown.org>
	<CAO8hVirKnRjhLnFYhCAXaoh==oicUgj89TPnzqc3Z-2oi6BPbg@mail.gmail.com>
	<CAMYG4GmZEqUxZAe1V4bt6hzTAZi-XtESvAViJ_VSAp-K=QXMzw@mail.gmail.com>
	<CAMJ8fwo+V9CiayxMtDqx5fwpJ2aujBOh77d+dY2JMQtQQAtEig@mail.gmail.com>
	<CAO8hVir65Jx8z1CBZWnev6gVNDm6M7y65Ng-yS6oKy9mvRyQcw@mail.gmail.com>
Message-ID: <CAMYG4GkBf2hikzJewq8E1kpW8VtmyTQP8iyTggo6enU1iWL7Fg@mail.gmail.com>

On Mon, May 19, 2014 at 3:16 PM, Jonathan Wong <jon.the.wong at gmail.com>wrote:

> Matthew: Thanks for clarifying about the block-jacobi.
>
> Paul: I'm only using bjacobi with PETSc to show that the problem is
> solvable, and to provide some "estimation" as to the performance of the
> jacobi preconditioner. On the GPU, I am using CUSP to do cg+jacobi which
> works fine for this 50k node mesh.
>

Its possible the GPU CG code is just ignoring breakdown and continuing the
solve. This may work sometimes, but
could give incorrect answers.

Also, it seems simply beyond belief that CG+Jacobi could solve any FEM
problem other than the identity. For example,
the Laplacian has a condition number that is proportional to h^{-2}, so it
grows like N for linear finite elements in 2D.
Are you trying to solve something with an extremely small timestep so that
it looks like the identity?

   Matt


> On Mon, May 19, 2014 at 12:07 PM, Paul Mullowney <paulmullowney at gmail.com>wrote:
>
>> I don't think bjacobi is working on GPUs. I know Dominic made a pull
>> request a few months ago, but I don't know if its been integrated into next.
>> -Paul
>>
>>
>> On Mon, May 19, 2014 at 12:45 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>
>>> On Mon, May 19, 2014 at 1:42 PM, Jonathan Wong <jon.the.wong at gmail.com>wrote:
>>>
>>>> Thanks for the input. To clarify, I'm trying to compare GPU algorithms
>>>> to Petsc, and they only have cg/jacobi for what I'm comparing at the
>>>> moment. This is why I'm not using gmres (which also works well).
>>>>
>>>> I can solve the problem with the GPU (custom code) using CG + jacobi
>>>> for all the meshes. On the CPU side, I can solve everything with cg/bjacobi
>>>> and almost all of my meshes with cg/jacobi except for my 50k node mesh. I
>>>> can solve the problem with my finite element built-in direct solver (just
>>>> takes awhile) on one processor. I've been reading that by default the
>>>> bjacobi pc uses one block per processor. So I had assumed that for one
>>>> processor block-jacobi and jacobi would give similar results. cg+bjacobi
>>>> works fine. cg+jacobi does not.
>>>>
>>>
>>> "Jacobi" means preconditioning by the inverse of the diagonal of the
>>> matrix. Block-Jacobi means using a preconditioner
>>> formed from each of the blocks, in this case 1 block. By default the
>>> inner preconditioner is ILU(0), not jacobi. You can
>>> make them equivalent using -sub_pc_type jacobi.
>>>
>>>    Matt
>>>
>>>
>>>>  I'll just look into the preconditioner code and use KSPview to try to
>>>> figure out what the differences are for one processor. I'm not sure why the
>>>> GPU can consistently solve the problem with cg/jacobi. I'm assuming this is
>>>> due to the way round-off or the order of operations differences between the
>>>> two.
>>>>
>>>>
>>>> On Mon, May 19, 2014 at 6:35 AM, Jed Brown <jed at jedbrown.org> wrote:
>>>>
>>>>> Matthew Knepley <knepley at gmail.com> writes:
>>>>> > No, Block-Jacobi and Jacobi are completely different. If you are not
>>>>> > positive definite, you should be using MINRES.
>>>>>
>>>>> MINRES requires an SPD preconditioner.
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/9e8bd2a5/attachment.html>

From mrosso at uci.edu  Mon May 19 17:18:29 2014
From: mrosso at uci.edu (Michele Rosso)
Date: Mon, 19 May 2014 15:18:29 -0700
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <87y4y0uar8.fsf@jedbrown.org>
References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org>
Message-ID: <537A8335.4080702@uci.edu>

Jed,

thanks for your reply.
By using the options you suggested, namely /-mg_levels_ksp_type 
richardson -mg_levels_pc_type sor/, I was able to
solve without bumping into the DIVERGED_INDEFINITE_PC message. 
Nevertheless, the number of iterations increases drastically as the 
simulation progresses.
The Poisson's equation I am solving arises from a variable-density 
projection method for incompressible multi-phase flows.
At each time step the system matrix coefficients change as a consequence 
of the change in location of the heavier phase; the rhs changes
in time because of the change in the velocity field. Usually the 
black-box multigrid or the deflated conjugate gradient method are used 
to solve efficiently this type of problem: it is my understanding - 
please correct me if I am wrong - that AMG is a generalization of the 
former.
The only source term acting is gravity; the hydrostatic pressure is 
removed from the governing equation in order to accommodate periodic 
boundary conditions: this is more a hack than a clean solution. Could it 
be the reason behind the poor performances/ DIVERGED_INDEFINITE_PC 
problem I am experiencing?
Thanks,

Michele


On 05/17/2014 12:26 AM, Jed Brown wrote:
> Michele Rosso <mrosso at uci.edu> writes:
>
>> Hi,
>>
>> I am solving an inhomogeneous Laplacian in 3D (basically a slightly
>> modified version of example ex34).
>> The laplacian is discretized by using a  cell-center finite difference
>> 7-point stencil with periodic BCs.
>> I am solving a time-dependent problem so the solution of the laplacian
>> is repeated at each time step with a different matrix (always SPD
>> though) and rhs. Also, the laplacian features large magnitude variations
>> in the coefficients. I solve by means of CG + GAMG as preconditioner.
>> Everything works fine for a while until I receive  a
>> DIVERGED_INDEFINITE_PC message.
> What is changing as you time step?  Is there a nonlinearity that
> activates suddenly?  Especially a bifurcation or perhaps a source term
> that is incompatible with the boundary conditions?  You could try
> -mg_levels_ksp_type richardson -mg_levels_pc_type sor.  Can you
> reproduce with a small problem?
>
> The configuration looks okay to me.
>
>> Before checking my model is incorrect I would like to rule out the
>> possibility of improper use of the linear solver.  I attached the full
>> output of a serial run with -log-summary -ksp_view
>> -ksp_converged_reason ksp_monitor_true_residual. I would appreciate if
>> you could help me in locating the issue.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/e36062c4/attachment-0001.html>

From jed at jedbrown.org  Mon May 19 17:30:06 2014
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 19 May 2014 16:30:06 -0600
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <537A8335.4080702@uci.edu>
References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org>
	<537A8335.4080702@uci.edu>
Message-ID: <8761l1qu4x.fsf@jedbrown.org>

Michele Rosso <mrosso at uci.edu> writes:

> Jed,
>
> thanks for your reply.
> By using the options you suggested, namely /-mg_levels_ksp_type 
> richardson -mg_levels_pc_type sor/, I was able to
> solve without bumping into the DIVERGED_INDEFINITE_PC message. 
> Nevertheless, the number of iterations increases drastically as the 
> simulation progresses.

What about SOR with Chebyshev?  (A little weird, but sometimes it's a
good choice.)  If the solve is expensive, you can add a few more
iterations for eigenvalue estimation.

> The Poisson's equation I am solving arises from a variable-density 
> projection method for incompressible multi-phase flows.
> At each time step the system matrix coefficients change as a consequence 
> of the change in location of the heavier phase; the rhs changes
> in time because of the change in the velocity field. Usually the 
> black-box multigrid or the deflated conjugate gradient method are used 
> to solve efficiently this type of problem: it is my understanding - 
> please correct me if I am wrong - that AMG is a generalization of the 
> former.

Dendy's "black-box MG" is a semi-geometric method for cell-centered
discretizations.  AMG is not a superset or subset of those methods.

> The only source term acting is gravity; the hydrostatic pressure is 
> removed from the governing equation in order to accommodate periodic 
> boundary conditions: this is more a hack than a clean solution. Could it 
> be the reason behind the poor performances/ DIVERGED_INDEFINITE_PC 
> problem I am experiencing?

If you have periodic boundary conditions, then you also have a pressure
null space.  Have you removed the null space from the RHS and supplied
the null space to the solver?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/eee6437a/attachment.pgp>

From mrosso at uci.edu  Mon May 19 17:41:48 2014
From: mrosso at uci.edu (Michele Rosso)
Date: Mon, 19 May 2014 15:41:48 -0700
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <8761l1qu4x.fsf@jedbrown.org>
References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org>
	<537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org>
Message-ID: <537A88AC.3060308@uci.edu>

Jed,

thank you very much!
I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type 
sor/   and report back.
Yes, I removed the nullspace from both the system matrix and the rhs.
Is there a way to have something similar to Dendy's multigrid or the 
deflated conjugate gradient method with PETSc?
Thank you,

Michele


//
On 05/19/2014 03:30 PM, Jed Brown wrote:
> Michele Rosso <mrosso at uci.edu> writes:
>
>> Jed,
>>
>> thanks for your reply.
>> By using the options you suggested, namely /-mg_levels_ksp_type
>> richardson -mg_levels_pc_type sor/, I was able to
>> solve without bumping into the DIVERGED_INDEFINITE_PC message.
>> Nevertheless, the number of iterations increases drastically as the
>> simulation progresses.
> What about SOR with Chebyshev?  (A little weird, but sometimes it's a
> good choice.)  If the solve is expensive, you can add a few more
> iterations for eigenvalue estimation.
>
>> The Poisson's equation I am solving arises from a variable-density
>> projection method for incompressible multi-phase flows.
>> At each time step the system matrix coefficients change as a consequence
>> of the change in location of the heavier phase; the rhs changes
>> in time because of the change in the velocity field. Usually the
>> black-box multigrid or the deflated conjugate gradient method are used
>> to solve efficiently this type of problem: it is my understanding -
>> please correct me if I am wrong - that AMG is a generalization of the
>> former.
> Dendy's "black-box MG" is a semi-geometric method for cell-centered
> discretizations.  AMG is not a superset or subset of those methods.
>
>> The only source term acting is gravity; the hydrostatic pressure is
>> removed from the governing equation in order to accommodate periodic
>> boundary conditions: this is more a hack than a clean solution. Could it
>> be the reason behind the poor performances/ DIVERGED_INDEFINITE_PC
>> problem I am experiencing?
> If you have periodic boundary conditions, then you also have a pressure
> null space.  Have you removed the null space from the RHS and supplied
> the null space to the solver?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/0b982ad5/attachment.html>

From jed at jedbrown.org  Mon May 19 17:49:21 2014
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 19 May 2014 16:49:21 -0600
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <537A88AC.3060308@uci.edu>
References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org>
	<537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org>
	<537A88AC.3060308@uci.edu>
Message-ID: <871tvpqt8u.fsf@jedbrown.org>

Michele Rosso <mrosso at uci.edu> writes:

> Jed,
>
> thank you very much!
> I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type 
> sor/   and report back.
> Yes, I removed the nullspace from both the system matrix and the rhs.
> Is there a way to have something similar to Dendy's multigrid or the 
> deflated conjugate gradient method with PETSc?

Dendy's MG needs geometry.  The algorithm to produce the interpolation
operators is not terribly complicated so it could be done, though DMDA
support for cell-centered is a somewhat awkward.  "Deflated CG" can mean
lots of things so you'll have to be more precise.  (Most everything in
the "deflation" world has a clear analogue in the MG world, but the
deflation community doesn't have a precise language to talk about their
methods so you always have to read the paper carefully to find out if
it's completely standard or if there is something new.)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/8e1a439d/attachment.pgp>

From andrewdalecramer at gmail.com  Mon May 19 20:50:38 2014
From: andrewdalecramer at gmail.com (Andrew Cramer)
Date: Tue, 20 May 2014 11:50:38 +1000
Subject: [petsc-users] Accessing Global Vectors
Message-ID: <CACYswdW6Cr-r+hj42-NyHgnLAgnPLYa7LwMTfkmOywqFwmsLAA@mail.gmail.com>

Hi All,


I'm new to PETSc and would like to use it as my linear elasticity solver
within a structural optimisation program. Originally I was using GP-GPUs
and CUDA for my solver but I would like to shift to using PETSc to leverage
it's breadth of trustworthy solvers. We have some SMP servers and a couple
compute clusters (one with GPUs, one without). I've been digging through
the docs and I'd like some feedback on my plan and perhaps some pointers if
at all possible.

The plan is to keep the 6000 lines or so of current code and try as much as
possible to use PETSc as a 'drop-in'. This would require giving one field
(array) of densities and receiving a 3d field (array) of displacements
back. Providing the density field would be easy with the usual array
construction functions on one node/process but pulling the displacements
back to the 'controlling' node would be difficult.

I understand that this goes against the ethos of PETSc which is distributed
all the way. My code is highly modular with differing objective functions
and optimisers (some of which are written by other research groups) that I
drop in and pull out. I don't want to throw all that away. I would need to
relearn object oriented programming within PETSc (currently I use c++) and
rewrite my entire code base. In terms of performance the optimisers
typically rely heavily on tight loops of reductions once the solve is
completed so I'm not sure that the speed-up would be too great rewriting
them as distributed anyway.

Sorry for the long winded post but I'm just not sure how to move forward,
I'm sick of implementing every solver I want to try in CUDA especially
knowing that people have done it better than I can in PETSc. But it's a
framework that I don't know how to interface with, all the examples seem to
have the solve as the main thing rather than one part of a broader program.


Andrew Cramer
University of Queensland, Australia
PhD Candidate
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/11cc91a6/attachment.html>

From knepley at gmail.com  Mon May 19 21:27:36 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 19 May 2014 21:27:36 -0500
Subject: [petsc-users] Accessing Global Vectors
In-Reply-To: <CACYswdW6Cr-r+hj42-NyHgnLAgnPLYa7LwMTfkmOywqFwmsLAA@mail.gmail.com>
References: <CACYswdW6Cr-r+hj42-NyHgnLAgnPLYa7LwMTfkmOywqFwmsLAA@mail.gmail.com>
Message-ID: <CAMYG4G=JFwvcuYLdj9ON5TyrirV5s_01nK98-tNUXvzeMgRJug@mail.gmail.com>

On Mon, May 19, 2014 at 8:50 PM, Andrew Cramer
<andrewdalecramer at gmail.com>wrote:

> Hi All,
>
> I'm new to PETSc and would like to use it as my linear elasticity solver
> within a structural optimisation program. Originally I was using GP-GPUs
> and CUDA for my solver but I would like to shift to using PETSc to leverage
> it's breadth of trustworthy solvers. We have some SMP servers and a couple
> compute clusters (one with GPUs, one without). I've been digging through
> the docs and I'd like some feedback on my plan and perhaps some pointers if
> at all possible.
>
> The plan is to keep the 6000 lines or so of current code and try as much
> as possible to use PETSc as a 'drop-in'. This would require giving one
> field (array) of densities and receiving a 3d field (array) of
> displacements back. Providing the density field would be easy with the
> usual array construction functions on one node/process but pulling the
> displacements back to the 'controlling' node would be difficult.
>
> I understand that this goes against the ethos of PETSc which is
> distributed all the way. My code is highly modular with differing objective
> functions and optimisers (some of which are written by other research
> groups) that I drop in and pull out. I don't want to throw all that away. I
> would need to relearn object oriented programming within PETSc (currently I
> use c++) and rewrite my entire code base. In terms of performance the
> optimisers typically rely heavily on tight loops of reductions once the
> solve is completed so I'm not sure that the speed-up would be too great
> rewriting them as distributed anyway.
>
> Sorry for the long winded post but I'm just not sure how to move forward,
> I'm sick of implementing every solver I want to try in CUDA especially
> knowing that people have done it better than I can in PETSc. But it's a
> framework that I don't know how to interface with, all the examples seem to
> have the solve as the main thing rather than one part of a broader program.
>

1) PETSc can do a good job on linear elasticity. GAMG is particularly
effective, and we have an example:


http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex56.c.html

2) You can use this function to go back and forth from 1 process


http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Vec/VecScatterCreateToZero.html

3) The expense of pushing all that data to nodes can large. You might be
better off just using GAMG on 1 process, which is how I would start.

    Matt


> Andrew Cramer
> University of Queensland, Australia
> PhD Candidate
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140519/439f384f/attachment-0001.html>

From andrewdalecramer at gmail.com  Tue May 20 00:57:58 2014
From: andrewdalecramer at gmail.com (Andrew Cramer)
Date: Tue, 20 May 2014 15:57:58 +1000
Subject: [petsc-users] Accessing Global Vectors
In-Reply-To: <CAMYG4G=JFwvcuYLdj9ON5TyrirV5s_01nK98-tNUXvzeMgRJug@mail.gmail.com>
References: <CACYswdW6Cr-r+hj42-NyHgnLAgnPLYa7LwMTfkmOywqFwmsLAA@mail.gmail.com>
	<CAMYG4G=JFwvcuYLdj9ON5TyrirV5s_01nK98-tNUXvzeMgRJug@mail.gmail.com>
Message-ID: <CACYswdW=KmLmHSvhYqgLMPZ3gDx6WZ9YzZg97ZL+q+DdbVcJZA@mail.gmail.com>

On 20 May 2014 12:27, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, May 19, 2014 at 8:50 PM, Andrew Cramer <andrewdalecramer at gmail.com
> > wrote:
>
>> Hi All,
>>
>> I'm new to PETSc and would like to use it as my linear elasticity solver
>> within a structural optimisation program. Originally I was using GP-GPUs
>> and CUDA for my solver but I would like to shift to using PETSc to leverage
>> it's breadth of trustworthy solvers. We have some SMP servers and a couple
>> compute clusters (one with GPUs, one without). I've been digging through
>> the docs and I'd like some feedback on my plan and perhaps some pointers if
>> at all possible.
>>
>> The plan is to keep the 6000 lines or so of current code and try as much
>> as possible to use PETSc as a 'drop-in'. This would require giving one
>> field (array) of densities and receiving a 3d field (array) of
>> displacements back. Providing the density field would be easy with the
>> usual array construction functions on one node/process but pulling the
>> displacements back to the 'controlling' node would be difficult.
>>
>> I understand that this goes against the ethos of PETSc which is
>> distributed all the way. My code is highly modular with differing objective
>> functions and optimisers (some of which are written by other research
>> groups) that I drop in and pull out. I don't want to throw all that away. I
>> would need to relearn object oriented programming within PETSc (currently I
>> use c++) and rewrite my entire code base. In terms of performance the
>> optimisers typically rely heavily on tight loops of reductions once the
>> solve is completed so I'm not sure that the speed-up would be too great
>> rewriting them as distributed anyway.
>>
>> Sorry for the long winded post but I'm just not sure how to move forward,
>> I'm sick of implementing every solver I want to try in CUDA especially
>> knowing that people have done it better than I can in PETSc. But it's a
>> framework that I don't know how to interface with, all the examples seem to
>> have the solve as the main thing rather than one part of a broader program.
>>
>
> 1) PETSc can do a good job on linear elasticity. GAMG is particularly
> effective, and we have an example:
>
>
> http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex56.c.html
>
> 2) You can use this function to go back and forth from 1 process
>
>
> http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Vec/VecScatterCreateToZero.html
>
> 3) The expense of pushing all that data to nodes can large. You might be
> better off just using GAMG on 1 process, which is how I would start.
>
>     Matt
>
>
>> Andrew Cramer
>> University of Queensland, Australia
>> PhD Candidate
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


Thanks for your help, I was eyeing off ksp/ex29 as it uses DMDA which I
thought would simplify things. I'll take a look at ex56 instead and see
what I can do.

Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/9eab4149/attachment.html>

From knepley at gmail.com  Tue May 20 05:22:35 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 20 May 2014 05:22:35 -0500
Subject: [petsc-users] Accessing Global Vectors
In-Reply-To: <CACYswdW=KmLmHSvhYqgLMPZ3gDx6WZ9YzZg97ZL+q+DdbVcJZA@mail.gmail.com>
References: <CACYswdW6Cr-r+hj42-NyHgnLAgnPLYa7LwMTfkmOywqFwmsLAA@mail.gmail.com>
	<CAMYG4G=JFwvcuYLdj9ON5TyrirV5s_01nK98-tNUXvzeMgRJug@mail.gmail.com>
	<CACYswdW=KmLmHSvhYqgLMPZ3gDx6WZ9YzZg97ZL+q+DdbVcJZA@mail.gmail.com>
Message-ID: <CAMYG4GkiK6y3wJAgsP2F9-pC1E2y3tXdOuTtWT0e9qJdtKCK6A@mail.gmail.com>

On Tue, May 20, 2014 at 12:57 AM, Andrew Cramer
<andrewdalecramer at gmail.com>wrote:

> On 20 May 2014 12:27, Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Mon, May 19, 2014 at 8:50 PM, Andrew Cramer <
>> andrewdalecramer at gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I'm new to PETSc and would like to use it as my linear elasticity solver
>>> within a structural optimisation program. Originally I was using GP-GPUs
>>> and CUDA for my solver but I would like to shift to using PETSc to leverage
>>> it's breadth of trustworthy solvers. We have some SMP servers and a couple
>>> compute clusters (one with GPUs, one without). I've been digging through
>>> the docs and I'd like some feedback on my plan and perhaps some pointers if
>>> at all possible.
>>>
>>> The plan is to keep the 6000 lines or so of current code and try as much
>>> as possible to use PETSc as a 'drop-in'. This would require giving one
>>> field (array) of densities and receiving a 3d field (array) of
>>> displacements back. Providing the density field would be easy with the
>>> usual array construction functions on one node/process but pulling the
>>> displacements back to the 'controlling' node would be difficult.
>>>
>>> I understand that this goes against the ethos of PETSc which is
>>> distributed all the way. My code is highly modular with differing objective
>>> functions and optimisers (some of which are written by other research
>>> groups) that I drop in and pull out. I don't want to throw all that away. I
>>> would need to relearn object oriented programming within PETSc (currently I
>>> use c++) and rewrite my entire code base. In terms of performance the
>>> optimisers typically rely heavily on tight loops of reductions once the
>>> solve is completed so I'm not sure that the speed-up would be too great
>>> rewriting them as distributed anyway.
>>>
>>> Sorry for the long winded post but I'm just not sure how to move
>>> forward, I'm sick of implementing every solver I want to try in CUDA
>>> especially knowing that people have done it better than I can in PETSc. But
>>> it's a framework that I don't know how to interface with, all the examples
>>> seem to have the solve as the main thing rather than one part of a broader
>>> program.
>>>
>>
>> 1) PETSc can do a good job on linear elasticity. GAMG is particularly
>> effective, and we have an example:
>>
>>
>> http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/examples/tutorials/ex56.c.html
>>
>> 2) You can use this function to go back and forth from 1 process
>>
>>
>> http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Vec/VecScatterCreateToZero.html
>>
>> 3) The expense of pushing all that data to nodes can large. You might be
>> better off just using GAMG on 1 process, which is how I would start.
>>
>>     Matt
>>
>>
>>> Andrew Cramer
>>> University of Queensland, Australia
>>> PhD Candidate
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
> Thanks for your help, I was eyeing off ksp/ex29 as it uses DMDA which I
> thought would simplify things. I'll take a look at ex56 instead and see
> what I can do.
>

If you have a completely structured grid, DMDA is definitely simpler,
although it is a little awkward for cell-centered
discretizations. We have some really new support for arbitrary
discretizations on top of DMDA, but it is alpha.

  Thanks,

     Matt


> Andrew
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/1b338233/attachment.html>

From christophe.ortiz at ciemat.es  Tue May 20 07:12:45 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Tue, 20 May 2014 14:12:45 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <87zjij6gzu.fsf@jedbrown.org>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
Message-ID: <CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>

I found another problem when using two-dimensional arrays defined using
pointers of pointers.

When I use a "classical" two-dimensional array defined by

PetscScalar  array[dof][dof];

and then build the Jacobian using

ierr = MatSetValuesBlocked(*Jpre,1,&row,1,&col,&array[0][0],INSERT_VALUES);

It works fine.

The problem comes when I define the two-dimensional array as follows:

PetscScalar  **array;

 array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
 for (k = 0; k < dof; k++){
   array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
  }

When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc
complains because I am not passing it the right way or when it accepts it,
wrong data is passed because the solution is not correct. Maybe Petsc
expect dof*dof values and only dof are passed ?

How a two-dimensional array declared with pointers of pointers should be
passed to MatSetValuesBlocked() ?

Many thanks in advance.
Christophe


On Thu, May 15, 2014 at 2:08 AM, Jed Brown <jed at jedbrown.org> wrote:

> Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
>
> > Hi all,
> >
> > I am experiencing some problems of memory corruption with PetscMemzero().
> >
> > I set the values of the Jacobian by blocks using MatSetValuesBlocked().
> To
> > do so, I use some temporary two-dimensional arrays[dof][dof] that I must
> > reset at each loop.
> >
> > Inside FormIJacobian, for instance, I declare the following
> two-dimensional
> > array:
> >
> >    PetscScalar  diag[dof][dof];
> >
> > and then, to zero the array diag[][] I do
> >
> >    ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar));
>
> Note that this can also be spelled
>
>   PetscMemzero(diag,sizeof diag);
>
> > Then, inside main(), once dof is determined, I allocate memory for diag
> as
> > follows:
> >
> >  diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
> >
> >  for (k = 0; k < dof; k++){
> >   diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
> >  }
> > That is, the classical way to allocate memory using the pointer notation.
>
> Note that you can do a contiguous allocation by creating a Vec, then use
> VecGetArray2D to get 2D indexing of it.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/1c89abed/attachment.html>

From knepley at gmail.com  Tue May 20 07:16:05 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 20 May 2014 07:16:05 -0500
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
Message-ID: <CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>

On Tue, May 20, 2014 at 7:12 AM, Christophe Ortiz <
christophe.ortiz at ciemat.es> wrote:

> I found another problem when using two-dimensional arrays defined using
> pointers of pointers.
>
> When I use a "classical" two-dimensional array defined by
>
> PetscScalar  array[dof][dof];
>

This declaration will use contiguous memory since its on the stack.


> and then build the Jacobian using
>
> ierr = MatSetValuesBlocked(*Jpre,1,&row,1,&col,&array[0][0],INSERT_VALUES);
>
> It works fine.
>
> The problem comes when I define the two-dimensional array as follows:
>
> PetscScalar  **array;
>

This one uses non-contiguous memory since its on the heap.


>  array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
>  for (k = 0; k < dof; k++){
>    array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
>   }
>
> When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc
> complains because I am not passing it the right way or when it accepts it,
> wrong data is passed because the solution is not correct. Maybe Petsc
> expect dof*dof values and only dof are passed ?
>

You can only pass contiguous memory to MatSetValues().

   Matt


> How a two-dimensional array declared with pointers of pointers should be
> passed to MatSetValuesBlocked() ?
>
> Many thanks in advance.
> Christophe
>
>
>
>
> On Thu, May 15, 2014 at 2:08 AM, Jed Brown <jed at jedbrown.org> wrote:
>
>> Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
>>
>> > Hi all,
>> >
>> > I am experiencing some problems of memory corruption with
>> PetscMemzero().
>> >
>> > I set the values of the Jacobian by blocks using MatSetValuesBlocked().
>> To
>> > do so, I use some temporary two-dimensional arrays[dof][dof] that I must
>> > reset at each loop.
>> >
>> > Inside FormIJacobian, for instance, I declare the following
>> two-dimensional
>> > array:
>> >
>> >    PetscScalar  diag[dof][dof];
>> >
>> > and then, to zero the array diag[][] I do
>> >
>> >    ierr = PetscMemzero(diag,dof*dof*sizeof(PetscScalar));
>>
>> Note that this can also be spelled
>>
>>   PetscMemzero(diag,sizeof diag);
>>
>> > Then, inside main(), once dof is determined, I allocate memory for diag
>> as
>> > follows:
>> >
>> >  diag = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
>> >
>> >  for (k = 0; k < dof; k++){
>> >   diag[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
>> >  }
>> > That is, the classical way to allocate memory using the pointer
>> notation.
>>
>> Note that you can do a contiguous allocation by creating a Vec, then use
>> VecGetArray2D to get 2D indexing of it.
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/b85bf640/attachment-0001.html>

From altriaex86 at gmail.com  Tue May 20 07:28:34 2014
From: altriaex86 at gmail.com (=?UTF-8?B?5byg5Zu954aZ?=)
Date: Tue, 20 May 2014 22:28:34 +1000
Subject: [petsc-users] Execution time of superlu_dist increase in
	multiprocessing
Message-ID: <CACDKL=yVXVmoJ60pQCZDtY-eNJLqZ4Uw6Xe_9-WDH9xN4+MDBg@mail.gmail.com>

Hi,

I'm working on standard eigensolving with spectrum transform. I tried mumps
and superlu_dist for ST. But I found that when I run my program  with more
process, execution time of mumps decrease, but time of superlu_dist
increase. Both of them are called by options like

char  common_options[]      = "-st_ksp_type preonly -st_pc_type lu
-st_pc_factor_mat_solver_package mumps";

ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr);

Shall I set more parameters to get benefit of parallel computing when using
superlu_dist? My mattype is mpiaij.

Your sincerely
Guoxi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/85ffb795/attachment.html>

From zyzhang at nuaa.edu.cn  Tue May 20 07:31:11 2014
From: zyzhang at nuaa.edu.cn (Zhang)
Date: Tue, 20 May 2014 20:31:11 +0800 (GMT+08:00)
Subject: [petsc-users] How to run snes ex12 with petsc-3.4.4
Message-ID: <82f67e.5b78.146199d356a.Coremail.zyzhang@nuaa.edu.cn>

Dear All,

I am trying the PetscFEM solver with petsc-3.4.4. 


But when I run snes/ex12, I always got run time errors.


[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: --------------------- Stack Frames ------------------------------------
[1]PETSC ERROR: likely location of problem given in stack below
[1]PETSC ERROR: --------------------- Stack Frames ------------------------------------
[1]PETSC ERROR: [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR: INSTEAD the line number of the start of the function
Note: The EXACT line numbers in the stack are not available,
[1]PETSC ERROR: INSTEAD the line number of the start of the function
[1]PETSC ERROR: is given.
[1]PETSC ERROR: [1] DMPlexProjectFunctionLocal line 230 /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c
[1]PETSC ERROR: [1] DMPlexProjectFunction line 338 /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c
[0]PETSC ERROR: is given.
[0]PETSC ERROR: [0] DMPlexProjectFunctionLocal line 230 /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c
[0]PETSC ERROR: [0] DMPlexProjectFunction line 338 /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[1]PETSC ERROR: --------------------- Error Message ------------------------------------
[1]PETSC ERROR: Signal received!
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
[1]PETSC ERROR: See docs/changes/index.html for recent updates.
[1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[1]PETSC ERROR: See docs/index.html for manual pages.
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
ex12 on a arch-linux2-c-opt named toshiba by zhenyu Tue May 20 20:26:56 2014
[1]PETSC ERROR: Libraries linked from /home/zhenyu/petsc-3.4.4/arch-linux2-c-opt/lib
[1]PETSC ERROR: Configure run at Mon May 19 23:24:37 2014
[1]PETSC ERROR: Configure options --download-cmake=1 --download-fblaslapack=1 --download-f2cblaslapack=1 --download-fftw=1 --download-ptscotch=1 --download-ctetgen=1 --download-petsc4py=1 --download-ml=1 --download-parmetis=1 --download-metis=1 --download-superlu_dist=1 --download-hypre=1 --download-c2html=1 --download-generator=1 --download-fiat=1 --download-scientificpython=1 --download-sowing=1 --download-triangle=1 --download-chaco=1 --download-boost=1 --download-exodusii=1 --download-netcdf=1 --download-netcdf-shared=1 --download-hdf5=1 --download-moab-shared=1 --download-suitesparse=1 --with-mpi-dir=/home/zhenyu/deps/openmpi-1.6.5 --with-pthread=1 --with-valgrind=1
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: ex12 on a arch-linux2-c-opt named toshiba by zhenyu Tue May 20 20:26:56 2014
[0]PETSC ERROR: Libraries linked from /home/zhenyu/petsc-3.4.4/arch-linux2-c-opt/lib
[0]PETSC ERROR: Configure run at Mon May 19 23:24:37 2014
[0]PETSC ERROR: Configure options --download-cmake=1 --download-fblaslapack=1 --download-f2cblaslapack=1 --download-fftw=1 --download-ptscotch=1 --download-ctetgen=1 --download-petsc4py=1 --download-ml=1 --download-parmetis=1 --download-metis=1 --download-superlu_dist=1 --download-hypre=1 --download-c2html=1 --download-generator=1 --download-fiat=1 --download-scientificpython=1 --download-sowing=1 --download-triangle=1 --download-chaco=1 --download-boost=1 --download-exodusii=1 --download-netcdf=1 --download-netcdf-shared=1 --download-hdf5=1 --download-moab-shared=1 --download-suitesparse=1 --with-mpi-dir=/home/zhenyu/deps/openmpi-1.6.5 --with-pthread=1 --with-valgrind=1
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 3027 on
node toshiba exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[toshiba:03025] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[toshiba:03025] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages


Well, for a smooth compiling, I made two correction to ex12.c

Line 195:   options->fem.bcFuncs = (void (**)(const PetscReal[], PetscScalar *)) &options->exactFuncs;

Line 574:   void (*initialGuess[numComponents])(const PetscReal x[],PetscScalar* u);

then generate ex12.h by 

PETSC_DIR=$HOME/petsc-3.4.4
DIM=2
ORDER=1

CASE=ex12
$PETSC_DIR/bin/pythonscripts/PetscGenerateFEMQuadrature.py \
$DIM $ORDER $DIM 1 laplacian \
$DIM $ORDER $DIM 1 boundary \
$PETSC_DIR/src/snes/examples/tutorials/$CASE.h

Since I am still not fully master the machnism of PetscFEM, 

could anyone show me a proper way to run this demo? Many thanks.


Zhenyu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/a3f23ef3/attachment.html>

From hzhang at mcs.anl.gov  Tue May 20 08:13:40 2014
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Tue, 20 May 2014 09:13:40 -0400
Subject: [petsc-users] Execution time of superlu_dist increase in
	multiprocessing
In-Reply-To: <22c24f97a03440af9e5ab46fd6ff25e9@LUCKMAN.anl.gov>
References: <22c24f97a03440af9e5ab46fd6ff25e9@LUCKMAN.anl.gov>
Message-ID: <CAGCphBuEUXBT-EtZqNnS-t1p9yaJvJZ4WRTcK8-406c+9P93og@mail.gmail.com>

?? :
You have to experiment to find out which package and options to give
better performance. Run your code with '-help' to see available
runtime options for mumps and superlu_dist. Then try different
options. I would try different matrix ordering first.

Different packages and solvers give different performance for an
application. One cannot expect same performance.

Hong
>
> I'm working on standard eigensolving with spectrum transform. I tried mumps
> and superlu_dist for ST. But I found that when I run my program  with more
> process, execution time of mumps decrease, but time of superlu_dist
> increase. Both of them are called by options like
>
> char  common_options[]      = "-st_ksp_type preonly -st_pc_type lu
> -st_pc_factor_mat_solver_package mumps";
>
> ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr);
>
> Shall I set more parameters to get benefit of parallel computing when using
> superlu_dist? My mattype is mpiaij.
>
> Your sincerely
> Guoxi

From jed at jedbrown.org  Tue May 20 08:17:27 2014
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 20 May 2014 07:17:27 -0600
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
Message-ID: <87wqdgpp20.fsf@jedbrown.org>

Matthew Knepley <knepley at gmail.com> writes:
>>  array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
>>  for (k = 0; k < dof; k++){
>>    array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
>>   }
>>
>> When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc
>> complains because I am not passing it the right way or when it accepts it,
>> wrong data is passed because the solution is not correct. Maybe Petsc
>> expect dof*dof values and only dof are passed ?
>>
>
> You can only pass contiguous memory to MatSetValues().

And, while perhaps atypical, VecGetArray2D will give you contiguous
memory behind the scenes, so it would work in this case.  (Make a Vec of
the right size using COMM_SELF instead of malloc.)

With C99, you can use VLA pointers to get the "2D indexing" without
setting up explicit pointers.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/359f6aa4/attachment.pgp>

From christophe.ortiz at ciemat.es  Tue May 20 08:24:31 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Tue, 20 May 2014 15:24:31 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <87wqdgpp20.fsf@jedbrown.org>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
Message-ID: <CANBrw+5qqmRgN5kJ4-51AgteL6J5hC9xwkuyp7xjwP4vPtmWvw@mail.gmail.com>

On Tue, May 20, 2014 at 3:17 PM, Jed Brown <jed at jedbrown.org> wrote:

> Matthew Knepley <knepley at gmail.com> writes:
> >>  array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
> >>  for (k = 0; k < dof; k++){
> >>    array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
> >>   }
> >>
> >> When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc
> >> complains because I am not passing it the right way or when it accepts
> it,
> >> wrong data is passed because the solution is not correct. Maybe Petsc
> >> expect dof*dof values and only dof are passed ?
> >>
> >
> > You can only pass contiguous memory to MatSetValues().
>
> And, while perhaps atypical, VecGetArray2D will give you contiguous
> memory behind the scenes, so it would work in this case.  (Make a Vec of
> the right size using COMM_SELF instead of malloc.)
>
> With C99, you can use VLA pointers to get the "2D indexing" without
> setting up explicit pointers.
>


Since for some reasons I need global two-dimensional arrays, what I did is
the following.
I declared a PetscScalar **array outside main(), ie before dof is
determined.
Then, knowing dof I use malloc inside main() to allocate memory to the
array. I use then array in different functions and in order to pass it to
MatSetValues, I copy it to a local and classical two-dimensional
array[dof][dof] (contiguous memory) which is passed to MatSetValues. It
works.

But I'll try with VecGetArray2D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/53d59e55/attachment-0001.html>

From jed at jedbrown.org  Tue May 20 08:31:36 2014
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 20 May 2014 07:31:36 -0600
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <CANBrw+5qqmRgN5kJ4-51AgteL6J5hC9xwkuyp7xjwP4vPtmWvw@mail.gmail.com>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+5qqmRgN5kJ4-51AgteL6J5hC9xwkuyp7xjwP4vPtmWvw@mail.gmail.com>
Message-ID: <87tx8kpoef.fsf@jedbrown.org>

Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> Since for some reasons I need global two-dimensional arrays, what I did is
> the following.
> I declared a PetscScalar **array outside main(), ie before dof is
> determined.
> Then, knowing dof I use malloc inside main() to allocate memory to the
> array. I use then array in different functions and in order to pass it to
> MatSetValues, I copy it to a local and classical two-dimensional
> array[dof][dof] (contiguous memory) which is passed to MatSetValues. It
> works.

This sounds like a perverse way to structure your code, but if you
insist...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/8f684c86/attachment.pgp>

From christophe.ortiz at ciemat.es  Tue May 20 08:34:26 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Tue, 20 May 2014 15:34:26 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <87tx8kpoef.fsf@jedbrown.org>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+5qqmRgN5kJ4-51AgteL6J5hC9xwkuyp7xjwP4vPtmWvw@mail.gmail.com>
	<87tx8kpoef.fsf@jedbrown.org>
Message-ID: <CANBrw+6fB5-sNOFTos0QHWozymErVLPd_vfKfBZc-fAz6K8rHw@mail.gmail.com>

On Tue, May 20, 2014 at 3:31 PM, Jed Brown <jed at jedbrown.org> wrote:

> Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> > Since for some reasons I need global two-dimensional arrays, what I did
> is
> > the following.
> > I declared a PetscScalar **array outside main(), ie before dof is
> > determined.
> > Then, knowing dof I use malloc inside main() to allocate memory to the
> > array. I use then array in different functions and in order to pass it to
> > MatSetValues, I copy it to a local and classical two-dimensional
> > array[dof][dof] (contiguous memory) which is passed to MatSetValues. It
> > works.
>
> This sounds like a perverse way to structure your code, but if you
> insist...
>

Jeje. Just trying different options...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/582d6caf/attachment.html>

From christophe.ortiz at ciemat.es  Tue May 20 09:03:37 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Tue, 20 May 2014 16:03:37 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <87wqdgpp20.fsf@jedbrown.org>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
Message-ID: <CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>

On Tue, May 20, 2014 at 3:17 PM, Jed Brown <jed at jedbrown.org> wrote:

> Matthew Knepley <knepley at gmail.com> writes:
> >>  array = (PetscScalar**)malloc(sizeof(PetscScalar*) * dof);
> >>  for (k = 0; k < dof; k++){
> >>    array[k] = (PetscScalar*)malloc(sizeof(PetscScalar) * dof);
> >>   }
> >>
> >> When I pass it to MatSetValuesBlocked() there is a problem. Either Petsc
> >> complains because I am not passing it the right way or when it accepts
> it,
> >> wrong data is passed because the solution is not correct. Maybe Petsc
> >> expect dof*dof values and only dof are passed ?
> >>
> >
> > You can only pass contiguous memory to MatSetValues().
>
> And, while perhaps atypical, VecGetArray2D will give you contiguous
> memory behind the scenes, so it would work in this case.  (Make a Vec of
> the right size using COMM_SELF instead of malloc.)
>
> With C99, you can use VLA pointers to get the "2D indexing" without
> setting up explicit pointers.
>


Would the following be ok ?

//Creation of vector X of size dof*dof:
VecCreateSeq(PETSC_COMM_SELF,dof*dof,&X);

// Using two-dimensional array style:
PetscScalar *x;
VecGetArray2d(X,dof,dof,0,0,&x);

x[i][j] = ...;

Is it ok ?
Then, what should be passed to MatSetValuesBlocked() ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/b738a0ec/attachment.html>

From knepley at gmail.com  Tue May 20 09:20:06 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 20 May 2014 09:20:06 -0500
Subject: [petsc-users] How to run snes ex12 with petsc-3.4.4
In-Reply-To: <82f67e.5b78.146199d356a.Coremail.zyzhang@nuaa.edu.cn>
References: <82f67e.5b78.146199d356a.Coremail.zyzhang@nuaa.edu.cn>
Message-ID: <CAMYG4Gn4ijWJzVZme=-8duzGX6mhEpzQDY8gjiSoe+iPE9QWPA@mail.gmail.com>

On Tue, May 20, 2014 at 7:31 AM, Zhang <zyzhang at nuaa.edu.cn> wrote:

> Dear All,
>
> I am trying the PetscFEM solver with petsc-3.4.4.
>
That is changing quickly since it is very new. Can you use 'master'?
http://www.mcs.anl.gov/petsc/developers/index.html
If you use that, Python is no longer required. Also, we will release very
soon, so its not a waste.

  Thanks,

    Matt


> But when I run snes/ex12, I always got run time errors.
>
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSCERROR: or try
> http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
> corruption errors
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try
> http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
> corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: --------------------- Stack Frames
> ------------------------------------
> [1]PETSC ERROR: likely location of problem given in stack below
> [1]PETSC ERROR: --------------------- Stack Frames
> ------------------------------------
> [1]PETSC ERROR: [0]PETSC ERROR: Note: The EXACT line numbers in the stack
> are not available,
> [0]PETSC ERROR: INSTEAD the line number of the start of the function
> Note: The EXACT line numbers in the stack are not available,
> [1]PETSC ERROR: INSTEAD the line number of the start of the function
> [1]PETSC ERROR: is given.
> [1]PETSC ERROR: [1] DMPlexProjectFunctionLocal line 230
> /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c
> [1]PETSC ERROR: [1] DMPlexProjectFunction line 338
> /home/zhenyu/petsc-3.4.4/src/dm/impl s/plex/plexfem.c
> [0]PETSC ERROR: is given.
> [0]PETSC ERROR: [0] DMPlexProjectFunctionLocal line 230
> /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c
> [0]PETSC ERROR: [0] DMPlexProjectFunction line 338
> /home/zhenyu/petsc-3.4.4/src/dm/impls/plex/plexfem.c
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [1]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [1]PETSC ERROR: Signal received!
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014
> [1]PETSC ERROR: See docs/changes/index.html for recent updates.
> [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [1]PETSC ERROR: See docs/index.html for manual pages.
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: [0]PETSC ERROR: See docs/faq.html for hints about trouble
> shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> ex12 on a arch-linux2-c-opt named toshiba by zhenyu Tue May 20 20:26:56
> 2014
> [1]PETSC ERROR: Libraries linked from
> /home/zhenyu/petsc-3.4.4/arch-linux2-c-opt/lib
> [1]PETSC ERROR: Configure run at Mon May 19 23:24:37 2014
> [1]PETSC ERROR: Configure options --download-cmake=1
> --download-fblaslapack=1 --download-f2cblaslapack=1 --download-fftw=1
> --download-ptscotch=1 --download-ctetgen=1 --download-petsc4py=1
> --download-ml=1 --download-parmetis=1 --download-metis=1
> --download-superlu_dist=1 --download-hypre=1 --download-c2html=1
> --download-generator=1 --download-fiat=1 --download-scientificpython=1
> --download-sowing=1 --download -triangle=1 --download-chaco=1
> --download-boost=1 --download-exodusii=1 --download-netcdf=1
> --download-netcdf-shared=1 --download-hdf5=1 --download-moab-shared=1
> --download-suitesparse=1 --with-mpi-dir=/home/zhenyu/deps/openmpi-1.6.5
> --with-pthread=1 --with-valgrind=1
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ex12 on a arch-linux2-c-opt named toshiba by zhenyu Tue
> May 20 20:26:56 2014
> [0]PETSC ERROR: Libraries linked from
> /home/zhenyu/petsc-3.4.4/arch-linux2-c-opt/lib
> [0]PETSC ERROR: Configure run at Mon May 19 23:24:37 2014
> [0]PETSC ERROR: Configure options --download-cmake=1
> --download-fblaslapack=1 --download-f2cblaslapack=1 --download-fftw=1
> --download-ptscotch=1 --download-ctetgen=1 --download-petsc4py=1
> --download-ml=1 --download-parmetis=1 --download-metis=1
> --download-superlu_dist=1 --download-hypre=1 --download-c2html=1
> --download-generator=1 --download-fiat=1 --download-scientificpython=1
> --download-sowing=1 --download-triangle=1 --download-chaco=1
> --download-boost=1 --download-exodusii=1 --download-netcdf=1
> --download-netcdf-shared=1 --download-hdf5=1 --download-moab-shared=1
> --download-suitesparse=1 --with-mpi-dir=/home/zhenyu/deps/openmpi-1.6.5
> --with-pthread=1 --with-valgrind=1
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 1 with PID 3027 on
> node toshiba exiting improperly. There are two reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> ----------------------------------------------------------- ---------------
> [toshiba:03025] 1 more process has sent help message help-mpi-api.txt /
> mpi-abort
> [toshiba:03025] Set MCA parameter "orte_base_help_aggregate" to 0 to see
> all help / error messages
>
>
> Well, for a smooth compiling, I made two correction to ex12.c
>
> Line 195:   options->fem.bcFuncs = (void (**)(const PetscReal[],
> PetscScalar *)) &options->exactFuncs;
>
> Line 574:   void (*initialGuess[numComponents])(const PetscReal
> x[],PetscScalar* u);
>
> then generate ex12.h by
>
> PETSC_DIR=$HOME/petsc-3.4.4
> DIM=2
> ORDER=1
>
> CASE=ex12
> $PETSC_DIR/bin/pythonscripts/PetscGenerateFEMQuadrature.py \
> $DIM $ORDER $DIM 1 laplacian \
> $DIM $ORDER $DIM 1 boundary \
> $PETSC_DIR/src/snes/examples/tutorials/$CASE.h
>
> Since I am still not fully master the machnism of PetscFEM,
>
> could anyone show me a proper way to run this demo? Many thanks.
>
>
> Zhenyu
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/fa5e7085/attachment-0001.html>

From jed at jedbrown.org  Tue May 20 09:25:00 2014
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 20 May 2014 08:25:00 -0600
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>
Message-ID: <87lhtwplxf.fsf@jedbrown.org>

Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> Would the following be ok ?
>
> //Creation of vector X of size dof*dof:
> VecCreateSeq(PETSC_COMM_SELF,dof*dof,&X);
>
> // Using two-dimensional array style:
> PetscScalar *x;

This needs to be

  PetscScalar **x;

as you would have noticed if you tried to compile.

> VecGetArray2d(X,dof,dof,0,0,&x);
>
> x[i][j] = ...;

Yes.

> Is it ok ?
> Then, what should be passed to MatSetValuesBlocked() ?

Since the array starts are (0,0), you can just pass &x[0][0].

Remember to call VecRestoreArray2d() and eventually VecDestroy().
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/6892f60d/attachment.pgp>

From bsmith at mcs.anl.gov  Tue May 20 10:20:54 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 20 May 2014 10:20:54 -0500
Subject: [petsc-users] Execution time of superlu_dist increase in
	multiprocessing
In-Reply-To: <CACDKL=yVXVmoJ60pQCZDtY-eNJLqZ4Uw6Xe_9-WDH9xN4+MDBg@mail.gmail.com>
References: <CACDKL=yVXVmoJ60pQCZDtY-eNJLqZ4Uw6Xe_9-WDH9xN4+MDBg@mail.gmail.com>
Message-ID: <4E4F2903-433E-4630-B49C-BD3CA0F76A1F@mcs.anl.gov>


   The time of a direct solver depends on the specific algorithms used by the software and very importantly the nonzero structure of the matrix. We sometimes find that one package scales better than a different package on a particular matrix but then the other package works better on a different matrix. So this is not particularly surprising what you report.

   Barry

On May 20, 2014, at 7:28 AM, ??? <altriaex86 at gmail.com> wrote:

> Hi,
> 
> I'm working on standard eigensolving with spectrum transform. I tried mumps and superlu_dist for ST. But I found that when I run my program  with more process, execution time of mumps decrease, but time of superlu_dist increase. Both of them are called by options like
> 
> char  common_options[]      = "-st_ksp_type preonly -st_pc_type lu -st_pc_factor_mat_solver_package mumps";
> 
> ierr = PetscOptionsInsertString(common_options);CHKERRQ(ierr);
> 
> Shall I set more parameters to get benefit of parallel computing when using superlu_dist? My mattype is mpiaij. 
> 
> Your sincerely
> Guoxi


From danyang.su at gmail.com  Tue May 20 13:31:57 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Tue, 20 May 2014 11:31:57 -0700
Subject: [petsc-users] Question about DMDA local vector and global vector
Message-ID: <537B9F9D.2070609@gmail.com>

Hi All,

I use DMDA for a flow problem and found the local vector and global 
vector does not match for 2D and 3D problem when dof >1.

For example, the mesh is as follows:

|proc 1|   proc 2 |   proc 3  |
|7  8  9|16 17 18|25 26 27|
|4  5  6|13 14 15|22 23 24|
|1  2  3|10 11 12|19 20 21|

/The following functions are used to create DMDA object, global vector 
and local vector./

call DMDACreate2d(Petsc_Comm_World,DMDA_BOUNDARY_NONE,     &
                      DMDA_BOUNDARY_NONE, DMDA_STENCIL_BOX,             &
nvxgbl,nvzgbl,PETSC_DECIDE,PETSC_DECIDE,          &
                      dmda_flow%dof, dmda_flow%swidth,                  &
PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,            &
                      dmda_flow%da,ierr)
call DMCreateGlobalVector(dmda_flow%da,x_flow,ierr)
call VecDuplicate(x_flow,b_flow,ierr)
call DMCreateLocalVector(dmda_flow%da,x_flow_loc,ierr)
call VecDuplicate(x_flow_loc,b_flow_loc,ierr)

/The following functions are used to compute the function (b_flow_loc)/

call VecGetArrayF90(b_flow_loc, vecpointer, ierr)
vecpointer = (compute the values here...)
call VecRestoreArrayF90(b_flow_loc,vecpointer,ierr)
call DMLocalToGlobalBegin(dmda_flow%,b_flow_loc,INSERT_VALUES, &
                           b_flow,ierr)
call DMLocalToGlobalEnd(dmda_flow%,b_flow_loc,INSERT_VALUES, &
                           b_flow,ierr)


/The data of local vector b_flow_loc for proc1, proc2 and proc3 are as 
follows (just an example, without ghost value)/
proc 1  proc 2   proc 3
   1        10        19
   2        11        20
   3        12        21
   4        13        22
   5        14        23
   6        15        24
   ...      ...       ...

/But the global vector b_flow from Vecview shows that the data is stored 
as follows (left column). I thought the global vector b_flow is like the 
right column. Is anything wrong here?/

Process [0]            Process [0]
1                              1
2                              2
3                              3
10                            4
11                            5
12                            6
...                              ...
Process [1]            Process [1]
4                              10
5                              11
6                              12
13                            13
14                            14
15                            15
...                              ...
Process [2]            Process [2]
...                            ...

Though the data distribution is different from what I thought before, 
the code works well for 1D problem and most of the 2D and 3D problem, 
but failed in newton iteration for some 2D problem with dof > 1. I use 
KSP solver, not SNES solver at present.

Thanks and regards,

Danyang


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/bd4ba9d8/attachment.html>

From knepley at gmail.com  Tue May 20 14:25:43 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 20 May 2014 14:25:43 -0500
Subject: [petsc-users] Question about DMDA local vector and global vector
In-Reply-To: <537B9F9D.2070609@gmail.com>
References: <537B9F9D.2070609@gmail.com>
Message-ID: <CAMYG4Gmb-pZ7Y7FNSt8qvjtMzRuMQs43ci9-xp35rs0ROTUBAA@mail.gmail.com>

On Tue, May 20, 2014 at 1:31 PM, Danyang Su <danyang.su at gmail.com> wrote:

>  Hi All,
>
> I use DMDA for a flow problem and found the local vector and global vector
> does not match for 2D and 3D problem when dof >1.
>
> For example, the mesh is as follows:
>
> |proc 1|   proc 2 |   proc 3  |
> |7  8  9|16 17 18|25 26 27|
> |4  5  6|13 14 15|22 23 24|
> |1  2  3|10 11 12|19 20 21|
>
> *The following functions are used to create DMDA object, global vector and
> local vector.*
>
> call DMDACreate2d(Petsc_Comm_World,DMDA_BOUNDARY_NONE,     &
>                      DMDA_BOUNDARY_NONE, DMDA_STENCIL_BOX,             &
>                      nvxgbl,nvzgbl,PETSC_DECIDE,PETSC_DECIDE,          &
>                      dmda_flow%dof, dmda_flow%swidth,                  &
>                      PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,            &
>                      dmda_flow%da,ierr)
> call DMCreateGlobalVector(dmda_flow%da,x_flow,ierr)
> call VecDuplicate(x_flow,b_flow,ierr)
> call DMCreateLocalVector(dmda_flow%da,x_flow_loc,ierr)
> call VecDuplicate(x_flow_loc,b_flow_loc,ierr)
>
> *The following functions are used to compute the function (b_flow_loc)*
>
> call VecGetArrayF90(b_flow_loc, vecpointer, ierr)
> vecpointer = (compute the values here...)
> call VecRestoreArrayF90(b_flow_loc,vecpointer,ierr)
> call DMLocalToGlobalBegin(dmda_flow%,b_flow_loc,INSERT_VALUES,        &
>                           b_flow,ierr)
> call DMLocalToGlobalEnd(dmda_flow%,b_flow_loc,INSERT_VALUES,          &
>                           b_flow,ierr)
>
>
> *The data of local vector b_flow_loc for proc1, proc2 and proc3 are as
> follows (just an example, without ghost value)*
> proc 1  proc 2   proc 3
>   1        10        19
>   2        11        20
>   3        12        21
>   4        13        22
>   5        14        23
>   6        15        24
>   ...      ...       ...
>
> *But the global vector b_flow from Vecview shows that the data is stored
> as follows (left column). I thought the global vector b_flow is like the
> right column. Is anything wrong here?*
>

On output, the global vectors are automatically permuted to the natural
ordering.

   Matt


>
> Process [0]            Process [0]
> 1                              1
> 2                              2
> 3                              3
> 10                            4
> 11                            5
> 12                            6
> ...                              ...
> Process [1]            Process [1]
> 4                              10
> 5                              11
> 6                              12
> 13                            13
> 14                            14
> 15                            15
> ...                              ...
> Process [2]            Process [2]
> ...                            ...
>
> Though the data distribution is different from what I thought before, the
> code works well for 1D problem and most of the 2D and 3D problem, but
> failed in newton iteration for some 2D problem with dof > 1. I use KSP
> solver, not SNES solver at present.
>
> Thanks and regards,
>
> Danyang
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/e2805c2f/attachment.html>

From lb2653 at columbia.edu  Tue May 20 16:33:18 2014
From: lb2653 at columbia.edu (Luc Berger-Vergiat)
Date: Tue, 20 May 2014 17:33:18 -0400
Subject: [petsc-users] Error message for DMShell using MG preconditioner to
	solve S
Message-ID: <537BCA1E.6010508@columbi.edu>

Hi all,
I am running an FEM simulation that uses Petsc as a linear solver.
I am setting up ISs and pass them to a DMShell in order to use the 
FieldSplit capabilities of Petsc.

When I pass the following options to Petsc:

    " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur
    -pc_fieldsplit_schur_factorization_type full
    -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_0_fields 1,2
    -pc_fieldsplit_1_fields 0 -fieldsplit_0_ksp_type preonly
    -fieldsplit_0_pc_type ilu -fieldsplit_Field_0_ksp_type gmres
    -fieldsplit_Field_0_pc_type mg -malloc_log mlog -log_summary time.log"

I get an error message:

[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR:
[0]PETSC ERROR: Must call DMShellSetGlobalVector() or 
DMShellSetCreateGlobalVector()
[0]PETSC ERROR: See 
http://http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble 
shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-5071-g1163a46  
GIT Date: 2014-03-26 22:20:51 -0500
[0]PETSC ERROR: /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap 
on a arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014
[0]PETSC ERROR: Configure options --download-cmake --download-hypre 
--download-metis --download-mpich --download-parmetis 
--with-debugging=no --with-share-libraries=no
[0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in 
/home/luc/research/petsc/src/dm/impls/shell/dmshell.c
[0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in 
/home/luc/research/petsc/src/dm/interface/dm.c
[0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in 
/home/luc/research/petsc/src/dm/interface/dmget.c

I am not really sure why this happens but it only happens when 
-fieldsplit_Field_0_pc_type mg, with other preconditioners, I have no 
problems. I attached the ksp_view in case that's any use.

-- 
Best,
Luc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/0c1e7f70/attachment-0001.html>
-------------- next part --------------
KSP Object: 1 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-08, absolute=1e-16, divergence=1e+16
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: fieldsplit
    FieldSplit with Schur preconditioner, factorization FULL
    Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses A00's diagonal's inverse
    Split info:
    Split number 0 Defined by IS
    Split number 1 Defined by IS
    KSP solver for A00 block
      KSP Object:      (fieldsplit_0_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (fieldsplit_0_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Mat Object:               1 MPI processes
                type: seqaij
                rows=2000, cols=2000
                package used to perform factorization: petsc
                total: nonzeros=40000, allocated nonzeros=40000
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 400 nodes, limit used is 5
        linear system matrix = precond matrix:
        Mat Object:        (fieldsplit_0_)         1 MPI processes
          type: seqaij
          rows=2000, cols=2000
          total: nonzeros=40000, allocated nonzeros=40000
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 400 nodes, limit used is 5
    KSP solver for S = A11 - A10 inv(A00) A01 
      KSP Object:      (fieldsplit_Field_0_)       1 MPI processes
        type: gmres
          GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
          GMRES: happy breakdown tolerance 1e-30
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using PRECONDITIONED norm type for convergence test
      PC Object:      (fieldsplit_Field_0_)       1 MPI processes
        type: mg
          MG: type is MULTIPLICATIVE, levels=1 cycles=v
            Cycles per PCApply=1
            Not using Galerkin computed coarse grid matrices
        Coarse grid solver -- level -------------------------------
          KSP Object:          (fieldsplit_Field_0_mg_levels_0_)           1 MPI processes
            type: chebyshev
              Chebyshev: eigenvalue estimates:  min = 0.937483, max = 10.3123
              Chebyshev: estimated using:  [0 0.1; 0 1.1]
              KSP Object:              (fieldsplit_Field_0_mg_levels_0_est_)               1 MPI processes
                type: gmres
                  GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
                  GMRES: happy breakdown tolerance 1e-30
                maximum iterations=10, initial guess is zero
                tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
                left preconditioning
                using NONE norm type for convergence test
              PC Object:              (fieldsplit_Field_0_mg_levels_0_)               1 MPI processes
                type: sor
                  SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
                linear system matrix followed by preconditioner matrix:
                Mat Object:                (fieldsplit_Field_0_)                 1 MPI processes
                  type: schurcomplement
                  rows=209, cols=209
                    Schur complement A11 - A10 inv(A00) A01
                    A11
                      Mat Object:                      (fieldsplit_Field_0_)                       1 MPI processes
                        type: seqaij
                        rows=209, cols=209
                        total: nonzeros=3209, allocated nonzeros=3209
                        total number of mallocs used during MatSetValues calls =0
                          using I-node routines: found 119 nodes, limit used is 5
                    A10
                      Mat Object:                       1 MPI processes
                        type: seqaij
                        rows=209, cols=2000
                        total: nonzeros=14800, allocated nonzeros=14800
                        total number of mallocs used during MatSetValues calls =0
                          using I-node routines: found 119 nodes, limit used is 5
                    KSP of A00
                      KSP Object:                      (fieldsplit_0_)                       1 MPI processes
                        type: preonly
                        maximum iterations=10000, initial guess is zero
                        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
                        left preconditioning
                        using NONE norm type for convergence test
                      PC Object:                      (fieldsplit_0_)                       1 MPI processes
                        type: ilu
                          ILU: out-of-place factorization
                          0 levels of fill
                          tolerance for zero pivot 2.22045e-14
                          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
                          matrix ordering: natural
                          factor fill ratio given 1, needed 1
                            Factored matrix follows:
                              Mat Object:                               1 MPI processes
                                type: seqaij
                                rows=2000, cols=2000
                                package used to perform factorization: petsc
                                total: nonzeros=40000, allocated nonzeros=40000
                                total number of mallocs used during MatSetValues calls =0
                                  using I-node routines: found 400 nodes, limit used is 5
                        linear system matrix = precond matrix:
                        Mat Object:                        (fieldsplit_0_)                         1 MPI processes
                          type: seqaij
                          rows=2000, cols=2000
                          total: nonzeros=40000, allocated nonzeros=40000
                          total number of mallocs used during MatSetValues calls =0
                            using I-node routines: found 400 nodes, limit used is 5
                    A01
                      Mat Object:                       1 MPI processes
                        type: seqaij
                        rows=2000, cols=209
                        total: nonzeros=14800, allocated nonzeros=14800
                        total number of mallocs used during MatSetValues calls =0
                          using I-node routines: found 400 nodes, limit used is 5
                Mat Object:                 1 MPI processes
                  type: seqaij
                  rows=209, cols=209
                  total: nonzeros=3209, allocated nonzeros=3209
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 119 nodes, limit used is 5
            maximum iterations=1, initial guess is zero
            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
            left preconditioning
            using NONE norm type for convergence test
          PC Object:          (fieldsplit_Field_0_mg_levels_0_)           1 MPI processes
            type: sor
              SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
            linear system matrix followed by preconditioner matrix:
            Mat Object:            (fieldsplit_Field_0_)             1 MPI processes
              type: schurcomplement
              rows=209, cols=209
                Schur complement A11 - A10 inv(A00) A01
                A11
                  Mat Object:                  (fieldsplit_Field_0_)                   1 MPI processes
                    type: seqaij
                    rows=209, cols=209
                    total: nonzeros=3209, allocated nonzeros=3209
                    total number of mallocs used during MatSetValues calls =0
                      using I-node routines: found 119 nodes, limit used is 5
                A10
                  Mat Object:                   1 MPI processes
                    type: seqaij
                    rows=209, cols=2000
                    total: nonzeros=14800, allocated nonzeros=14800
                    total number of mallocs used during MatSetValues calls =0
                      using I-node routines: found 119 nodes, limit used is 5
                KSP of A00
                  KSP Object:                  (fieldsplit_0_)                   1 MPI processes
                    type: preonly
                    maximum iterations=10000, initial guess is zero
                    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
                    left preconditioning
                    using NONE norm type for convergence test
                  PC Object:                  (fieldsplit_0_)                   1 MPI processes
                    type: ilu
                      ILU: out-of-place factorization
                      0 levels of fill
                      tolerance for zero pivot 2.22045e-14
                      using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
                      matrix ordering: natural
                      factor fill ratio given 1, needed 1
                        Factored matrix follows:
                          Mat Object:                           1 MPI processes
                            type: seqaij
                            rows=2000, cols=2000
                            package used to perform factorization: petsc
                            total: nonzeros=40000, allocated nonzeros=40000
                            total number of mallocs used during MatSetValues calls =0
                              using I-node routines: found 400 nodes, limit used is 5
                    linear system matrix = precond matrix:
                    Mat Object:                    (fieldsplit_0_)                     1 MPI processes
                      type: seqaij
                      rows=2000, cols=2000
                      total: nonzeros=40000, allocated nonzeros=40000
                      total number of mallocs used during MatSetValues calls =0
                        using I-node routines: found 400 nodes, limit used is 5
                A01
                  Mat Object:                   1 MPI processes
                    type: seqaij
                    rows=2000, cols=209
                    total: nonzeros=14800, allocated nonzeros=14800
                    total number of mallocs used during MatSetValues calls =0
                      using I-node routines: found 400 nodes, limit used is 5
            Mat Object:             1 MPI processes
              type: seqaij
              rows=209, cols=209
              total: nonzeros=3209, allocated nonzeros=3209
              total number of mallocs used during MatSetValues calls =0
                using I-node routines: found 119 nodes, limit used is 5
        linear system matrix followed by preconditioner matrix:
        Mat Object:        (fieldsplit_Field_0_)         1 MPI processes
          type: schurcomplement
          rows=209, cols=209
            Schur complement A11 - A10 inv(A00) A01
            A11
              Mat Object:              (fieldsplit_Field_0_)               1 MPI processes
                type: seqaij
                rows=209, cols=209
                total: nonzeros=3209, allocated nonzeros=3209
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 119 nodes, limit used is 5
            A10
              Mat Object:               1 MPI processes
                type: seqaij
                rows=209, cols=2000
                total: nonzeros=14800, allocated nonzeros=14800
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 119 nodes, limit used is 5
            KSP of A00
              KSP Object:              (fieldsplit_0_)               1 MPI processes
                type: preonly
                maximum iterations=10000, initial guess is zero
                tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
                left preconditioning
                using NONE norm type for convergence test
              PC Object:              (fieldsplit_0_)               1 MPI processes
                type: ilu
                  ILU: out-of-place factorization
                  0 levels of fill
                  tolerance for zero pivot 2.22045e-14
                  using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
                  matrix ordering: natural
                  factor fill ratio given 1, needed 1
                    Factored matrix follows:
                      Mat Object:                       1 MPI processes
                        type: seqaij
                        rows=2000, cols=2000
                        package used to perform factorization: petsc
                        total: nonzeros=40000, allocated nonzeros=40000
                        total number of mallocs used during MatSetValues calls =0
                          using I-node routines: found 400 nodes, limit used is 5
                linear system matrix = precond matrix:
                Mat Object:                (fieldsplit_0_)                 1 MPI processes
                  type: seqaij
                  rows=2000, cols=2000
                  total: nonzeros=40000, allocated nonzeros=40000
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 400 nodes, limit used is 5
            A01
              Mat Object:               1 MPI processes
                type: seqaij
                rows=2000, cols=209
                total: nonzeros=14800, allocated nonzeros=14800
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 400 nodes, limit used is 5
        Mat Object:         1 MPI processes
          type: seqaij
          rows=209, cols=209
          total: nonzeros=3209, allocated nonzeros=3209
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 119 nodes, limit used is 5
  linear system matrix = precond matrix:
  Mat Object:   1 MPI processes
    type: seqaij
    rows=2209, cols=2209
    total: nonzeros=72809, allocated nonzeros=72809
    total number of mallocs used during MatSetValues calls =0
      using I-node routines: found 519 nodes, limit used is 5

From danyang.su at gmail.com  Tue May 20 16:49:31 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Tue, 20 May 2014 14:49:31 -0700
Subject: [petsc-users] Question about DMDA local vector and global vector
In-Reply-To: <CAMYG4Gmb-pZ7Y7FNSt8qvjtMzRuMQs43ci9-xp35rs0ROTUBAA@mail.gmail.com>
References: <537B9F9D.2070609@gmail.com>
	<CAMYG4Gmb-pZ7Y7FNSt8qvjtMzRuMQs43ci9-xp35rs0ROTUBAA@mail.gmail.com>
Message-ID: <537BCDEB.1070508@gmail.com>

Hi Matthew,

How about the matview output? Is this automatically permuted to the 
natural ordering too?

Thanks,

Danyang

On 20/05/2014 12:25 PM, Matthew Knepley wrote:
> On Tue, May 20, 2014 at 1:31 PM, Danyang Su <danyang.su at gmail.com 
> <mailto:danyang.su at gmail.com>> wrote:
>
>     Hi All,
>
>     I use DMDA for a flow problem and found the local vector and
>     global vector does not match for 2D and 3D problem when dof >1.
>
>     For example, the mesh is as follows:
>
>     |proc 1|   proc 2 |   proc 3  |
>     |7  8  9|16 17 18|25 26 27|
>     |4  5  6|13 14 15|22 23 24|
>     |1  2  3|10 11 12|19 20 21|
>
>     /The following functions are used to create DMDA object, global
>     vector and local vector./
>
>     call DMDACreate2d(Petsc_Comm_World,DMDA_BOUNDARY_NONE, &
>                          DMDA_BOUNDARY_NONE,
>     DMDA_STENCIL_BOX,             &
>     nvxgbl,nvzgbl,PETSC_DECIDE,PETSC_DECIDE,          &
>                          dmda_flow%dof,
>     dmda_flow%swidth,                  &
>     PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,            &
>                          dmda_flow%da,ierr)
>     call DMCreateGlobalVector(dmda_flow%da,x_flow,ierr)
>     call VecDuplicate(x_flow,b_flow,ierr)
>     call DMCreateLocalVector(dmda_flow%da,x_flow_loc,ierr)
>     call VecDuplicate(x_flow_loc,b_flow_loc,ierr)
>
>     /The following functions are used to compute the function
>     (b_flow_loc)/
>
>     call VecGetArrayF90(b_flow_loc, vecpointer, ierr)
>     vecpointer = (compute the values here...)
>     call VecRestoreArrayF90(b_flow_loc,vecpointer,ierr)
>     call DMLocalToGlobalBegin(dmda_flow%,b_flow_loc,INSERT_VALUES, &
>                               b_flow,ierr)
>     call DMLocalToGlobalEnd(dmda_flow%,b_flow_loc,INSERT_VALUES, &
>                               b_flow,ierr)
>
>
>     /The data of local vector b_flow_loc for proc1, proc2 and proc3
>     are as follows (just an example, without ghost value)/
>     proc 1  proc 2   proc 3
>       1        10        19
>       2        11        20
>       3        12        21
>       4        13        22
>       5        14        23
>       6        15        24
>       ...      ...       ...
>
>     /But the global vector b_flow from Vecview shows that the data is
>     stored as follows (left column). I thought the global vector
>     b_flow is like the right column. Is anything wrong here?/
>
>
> On output, the global vectors are automatically permuted to the 
> natural ordering.
>
>    Matt
>
>
>     Process [0]            Process [0]
>     1                              1
>     2                              2
>     3                              3
>     10                            4
>     11                            5
>     12                            6
>     ...                              ...
>     Process [1]            Process [1]
>     4                              10
>     5                              11
>     6                              12
>     13                            13
>     14                            14
>     15                            15
>     ...                              ...
>     Process [2]            Process [2]
>     ...                            ...
>
>     Though the data distribution is different from what I thought
>     before, the code works well for 1D problem and most of the 2D and
>     3D problem, but failed in newton iteration for some 2D problem
>     with dof > 1. I use KSP solver, not SNES solver at present.
>
>     Thanks and regards,
>
>     Danyang
>
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/619ae17b/attachment.html>

From jed at jedbrown.org  Tue May 20 16:54:14 2014
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 20 May 2014 15:54:14 -0600
Subject: [petsc-users] Question about DMDA local vector and global vector
In-Reply-To: <537BCDEB.1070508@gmail.com>
References: <537B9F9D.2070609@gmail.com>
	<CAMYG4Gmb-pZ7Y7FNSt8qvjtMzRuMQs43ci9-xp35rs0ROTUBAA@mail.gmail.com>
	<537BCDEB.1070508@gmail.com>
Message-ID: <8738g4nmk9.fsf@jedbrown.org>

Danyang Su <danyang.su at gmail.com> writes:

> Hi Matthew,
>
> How about the matview output? Is this automatically permuted to the 
> natural ordering too?

Yes.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/914070fd/attachment.pgp>

From stali at geology.wisc.edu  Tue May 20 18:16:56 2014
From: stali at geology.wisc.edu (Tabrez Ali)
Date: Tue, 20 May 2014 18:16:56 -0500
Subject: [petsc-users] Error message for DMShell using MG preconditioner
 to	solve S
In-Reply-To: <537BCA1E.6010508@columbi.edu>
References: <537BCA1E.6010508@columbi.edu>
Message-ID: <537BE268.5080806@geology.wisc.edu>

I saw a similar error sometime back while fooling around with 
fieldsplit. Can you update petsc-dev (git-pull), rebuild and try again.

T

On 05/20/2014 04:33 PM, Luc Berger-Vergiat wrote:
> Hi all,
> I am running an FEM simulation that uses Petsc as a linear solver.
> I am setting up ISs and pass them to a DMShell in order to use the 
> FieldSplit capabilities of Petsc.
>
> When I pass the following options to Petsc:
>
>     " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur
>     -pc_fieldsplit_schur_factorization_type full
>     -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_0_fields
>     1,2 -pc_fieldsplit_1_fields 0 -fieldsplit_0_ksp_type preonly
>     -fieldsplit_0_pc_type ilu -fieldsplit_Field_0_ksp_type gmres
>     -fieldsplit_Field_0_pc_type mg -malloc_log mlog -log_summary time.log"
>
> I get an error message:
>
> [0]PETSC ERROR: --------------------- Error Message 
> --------------------------------------------------------------
> [0]PETSC ERROR:
> [0]PETSC ERROR: Must call DMShellSetGlobalVector() or 
> DMShellSetCreateGlobalVector()
> [0]PETSC ERROR: See 
> http://http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble 
> shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-5071-g1163a46  
> GIT Date: 2014-03-26 22:20:51 -0500
> [0]PETSC ERROR: 
> /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap on a 
> arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014
> [0]PETSC ERROR: Configure options --download-cmake --download-hypre 
> --download-metis --download-mpich --download-parmetis 
> --with-debugging=no --with-share-libraries=no
> [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in 
> /home/luc/research/petsc/src/dm/impls/shell/dmshell.c
> [0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in 
> /home/luc/research/petsc/src/dm/interface/dm.c
> [0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in 
> /home/luc/research/petsc/src/dm/interface/dmget.c
>
> I am not really sure why this happens but it only happens when 
> -fieldsplit_Field_0_pc_type mg, with other preconditioners, I have no 
> problems. I attached the ksp_view in case that's any use.
> -- 
> Best,
> Luc


-- 
No one trusts a model except the one who wrote it; Everyone trusts an observation except the one who made it - Harlow Shapley

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/e9897799/attachment.html>

From knepley at gmail.com  Tue May 20 20:14:16 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 20 May 2014 20:14:16 -0500
Subject: [petsc-users] Error message for DMShell using MG preconditioner
 to solve S
In-Reply-To: <537BCA1E.6010508@columbi.edu>
References: <537BCA1E.6010508@columbi.edu>
Message-ID: <CAMYG4Gn_1X_tdQ9pgrvpPooz86+GbSgExAXtNF97sy-RLrD_ZA@mail.gmail.com>

On Tue, May 20, 2014 at 4:33 PM, Luc Berger-Vergiat <lb2653 at columbia.edu>wrote:

>  Hi all,
> I am running an FEM simulation that uses Petsc as a linear solver.
> I am setting up ISs and pass them to a DMShell in order to use the
> FieldSplit capabilities of Petsc.
>
> When I pass the following options to Petsc:
>
> " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur
> -pc_fieldsplit_schur_factorization_type full
> -pc_fieldsplit_schur_precondition selfp -pc_fieldsplit_0_fields 1,2
> -pc_fieldsplit_1_fields 0 -fieldsplit_0_ksp_type preonly
> -fieldsplit_0_pc_type ilu -fieldsplit_Field_0_ksp_type gmres
> -fieldsplit_Field_0_pc_type mg -malloc_log mlog -log_summary time.log"
>
> I get an error message:
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR:
> [0]PETSC ERROR: Must call DMShellSetGlobalVector() or
> DMShellSetCreateGlobalVector()
> [0]PETSC ERROR: See
> http://http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
> shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-5071-g1163a46  GIT
> Date: 2014-03-26 22:20:51 -0500
> [0]PETSC ERROR: /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap
> on a arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014
> [0]PETSC ERROR: Configure options --download-cmake --download-hypre
> --download-metis --download-mpich --download-parmetis --with-debugging=no
> --with-share-libraries=no
> [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in
> /home/luc/research/petsc/src/dm/impls/shell/dmshell.c
> [0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in
> /home/luc/research/petsc/src/dm/interface/dm.c
> [0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in
> /home/luc/research/petsc/src/dm/interface/dmget.c
>

Always always always give the entire error message.

  Matt


> I am not really sure why this happens but it only happens when
> -fieldsplit_Field_0_pc_type mg, with other preconditioners, I have no
> problems. I attached the ksp_view in case that's any use.
>
> --
> Best,
> Luc
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140520/eeb07b2b/attachment-0001.html>

From christophe.ortiz at ciemat.es  Wed May 21 08:18:06 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Wed, 21 May 2014 15:18:06 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <87lhtwplxf.fsf@jedbrown.org>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>
	<87lhtwplxf.fsf@jedbrown.org>
Message-ID: <CANBrw+7VjpyHedz2oqz=pW5nMOTgZd3E7RVbCwNGUCTZruw8sw@mail.gmail.com>

On Tue, May 20, 2014 at 4:25 PM, Jed Brown <jed at jedbrown.org> wrote:

> Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> > Would the following be ok ?
> >
> > //Creation of vector X of size dof*dof:
> > VecCreateSeq(PETSC_COMM_SELF,dof*dof,&X);
> >
> > // Using two-dimensional array style:
> > PetscScalar *x;
>
> This needs to be
>
>   PetscScalar **x;
>
> as you would have noticed if you tried to compile.
>
> > VecGetArray2d(X,dof,dof,0,0,&x);
> >
> > x[i][j] = ...;
>
> Yes.
>
> > Is it ok ?
> > Then, what should be passed to MatSetValuesBlocked() ?
>
> Since the array starts are (0,0), you can just pass &x[0][0].
>
> Remember to call VecRestoreArray2d() and eventually VecDestroy().
>

I tried and it works. The advantage is that it avoids setting up and using
pointers.
However, I found out that it is significantly slower than using explicit
pointers of pointers **.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140521/e2792ac0/attachment.html>

From jed at jedbrown.org  Wed May 21 08:21:47 2014
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 21 May 2014 07:21:47 -0600
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <CANBrw+7VjpyHedz2oqz=pW5nMOTgZd3E7RVbCwNGUCTZruw8sw@mail.gmail.com>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>
	<87lhtwplxf.fsf@jedbrown.org>
	<CANBrw+7VjpyHedz2oqz=pW5nMOTgZd3E7RVbCwNGUCTZruw8sw@mail.gmail.com>
Message-ID: <87egznmfmc.fsf@jedbrown.org>

Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> I tried and it works. The advantage is that it avoids setting up and
> using pointers.  However, I found out that it is significantly slower
> than using explicit pointers of pointers **.

Are you creating and destroying in an inner loop?  What gets set up is
the same, and it's fewer allocations than what you were doing with many
calls to malloc.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140521/1fea18d7/attachment.pgp>

From christophe.ortiz at ciemat.es  Wed May 21 08:31:47 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Wed, 21 May 2014 15:31:47 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <87egznmfmc.fsf@jedbrown.org>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>
	<87lhtwplxf.fsf@jedbrown.org>
	<CANBrw+7VjpyHedz2oqz=pW5nMOTgZd3E7RVbCwNGUCTZruw8sw@mail.gmail.com>
	<87egznmfmc.fsf@jedbrown.org>
Message-ID: <CANBrw+6quMhPKTEm-CfaYf2AfC3BtxfC3pSL_ZCw-R0ug39Fog@mail.gmail.com>

On Wed, May 21, 2014 at 3:21 PM, Jed Brown <jed at jedbrown.org> wrote:

> Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> > I tried and it works. The advantage is that it avoids setting up and
> > using pointers.  However, I found out that it is significantly slower
> > than using explicit pointers of pointers **.
>
> Are you creating and destroying in an inner loop?


In some sense, yes. I create and destroy inside FormIJacobian() (my
Jacobian evaluation routine). Therefore it is called at each timestep. I
guess this takes time. But it is slower than doing the many malloc.

How can I create a global vector that would be passed to FormIJacobian() ?
Creating it only once instead of doing it at each timestep would save time.
I need to use this vector (size dof*dof) with classes and methods inside
FormIJacobian() to calculate the different blocks that are passed to the
Jacobian with MatSetValuesBlocked(). However, I cannot pass it as argument
of FormIJacobian() since there is no room for it in the arguments.


>  What gets set up is
> the same, and it's fewer allocations than what you were doing with many
> calls to malloc.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140521/7d321a01/attachment.html>

From jed at jedbrown.org  Wed May 21 08:44:13 2014
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 21 May 2014 07:44:13 -0600
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <CANBrw+6quMhPKTEm-CfaYf2AfC3BtxfC3pSL_ZCw-R0ug39Fog@mail.gmail.com>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>
	<87lhtwplxf.fsf@jedbrown.org>
	<CANBrw+7VjpyHedz2oqz=pW5nMOTgZd3E7RVbCwNGUCTZruw8sw@mail.gmail.com>
	<87egznmfmc.fsf@jedbrown.org>
	<CANBrw+6quMhPKTEm-CfaYf2AfC3BtxfC3pSL_ZCw-R0ug39Fog@mail.gmail.com>
Message-ID: <87a9abmeky.fsf@jedbrown.org>

Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> In some sense, yes. I create and destroy inside FormIJacobian() (my
> Jacobian evaluation routine). Therefore it is called at each timestep. I
> guess this takes time. But it is slower than doing the many malloc.

What communicator (you should use VecCreateSeq)?  Be sure to profile a
configure --with-debugging=0.

How many elements do you have on each process?  How big are the
elements?

> How can I create a global vector that would be passed to FormIJacobian() ?
> Creating it only once instead of doing it at each timestep would save time.

You can/should always put this stuff in the user context (which comes in
via the last argument).

> I need to use this vector (size dof*dof) with classes and methods inside
> FormIJacobian() to calculate the different blocks that are passed to the
> Jacobian with MatSetValuesBlocked(). However, I cannot pass it as argument
> of FormIJacobian() since there is no room for it in the arguments.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140521/c9cb1530/attachment.pgp>

From christophe.ortiz at ciemat.es  Wed May 21 08:49:58 2014
From: christophe.ortiz at ciemat.es (Christophe Ortiz)
Date: Wed, 21 May 2014 15:49:58 +0200
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <87a9abmeky.fsf@jedbrown.org>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>
	<87lhtwplxf.fsf@jedbrown.org>
	<CANBrw+7VjpyHedz2oqz=pW5nMOTgZd3E7RVbCwNGUCTZruw8sw@mail.gmail.com>
	<87egznmfmc.fsf@jedbrown.org>
	<CANBrw+6quMhPKTEm-CfaYf2AfC3BtxfC3pSL_ZCw-R0ug39Fog@mail.gmail.com>
	<87a9abmeky.fsf@jedbrown.org>
Message-ID: <CANBrw+57EsEC2=4KOASe4JysGBaKKqGjsBtbZZBBygqVT51AMQ@mail.gmail.com>

On Wed, May 21, 2014 at 3:44 PM, Jed Brown <jed at jedbrown.org> wrote:

> Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> > In some sense, yes. I create and destroy inside FormIJacobian() (my
> > Jacobian evaluation routine). Therefore it is called at each timestep. I
> > guess this takes time. But it is slower than doing the many malloc.
>
> What communicator (you should use VecCreateSeq)?


I used:
  VecCreateSeq(PETSC_COMM_SELF,dof*dof,&X);

inside FormIJacobian()


>  Be sure to profile a
> configure --with-debugging=0.
>
> How many elements do you have on each process?  How big are the
> elements?
>

For the moment, dof is small (dof=4). Still doing some tests with the
classes and methods. But should reach 1000-10000 in production.


>
> > How can I create a global vector that would be passed to FormIJacobian()
> ?
> > Creating it only once instead of doing it at each timestep would save
> time.
>
> You can/should always put this stuff in the user context (which comes in
> via the last argument).
>

Ahhh...did not think about it ! This would allow to create the vector only
once in the main (after dof is determined) and pass it as argument to
FormIJacobian(). I will try. Thanks !


>
> > I need to use this vector (size dof*dof) with classes and methods inside
> > FormIJacobian() to calculate the different blocks that are passed to the
> > Jacobian with MatSetValuesBlocked(). However, I cannot pass it as
> argument
> > of FormIJacobian() since there is no room for it in the arguments.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140521/270f72c8/attachment.html>

From jed at jedbrown.org  Wed May 21 08:55:40 2014
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 21 May 2014 07:55:40 -0600
Subject: [petsc-users] Memory corruption with two-dimensional array and
	PetscMemzero
In-Reply-To: <CANBrw+57EsEC2=4KOASe4JysGBaKKqGjsBtbZZBBygqVT51AMQ@mail.gmail.com>
References: <CANBrw+7qxvvZbX2oLmniiOo1zb3-waGW+04Q3wepONkz-4DbVg@mail.gmail.com>
	<87zjij6gzu.fsf@jedbrown.org>
	<CANBrw+66LMu0Hy6frcLstKWvZC2=DL3qjVyavnwP3yPkkSSfSw@mail.gmail.com>
	<CAMYG4GmwjW6ZRJYvZ8ykZMgQRLORMizghqfbFH4XqakE-Sonug@mail.gmail.com>
	<87wqdgpp20.fsf@jedbrown.org>
	<CANBrw+7tXa6vmm4+UvKA57zk_VyB92zdfZUqtwoVgibfhasbWQ@mail.gmail.com>
	<87lhtwplxf.fsf@jedbrown.org>
	<CANBrw+7VjpyHedz2oqz=pW5nMOTgZd3E7RVbCwNGUCTZruw8sw@mail.gmail.com>
	<87egznmfmc.fsf@jedbrown.org>
	<CANBrw+6quMhPKTEm-CfaYf2AfC3BtxfC3pSL_ZCw-R0ug39Fog@mail.gmail.com>
	<87a9abmeky.fsf@jedbrown.org>
	<CANBrw+57EsEC2=4KOASe4JysGBaKKqGjsBtbZZBBygqVT51AMQ@mail.gmail.com>
Message-ID: <874n0jme1v.fsf@jedbrown.org>

Christophe Ortiz <christophe.ortiz at ciemat.es> writes:
> For the moment, dof is small (dof=4). Still doing some tests with the
> classes and methods. But should reach 1000-10000 in production.

This is a huge difference.  It's a waste of time to profile cases that
don't matter, so try to profile the cases that matter (even if the
"physics" is mocked).

Note that for 4x4, you can use

  PetscScalar emat[4][4];

> Ahhh...did not think about it ! This would allow to create the vector only
> once in the main (after dof is determined) and pass it as argument to
> FormIJacobian(). I will try. Thanks !

Yes, that's what the context is for.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140521/a5b132b3/attachment-0001.pgp>

From lb2653 at columbia.edu  Wed May 21 14:09:23 2014
From: lb2653 at columbia.edu (Luc Berger-Vergiat)
Date: Wed, 21 May 2014 15:09:23 -0400
Subject: [petsc-users] Error message for DMShell using MG preconditioner
 to solve S
In-Reply-To: <CAMYG4Gn_1X_tdQ9pgrvpPooz86+GbSgExAXtNF97sy-RLrD_ZA@mail.gmail.com>
References: <537BCA1E.6010508@columbi.edu>
	<CAMYG4Gn_1X_tdQ9pgrvpPooz86+GbSgExAXtNF97sy-RLrD_ZA@mail.gmail.com>
Message-ID: <537CF9E3.4010908@columbi.edu>

So I just pulled an updated version of petsc-dev today (I switched from 
the *next* branch to the *master* branch due to some compilation error 
existing with the last commit on *next*).
I still have the same error and I believe this is the whole error 
message I have.
I mean I am running multiple time steps for my simulation so I have the 
same message at each time step, but I don't think that it is important 
to report these duplicates, is it?

Best,
Luc

On 05/20/2014 09:14 PM, Matthew Knepley wrote:
>
> On Tue, May 20, 2014 at 4:33 PM, Luc Berger-Vergiat 
> <lb2653 at columbia.edu <mailto:lb2653 at columbia.edu>> wrote:
>
>     Hi all,
>     I am running an FEM simulation that uses Petsc as a linear solver.
>     I am setting up ISs and pass them to a DMShell in order to use the
>     FieldSplit capabilities of Petsc.
>
>     When I pass the following options to Petsc:
>
>         " -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type
>         schur -pc_fieldsplit_schur_factorization_type full
>         -pc_fieldsplit_schur_precondition selfp
>         -pc_fieldsplit_0_fields 1,2 -pc_fieldsplit_1_fields 0
>         -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type ilu
>         -fieldsplit_Field_0_ksp_type gmres -fieldsplit_Field_0_pc_type
>         mg -malloc_log mlog -log_summary time.log"
>
>     I get an error message:
>
>     [0]PETSC ERROR: --------------------- Error Message
>     --------------------------------------------------------------
>     [0]PETSC ERROR:
>     [0]PETSC ERROR: Must call DMShellSetGlobalVector() or
>     DMShellSetCreateGlobalVector()
>     [0]PETSC ERROR: See
>     http://http://www.mcs.anl.gov/petsc/documentation/faq.html for
>     trouble shooting.
>     [0]PETSC ERROR: Petsc Development GIT revision:
>     v3.4.4-5071-g1163a46  GIT Date: 2014-03-26 22:20:51 -0500
>     [0]PETSC ERROR:
>     /home/luc/research/feap_repo/ShearBands/parfeap-dev/feap on a
>     arch-linux2-c-opt named euler by luc Tue May 20 11:31:11 2014
>     [0]PETSC ERROR: Configure options --download-cmake
>     --download-hypre --download-metis --download-mpich
>     --download-parmetis --with-debugging=no --with-share-libraries=no
>     [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() line 245 in
>     /home/luc/research/petsc/src/dm/impls/shell/dmshell.c
>     [0]PETSC ERROR: #2 DMCreateGlobalVector() line 669 in
>     /home/luc/research/petsc/src/dm/interface/dm.c
>     [0]PETSC ERROR: #3 DMGetGlobalVector() line 154 in
>     /home/luc/research/petsc/src/dm/interface/dmget.c
>
>
> Always always always give the entire error message.
>
>   Matt
>
>     I am not really sure why this happens but it only happens when
>     -fieldsplit_Field_0_pc_type mg, with other preconditioners, I have
>     no problems. I attached the ksp_view in case that's any use.
>
>     -- 
>     Best,
>     Luc
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140521/6db2d72d/attachment.html>

From stali at geology.wisc.edu  Wed May 21 16:48:51 2014
From: stali at geology.wisc.edu (Tabrez Ali)
Date: Wed, 21 May 2014 16:48:51 -0500
Subject: [petsc-users] Valgrind unhandled instruction
Message-ID: <537D1F43.8010507@geology.wisc.edu>

Hello

With petsc-dev I get the following error with my own code and also with 
ex56 as shown below. Both run fine otherwise. This is with Valgrind 3.7 
(in Debian stable).

Is this a PETSc or Valgrind issue?

T

stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ valgrind ./ex56 -ne 
9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 
-pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true 
-two_solves -ksp_monitor_short -use_mat_nearnullspace
==16123== Memcheck, a memory error detector
==16123== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==16123== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==16123== Command: ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type 
agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 
-pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short 
-use_mat_nearnullspace
==16123==
vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x38 0x39
==16123== valgrind: Unrecognised instruction at address 0x4228928.
==16123==    at 0x4228928: ISCreateGeneral_Private (in 
/home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
==16123==    by 0x4228D54: ISGeneralSetIndices_General (in 
/home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
==16123==    by 0x4229504: ISGeneralSetIndices (in 
/home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
==16123==    by 0x422976F: ISCreateGeneral (in 
/home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
==16123==    by 0x4A94CAA: PCGAMGCoarsen_AGG (in 
/home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
==16123==    by 0x4A84FD6: PCSetUp_GAMG (in 
/home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
==16123==    by 0x49E8163: PCSetUp (in 
/home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
==16123==    by 0x4AE6023: KSPSetUp (in 
/home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
==16123==    by 0x804C7D6: main (in 
/home/stali/petsc-dev/src/ksp/ksp/examples/tutorials/ex56)
==16123== Your program just tried to execute an instruction that Valgrind
==16123== did not recognise.  There are two possible reasons for this.
==16123== 1. Your program has a bug and erroneously jumped to a non-code
==16123==    location.  If you are running Memcheck and you just saw a
==16123==    warning about a bad jump, it's probably your program's fault.
==16123== 2. The instruction is legitimate but Valgrind doesn't handle it,
==16123==    i.e. it's Valgrind's fault.  If you think this is the case or
==16123==    you are not sure, please let us know and we'll try to fix it.
==16123== Either way, Valgrind will now raise a SIGILL signal which will
==16123== probably kill your program.
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 4 Illegal instruction: Likely due 
to memory corruption
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC 
ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames 
------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: [0] ISCreateGeneral_Private line 575 
/home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
[0]PETSC ERROR: [0] ISGeneralSetIndices_General line 674 
/home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
[0]PETSC ERROR: [0] ISGeneralSetIndices line 662 
/home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
[0]PETSC ERROR: [0] ISCreateGeneral line 631 
/home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
[0]PETSC ERROR: [0] PCGAMGCoarsen_AGG line 976 
/home/stali/petsc-dev/src/ksp/pc/impls/gamg/agg.c
[0]PETSC ERROR: [0] PCSetUp_GAMG line 487 
/home/stali/petsc-dev/src/ksp/pc/impls/gamg/gamg.c
[0]PETSC ERROR: [0] KSPSetUp line 219 
/home/stali/petsc-dev/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html 
for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4344-ge0d8a6f  
GIT Date: 2014-05-21 16:02:44 -0500
[0]PETSC ERROR: ./ex56 on a arch-linux2-c-debug named i5 by stali Wed 
May 21 16:41:07 2014
[0]PETSC ERROR: Configure options --with-fc=gfortran --with-cc=gcc 
--download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS="-O3 
-march=native" --FOPTFLAGS="-O3 -march=native" --with-shared-libraries 
--with-debugging=1
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
==16123==
==16123== HEAP SUMMARY:
==16123==     in use at exit: 4,627,684 bytes in 1,188 blocks
==16123==   total heap usage: 1,649 allocs, 461 frees, 6,073,192 bytes 
allocated
==16123==
==16123== LEAK SUMMARY:
==16123==    definitely lost: 0 bytes in 0 blocks
==16123==    indirectly lost: 0 bytes in 0 blocks
==16123==      possibly lost: 0 bytes in 0 blocks
==16123==    still reachable: 4,627,684 bytes in 1,188 blocks
==16123==         suppressed: 0 bytes in 0 blocks
==16123== Rerun with --leak-check=full to see details of leaked memory
==16123==
==16123== For counts of detected and suppressed errors, rerun with: -v
==16123== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 61 from 8)

From balay at mcs.anl.gov  Wed May 21 17:27:53 2014
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 21 May 2014 17:27:53 -0500
Subject: [petsc-users] Valgrind unhandled instruction
In-Reply-To: <537D1F43.8010507@geology.wisc.edu>
References: <537D1F43.8010507@geology.wisc.edu>
Message-ID: <alpine.LFD.2.11.1405211723340.13200@asterix>

Looks like valgrind-3.7 doesn't know all instructions generated by
"-O3 -march=native".

And generally one should run valgrind with code compiled with '-g'
anyway.

I see similar issue with valgrind-3.7 - but the error goes away with
valgrind-3.9 [compiled from source]

Satish

-----------

balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --version
valgrind-3.7.0
balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --tool=memcheck -q  ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace
vex amd64->IR: unhandled instruction bytes: 0xC5 0xFB 0x2A 0xC2 0xBA 0x1 0x0 0x0
==19041== valgrind: Unrecognised instruction at address 0x4ef760e.
==19041==    at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
==19041==    by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
==19041==    by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
==19041==    by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56)
==19041== Your program just tried to execute an instruction that Valgrind
==19041== did not recognise.  There are two possible reasons for this.
==19041== 1. Your program has a bug and erroneously jumped to a non-code
==19041==    location.  If you are running Memcheck and you just saw a
==19041==    warning about a bad jump, it's probably your program's fault.
==19041== 2. The instruction is legitimate but Valgrind doesn't handle it,
==19041==    i.e. it's Valgrind's fault.  If you think this is the case or
==19041==    you are not sure, please let us know and we'll try to fix it.
==19041== Either way, Valgrind will now raise a SIGILL signal which will
==19041== probably kill your program.
==19041== 
==19041== Process terminating with default action of signal 4 (SIGILL)
==19041==  Illegal opcode at address 0x4EF760E
==19041==    at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
==19041==    by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
==19041==    by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
==19041==    by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56)
Illegal instruction (core dumped)
balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --version
valgrind-3.9.0
balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --tool=memcheck -q  ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace
  0 KSP Residual norm 740.547 
  1 KSP Residual norm 104.004 
  2 KSP Residual norm 79.1334 
  3 KSP Residual norm 50.0497 
  4 KSP Residual norm 4.40859 
  5 KSP Residual norm 1.56451 
  6 KSP Residual norm 0.601773 
  7 KSP Residual norm 0.225864 
  8 KSP Residual norm 0.0122203 
  9 KSP Residual norm 0.00290625 
  0 KSP Residual norm 0.00740547 
  1 KSP Residual norm 0.00104004 
  2 KSP Residual norm 0.000791334 
  3 KSP Residual norm 0.000500497 
  4 KSP Residual norm 4.40859e-05 
  5 KSP Residual norm 1.56451e-05 
  6 KSP Residual norm 6.01773e-06 
  7 KSP Residual norm 2.25864e-06 
  8 KSP Residual norm 1.22203e-07 
  9 KSP Residual norm 2.90625e-08 
  0 KSP Residual norm 7.40547e-08 
  1 KSP Residual norm 1.04004e-08 
  2 KSP Residual norm 7.91334e-09 
  3 KSP Residual norm 5.00497e-09 
  4 KSP Residual norm 4.409e-10 
  5 KSP Residual norm 1.565e-10 
  6 KSP Residual norm 6.018e-11 
  7 KSP Residual norm 2.259e-11 
  8 KSP Residual norm < 1.e-11
  9 KSP Residual norm < 1.e-11
[0]main |b-Ax|/|b|=6.068344e-05, |b|=5.391826e+00, emax=9.964453e-01
balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ 


On Wed, 21 May 2014, Tabrez Ali wrote:

> Hello
> 
> With petsc-dev I get the following error with my own code and also with ex56
> as shown below. Both run fine otherwise. This is with Valgrind 3.7 (in Debian
> stable).
> 
> Is this a PETSc or Valgrind issue?
> 
> T
> 
> stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ valgrind ./ex56 -ne 9
> -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1
> -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves
> -ksp_monitor_short -use_mat_nearnullspace
> ==16123== Memcheck, a memory error detector
> ==16123== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
> ==16123== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
> ==16123== Command: ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg
> -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10
> -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short
> -use_mat_nearnullspace
> ==16123==
> vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x38 0x39
> ==16123== valgrind: Unrecognised instruction at address 0x4228928.
> ==16123==    at 0x4228928: ISCreateGeneral_Private (in
> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
> ==16123==    by 0x4228D54: ISGeneralSetIndices_General (in
> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
> ==16123==    by 0x4229504: ISGeneralSetIndices (in
> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
> ==16123==    by 0x422976F: ISCreateGeneral (in
> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
> ==16123==    by 0x4A94CAA: PCGAMGCoarsen_AGG (in
> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
> ==16123==    by 0x4A84FD6: PCSetUp_GAMG (in
> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
> ==16123==    by 0x49E8163: PCSetUp (in
> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
> ==16123==    by 0x4AE6023: KSPSetUp (in
> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
> ==16123==    by 0x804C7D6: main (in
> /home/stali/petsc-dev/src/ksp/ksp/examples/tutorials/ex56)
> ==16123== Your program just tried to execute an instruction that Valgrind
> ==16123== did not recognise.  There are two possible reasons for this.
> ==16123== 1. Your program has a bug and erroneously jumped to a non-code
> ==16123==    location.  If you are running Memcheck and you just saw a
> ==16123==    warning about a bad jump, it's probably your program's fault.
> ==16123== 2. The instruction is legitimate but Valgrind doesn't handle it,
> ==16123==    i.e. it's Valgrind's fault.  If you think this is the case or
> ==16123==    you are not sure, please let us know and we'll try to fix it.
> ==16123== Either way, Valgrind will now raise a SIGILL signal which will
> ==16123== probably kill your program.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 4 Illegal instruction: Likely due to
> memory corruption
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or
> try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
> corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> [0]PETSC ERROR:       is given.
> [0]PETSC ERROR: [0] ISCreateGeneral_Private line 575
> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
> [0]PETSC ERROR: [0] ISGeneralSetIndices_General line 674
> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
> [0]PETSC ERROR: [0] ISGeneralSetIndices line 662
> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
> [0]PETSC ERROR: [0] ISCreateGeneral line 631
> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
> [0]PETSC ERROR: [0] PCGAMGCoarsen_AGG line 976
> /home/stali/petsc-dev/src/ksp/pc/impls/gamg/agg.c
> [0]PETSC ERROR: [0] PCSetUp_GAMG line 487
> /home/stali/petsc-dev/src/ksp/pc/impls/gamg/gamg.c
> [0]PETSC ERROR: [0] KSPSetUp line 219
> /home/stali/petsc-dev/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
> trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4344-ge0d8a6f  GIT
> Date: 2014-05-21 16:02:44 -0500
> [0]PETSC ERROR: ./ex56 on a arch-linux2-c-debug named i5 by stali Wed May 21
> 16:41:07 2014
> [0]PETSC ERROR: Configure options --with-fc=gfortran --with-cc=gcc
> --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS="-O3
> -march=native" --FOPTFLAGS="-O3 -march=native" --with-shared-libraries
> --with-debugging=1
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> ==16123==
> ==16123== HEAP SUMMARY:
> ==16123==     in use at exit: 4,627,684 bytes in 1,188 blocks
> ==16123==   total heap usage: 1,649 allocs, 461 frees, 6,073,192 bytes
> allocated
> ==16123==
> ==16123== LEAK SUMMARY:
> ==16123==    definitely lost: 0 bytes in 0 blocks
> ==16123==    indirectly lost: 0 bytes in 0 blocks
> ==16123==      possibly lost: 0 bytes in 0 blocks
> ==16123==    still reachable: 4,627,684 bytes in 1,188 blocks
> ==16123==         suppressed: 0 bytes in 0 blocks
> ==16123== Rerun with --leak-check=full to see details of leaked memory
> ==16123==
> ==16123== For counts of detected and suppressed errors, rerun with: -v
> ==16123== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 61 from 8)
> 
> 


From stali at geology.wisc.edu  Wed May 21 17:57:52 2014
From: stali at geology.wisc.edu (Tabrez Ali)
Date: Wed, 21 May 2014 17:57:52 -0500
Subject: [petsc-users] Valgrind unhandled instruction
In-Reply-To: <alpine.LFD.2.11.1405211723340.13200@asterix>
References: <537D1F43.8010507@geology.wisc.edu>
	<alpine.LFD.2.11.1405211723340.13200@asterix>
Message-ID: <537D2F70.5000703@geology.wisc.edu>

Sorry I missed the flags. Thanks for the clarification.

Tabrez

On 05/21/2014 05:27 PM, Satish Balay wrote:
> Looks like valgrind-3.7 doesn't know all instructions generated by
> "-O3 -march=native".
>
> And generally one should run valgrind with code compiled with '-g'
> anyway.
>
> I see similar issue with valgrind-3.7 - but the error goes away with
> valgrind-3.9 [compiled from source]
>
> Satish
>
> -----------
>
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --version
> valgrind-3.7.0
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --tool=memcheck -q  ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace
> vex amd64->IR: unhandled instruction bytes: 0xC5 0xFB 0x2A 0xC2 0xBA 0x1 0x0 0x0
> ==19041== valgrind: Unrecognised instruction at address 0x4ef760e.
> ==19041==    at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56)
> ==19041== Your program just tried to execute an instruction that Valgrind
> ==19041== did not recognise.  There are two possible reasons for this.
> ==19041== 1. Your program has a bug and erroneously jumped to a non-code
> ==19041==    location.  If you are running Memcheck and you just saw a
> ==19041==    warning about a bad jump, it's probably your program's fault.
> ==19041== 2. The instruction is legitimate but Valgrind doesn't handle it,
> ==19041==    i.e. it's Valgrind's fault.  If you think this is the case or
> ==19041==    you are not sure, please let us know and we'll try to fix it.
> ==19041== Either way, Valgrind will now raise a SIGILL signal which will
> ==19041== probably kill your program.
> ==19041==
> ==19041== Process terminating with default action of signal 4 (SIGILL)
> ==19041==  Illegal opcode at address 0x4EF760E
> ==19041==    at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56)
> Illegal instruction (core dumped)
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --version
> valgrind-3.9.0
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --tool=memcheck -q  ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace
>    0 KSP Residual norm 740.547
>    1 KSP Residual norm 104.004
>    2 KSP Residual norm 79.1334
>    3 KSP Residual norm 50.0497
>    4 KSP Residual norm 4.40859
>    5 KSP Residual norm 1.56451
>    6 KSP Residual norm 0.601773
>    7 KSP Residual norm 0.225864
>    8 KSP Residual norm 0.0122203
>    9 KSP Residual norm 0.00290625
>    0 KSP Residual norm 0.00740547
>    1 KSP Residual norm 0.00104004
>    2 KSP Residual norm 0.000791334
>    3 KSP Residual norm 0.000500497
>    4 KSP Residual norm 4.40859e-05
>    5 KSP Residual norm 1.56451e-05
>    6 KSP Residual norm 6.01773e-06
>    7 KSP Residual norm 2.25864e-06
>    8 KSP Residual norm 1.22203e-07
>    9 KSP Residual norm 2.90625e-08
>    0 KSP Residual norm 7.40547e-08
>    1 KSP Residual norm 1.04004e-08
>    2 KSP Residual norm 7.91334e-09
>    3 KSP Residual norm 5.00497e-09
>    4 KSP Residual norm 4.409e-10
>    5 KSP Residual norm 1.565e-10
>    6 KSP Residual norm 6.018e-11
>    7 KSP Residual norm 2.259e-11
>    8 KSP Residual norm<  1.e-11
>    9 KSP Residual norm<  1.e-11
> [0]main |b-Ax|/|b|=6.068344e-05, |b|=5.391826e+00, emax=9.964453e-01
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $
>
>
> On Wed, 21 May 2014, Tabrez Ali wrote:
>
>> Hello
>>
>> With petsc-dev I get the following error with my own code and also with ex56
>> as shown below. Both run fine otherwise. This is with Valgrind 3.7 (in Debian
>> stable).
>>
>> Is this a PETSc or Valgrind issue?
>>
>> T
>>
>> stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ valgrind ./ex56 -ne 9
>> -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1
>> -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves
>> -ksp_monitor_short -use_mat_nearnullspace
>> ==16123== Memcheck, a memory error detector
>> ==16123== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
>> ==16123== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
>> ==16123== Command: ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg
>> -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10
>> -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short
>> -use_mat_nearnullspace
>> ==16123==
>> vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x38 0x39
>> ==16123== valgrind: Unrecognised instruction at address 0x4228928.
>> ==16123==    at 0x4228928: ISCreateGeneral_Private (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4228D54: ISGeneralSetIndices_General (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4229504: ISGeneralSetIndices (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x422976F: ISCreateGeneral (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4A94CAA: PCGAMGCoarsen_AGG (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4A84FD6: PCSetUp_GAMG (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x49E8163: PCSetUp (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4AE6023: KSPSetUp (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x804C7D6: main (in
>> /home/stali/petsc-dev/src/ksp/ksp/examples/tutorials/ex56)
>> ==16123== Your program just tried to execute an instruction that Valgrind
>> ==16123== did not recognise.  There are two possible reasons for this.
>> ==16123== 1. Your program has a bug and erroneously jumped to a non-code
>> ==16123==    location.  If you are running Memcheck and you just saw a
>> ==16123==    warning about a bad jump, it's probably your program's fault.
>> ==16123== 2. The instruction is legitimate but Valgrind doesn't handle it,
>> ==16123==    i.e. it's Valgrind's fault.  If you think this is the case or
>> ==16123==    you are not sure, please let us know and we'll try to fix it.
>> ==16123== Either way, Valgrind will now raise a SIGILL signal which will
>> ==16123== probably kill your program.
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 4 Illegal instruction: Likely due to
>> memory corruption
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or
>> try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
>> corruption errors
>> [0]PETSC ERROR: likely location of problem given in stack below
>> [0]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [0]PETSC ERROR:       is given.
>> [0]PETSC ERROR: [0] ISCreateGeneral_Private line 575
>> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
>> [0]PETSC ERROR: [0] ISGeneralSetIndices_General line 674
>> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
>> [0]PETSC ERROR: [0] ISGeneralSetIndices line 662
>> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
>> [0]PETSC ERROR: [0] ISCreateGeneral line 631
>> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
>> [0]PETSC ERROR: [0] PCGAMGCoarsen_AGG line 976
>> /home/stali/petsc-dev/src/ksp/pc/impls/gamg/agg.c
>> [0]PETSC ERROR: [0] PCSetUp_GAMG line 487
>> /home/stali/petsc-dev/src/ksp/pc/impls/gamg/gamg.c
>> [0]PETSC ERROR: [0] KSPSetUp line 219
>> /home/stali/petsc-dev/src/ksp/ksp/interface/itfunc.c
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Signal received
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
>> trouble shooting.
>> [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4344-ge0d8a6f  GIT
>> Date: 2014-05-21 16:02:44 -0500
>> [0]PETSC ERROR: ./ex56 on a arch-linux2-c-debug named i5 by stali Wed May 21
>> 16:41:07 2014
>> [0]PETSC ERROR: Configure options --with-fc=gfortran --with-cc=gcc
>> --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS="-O3
>> -march=native" --FOPTFLAGS="-O3 -march=native" --with-shared-libraries
>> --with-debugging=1
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>> [unset]: aborting job:
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>> ==16123==
>> ==16123== HEAP SUMMARY:
>> ==16123==     in use at exit: 4,627,684 bytes in 1,188 blocks
>> ==16123==   total heap usage: 1,649 allocs, 461 frees, 6,073,192 bytes
>> allocated
>> ==16123==
>> ==16123== LEAK SUMMARY:
>> ==16123==    definitely lost: 0 bytes in 0 blocks
>> ==16123==    indirectly lost: 0 bytes in 0 blocks
>> ==16123==      possibly lost: 0 bytes in 0 blocks
>> ==16123==    still reachable: 4,627,684 bytes in 1,188 blocks
>> ==16123==         suppressed: 0 bytes in 0 blocks
>> ==16123== Rerun with --leak-check=full to see details of leaked memory
>> ==16123==
>> ==16123== For counts of detected and suppressed errors, rerun with: -v
>> ==16123== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 61 from 8)
>>
>>

From likunt at caltech.edu  Thu May 22 09:32:10 2014
From: likunt at caltech.edu (likunt at caltech.edu)
Date: Thu, 22 May 2014 07:32:10 -0700 (PDT)
Subject: [petsc-users] output vec
Message-ID: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>

Dear Petsc developers,

I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like this:

m1 m2 m3
m4 m5 m6
...

Here is my code to do this:

==================================================================
PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view);
PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU);
PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE);
for(int step=0; step<STEP; step++)
{
    //calculate M at current step
    DMDAVecGetArray(da, M, &aM);
    DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0);
    for(int node=xs; node<xs+xm; node++)
    {
       PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n",
aM[node].x, aM[node].y, aM[node].z);
       PetscViewerFlush(view);
    }
    DMDAVecRestoreArray(da, M, &aM);
}
=================================================================

but this turns out to be very slow. I am trying to write it in a binary
file, but I cannot find the corresponding functionality (such as
PETSC_VIEWER_ASCII_SYMMODU and PetscViewerASCIISynchronizedPrintf in
binary form).  Thanks.


From jed at jedbrown.org  Thu May 22 09:42:41 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 22 May 2014 08:42:41 -0600
Subject: [petsc-users] output vec
In-Reply-To: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
Message-ID: <878uptj2n2.fsf@jedbrown.org>

likunt at caltech.edu writes:

> Dear Petsc developers,
>
> I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like this:
>
> m1 m2 m3
> m4 m5 m6
> ...
>
> Here is my code to do this:
>
> ==================================================================
> PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view);
> PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU);
> PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE);
> for(int step=0; step<STEP; step++)
> {
>     //calculate M at current step
>     DMDAVecGetArray(da, M, &aM);
>     DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0);
>     for(int node=xs; node<xs+xm; node++)
>     {
>        PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n",
> aM[node].x, aM[node].y, aM[node].z);
>        PetscViewerFlush(view);
>     }
>     DMDAVecRestoreArray(da, M, &aM);
> }
> =================================================================
>
> but this turns out to be very slow. 

Yes, ASCII output is slow.

> I am trying to write it in a binary file, but I cannot find the
> corresponding functionality (such as PETSC_VIEWER_ASCII_SYMMODU and
> PetscViewerASCIISynchronizedPrintf in binary form).

Just use VecView to a binary viewer.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/ff026a65/attachment.pgp>

From knepley at gmail.com  Thu May 22 09:42:52 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 22 May 2014 09:42:52 -0500
Subject: [petsc-users] output vec
In-Reply-To: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
Message-ID: <CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>

On Thu, May 22, 2014 at 9:32 AM, <likunt at caltech.edu> wrote:

> Dear Petsc developers,
>
> I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like this:
>
> m1 m2 m3
> m4 m5 m6
> ...
>
> Here is my code to do this:
>
> ==================================================================
> PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view);
> PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU);
> PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE);
> for(int step=0; step<STEP; step++)
> {
>     //calculate M at current step
>     DMDAVecGetArray(da, M, &aM);
>     DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0);
>     for(int node=xs; node<xs+xm; node++)
>     {
>        PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n",
> aM[node].x, aM[node].y, aM[node].z);
>        PetscViewerFlush(view);
>     }
>     DMDAVecRestoreArray(da, M, &aM);
> }
> =================================================================
>
> but this turns out to be very slow. I am trying to write it in a binary
> file, but I cannot find the corresponding functionality (such as
> PETSC_VIEWER_ASCII_SYMMODU and PetscViewerASCIISynchronizedPrintf in
> binary form).  Thanks.
>

There is PetscViewerBindaryWrite(), but what do you really want to do? What
you suggest
doing will be very slow. Why not just use PETSc binary output?

   Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/c19d5b43/attachment.html>

From likunt at caltech.edu  Thu May 22 10:02:29 2014
From: likunt at caltech.edu (Likun Tan)
Date: Thu, 22 May 2014 11:02:29 -0400
Subject: [petsc-users] output vec
In-Reply-To: <CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
Message-ID: <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>

Thanks for your suggestion.  
Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e.
m1
m2
m3
m4
m5
m6

But I prefer the form

m1 m2 m3
m4 m5 m6

Since in the end I will have about 1e+7 elements in the vec. If there is no way to output the vec in the second form, I will simply use VecView. Thanks.

> On May 22, 2014, at 10:42 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
>> On Thu, May 22, 2014 at 9:32 AM, <likunt at caltech.edu> wrote:
>> Dear Petsc developers,
>> 
>> I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like this:
>> 
>> m1 m2 m3
>> m4 m5 m6
>> ...
>> 
>> Here is my code to do this:
>> 
>> ==================================================================
>> PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view);
>> PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU);
>> PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE);
>> for(int step=0; step<STEP; step++)
>> {
>>     //calculate M at current step
>>     DMDAVecGetArray(da, M, &aM);
>>     DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0);
>>     for(int node=xs; node<xs+xm; node++)
>>     {
>>        PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n",
>> aM[node].x, aM[node].y, aM[node].z);
>>        PetscViewerFlush(view);
>>     }
>>     DMDAVecRestoreArray(da, M, &aM);
>> }
>> =================================================================
>> 
>> but this turns out to be very slow. I am trying to write it in a binary
>> file, but I cannot find the corresponding functionality (such as
>> PETSC_VIEWER_ASCII_SYMMODU and PetscViewerASCIISynchronizedPrintf in
>> binary form).  Thanks.
> 
> There is PetscViewerBindaryWrite(), but what do you really want to do? What you suggest
> doing will be very slow. Why not just use PETSc binary output?
> 
>    Matt
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/eff03fe4/attachment.html>

From jed at jedbrown.org  Thu May 22 10:07:06 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 22 May 2014 09:07:06 -0600
Subject: [petsc-users] output vec
In-Reply-To: <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
	<1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
Message-ID: <874n0hj1id.fsf@jedbrown.org>

Likun Tan <likunt at caltech.edu> writes:

> Thanks for your suggestion.  
> Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e.
> m1
> m2
> m3
> m4
> m5
> m6

The binary viewer writes a *binary* file.  No formatting or line breaks.

> But I prefer the form
>
> m1 m2 m3
> m4 m5 m6
>
> Since in the end I will have about 1e+7 elements in the vec. If there is no way to output the vec in the second form, I will simply use VecView. Thanks.

Use VecView to write a binary (not ASCII) file.  See
PetscViewerBinaryOpen().  You can look at it with python, matlab/octave,
etc.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/1d83dcad/attachment.pgp>

From knepley at gmail.com  Thu May 22 10:20:42 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 22 May 2014 10:20:42 -0500
Subject: [petsc-users] output vec
In-Reply-To: <1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
	<1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
Message-ID: <CAMYG4GkDkB9uAyH7xdVtLJ2NgiZaT9_zZ-eKs0+ocWav-Nrsqg@mail.gmail.com>

On Thu, May 22, 2014 at 10:02 AM, Likun Tan <likunt at caltech.edu> wrote:

> Thanks for your suggestion.
> Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e.
>

No it won't. Binary files have no newlines or spaces.

   Matt


> m1
> m2
> m3
> m4
> m5
> m6
>
> But I prefer the form
>
> m1 m2 m3
> m4 m5 m6
>
> Since in the end I will have about 1e+7 elements in the vec. If there is
> no way to output the vec in the second form, I will simply use VecView.
> Thanks.
>
> On May 22, 2014, at 10:42 AM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Thu, May 22, 2014 at 9:32 AM, <likunt at caltech.edu> wrote:
>
>> Dear Petsc developers,
>>
>> I am trying to output my vec M={m1 m2 m3 m4 m5 m6 ...} in a form like
>> this:
>>
>> m1 m2 m3
>> m4 m5 m6
>> ...
>>
>> Here is my code to do this:
>>
>> ==================================================================
>> PetscViewerASCIIOpen(PETSC_COMM_WORLD, NAME, &view);
>> PetscViewerSetFormat(view, PETSC_VIEWER_ASCII_SYMMODU);
>> PetscViewerASCIISynchronizedAllow(view, PETSC_TRUE);
>> for(int step=0; step<STEP; step++)
>> {
>>     //calculate M at current step
>>     DMDAVecGetArray(da, M, &aM);
>>     DMDAGetCorners(da, &xs, 0, 0, &xm, 0, 0);
>>     for(int node=xs; node<xs+xm; node++)
>>     {
>>        PetscViewerASCIISynchronizedPrintf(view, "%3.12f %3.12f %3.12f\n",
>> aM[node].x, aM[node].y, aM[node].z);
>>        PetscViewerFlush(view);
>>     }
>>     DMDAVecRestoreArray(da, M, &aM);
>> }
>> =================================================================
>>
>> but this turns out to be very slow. I am trying to write it in a binary
>> file, but I cannot find the corresponding functionality (such as
>> PETSC_VIEWER_ASCII_SYMMODU and PetscViewerASCIISynchronizedPrintf in
>> binary form).  Thanks.
>>
>
> There is PetscViewerBindaryWrite(), but what do you really want to do?
> What you suggest
> doing will be very slow. Why not just use PETSc binary output?
>
>    Matt
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/b84093a5/attachment-0001.html>

From likunt at caltech.edu  Thu May 22 11:20:13 2014
From: likunt at caltech.edu (Likun Tan)
Date: Thu, 22 May 2014 12:20:13 -0400
Subject: [petsc-users] output vec
In-Reply-To: <874n0hj1id.fsf@jedbrown.org>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
	<1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
	<874n0hj1id.fsf@jedbrown.org>
Message-ID: <EB21C5B2-0894-430B-91DB-2F589C317713@caltech.edu>

I am using VecView to output the vec in a binary file and tried to open it in Matlab. I define the precision to be double, but Matlab does not give reasonable values of my vec (almost extremely large or small or NaN values). Here is my code

========================================
PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view);
for(step=0; step<STEP; step++)
{
       //compute M at current step
       VecView(M, view);
}
PetscViewerDestroy(&view);
=======================================

I am not sure if there is any problem of my Petsc code. Your comment is well appreciated.

> On May 22, 2014, at 11:07 AM, Jed Brown <jed at jedbrown.org> wrote:
> 
> Likun Tan <likunt at caltech.edu> writes:
> 
>> Thanks for your suggestion.  
>> Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e.
>> m1
>> m2
>> m3
>> m4
>> m5
>> m6
> 
> The binary viewer writes a *binary* file.  No formatting or line breaks.
> 
>> But I prefer the form
>> 
>> m1 m2 m3
>> m4 m5 m6
>> 
>> Since in the end I will have about 1e+7 elements in the vec. If there is no way to output the vec in the second form, I will simply use VecView. Thanks.
> 
> Use VecView to write a binary (not ASCII) file.  See
> PetscViewerBinaryOpen().  You can look at it with python, matlab/octave,
> etc.

From knepley at gmail.com  Thu May 22 11:26:14 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 22 May 2014 11:26:14 -0500
Subject: [petsc-users] output vec
In-Reply-To: <EB21C5B2-0894-430B-91DB-2F589C317713@caltech.edu>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
	<1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
	<874n0hj1id.fsf@jedbrown.org>
	<EB21C5B2-0894-430B-91DB-2F589C317713@caltech.edu>
Message-ID: <CAMYG4Gn=9mNQdn0YEHEbDaWfDinEWD7bz-_8O+WyjW+nXALRHg@mail.gmail.com>

On Thu, May 22, 2014 at 11:20 AM, Likun Tan <likunt at caltech.edu> wrote:

> I am using VecView to output the vec in a binary file and tried to open it
> in Matlab. I define the precision to be double, but Matlab does not give
> reasonable values of my vec (almost extremely large or small or NaN
> values). Here is my code
>

Are you using PetscBinaryRead.m in Matlab? If so, send the code snippet for
a small vector, all the output, and the binary file.

  Matt


> ========================================
> PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view);
> for(step=0; step<STEP; step++)
> {
>        //compute M at current step
>        VecView(M, view);
> }
> PetscViewerDestroy(&view);
> =======================================
>
> I am not sure if there is any problem of my Petsc code. Your comment is
> well appreciated.
>
> > On May 22, 2014, at 11:07 AM, Jed Brown <jed at jedbrown.org> wrote:
> >
> > Likun Tan <likunt at caltech.edu> writes:
> >
> >> Thanks for your suggestion.
> >> Using VecView or PetscViewerBinaryWrite will print the vec vertically,
> i.e.
> >> m1
> >> m2
> >> m3
> >> m4
> >> m5
> >> m6
> >
> > The binary viewer writes a *binary* file.  No formatting or line breaks.
> >
> >> But I prefer the form
> >>
> >> m1 m2 m3
> >> m4 m5 m6
> >>
> >> Since in the end I will have about 1e+7 elements in the vec. If there
> is no way to output the vec in the second form, I will simply use VecView.
> Thanks.
> >
> > Use VecView to write a binary (not ASCII) file.  See
> > PetscViewerBinaryOpen().  You can look at it with python, matlab/octave,
> > etc.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/9c6bcd3c/attachment.html>

From mfadams at lbl.gov  Thu May 22 11:34:39 2014
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 22 May 2014 12:34:39 -0400
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <871tvpqt8u.fsf@jedbrown.org>
References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org>
	<537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org>
	<537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org>
Message-ID: <CADOhEh59U0gn1aZ3vPFs6mzb2FAnjAMcz8XuGCRAxf1hwE6xuQ@mail.gmail.com>

If the solver is degrading as the coefficients change, and I would assume
get more nasty, you can try deleting the solver at each time step.  This
will be about 2x more expensive, because it does the setup each solve, but
it might fix your problem.

You also might try:

-pc_type hypre
-pc_hypre_type boomeramg


On Mon, May 19, 2014 at 6:49 PM, Jed Brown <jed at jedbrown.org> wrote:

> Michele Rosso <mrosso at uci.edu> writes:
>
> > Jed,
> >
> > thank you very much!
> > I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type
> > sor/   and report back.
> > Yes, I removed the nullspace from both the system matrix and the rhs.
> > Is there a way to have something similar to Dendy's multigrid or the
> > deflated conjugate gradient method with PETSc?
>
> Dendy's MG needs geometry.  The algorithm to produce the interpolation
> operators is not terribly complicated so it could be done, though DMDA
> support for cell-centered is a somewhat awkward.  "Deflated CG" can mean
> lots of things so you'll have to be more precise.  (Most everything in
> the "deflation" world has a clear analogue in the MG world, but the
> deflation community doesn't have a precise language to talk about their
> methods so you always have to read the paper carefully to find out if
> it's completely standard or if there is something new.)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/1460cf95/attachment.html>

From fiona at epcc.ed.ac.uk  Thu May 22 11:59:56 2014
From: fiona at epcc.ed.ac.uk (Fiona Reid)
Date: Thu, 22 May 2014 17:59:56 +0100
Subject: [petsc-users] Obtaining TS rosw solver output at regular time steps
Message-ID: <537E2D0C.5050606@epcc.ed.ac.uk>

Dear PETSc Users,

Can anyone advise as to how I can obtain output from the TS rosw solver 
at regular time steps, e.g. every 0.05 seconds?

I'm using a very slightly modified version of the example code from 
petsc-3.4.3/src/ts/examples/tutorials/ex20.c (changes are user.mu = 1.0 
and user->next_output += 0.05). I set the initial time step to be 0.05 
via TSSetInitialTimeStep.

If I use the default solver (beuler) and the -monitor option I get 
output looking like:

./ex20 -ts_type beuler -monitor | more
[0.0] 0 TS 0.000000 (dt = 0.050000) X  2.000000e+00  0.000000e+00
[0.1] 1 TS 0.050000 (dt = 0.050000) X  1.995658e+00 -8.683325e-02
[0.1] 2 TS 0.100000 (dt = 0.050000) X  1.987545e+00 -1.622726e-01
[0.2] 3 TS 0.150000 (dt = 0.050000) X  1.976146e+00 -2.279661e-01
[0.2] 4 TS 0.200000 (dt = 0.050000) X  1.961876e+00 -2.854046e-01
[0.2] 5 TS 0.250000 (dt = 0.050000) X  1.945081e+00 -3.359109e-01
[0.3] 6 TS 0.300000 (dt = 0.050000) X  1.926048e+00 -3.806427e-01
[0.3] 7 TS 0.350000 (dt = 0.050000) X  1.905018e+00 -4.206033e-01
[0.4] 8 TS 0.400000 (dt = 0.050000) X  1.882185e+00 -4.566572e-01
[0.4] 9 TS 0.450000 (dt = 0.050000) X  1.857708e+00 -4.895467e-01
[0.5] 10 TS 0.500000 (dt = 0.050000) X  1.831713e+00 -5.199087e-01

However if I switch to using the rosw solver instead I get:

./ex20 -ts_type rosw -monitor  | more
[0.0] 0 TS 0.000000 (dt = 0.050000) X  0.000000e+00  0.000000e+00
[0.1] 1 TS 0.050000 (dt = 0.061949) X  1.997620e+00 -9.284729e-02
[0.1] 2 TS 0.111949 (dt = 0.065192) X  1.990961e+00 -1.726821e-01
[0.2] 3 TS 0.177141 (dt = 0.068763) X  1.980577e+00 -2.414006e-01
[0.2] 4 TS 0.245904 (dt = 0.073732) X  1.966977e+00 -3.007620e-01
[0.2] 5 TS 0.319635 (dt = 0.080204) X  1.950593e+00 -3.523419e-01
[0.3] 5 TS 0.319635 (dt = 0.080204) X  1.931848e+00 -3.974691e-01
[0.3] 6 TS 0.399840 (dt = 0.088357) X  1.910959e+00 -4.373211e-01
[0.4] 7 TS 0.488197 (dt = 0.098465) X  1.888151e+00 -4.729449e-01
[0.4] 7 TS 0.488197 (dt = 0.098465) X  1.863723e+00 -5.051196e-01
[0.5] 8 TS 0.586663 (dt = 0.110786) X  1.837695e+00 -5.346152e-01
[0.5] 8 TS 0.586663 (dt = 0.110786) X  1.810291e+00 -5.620222e-01

Sometimes I get two different X values for the same value of TS. Ideally 
I'd like to have the output from rosw for exactly the same time values 
as beuler, e.g. every 0.05 seconds, such that it's possible to directly 
compare the two solvers.

Is there a way to fix the time step in PETSc for the rosw solver such 
that I can get output every 0.05 seconds? if so how can I do this?

Thank you very much in advance.

Fiona

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/59e482f7/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/59e482f7/attachment.ksh>

From jaolive at MIT.EDU  Thu May 22 12:47:55 2014
From: jaolive at MIT.EDU (Jean-Arthur Louis Olive)
Date: Thu, 22 May 2014 17:47:55 +0000
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
In-Reply-To: <2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
References: <53725A86.3070804@uidaho.edu>
	<591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
	<2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
Message-ID: <0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu>

Hi Barry,
sorry about the late reply- 
We indeed use structured grids (DMDA 2d) - but do not ever provide a Jacobian for our non-linear stokes problem (instead just rely on petsc's FD approximation). I understand "snes_type test" is meant to compare petsc?s Jacobian with a user-provided analytical Jacobian.
Are you saying we should provide an exact Jacobian for our simple linear test and see if there?s a problem with the approximate Jacobian?
Thanks,
Arthur & Eric


>   If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below.
> 
>   Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian.
> 
>   Barry
> 
> 
> On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU> wrote:
> 
>> Hi all,
>> we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations.
>> 
>> So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below).
>> 
>> RESIDUAL 1 (NO COUPLING):
>>  for (j=info->ys; j<info->ys+info->ym; j++) {
>>    for (i=info->xs; i<info->xs+info->xm; i++) {
>>      f[j][i].P = x[j][i].P - 3000000;
>>      f[j][i].vx= 2*x[j][i].vx;
>>      f[j][i].vy= 3*x[j][i].vy - 2;
>>      f[j][i].T = x[j][i].T;
>> }
>> 
>> RESIDUAL 2 (ONE COUPLING TERM):
>>  for (j=info->ys; j<info->ys+info->ym; j++) {
>>    for (i=info->xs; i<info->xs+info->xm; i++) {
>>      f[j][i].P = x[j][i].P - 3;
>>      f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
>>      f[j][i].vy= x[j][i].vy - 2;
>>      f[j][i].T = x[j][i].T;     
>>    }
>>  }
>> 
>> 
>> and our default set of options is:
>> 
>> 
>> OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp 
>> 
>> 
>> With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below:
>> 
>> 
>> Result from Solve - RESIDUAL 1 
>>  0 SNES Function norm 8.485281374240e+07 
>>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
>>    1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
>>  1 SNES Function norm 1.131370849896e+02 
>>    0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
>>  2 SNES Function norm 1.131370849896e+02 
>> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>> 
>> 
>> With the coupled residual (Residual 2), the norms do not match, see below
>> 
>> 
>> Result from Solve - RESIDUAL 2:
>>  0 SNES Function norm 1.019803902719e+02 
>>    0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
>>    1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
>>  1 SNES Function norm 1.697056274848e+02 
>>    0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
>>    1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
>>  2 SNES Function norm 3.236770473841e-07 
>> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
>> 
>> 
>> Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second.
>> 
>> 
>> Result from Solve with -snes_fd - RESIDUAL 2
>> 0 SNES Function norm 8.485281374240e+07 
>>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
>>    1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
>>  1 SNES Function norm 2.039607805429e+02 
>>    0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
>>    1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
>>  2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
>>    0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
>>  3 SNES Function norm 2.549509757105e+01 
>> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3
>> 
>> 
>> Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms?
>> 
>> Thanks a lot,
>> Arthur and Eric
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1855 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/c007854d/attachment.bin>

From knepley at gmail.com  Thu May 22 12:52:07 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 22 May 2014 12:52:07 -0500
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
In-Reply-To: <0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu>
References: <53725A86.3070804@uidaho.edu>
	<591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
	<2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
	<0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu>
Message-ID: <CAMYG4GkABVQiria_xSuCFmHrCy7O3DMpK8t=yc8mwa9bGk5uyQ@mail.gmail.com>

On Thu, May 22, 2014 at 12:47 PM, Jean-Arthur Louis Olive
<jaolive at mit.edu>wrote:

> Hi Barry,
> sorry about the late reply-
> We indeed use structured grids (DMDA 2d) - but do not ever provide a
> Jacobian for our non-linear stokes problem (instead just rely on petsc's FD
> approximation). I understand "snes_type test" is meant to compare petsc?s
> Jacobian with a user-provided analytical Jacobian.
> Are you saying we should provide an exact Jacobian for our simple linear
> test and see if there?s a problem with the approximate Jacobian?
>

The Jacobian computed by PETSc uses a finite-difference approximation, and
thus is only accurate to maybe 1.0e-7
depending on the conditioning of your system. Are you trying to compare
things that are more precise than that? You
can provide an exact Jacobian to get machine accuracy.

  Matt


> Thanks,
> Arthur & Eric
>
>
>
> >   If you are using DMDA and either DMGetColoring or the SNESSetDM
> approach and dof is 4 then we color each of the 4 variables per grid point
> with a different color so coupling between variables within a grid point is
> not a problem. This would not explain the problem you are seeing below.
> >
> >   Run your code with -snes_type test and read the results and follow the
> directions to debug your Jacobian.
> >
> >   Barry
> >
> >
> > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU>
> wrote:
> >
> >> Hi all,
> >> we are using PETSc to solve the steady state Stokes equations with
> non-linear viscosities using finite difference. Recently we have realized
> that our true residual norm after the last KSP solve did not match next
> SNES function norm when solving the linear Stokes equations.
> >>
> >> So to understand this better, we set up two extremely simple linear
> residuals, one with no coupling between variables (vx, vy, P and T), the
> other with one coupling term (shown below).
> >>
> >> RESIDUAL 1 (NO COUPLING):
> >>  for (j=info->ys; j<info->ys+info->ym; j++) {
> >>    for (i=info->xs; i<info->xs+info->xm; i++) {
> >>      f[j][i].P = x[j][i].P - 3000000;
> >>      f[j][i].vx= 2*x[j][i].vx;
> >>      f[j][i].vy= 3*x[j][i].vy - 2;
> >>      f[j][i].T = x[j][i].T;
> >> }
> >>
> >> RESIDUAL 2 (ONE COUPLING TERM):
> >>  for (j=info->ys; j<info->ys+info->ym; j++) {
> >>    for (i=info->xs; i<info->xs+info->xm; i++) {
> >>      f[j][i].P = x[j][i].P - 3;
> >>      f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
> >>      f[j][i].vy= x[j][i].vy - 2;
> >>      f[j][i].T = x[j][i].T;
> >>    }
> >>  }
> >>
> >>
> >> and our default set of options is:
> >>
> >>
> >> OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2
> -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor
> -snes_converged_reason -snes_view -log_summary -options_left 1
> -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp
> >>
> >>
> >> With the uncoupled residual (Residual 1), we get matching KSP and SNES
> norm, highlighted below:
> >>
> >>
> >> Result from Solve - RESIDUAL 1
> >>  0 SNES Function norm 8.485281374240e+07
> >>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm
> 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm
> 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
> >>  1 SNES Function norm 1.131370849896e+02
> >>    0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm
> 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
> >>  2 SNES Function norm 1.131370849896e+02
> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> >>
> >>
> >> With the coupled residual (Residual 2), the norms do not match, see
> below
> >>
> >>
> >> Result from Solve - RESIDUAL 2:
> >>  0 SNES Function norm 1.019803902719e+02
> >>    0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm
> 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm
> 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
> >>  1 SNES Function norm 1.697056274848e+02
> >>    0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm
> 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm
> 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
> >>  2 SNES Function norm 3.236770473841e-07
> >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
> >>
> >>
> >> Lastly, if we add -snes_fd to our options, the norms for residual 2 get
> better - they match after the first iteration but not after the second.
> >>
> >>
> >> Result from Solve with -snes_fd - RESIDUAL 2
> >> 0 SNES Function norm 8.485281374240e+07
> >>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm
> 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm
> 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
> >>  1 SNES Function norm 2.039607805429e+02
> >>    0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm
> 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm
> 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
> >>  2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
> >>    0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm
> 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
> >>  3 SNES Function norm 2.549509757105e+01
> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3
> >>
> >>
> >> Does this mean that our Jacobian is not approximated properly by the
> default ?coloring? method when it has off-diagonal terms?
> >>
> >> Thanks a lot,
> >> Arthur and Eric
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/753d8143/attachment.html>

From jed at jedbrown.org  Thu May 22 12:57:37 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 22 May 2014 11:57:37 -0600
Subject: [petsc-users] Obtaining TS rosw solver output at regular time
	steps
In-Reply-To: <537E2D0C.5050606@epcc.ed.ac.uk>
References: <537E2D0C.5050606@epcc.ed.ac.uk>
Message-ID: <87vbsxhf1q.fsf@jedbrown.org>

Fiona Reid <fiona at epcc.ed.ac.uk> writes:

> Dear PETSc Users,
>
> Can anyone advise as to how I can obtain output from the TS rosw solver 
> at regular time steps, e.g. every 0.05 seconds?
>
> I'm using a very slightly modified version of the example code from 
> petsc-3.4.3/src/ts/examples/tutorials/ex20.c (changes are user.mu = 1.0 
> and user->next_output += 0.05). I set the initial time step to be 0.05 
> via TSSetInitialTimeStep.

If you want a constant time step size, use -ts_adapt_type none.  By
default, RosW uses an adaptive step size with an embedded error
estimator.  Note that RosW is a multi-stage method, so you might, for
example, compare the accuracy and efficiency of

  -ts_type rosw -ts_rosw_type ra34pw2 -ts_dt 0.2

to

  -ts_type beuler -ts_dt 0.05

> If I use the default solver (beuler) and the -monitor option I get 
> output looking like:
>
> ./ex20 -ts_type beuler -monitor | more
> [0.0] 0 TS 0.000000 (dt = 0.050000) X  2.000000e+00  0.000000e+00
> [0.1] 1 TS 0.050000 (dt = 0.050000) X  1.995658e+00 -8.683325e-02
> [0.1] 2 TS 0.100000 (dt = 0.050000) X  1.987545e+00 -1.622726e-01
> [0.2] 3 TS 0.150000 (dt = 0.050000) X  1.976146e+00 -2.279661e-01
> [0.2] 4 TS 0.200000 (dt = 0.050000) X  1.961876e+00 -2.854046e-01
> [0.2] 5 TS 0.250000 (dt = 0.050000) X  1.945081e+00 -3.359109e-01
> [0.3] 6 TS 0.300000 (dt = 0.050000) X  1.926048e+00 -3.806427e-01
> [0.3] 7 TS 0.350000 (dt = 0.050000) X  1.905018e+00 -4.206033e-01
> [0.4] 8 TS 0.400000 (dt = 0.050000) X  1.882185e+00 -4.566572e-01
> [0.4] 9 TS 0.450000 (dt = 0.050000) X  1.857708e+00 -4.895467e-01
> [0.5] 10 TS 0.500000 (dt = 0.050000) X  1.831713e+00 -5.199087e-01
>
> However if I switch to using the rosw solver instead I get:
>
> ./ex20 -ts_type rosw -monitor  | more
> [0.0] 0 TS 0.000000 (dt = 0.050000) X  0.000000e+00  0.000000e+00
> [0.1] 1 TS 0.050000 (dt = 0.061949) X  1.997620e+00 -9.284729e-02
> [0.1] 2 TS 0.111949 (dt = 0.065192) X  1.990961e+00 -1.726821e-01
> [0.2] 3 TS 0.177141 (dt = 0.068763) X  1.980577e+00 -2.414006e-01
> [0.2] 4 TS 0.245904 (dt = 0.073732) X  1.966977e+00 -3.007620e-01
> [0.2] 5 TS 0.319635 (dt = 0.080204) X  1.950593e+00 -3.523419e-01
> [0.3] 5 TS 0.319635 (dt = 0.080204) X  1.931848e+00 -3.974691e-01
> [0.3] 6 TS 0.399840 (dt = 0.088357) X  1.910959e+00 -4.373211e-01
> [0.4] 7 TS 0.488197 (dt = 0.098465) X  1.888151e+00 -4.729449e-01
> [0.4] 7 TS 0.488197 (dt = 0.098465) X  1.863723e+00 -5.051196e-01
> [0.5] 8 TS 0.586663 (dt = 0.110786) X  1.837695e+00 -5.346152e-01
> [0.5] 8 TS 0.586663 (dt = 0.110786) X  1.810291e+00 -5.620222e-01
>
> Sometimes I get two different X values for the same value of TS. Ideally 
> I'd like to have the output from rosw for exactly the same time values 
> as beuler, e.g. every 0.05 seconds, such that it's possible to directly 
> compare the two solvers.
>
> Is there a way to fix the time step in PETSc for the rosw solver such 
> that I can get output every 0.05 seconds? if so how can I do this?
>
> Thank you very much in advance.
>
> Fiona
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/88869abb/attachment-0001.pgp>

From jaolive at MIT.EDU  Thu May 22 13:01:52 2014
From: jaolive at MIT.EDU (Jean-Arthur Louis Olive)
Date: Thu, 22 May 2014 18:01:52 +0000
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
In-Reply-To: <CAMYG4GkABVQiria_xSuCFmHrCy7O3DMpK8t=yc8mwa9bGk5uyQ@mail.gmail.com>
References: <53725A86.3070804@uidaho.edu>
	<591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
	<2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
	<0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu>
	<CAMYG4GkABVQiria_xSuCFmHrCy7O3DMpK8t=yc8mwa9bGk5uyQ@mail.gmail.com>
Message-ID: <07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu>

Hi Matt,
our underlying problem is the mismatch between KSP ans SNES norms, even when solving a simple linear system, e.g.,

  for (j=info->ys; j<info->ys+info->ym; j++) {
    for (i=info->xs; i<info->xs+info->xm; i++) {
      f[j][i].P = x[j][i].P - 3;
      f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
      f[j][i].vy= x[j][i].vy - 2;
      f[j][i].T = x[j][i].T;     
    }
  }

which should not have any conditioning issue. So I don?t think in this case it?s an accuracy problem- but something could be wrong with the FD estimation of our Jacobian (?)

Arthur


On May 22, 2014, at 11:52 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, May 22, 2014 at 12:47 PM, Jean-Arthur Louis Olive <jaolive at mit.edu> wrote:
> Hi Barry,
> sorry about the late reply-
> We indeed use structured grids (DMDA 2d) - but do not ever provide a Jacobian for our non-linear stokes problem (instead just rely on petsc's FD approximation). I understand "snes_type test" is meant to compare petsc?s Jacobian with a user-provided analytical Jacobian.
> Are you saying we should provide an exact Jacobian for our simple linear test and see if there?s a problem with the approximate Jacobian?
> 
> The Jacobian computed by PETSc uses a finite-difference approximation, and thus is only accurate to maybe 1.0e-7
> depending on the conditioning of your system. Are you trying to compare things that are more precise than that? You
> can provide an exact Jacobian to get machine accuracy.
> 
>   Matt
>  
> Thanks,
> Arthur & Eric
> 
> 
> 
> >   If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below.
> >
> >   Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian.
> >
> >   Barry
> >
> >
> > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU> wrote:
> >
> >> Hi all,
> >> we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations.
> >>
> >> So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below).
> >>
> >> RESIDUAL 1 (NO COUPLING):
> >>  for (j=info->ys; j<info->ys+info->ym; j++) {
> >>    for (i=info->xs; i<info->xs+info->xm; i++) {
> >>      f[j][i].P = x[j][i].P - 3000000;
> >>      f[j][i].vx= 2*x[j][i].vx;
> >>      f[j][i].vy= 3*x[j][i].vy - 2;
> >>      f[j][i].T = x[j][i].T;
> >> }
> >>
> >> RESIDUAL 2 (ONE COUPLING TERM):
> >>  for (j=info->ys; j<info->ys+info->ym; j++) {
> >>    for (i=info->xs; i<info->xs+info->xm; i++) {
> >>      f[j][i].P = x[j][i].P - 3;
> >>      f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
> >>      f[j][i].vy= x[j][i].vy - 2;
> >>      f[j][i].T = x[j][i].T;
> >>    }
> >>  }
> >>
> >>
> >> and our default set of options is:
> >>
> >>
> >> OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp
> >>
> >>
> >> With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below:
> >>
> >>
> >> Result from Solve - RESIDUAL 1
> >>  0 SNES Function norm 8.485281374240e+07
> >>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
> >>  1 SNES Function norm 1.131370849896e+02
> >>    0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
> >>  2 SNES Function norm 1.131370849896e+02
> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> >>
> >>
> >> With the coupled residual (Residual 2), the norms do not match, see below
> >>
> >>
> >> Result from Solve - RESIDUAL 2:
> >>  0 SNES Function norm 1.019803902719e+02
> >>    0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
> >>  1 SNES Function norm 1.697056274848e+02
> >>    0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
> >>  2 SNES Function norm 3.236770473841e-07
> >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
> >>
> >>
> >> Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second.
> >>
> >>
> >> Result from Solve with -snes_fd - RESIDUAL 2
> >> 0 SNES Function norm 8.485281374240e+07
> >>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
> >>  1 SNES Function norm 2.039607805429e+02
> >>    0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
> >>    1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
> >>  2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
> >>    0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
> >>  3 SNES Function norm 2.549509757105e+01
> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3
> >>
> >>
> >> Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms?
> >>
> >> Thanks a lot,
> >> Arthur and Eric
> >
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/1787a055/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1855 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/1787a055/attachment.bin>

From knepley at gmail.com  Thu May 22 13:19:50 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 22 May 2014 13:19:50 -0500
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
In-Reply-To: <07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu>
References: <53725A86.3070804@uidaho.edu>
	<591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
	<2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
	<0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu>
	<CAMYG4GkABVQiria_xSuCFmHrCy7O3DMpK8t=yc8mwa9bGk5uyQ@mail.gmail.com>
	<07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu>
Message-ID: <CAMYG4GmNtEAZSKTjNv5JrYwvkGeG+CrWXMMrY_16p4oxSL4XDQ@mail.gmail.com>

On Thu, May 22, 2014 at 1:01 PM, Jean-Arthur Louis Olive <jaolive at mit.edu>wrote:

> Hi Matt,
> our underlying problem is the mismatch between KSP ans SNES norms, even
> when solving a simple linear system, e.g.,
>
>   for (j=info->ys; j<info->ys+info->ym; j++) {
>     for (i=info->xs; i<info->xs+info->xm; i++) {
>       f[j][i].P = x[j][i].P - 3;
>       f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
>       f[j][i].vy= x[j][i].vy - 2;
>       f[j][i].T = x[j][i].T;
>     }
>   }
>
> which should not have any conditioning issue. So I don?t think in this
> case it?s an accuracy problem- but something could be wrong with the FD
> estimation of our Jacobian (?)
>

I think you are misinterpreting the output. As I said before, the FD
Jacobian will only be accurate to about
1.0e-7 (which is what I see with my own code). Thus it will only match the
SNES residual to this precision.
If you want an exact match, you need to code up the exact Jacobian.

   Matt


> Arthur
>
>
> On May 22, 2014, at 11:52 AM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Thu, May 22, 2014 at 12:47 PM, Jean-Arthur Louis Olive <jaolive at mit.edu
> > wrote:
>
>> Hi Barry,
>> sorry about the late reply-
>> We indeed use structured grids (DMDA 2d) - but do not ever provide a
>> Jacobian for our non-linear stokes problem (instead just rely on petsc's FD
>> approximation). I understand "snes_type test" is meant to compare petsc?s
>> Jacobian with a user-provided analytical Jacobian.
>> Are you saying we should provide an exact Jacobian for our simple linear
>> test and see if there?s a problem with the approximate Jacobian?
>>
>
> The Jacobian computed by PETSc uses a finite-difference approximation, and
> thus is only accurate to maybe 1.0e-7
> depending on the conditioning of your system. Are you trying to compare
> things that are more precise than that? You
> can provide an exact Jacobian to get machine accuracy.
>
>   Matt
>
>
>> Thanks,
>> Arthur & Eric
>>
>>
>>
>> >   If you are using DMDA and either DMGetColoring or the SNESSetDM
>> approach and dof is 4 then we color each of the 4 variables per grid point
>> with a different color so coupling between variables within a grid point is
>> not a problem. This would not explain the problem you are seeing below.
>> >
>> >   Run your code with -snes_type test and read the results and follow
>> the directions to debug your Jacobian.
>> >
>> >   Barry
>> >
>> >
>> > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU>
>> wrote:
>> >
>> >> Hi all,
>> >> we are using PETSc to solve the steady state Stokes equations with
>> non-linear viscosities using finite difference. Recently we have realized
>> that our true residual norm after the last KSP solve did not match next
>> SNES function norm when solving the linear Stokes equations.
>> >>
>> >> So to understand this better, we set up two extremely simple linear
>> residuals, one with no coupling between variables (vx, vy, P and T), the
>> other with one coupling term (shown below).
>> >>
>> >> RESIDUAL 1 (NO COUPLING):
>> >>  for (j=info->ys; j<info->ys+info->ym; j++) {
>> >>    for (i=info->xs; i<info->xs+info->xm; i++) {
>> >>      f[j][i].P = x[j][i].P - 3000000;
>> >>      f[j][i].vx= 2*x[j][i].vx;
>> >>      f[j][i].vy= 3*x[j][i].vy - 2;
>> >>      f[j][i].T = x[j][i].T;
>> >> }
>> >>
>> >> RESIDUAL 2 (ONE COUPLING TERM):
>> >>  for (j=info->ys; j<info->ys+info->ym; j++) {
>> >>    for (i=info->xs; i<info->xs+info->xm; i++) {
>> >>      f[j][i].P = x[j][i].P - 3;
>> >>      f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
>> >>      f[j][i].vy= x[j][i].vy - 2;
>> >>      f[j][i].T = x[j][i].T;
>> >>    }
>> >>  }
>> >>
>> >>
>> >> and our default set of options is:
>> >>
>> >>
>> >> OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2
>> -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor
>> -snes_converged_reason -snes_view -log_summary -options_left 1
>> -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp
>> >>
>> >>
>> >> With the uncoupled residual (Residual 1), we get matching KSP and SNES
>> norm, highlighted below:
>> >>
>> >>
>> >> Result from Solve - RESIDUAL 1
>> >>  0 SNES Function norm 8.485281374240e+07
>> >>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid
>> norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid
>> norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
>> >>  1 SNES Function norm 1.131370849896e+02
>> >>    0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid
>> norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
>> >>  2 SNES Function norm 1.131370849896e+02
>> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>> >>
>> >>
>> >> With the coupled residual (Residual 2), the norms do not match, see
>> below
>> >>
>> >>
>> >> Result from Solve - RESIDUAL 2:
>> >>  0 SNES Function norm 1.019803902719e+02
>> >>    0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid
>> norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid
>> norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
>> >>  1 SNES Function norm 1.697056274848e+02
>> >>    0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid
>> norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid
>> norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
>> >>  2 SNES Function norm 3.236770473841e-07
>> >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
>> >>
>> >>
>> >> Lastly, if we add -snes_fd to our options, the norms for residual 2
>> get better - they match after the first iteration but not after the second.
>> >>
>> >>
>> >> Result from Solve with -snes_fd - RESIDUAL 2
>> >> 0 SNES Function norm 8.485281374240e+07
>> >>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid
>> norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid
>> norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
>> >>  1 SNES Function norm 2.039607805429e+02
>> >>    0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid
>> norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid
>> norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
>> >>  2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
>> >>    0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid
>> norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
>> >>  3 SNES Function norm 2.549509757105e+01
>> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3
>> >>
>> >>
>> >> Does this mean that our Jacobian is not approximated properly by the
>> default ?coloring? method when it has off-diagonal terms?
>> >>
>> >> Thanks a lot,
>> >> Arthur and Eric
>> >
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/f4fccd89/attachment-0001.html>

From bsmith at mcs.anl.gov  Thu May 22 13:21:19 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 22 May 2014 13:21:19 -0500
Subject: [petsc-users] Non-matching KSP and SNES norms during SNES solve
In-Reply-To: <07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu>
References: <53725A86.3070804@uidaho.edu>
	<591D878E-0113-42FE-923D-07E0282E9597@mit.edu>
	<2ED30B91-02C3-4116-98C2-CA8CEFF63A1F@mcs.anl.gov>
	<0DA92020-DFAA-4A11-BF79-30DB31BE04AA@mit.edu>
	<CAMYG4GkABVQiria_xSuCFmHrCy7O3DMpK8t=yc8mwa9bGk5uyQ@mail.gmail.com>
	<07EE1E22-AD0E-4104-8545-0ED77A750574@mit.edu>
Message-ID: <B6D91A74-9AD1-44F6-AF33-07E71C3990E5@mcs.anl.gov>


  1 SNES Function norm 1.697056274848e+02 
    0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
  2 SNES Function norm 3.236770473841e-07 

   With matrix free multiply  2 SNES Function norm 3.236770473841e-07  will not and should not match true resid norm 5.777940247956e-12   they are computed in completely different ways and one has lost half the digits in the finite differencing thus the linear system only approximates the ?nonlinear system? (which also happens to be linear) to roughly half the decimal digits so the results above are completely reasonable and expected (in single precision 5.777940247956e-12 and 3.236770473841e-07 compared to O(1) are both zero .  Of course with "exact arithmetic? and no differencing they will be same.

  Barry

-snes_fd is not practical in any way.

Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2


Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second.


Result from Solve with -snes_fd - RESIDUAL 2
 0 SNES Function norm 8.485281374240e+07 
    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
  1 SNES Function norm 2.039607805429e+02 
    0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
  2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
    0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
  3 SNES Function norm 2.549509757105e+01 
Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3


On May 22, 2014, at 1:01 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU> wrote:

> Hi Matt,
> our underlying problem is the mismatch between KSP ans SNES norms, even when solving a simple linear system, e.g.,
> 
>   for (j=info->ys; j<info->ys+info->ym; j++) {
>     for (i=info->xs; i<info->xs+info->xm; i++) {
>       f[j][i].P = x[j][i].P - 3;
>       f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
>       f[j][i].vy= x[j][i].vy - 2;
>       f[j][i].T = x[j][i].T;     
>     }
>   }
> 
> which should not have any conditioning issue. So I don?t think in this case it?s an accuracy problem- but something could be wrong with the FD estimation of our Jacobian (?)
> 
> Arthur
> 
> 
> On May 22, 2014, at 11:52 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
>> On Thu, May 22, 2014 at 12:47 PM, Jean-Arthur Louis Olive <jaolive at mit.edu> wrote:
>> Hi Barry,
>> sorry about the late reply-
>> We indeed use structured grids (DMDA 2d) - but do not ever provide a Jacobian for our non-linear stokes problem (instead just rely on petsc's FD approximation). I understand "snes_type test" is meant to compare petsc?s Jacobian with a user-provided analytical Jacobian.
>> Are you saying we should provide an exact Jacobian for our simple linear test and see if there?s a problem with the approximate Jacobian?
>> 
>> The Jacobian computed by PETSc uses a finite-difference approximation, and thus is only accurate to maybe 1.0e-7
>> depending on the conditioning of your system. Are you trying to compare things that are more precise than that? You
>> can provide an exact Jacobian to get machine accuracy.
>> 
>>   Matt
>>  
>> Thanks,
>> Arthur & Eric
>> 
>> 
>> 
>> >   If you are using DMDA and either DMGetColoring or the SNESSetDM approach and dof is 4 then we color each of the 4 variables per grid point with a different color so coupling between variables within a grid point is not a problem. This would not explain the problem you are seeing below.
>> >
>> >   Run your code with -snes_type test and read the results and follow the directions to debug your Jacobian.
>> >
>> >   Barry
>> >
>> >
>> > On May 13, 2014, at 1:20 PM, Jean-Arthur Louis Olive <jaolive at MIT.EDU> wrote:
>> >
>> >> Hi all,
>> >> we are using PETSc to solve the steady state Stokes equations with non-linear viscosities using finite difference. Recently we have realized that our true residual norm after the last KSP solve did not match next SNES function norm when solving the linear Stokes equations.
>> >>
>> >> So to understand this better, we set up two extremely simple linear residuals, one with no coupling between variables (vx, vy, P and T), the other with one coupling term (shown below).
>> >>
>> >> RESIDUAL 1 (NO COUPLING):
>> >>  for (j=info->ys; j<info->ys+info->ym; j++) {
>> >>    for (i=info->xs; i<info->xs+info->xm; i++) {
>> >>      f[j][i].P = x[j][i].P - 3000000;
>> >>      f[j][i].vx= 2*x[j][i].vx;
>> >>      f[j][i].vy= 3*x[j][i].vy - 2;
>> >>      f[j][i].T = x[j][i].T;
>> >> }
>> >>
>> >> RESIDUAL 2 (ONE COUPLING TERM):
>> >>  for (j=info->ys; j<info->ys+info->ym; j++) {
>> >>    for (i=info->xs; i<info->xs+info->xm; i++) {
>> >>      f[j][i].P = x[j][i].P - 3;
>> >>      f[j][i].vx= x[j][i].vx - 3*x[j][i].vy;
>> >>      f[j][i].vy= x[j][i].vy - 2;
>> >>      f[j][i].T = x[j][i].T;
>> >>    }
>> >>  }
>> >>
>> >>
>> >> and our default set of options is:
>> >>
>> >>
>> >> OPTIONS:  mpiexec -np $np ../Stokes -snes_max_it 4 -ksp_atol 2.0e+2 -ksp_max_it 20 -ksp_rtol 9.0e-1 -ksp_type fgmres -snes_monitor -snes_converged_reason -snes_view -log_summary -options_left 1 -ksp_monitor_true_residual -pc_type none -snes_linesearch_type cp
>> >>
>> >>
>> >> With the uncoupled residual (Residual 1), we get matching KSP and SNES norm, highlighted below:
>> >>
>> >>
>> >> Result from Solve - RESIDUAL 1
>> >>  0 SNES Function norm 8.485281374240e+07
>> >>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.333333333330e-06
>> >>  1 SNES Function norm 1.131370849896e+02
>> >>    0 KSP unpreconditioned resid norm 1.131370849896e+02 true resid norm 1.131370849896e+02 ||r(i)||/||b|| 1.000000000000e+00
>> >>  2 SNES Function norm 1.131370849896e+02
>> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>> >>
>> >>
>> >> With the coupled residual (Residual 2), the norms do not match, see below
>> >>
>> >>
>> >> Result from Solve - RESIDUAL 2:
>> >>  0 SNES Function norm 1.019803902719e+02
>> >>    0 KSP unpreconditioned resid norm 1.019803902719e+02 true resid norm 1.019803902719e+02 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 8.741176309016e+01 true resid norm 8.741176309016e+01 ||r(i)||/||b|| 8.571428571429e-01
>> >>  1 SNES Function norm 1.697056274848e+02
>> >>    0 KSP unpreconditioned resid norm 1.697056274848e+02 true resid norm 1.697056274848e+02 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 5.828670868165e-12 true resid norm 5.777940247956e-12 ||r(i)||/||b|| 3.404683942184e-14
>> >>  2 SNES Function norm 3.236770473841e-07
>> >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 2
>> >>
>> >>
>> >> Lastly, if we add -snes_fd to our options, the norms for residual 2 get better - they match after the first iteration but not after the second.
>> >>
>> >>
>> >> Result from Solve with -snes_fd - RESIDUAL 2
>> >> 0 SNES Function norm 8.485281374240e+07
>> >>    0 KSP unpreconditioned resid norm 8.485281374240e+07 true resid norm 8.485281374240e+07 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 2.403700850300e-06
>> >>  1 SNES Function norm 2.039607805429e+02
>> >>    0 KSP unpreconditioned resid norm 2.039607805429e+02 true resid norm 2.039607805429e+02 ||r(i)||/||b|| 1.000000000000e+00
>> >>    1 KSP unpreconditioned resid norm 2.529822128436e+01 true resid norm 2.529822128436e+01 ||r(i)||/||b|| 1.240347346045e-01
>> >>  2 SNES Function norm 2.549509757105e+01 [SLIGHTLY DIFFERENT]
>> >>    0 KSP unpreconditioned resid norm 2.549509757105e+01 true resid norm 2.549509757105e+01 ||r(i)||/||b|| 1.000000000000e+00
>> >>  3 SNES Function norm 2.549509757105e+01
>> >> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 3
>> >>
>> >>
>> >> Does this mean that our Jacobian is not approximated properly by the default ?coloring? method when it has off-diagonal terms?
>> >>
>> >> Thanks a lot,
>> >> Arthur and Eric
>> >
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
> 


From fiona at epcc.ed.ac.uk  Thu May 22 13:26:38 2014
From: fiona at epcc.ed.ac.uk (Fiona Reid)
Date: Thu, 22 May 2014 19:26:38 +0100
Subject: [petsc-users] Obtaining TS rosw solver output at regular time
 steps
In-Reply-To: <87vbsxhf1q.fsf@jedbrown.org>
References: <537E2D0C.5050606@epcc.ed.ac.uk> <87vbsxhf1q.fsf@jedbrown.org>
Message-ID: <537E415E.7010409@epcc.ed.ac.uk>


On 22/05/2014 18:57, Jed Brown wrote:
> If you want a constant time step size, use -ts_adapt_type none.  By
> default, RosW uses an adaptive step size with an embedded error
> estimator.  Note that RosW is a multi-stage method, so you might, for
> example, compare the accuracy and efficiency of
>
>    -ts_type rosw -ts_rosw_type ra34pw2 -ts_dt 0.2
>
> to
>
>    -ts_type beuler -ts_dt 0.05

Many thanks Jed, that's brilliant and does exactly what I need.

Fiona


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


From fiona at epcc.ed.ac.uk  Thu May 22 13:53:52 2014
From: fiona at epcc.ed.ac.uk (Fiona Reid)
Date: Thu, 22 May 2014 19:53:52 +0100
Subject: [petsc-users] Obtaining TS rosw solver output at regular time
 steps
In-Reply-To: <87vbsxhf1q.fsf@jedbrown.org>
References: <537E2D0C.5050606@epcc.ed.ac.uk> <87vbsxhf1q.fsf@jedbrown.org>
Message-ID: <537E47C0.5010205@epcc.ed.ac.uk>

Apologies everyone, I have another somewhat related question.

If I actually want to use a variable time step with RosW is there any 
way to output the results at regular 0.05 seconds intervals? I realise 
this will interpolate between two points but it would be good to be able 
to plot all my data for the same time values.

Using "-ts_adapt_type none" doesn't quite give me a "good enough" solution.

Many thanks,

Fiona


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


From likunt at caltech.edu  Thu May 22 13:55:16 2014
From: likunt at caltech.edu (Likun Tan)
Date: Thu, 22 May 2014 14:55:16 -0400
Subject: [petsc-users] output vec
In-Reply-To: <CAMYG4Gn=9mNQdn0YEHEbDaWfDinEWD7bz-_8O+WyjW+nXALRHg@mail.gmail.com>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
	<1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
	<874n0hj1id.fsf@jedbrown.org>
	<EB21C5B2-0894-430B-91DB-2F589C317713@caltech.edu>
	<CAMYG4Gn=9mNQdn0YEHEbDaWfDinEWD7bz-_8O+WyjW+nXALRHg@mail.gmail.com>
Message-ID: <E9CA6E27-8C9C-4010-89FA-5F7E50C9724E@caltech.edu>

Hi Matt,

I am not using PetscBinaryRead. I wrote a binary file from Petsc and use Matlab's function to read it, I.e.

fileID = fopen('result.bin', 'w');
data = fread(fileID, 'double');

But this gives me unreasonable values of data. I checked this example

http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex54f.F.html

which is exactly what I need for my problem. Do you have a C version of it ? Many thanks.

> On May 22, 2014, at 12:26 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
>> On Thu, May 22, 2014 at 11:20 AM, Likun Tan <likunt at caltech.edu> wrote:
>> I am using VecView to output the vec in a binary file and tried to open it in Matlab. I define the precision to be double, but Matlab does not give reasonable values of my vec (almost extremely large or small or NaN values). Here is my code
> 
> Are you using PetscBinaryRead.m in Matlab? If so, send the code snippet for a small vector, all the output, and the binary file.
> 
>   Matt
>  
>> ========================================
>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view);
>> for(step=0; step<STEP; step++)
>> {
>>        //compute M at current step
>>        VecView(M, view);
>> }
>> PetscViewerDestroy(&view);
>> =======================================
>> 
>> I am not sure if there is any problem of my Petsc code. Your comment is well appreciated.
>> 
>> > On May 22, 2014, at 11:07 AM, Jed Brown <jed at jedbrown.org> wrote:
>> >
>> > Likun Tan <likunt at caltech.edu> writes:
>> >
>> >> Thanks for your suggestion.
>> >> Using VecView or PetscViewerBinaryWrite will print the vec vertically, i.e.
>> >> m1
>> >> m2
>> >> m3
>> >> m4
>> >> m5
>> >> m6
>> >
>> > The binary viewer writes a *binary* file.  No formatting or line breaks.
>> >
>> >> But I prefer the form
>> >>
>> >> m1 m2 m3
>> >> m4 m5 m6
>> >>
>> >> Since in the end I will have about 1e+7 elements in the vec. If there is no way to output the vec in the second form, I will simply use VecView. Thanks.
>> >
>> > Use VecView to write a binary (not ASCII) file.  See
>> > PetscViewerBinaryOpen().  You can look at it with python, matlab/octave,
>> > etc.
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/adabe746/attachment.html>

From danyang.su at gmail.com  Thu May 22 13:58:10 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Thu, 22 May 2014 11:58:10 -0700
Subject: [petsc-users] Question on local vec to global vec for dof > 1
Message-ID: <537E48C2.3060105@gmail.com>

Hi All,

I have a 1D transient flow problem (1 dof) coupled with energy balance 
(1 dof), so the total dof per node is 2.

The whole domain has 10 nodes in z direction.

The program runs well with 1 processor but failed in 2 processors. The 
matrix is the same for 1 processor and 2 processor but the rhs are 
different.

The following is used to set the rhs value.

call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
vecpointer = (calculate the rhs value here)
call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)

*Vecview  Correct * *Vecview Wrong*
dof     local node           Process [0] _Process [0] _ /_Process [0] _/
1            1              1.395982780116148E-021 
1.39598e-021                 1.39598e-021
1            2              0.000000000000000E+000 
0                                       0
1            3              0.000000000000000E+000 
0                                       0
1            4              5.642372883946980E-037 
5.64237e-037                 5.64237e-037
1            5              0.000000000000000E+000 
0                                       0
1            6 -1.395982780116148E-021 -7.52316e-037                
-1.39598e-021 Line A
2            1 0.000000000000000E+000 7.52316e-037                 0
2            2 0.000000000000000E+000 0 0
2            3 0.000000000000000E+000 1.68459e-016 0
2            4 4.814824860968090E-035 0.1296 4.81482e-035
2            5 0.000000000000000E+000 _/Process [1]/_ Line B
2            6 -1.371273884908092E-019 0 
7.52316e-037                       Line C
0               0
                                        Process [1] 0 1.68459e-016
1            1 1.395982780116148E-021 4.81482e-035 
0.1296                                     Line D
1            2 -7.523163845262640E-037 0 
1.37127e-019                         Line E
1            3 7.523163845262640E-037 -7.22224e-035                
-7.22224e-035
1            4 0.000000000000000E+000 7.22224e-035                 
7.22224e-035
1            5 1.684590875336239E-016 
0                                       0
1            6 0.129600000000000          
128623                            128623
2            1 1.371273884908092E-019 
0                                       0 Line F
2            2             -7.222237291452134E-035
2            3              7.222237291452134E-035
2            4              0.000000000000000E+000
2            5               128623.169844761
2            6              0.000000000000000E+000

The red line (Line A, C, D and F) is the ghost values for 2 subdomains, 
but when run with 2 processor, the program treates Line B, C, D, and E 
as ghost values.
*How can I handle this kind of local vector to global vector assembly?*

*In fact, the codes can work if the dof and local node is as follows.*
dof     local node
1            1
2            1
1            2
2            2
1            3
2            3

Thanks and regards,

Danyang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/ffb3c3a4/attachment-0001.html>

From knepley at gmail.com  Thu May 22 13:59:09 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 22 May 2014 13:59:09 -0500
Subject: [petsc-users] output vec
In-Reply-To: <E9CA6E27-8C9C-4010-89FA-5F7E50C9724E@caltech.edu>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
	<1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
	<874n0hj1id.fsf@jedbrown.org>
	<EB21C5B2-0894-430B-91DB-2F589C317713@caltech.edu>
	<CAMYG4Gn=9mNQdn0YEHEbDaWfDinEWD7bz-_8O+WyjW+nXALRHg@mail.gmail.com>
	<E9CA6E27-8C9C-4010-89FA-5F7E50C9724E@caltech.edu>
Message-ID: <CAMYG4GmUaRSVgzqZ4c8shdNwOE79o6GSpRLgfY5qeHSSkqEevA@mail.gmail.com>

On Thu, May 22, 2014 at 1:55 PM, Likun Tan <likunt at caltech.edu> wrote:

> Hi Matt,
>
> I am not using PetscBinaryRead. I wrote a binary file from Petsc and use
> Matlab's function to read it, I.e.
>
> fileID = fopen('result.bin', 'w');
> data = fread(fileID, 'double');
>
> But this gives me unreasonable values of data. I checked this example
>
>
> http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex54f.F.html
>
> which is exactly what I need for my problem. Do you have a C version of it
> ? Many thanks.
>

Why would you rewrite this?


https://bitbucket.org/petsc/petsc/src/2c43c009db31f079231059c9efed501d4deca8bf/bin/matlab/PetscBinaryRead.m?at=master

   Matt


> On May 22, 2014, at 12:26 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Thu, May 22, 2014 at 11:20 AM, Likun Tan <likunt at caltech.edu> wrote:
>
>> I am using VecView to output the vec in a binary file and tried to open
>> it in Matlab. I define the precision to be double, but Matlab does not give
>> reasonable values of my vec (almost extremely large or small or NaN
>> values). Here is my code
>>
>
> Are you using PetscBinaryRead.m in Matlab? If so, send the code snippet
> for a small vector, all the output, and the binary file.
>
>   Matt
>
>
>> ========================================
>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, & view);
>> for(step=0; step<STEP; step++)
>> {
>>        //compute M at current step
>>        VecView(M, view);
>> }
>> PetscViewerDestroy(&view);
>> =======================================
>>
>> I am not sure if there is any problem of my Petsc code. Your comment is
>> well appreciated.
>>
>> > On May 22, 2014, at 11:07 AM, Jed Brown <jed at jedbrown.org> wrote:
>> >
>> > Likun Tan <likunt at caltech.edu> writes:
>> >
>> >> Thanks for your suggestion.
>> >> Using VecView or PetscViewerBinaryWrite will print the vec vertically,
>> i.e.
>> >> m1
>> >> m2
>> >> m3
>> >> m4
>> >> m5
>> >> m6
>> >
>> > The binary viewer writes a *binary* file.  No formatting or line breaks.
>> >
>> >> But I prefer the form
>> >>
>> >> m1 m2 m3
>> >> m4 m5 m6
>> >>
>> >> Since in the end I will have about 1e+7 elements in the vec. If there
>> is no way to output the vec in the second form, I will simply use VecView.
>> Thanks.
>> >
>> > Use VecView to write a binary (not ASCII) file.  See
>> > PetscViewerBinaryOpen().  You can look at it with python, matlab/octave,
>> > etc.
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/0a4be8fc/attachment.html>

From knepley at gmail.com  Thu May 22 14:01:05 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 22 May 2014 14:01:05 -0500
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <537E48C2.3060105@gmail.com>
References: <537E48C2.3060105@gmail.com>
Message-ID: <CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>

On Thu, May 22, 2014 at 1:58 PM, Danyang Su <danyang.su at gmail.com> wrote:

>  Hi All,
>
> I have a 1D transient flow problem (1 dof) coupled with energy balance (1
> dof), so the total dof per node is 2.
>
> The whole domain has 10 nodes in z direction.
>
> The program runs well with 1 processor but failed in 2 processors. The
> matrix is the same for 1 processor and 2 processor but the rhs are
> different.
>
> The following is used to set the rhs value.
>
> call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
> vecpointer = (calculate the rhs value here)
> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>
>
> *Vecview  Correct *            *Vecview  Wrong*
> dof     local node           Process [0]                                  *Process
> [0]   *                  * Process [0] *
> 1            1              1.395982780116148E-021
> 1.39598e-021                 1.39598e-021
> 1            2              0.000000000000000E+000
> 0                                       0
> 1            3              0.000000000000000E+000
> 0                                       0
> 1            4              5.642372883946980E-037
> 5.64237e-037                 5.64237e-037
> 1            5              0.000000000000000E+000
> 0                                       0
> 1            6             -1.395982780116148E-021              -7.52316e-037
> -1.39598e-021                       Line A
> 2            1              0.000000000000000E+000
> 7.52316e-037                 0
> 2            2              0.000000000000000E+000                0
> 0
> 2            3              0.000000000000000E+000
> 1.68459e-016                  0
> 2            4              4.814824860968090E-035                0.1296
> 4.81482e-035
> 2            5              0.000000000000000E+000
>              *Process [1]*                             Line B
> 2            6             -1.371273884908092E-019               0
> 7.52316e-037                       Line C
>
>                    0                                        0
>                                        Process [1]
>           0                                       1.68459e-016
> 1            1              1.395982780116148E-021
> 4.81482e-035                 0.1296
> Line D
> 1            2             -7.523163845262640E-037                0
>             1.37127e-019                         Line E
> 1            3              7.523163845262640E-037
> -7.22224e-035                -7.22224e-035
> 1            4              0.000000000000000E+000
> 7.22224e-035                 7.22224e-035
> 1            5              1.684590875336239E-016
> 0                                       0
> 1            6              0.129600000000000
> 128623                            128623
> 2            1              1.371273884908092E-019
> 0
> 0                                                Line F
> 2            2             -7.222237291452134E-035
> 2            3              7.222237291452134E-035
> 2            4              0.000000000000000E+000
> 2            5               128623.169844761
> 2            6              0.000000000000000E+000
>
> The red line (Line A, C, D and F) is the ghost values for 2 subdomains,
> but when run with 2 processor, the program treates Line B, C, D, and E as
> ghost values.
> *How can I handle this kind of local vector to global vector assembly?*
>

Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for.

   Matt


>
> *In fact, the codes can work if the dof and local node is as follows.*
> dof     local node
> 1            1
> 2            1
> 1            2
> 2            2
> 1            3
> 2            3
>
> Thanks and regards,
>
> Danyang
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/4fa6f86a/attachment.html>

From jed at jedbrown.org  Thu May 22 15:22:30 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 22 May 2014 14:22:30 -0600
Subject: [petsc-users] Obtaining TS rosw solver output at regular time
	steps
In-Reply-To: <537E47C0.5010205@epcc.ed.ac.uk>
References: <537E2D0C.5050606@epcc.ed.ac.uk> <87vbsxhf1q.fsf@jedbrown.org>
	<537E47C0.5010205@epcc.ed.ac.uk>
Message-ID: <87ha4hh8c9.fsf@jedbrown.org>

Fiona Reid <fiona at epcc.ed.ac.uk> writes:

> Apologies everyone, I have another somewhat related question.
>
> If I actually want to use a variable time step with RosW is there any 
> way to output the results at regular 0.05 seconds intervals? I realise 
> this will interpolate between two points but it would be good to be able 
> to plot all my data for the same time values.
>
> Using "-ts_adapt_type none" doesn't quite give me a "good enough" solution.

I recommend writing a monitor (TSMonitorSet) that checks whether an
"interesting" time has been passed on the step that just completed, then
use TSInterpolate() to obtain a solution at that "interesting" time.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/ed308f6c/attachment.pgp>

From danyang.su at gmail.com  Thu May 22 16:44:47 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Thu, 22 May 2014 14:44:47 -0700
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
References: <537E48C2.3060105@gmail.com>
	<CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
Message-ID: <537E6FCF.4030408@gmail.com>

On 22/05/2014 12:01 PM, Matthew Knepley wrote:
> On Thu, May 22, 2014 at 1:58 PM, Danyang Su <danyang.su at gmail.com 
> <mailto:danyang.su at gmail.com>> wrote:
>
>     Hi All,
>
>     I have a 1D transient flow problem (1 dof) coupled with energy
>     balance (1 dof), so the total dof per node is 2.
>
>     The whole domain has 10 nodes in z direction.
>
>     The program runs well with 1 processor but failed in 2 processors.
>     The matrix is the same for 1 processor and 2 processor but the rhs
>     are different.
>
>     The following is used to set the rhs value.
>
>     call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
>     vecpointer = (calculate the rhs value here)
>     call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
>     call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>     call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>
>     *Vecview  Correct * *Vecview  Wrong*
>     dof     local node           Process [0] _Process [0] _ /_Process
>     [0] _/
>     1            1 1.395982780116148E-021 1.39598e-021                
>     1.39598e-021
>     1            2 0.000000000000000E+000
>     0                                       0
>     1            3 0.000000000000000E+000 0                           
>                0
>     1            4 5.642372883946980E-037 5.64237e-037                
>     5.64237e-037
>     1            5 0.000000000000000E+000
>     0                                       0
>     1            6 -1.395982780116148E-021 -7.52316e-037
>     -1.39598e-021                       Line A
>     2            1 0.000000000000000E+000 7.52316e-037 0
>     2            2 0.000000000000000E+000 0 0
>     2            3 0.000000000000000E+000 1.68459e-016                  0
>     2            4 4.814824860968090E-035 0.1296 4.81482e-035
>     2            5 0.000000000000000E+000 _/Process [1]/_ Line B
>     2            6 -1.371273884908092E-019 0
>     7.52316e-037                       Line C
>     0           0
>                                            Process [1] 0 1.68459e-016
>     1            1 1.395982780116148E-021 4.81482e-035
>     0.1296                                     Line D
>     1            2 -7.523163845262640E-037 0            
>     1.37127e-019                         Line E
>     1            3 7.523163845262640E-037 -7.22224e-035               
>     -7.22224e-035
>     1            4 0.000000000000000E+000 7.22224e-035                
>     7.22224e-035
>     1            5 1.684590875336239E-016
>     0                                       0
>     1            6 0.129600000000000         
>     128623                            128623
>     2            1 1.371273884908092E-019 0
>     0                                                Line F
>     2            2 -7.222237291452134E-035
>     2            3 7.222237291452134E-035
>     2            4 0.000000000000000E+000
>     2            5 128623.169844761
>     2            6              0.000000000000000E+000
>
>     The red line (Line A, C, D and F) is the ghost values for 2
>     subdomains, but when run with 2 processor, the program treates
>     Line B, C, D, and E as ghost values.
>     *How can I handle this kind of local vector to global vector
>     assembly?*
>
>
> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is 
> for.
Thanks, Matthew.

I tried the following codes but still cannot get the correct global rhs 
vector

call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr)
do i = 1,nvz !nvz is local node amount, here is 6
   vecpointer1d(0,i-1) = x_array_loc(i) !assume x_array_loc is the local 
rhs (the third column in the above mentioned data)
   vecpointer1d(1,i-1) = x_array_loc(i+nvz)
end do
call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr)
call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)


Now the rhs for 1 processor is as follows. It is not what I want.

1.39598e-021
0
-0
-0
-0
-0
5.64237e-037
4.81482e-035
-0
-0
-7.52316e-037
-7.22224e-035
7.52316e-037
7.22224e-035
-0
-0
1.68459e-016
128623
0.1296
0
>
>    Matt
>
>
>     *In fact, the codes can work if the dof and local node is as follows.*
>     dof     local node
>     1            1
>     2            1
>     1            2
>     2            2
>     1            3
>     2            3
>
>     Thanks and regards,
>
>     Danyang
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/5c9efb3c/attachment-0001.html>

From likunt at caltech.edu  Thu May 22 18:01:06 2014
From: likunt at caltech.edu (likunt at caltech.edu)
Date: Thu, 22 May 2014 16:01:06 -0700 (PDT)
Subject: [petsc-users] output vec
In-Reply-To: <CAMYG4GmUaRSVgzqZ4c8shdNwOE79o6GSpRLgfY5qeHSSkqEevA@mail.gmail.com>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu> 
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
	<1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>   
	<874n0hj1id.fsf@jedbrown.org>   
	<EB21C5B2-0894-430B-91DB-2F589C317713@caltech.edu>   
	<CAMYG4Gn=9mNQdn0YEHEbDaWfDinEWD7bz-_8O+WyjW+nXALRHg@mail.gmail.com> 
	<E9CA6E27-8C9C-4010-89FA-5F7E50C9724E@caltech.edu>   
	<CAMYG4GmUaRSVgzqZ4c8shdNwOE79o6GSpRLgfY5qeHSSkqEevA@mail.gmail.com>
Message-ID: <58214.131.215.248.200.1400799666.squirrel@webmail.caltech.edu>

Thanks for your suggestion.
I've successfully read some binary files from petsc examples using
PetscBinaryRead, but I still have problem when reading the binary file
from my code. The issue is that only the first 8 elements are read
correctly and the rest are either extremely large or small numbers. I am
using the following commands for data output:

PetscViewerBinaryOpen(PETSC_COMM_WORLD, 'result.txt', FILE_MODE_WRITE,
&view);
VecView(field.M, view);
PetscViewerDestroy(&view);

Would you please give comments on the possible reason for this? Thank you.

> On Thu, May 22, 2014 at 1:55 PM, Likun Tan <likunt at caltech.edu> wrote:
>
>> Hi Matt,
>> I am not using PetscBinaryRead. I wrote a binary file from Petsc and
use
>> Matlab's function to read it, I.e.
>> fileID = fopen('result.bin', 'w');
>> data = fread(fileID, 'double');
>> But this gives me unreasonable values of data. I checked this example
http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex54f.F.html
which is exactly what I need for my problem. Do you have a C version of
it
>> ? Many thanks.
>
> Why would you rewrite this?
>
>
> https://bitbucket.org/petsc/petsc/src/2c43c009db31f079231059c9efed501d4deca8bf/bin/matlab/PetscBinaryRead.m?at=master
>
>    Matt
>
>
>> On May 22, 2014, at 12:26 PM, Matthew Knepley <knepley at gmail.com>
wrote:
>> On Thu, May 22, 2014 at 11:20 AM, Likun Tan <likunt at caltech.edu> wrote:
>>> I am using VecView to output the vec in a binary file and tried to
open
>>> it in Matlab. I define the precision to be double, but Matlab does not
give
>>> reasonable values of my vec (almost extremely large or small or NaN
values). Here is my code
>> Are you using PetscBinaryRead.m in Matlab? If so, send the code snippet
for a small vector, all the output, and the binary file.
>>   Matt
>>> ========================================
>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, NAME, FILE_MODE_WRITE, &
view);
>>> for(step=0; step<STEP; step++)
>>> {
>>>        //compute M at current step
>>>        VecView(M, view);
>>> }
>>> PetscViewerDestroy(&view);
>>> =======================================
>>> I am not sure if there is any problem of my Petsc code. Your comment
is
>>> well appreciated.
>>> > On May 22, 2014, at 11:07 AM, Jed Brown <jed at jedbrown.org> wrote:
>>> >
>>> > Likun Tan <likunt at caltech.edu> writes:
>>> >
>>> >> Thanks for your suggestion.
>>> >> Using VecView or PetscViewerBinaryWrite will print the vec
>>> vertically,
>>> i.e.
>>> >> m1
>>> >> m2
>>> >> m3
>>> >> m4
>>> >> m5
>>> >> m6
>>> >
>>> > The binary viewer writes a *binary* file.  No formatting or line
>>> breaks.
>>> >
>>> >> But I prefer the form
>>> >>
>>> >> m1 m2 m3
>>> >> m4 m5 m6
>>> >>
>>> >> Since in the end I will have about 1e+7 elements in the vec. If
>>> there
>>> is no way to output the vec in the second form, I will simply use
VecView.
>>> Thanks.
>>> >
>>> > Use VecView to write a binary (not ASCII) file.  See
>>> > PetscViewerBinaryOpen().  You can look at it with python,
>>> matlab/octave,
>>> > etc.
>> --
>> What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their
>> experiments lead.
>> -- Norbert Wiener
>
>
> --
> What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their
> experiments lead.
> -- Norbert Wiener
>


From bsmith at mcs.anl.gov  Thu May 22 18:03:01 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 22 May 2014 18:03:01 -0500
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <537E6FCF.4030408@gmail.com>
References: <537E48C2.3060105@gmail.com>
	<CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
	<537E6FCF.4030408@gmail.com>
Message-ID: <D1081C7A-0D57-4B01-940E-6CDC34B4B188@mcs.anl.gov>


  Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on 
x_vec_loc only on the first process. Is it what you expect?

  Also what is vecpointer1d declared to be?


  Barry

On May 22, 2014, at 4:44 PM, Danyang Su <danyang.su at gmail.com> wrote:

> On 22/05/2014 12:01 PM, Matthew Knepley wrote:
>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su <danyang.su at gmail.com> wrote:
>> Hi All,
>> 
>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2.
>> 
>> The whole domain has 10 nodes in z direction. 
>> 
>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. 
>> 
>> The following is used to set the rhs value.
>> 
>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
>> vecpointer = (calculate the rhs value here)
>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>> 
>>                                                                                              Vecview  Correct             Vecview  Wrong
>> dof     local node           Process [0]                                  Process [0]                      Process [0] 
>> 1            1              1.395982780116148E-021               1.39598e-021                 1.39598e-021
>> 1            2              0.000000000000000E+000               0                                       0
>> 1            3              0.000000000000000E+000               0                                       0
>> 1            4              5.642372883946980E-037               5.64237e-037                 5.64237e-037
>> 1            5              0.000000000000000E+000                0                                       0
>> 1            6             -1.395982780116148E-021               -7.52316e-037                -1.39598e-021                       Line A
>> 2            1              0.000000000000000E+000               7.52316e-037                 0
>> 2            2              0.000000000000000E+000                0                                       0
>> 2            3              0.000000000000000E+000               1.68459e-016                  0
>> 2            4              4.814824860968090E-035                0.1296                             4.81482e-035
>> 2            5              0.000000000000000E+000                                                        Process [1]                             Line B
>> 2            6             -1.371273884908092E-019               0                                        7.52316e-037                       Line C
>>                                                                                              0                                        0
>>                                        Process [1]                                   0                                       1.68459e-016
>> 1            1              1.395982780116148E-021               4.81482e-035                 0.1296                                     Line D
>> 1            2             -7.523163845262640E-037                0                                      1.37127e-019                         Line E 
>> 1            3              7.523163845262640E-037               -7.22224e-035                -7.22224e-035
>> 1            4              0.000000000000000E+000               7.22224e-035                 7.22224e-035
>> 1            5              1.684590875336239E-016                0                                       0
>> 1            6              0.129600000000000                         128623                            128623
>> 2            1              1.371273884908092E-019               0                                       0                                                Line F
>> 2            2             -7.222237291452134E-035            
>> 2            3              7.222237291452134E-035            
>> 2            4              0.000000000000000E+000            
>> 2            5               128623.169844761                
>> 2            6              0.000000000000000E+000    
>> 
>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values.
>> How can I handle this kind of local vector to global vector assembly?
>> 
>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for.
> Thanks, Matthew. 
> 
> I tried the following codes but still cannot get the correct global rhs vector
>  
> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr)
> do i = 1,nvz                                                                         !nvz is local node amount, here is 6
>   vecpointer1d(0,i-1) = x_array_loc(i)                              !assume x_array_loc is the local rhs (the third column in the above mentioned data)
>   vecpointer1d(1,i-1) = x_array_loc(i+nvz)
> end do
> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr)
> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
> 
> 
> Now the rhs for 1 processor is as follows. It is not what I want.
> 
> 1.39598e-021
> 0
> -0
> -0
> -0
> -0
> 5.64237e-037
> 4.81482e-035
> -0
> -0
> -7.52316e-037
> -7.22224e-035
> 7.52316e-037
> 7.22224e-035
> -0
> -0
> 1.68459e-016
> 128623
> 0.1296
> 0
>> 
>>    Matt
>>  
>> 
>> In fact, the codes can work if the dof and local node is as follows.
>> dof     local node   
>> 1            1       
>> 2            1       
>> 1            2       
>> 2            2       
>> 1            3       
>> 2            3   
>> 
>> Thanks and regards,
>> 
>> Danyang
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
> 


From jed at jedbrown.org  Thu May 22 18:04:31 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 22 May 2014 17:04:31 -0600
Subject: [petsc-users] output vec
In-Reply-To: <58214.131.215.248.200.1400799666.squirrel@webmail.caltech.edu>
References: <53462.131.215.248.200.1400769130.squirrel@webmail.caltech.edu>
	<CAMYG4Gm1E_B8mOPoVsKrDPm5EkR3z-kuonK4_NafD6dPVnxnyA@mail.gmail.com>
	<1738CF80-E5AF-47CE-BB0B-D36DA97AF5CB@caltech.edu>
	<874n0hj1id.fsf@jedbrown.org>
	<EB21C5B2-0894-430B-91DB-2F589C317713@caltech.edu>
	<CAMYG4Gn=9mNQdn0YEHEbDaWfDinEWD7bz-_8O+WyjW+nXALRHg@mail.gmail.com>
	<E9CA6E27-8C9C-4010-89FA-5F7E50C9724E@caltech.edu>
	<CAMYG4GmUaRSVgzqZ4c8shdNwOE79o6GSpRLgfY5qeHSSkqEevA@mail.gmail.com>
	<58214.131.215.248.200.1400799666.squirrel@webmail.caltech.edu>
Message-ID: <87ppj5fm9s.fsf@jedbrown.org>

likunt at caltech.edu writes:

> Thanks for your suggestion.
> I've successfully read some binary files from petsc examples using
> PetscBinaryRead, but I still have problem when reading the binary file
> from my code. The issue is that only the first 8 elements are read
> correctly and the rest are either extremely large or small numbers. I am
> using the following commands for data output:

It sounds like your code for reading the file is incorrect.  You can
look at the implementation (in C, Matlab, or Python), or you can just
call VecLoad().

> PetscViewerBinaryOpen(PETSC_COMM_WORLD, 'result.txt', FILE_MODE_WRITE,
> &view);
> VecView(field.M, view);
> PetscViewerDestroy(&view);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/6492b0ec/attachment.pgp>

From danyang.su at gmail.com  Thu May 22 18:33:01 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Thu, 22 May 2014 16:33:01 -0700
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <D1081C7A-0D57-4B01-940E-6CDC34B4B188@mcs.anl.gov>
References: <537E48C2.3060105@gmail.com>
	<CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
	<537E6FCF.4030408@gmail.com>
	<D1081C7A-0D57-4B01-940E-6CDC34B4B188@mcs.anl.gov>
Message-ID: <537E892D.9030808@gmail.com>

Hi Barry,

I use the following routine to reorder from the local rhs to global rhs.

               PetscScalar, pointer :: vecpointer1d(:,:)

               call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr) 
     !x_vec_gbl is a global vector created by DMCreateGlobalVector
               do i = nvzls,nvzle !local node number without ghost node
                 vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1) 
!x_array_loc is local rhs
                 vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz) !nvz 
= 6 for the present 1d example
               end do
               call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr)

Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs 
order is not what I expected. I want global rhs vector hold the values 
of dof=1 first and then dof=2 as the local matrix and rhs hold value in 
this order.

x_vec_gbl x_vec_gbl
dof       node VecView(Current)   dof      node VecView (Expected)
1          1          1.39598e-021             1 1          1.39598e-021
2          1          0 1         2          0
1          2          -0 1         3          0
2          2          -0 1         4          5.64237e-037
1          3          -0 1         5          0
2          3          -0 1         6          -7.52316e-037
1          4          5.64237e-037             1 7          7.52316e-037
2          4          4.81482e-035             1 8          0
1          5          -0 1         9          1.68459e-016
2          5          -0 1         10         0.1296

1          6          -7.52316e-037            2 1          0
2          6          -7.22224e-035            2 2          0
1          7          7.52316e-037             2 3          0
2          7          7.22224e-035             2 4          4.81482e-035
1          8          -0 2         5          0
2          8          -0 2         6          -7.22224e-035
1          9          1.68459e-016             2 7          7.22224e-035
2          9          128623                        2 8          0
1          10         0.1296                        2 9          128623
2          10         0                                  2 10         0

Thanks and regards,

Danyang


On 22/05/2014 4:03 PM, Barry Smith wrote:
>    Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on
> x_vec_loc only on the first process. Is it what you expect?
>
>    Also what is vecpointer1d declared to be?
>
>
>    Barry
>
> On May 22, 2014, at 4:44 PM, Danyang Su <danyang.su at gmail.com> wrote:
>
>> On 22/05/2014 12:01 PM, Matthew Knepley wrote:
>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su <danyang.su at gmail.com> wrote:
>>> Hi All,
>>>
>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2.
>>>
>>> The whole domain has 10 nodes in z direction.
>>>
>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different.
>>>
>>> The following is used to set the rhs value.
>>>
>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
>>> vecpointer = (calculate the rhs value here)
>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>
>>>                                                                                               Vecview  Correct             Vecview  Wrong
>>> dof     local node           Process [0]                                  Process [0]                      Process [0]
>>> 1            1              1.395982780116148E-021               1.39598e-021                 1.39598e-021
>>> 1            2              0.000000000000000E+000               0                                       0
>>> 1            3              0.000000000000000E+000               0                                       0
>>> 1            4              5.642372883946980E-037               5.64237e-037                 5.64237e-037
>>> 1            5              0.000000000000000E+000                0                                       0
>>> 1            6             -1.395982780116148E-021               -7.52316e-037                -1.39598e-021                       Line A
>>> 2            1              0.000000000000000E+000               7.52316e-037                 0
>>> 2            2              0.000000000000000E+000                0                                       0
>>> 2            3              0.000000000000000E+000               1.68459e-016                  0
>>> 2            4              4.814824860968090E-035                0.1296                             4.81482e-035
>>> 2            5              0.000000000000000E+000                                                        Process [1]                             Line B
>>> 2            6             -1.371273884908092E-019               0                                        7.52316e-037                       Line C
>>>                                                                                               0                                        0
>>>                                         Process [1]                                   0                                       1.68459e-016
>>> 1            1              1.395982780116148E-021               4.81482e-035                 0.1296                                     Line D
>>> 1            2             -7.523163845262640E-037                0                                      1.37127e-019                         Line E
>>> 1            3              7.523163845262640E-037               -7.22224e-035                -7.22224e-035
>>> 1            4              0.000000000000000E+000               7.22224e-035                 7.22224e-035
>>> 1            5              1.684590875336239E-016                0                                       0
>>> 1            6              0.129600000000000                         128623                            128623
>>> 2            1              1.371273884908092E-019               0                                       0                                                Line F
>>> 2            2             -7.222237291452134E-035
>>> 2            3              7.222237291452134E-035
>>> 2            4              0.000000000000000E+000
>>> 2            5               128623.169844761
>>> 2            6              0.000000000000000E+000
>>>
>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values.
>>> How can I handle this kind of local vector to global vector assembly?
>>>
>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for.
>> Thanks, Matthew.
>>
>> I tried the following codes but still cannot get the correct global rhs vector
>>   
>> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>> do i = 1,nvz                                                                         !nvz is local node amount, here is 6
>>    vecpointer1d(0,i-1) = x_array_loc(i)                              !assume x_array_loc is the local rhs (the third column in the above mentioned data)
>>    vecpointer1d(1,i-1) = x_array_loc(i+nvz)
>> end do
>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>
>>
>> Now the rhs for 1 processor is as follows. It is not what I want.
>>
>> 1.39598e-021
>> 0
>> -0
>> -0
>> -0
>> -0
>> 5.64237e-037
>> 4.81482e-035
>> -0
>> -0
>> -7.52316e-037
>> -7.22224e-035
>> 7.52316e-037
>> 7.22224e-035
>> -0
>> -0
>> 1.68459e-016
>> 128623
>> 0.1296
>> 0
>>>     Matt
>>>   
>>>
>>> In fact, the codes can work if the dof and local node is as follows.
>>> dof     local node
>>> 1            1
>>> 2            1
>>> 1            2
>>> 2            2
>>> 1            3
>>> 2            3
>>>
>>> Thanks and regards,
>>>
>>> Danyang
>>>
>>>
>>>
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/c0054988/attachment-0001.html>

From bsmith at mcs.anl.gov  Thu May 22 19:34:52 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 22 May 2014 19:34:52 -0500
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <537E892D.9030808@gmail.com>
References: <537E48C2.3060105@gmail.com>
	<CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
	<537E6FCF.4030408@gmail.com>
	<D1081C7A-0D57-4B01-940E-6CDC34B4B188@mcs.anl.gov>
	<537E892D.9030808@gmail.com>
Message-ID: <3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov>


   DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?.

  Barry

On May 22, 2014, at 6:33 PM, Danyang Su <danyang.su at gmail.com> wrote:

> Hi Barry,
> 
> I use the following routine to reorder from the local rhs to global rhs. 
> 
>               PetscScalar, pointer :: vecpointer1d(:,:)
> 
>               call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr)                  !x_vec_gbl is a global vector created by DMCreateGlobalVector
>               do i = nvzls,nvzle                                                                                          !local node number without ghost node
>                 vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1)                                         !x_array_loc is local rhs
>                 vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz)                                 !nvz = 6 for the present 1d example
>               end do                
>               call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr)
> 
> Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order.
> 
>                              x_vec_gbl                                            x_vec_gbl
> dof       node       VecView(Current)   dof      node        VecView (Expected)
> 1          1          1.39598e-021             1         1          1.39598e-021 
> 2          1          0                                   1         2          0            
> 1          2          -0                                  1         3          0            
> 2          2          -0                                  1         4          5.64237e-037 
> 1          3          -0                                  1         5          0            
> 2          3          -0                                  1         6          -7.52316e-037
> 1          4          5.64237e-037             1         7          7.52316e-037 
> 2          4          4.81482e-035             1         8          0            
> 1          5          -0                                  1         9          1.68459e-016 
> 2          5          -0                                  1         10         0.1296   
>                                                                    
> 1          6          -7.52316e-037            2         1          0
> 2          6          -7.22224e-035            2         2          0
> 1          7          7.52316e-037             2         3          0
> 2          7          7.22224e-035             2         4          4.81482e-035
> 1          8          -0                                  2         5          0
> 2          8          -0                                  2         6          -7.22224e-035
> 1          9          1.68459e-016             2         7          7.22224e-035
> 2          9          128623                        2         8          0
> 1          10         0.1296                        2         9          128623
> 2          10         0                                  2         10         0
> 
> Thanks and regards,
> 
> Danyang
> 
> 
> On 22/05/2014 4:03 PM, Barry Smith wrote:
>>   Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on 
>> x_vec_loc only on the first process. Is it what you expect?
>> 
>>   Also what is vecpointer1d declared to be?
>> 
>> 
>>   Barry
>> 
>> On May 22, 2014, at 4:44 PM, Danyang Su 
>> <danyang.su at gmail.com>
>>  wrote:
>> 
>> 
>>> On 22/05/2014 12:01 PM, Matthew Knepley wrote:
>>> 
>>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su <danyang.su at gmail.com>
>>>>  wrote:
>>>> Hi All,
>>>> 
>>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2.
>>>> 
>>>> The whole domain has 10 nodes in z direction. 
>>>> 
>>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different. 
>>>> 
>>>> The following is used to set the rhs value.
>>>> 
>>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
>>>> vecpointer = (calculate the rhs value here)
>>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>> 
>>>>                                                                                              Vecview  Correct             Vecview  Wrong
>>>> dof     local node           Process [0]                                  Process [0]                      Process [0] 
>>>> 1            1              1.395982780116148E-021               1.39598e-021                 1.39598e-021
>>>> 1            2              0.000000000000000E+000               0                                       0
>>>> 1            3              0.000000000000000E+000               0                                       0
>>>> 1            4              5.642372883946980E-037               5.64237e-037                 5.64237e-037
>>>> 1            5              0.000000000000000E+000                0                                       0
>>>> 1            6             -1.395982780116148E-021               -7.52316e-037                -1.39598e-021                       Line A
>>>> 2            1              0.000000000000000E+000               7.52316e-037                 0
>>>> 2            2              0.000000000000000E+000                0                                       0
>>>> 2            3              0.000000000000000E+000               1.68459e-016                  0
>>>> 2            4              4.814824860968090E-035                0.1296                             4.81482e-035
>>>> 2            5              0.000000000000000E+000                                                        Process [1]                             Line B
>>>> 2            6             -1.371273884908092E-019               0                                        7.52316e-037                       Line C
>>>>                                                                                              0                                        0
>>>>                                        Process [1]                                   0                                       1.68459e-016
>>>> 1            1              1.395982780116148E-021               4.81482e-035                 0.1296                                     Line D
>>>> 1            2             -7.523163845262640E-037                0                                      1.37127e-019                         Line E 
>>>> 1            3              7.523163845262640E-037               -7.22224e-035                -7.22224e-035
>>>> 1            4              0.000000000000000E+000               7.22224e-035                 7.22224e-035
>>>> 1            5              1.684590875336239E-016                0                                       0
>>>> 1            6              0.129600000000000                         128623                            128623
>>>> 2            1              1.371273884908092E-019               0                                       0                                                Line F
>>>> 2            2             -7.222237291452134E-035            
>>>> 2            3              7.222237291452134E-035            
>>>> 2            4              0.000000000000000E+000            
>>>> 2            5               128623.169844761                
>>>> 2            6              0.000000000000000E+000    
>>>> 
>>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values.
>>>> How can I handle this kind of local vector to global vector assembly?
>>>> 
>>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for.
>>>> 
>>> Thanks, Matthew. 
>>> 
>>> I tried the following codes but still cannot get the correct global rhs vector
>>>  
>>> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>>> do i = 1,nvz                                                                         !nvz is local node amount, here is 6
>>>   vecpointer1d(0,i-1) = x_array_loc(i)                              !assume x_array_loc is the local rhs (the third column in the above mentioned data)
>>>   vecpointer1d(1,i-1) = x_array_loc(i+nvz)
>>> end do
>>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>> 
>>> 
>>> Now the rhs for 1 processor is as follows. It is not what I want.
>>> 
>>> 1.39598e-021
>>> 0
>>> -0
>>> -0
>>> -0
>>> -0
>>> 5.64237e-037
>>> 4.81482e-035
>>> -0
>>> -0
>>> -7.52316e-037
>>> -7.22224e-035
>>> 7.52316e-037
>>> 7.22224e-035
>>> -0
>>> -0
>>> 1.68459e-016
>>> 128623
>>> 0.1296
>>> 0
>>> 
>>>>    Matt
>>>>  
>>>> 
>>>> In fact, the codes can work if the dof and local node is as follows.
>>>> dof     local node   
>>>> 1            1       
>>>> 2            1       
>>>> 1            2       
>>>> 2            2       
>>>> 1            3       
>>>> 2            3   
>>>> 
>>>> Thanks and regards,
>>>> 
>>>> Danyang
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
> 


From danyang.su at gmail.com  Thu May 22 19:42:53 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Thu, 22 May 2014 17:42:53 -0700
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov>
References: <537E48C2.3060105@gmail.com>
	<CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
	<537E6FCF.4030408@gmail.com>
	<D1081C7A-0D57-4B01-940E-6CDC34B4B188@mcs.anl.gov>
	<537E892D.9030808@gmail.com>
	<3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov>
Message-ID: <537E998D.8040101@gmail.com>

On 22/05/2014 5:34 PM, Barry Smith wrote:
>     DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?.
Then, is there any routine convert matrix to be "interlaced"?

Thanks,

Danyang
>
>    Barry
>
> On May 22, 2014, at 6:33 PM, Danyang Su <danyang.su at gmail.com> wrote:
>
>> Hi Barry,
>>
>> I use the following routine to reorder from the local rhs to global rhs.
>>
>>                PetscScalar, pointer :: vecpointer1d(:,:)
>>
>>                call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr)                  !x_vec_gbl is a global vector created by DMCreateGlobalVector
>>                do i = nvzls,nvzle                                                                                          !local node number without ghost node
>>                  vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1)                                         !x_array_loc is local rhs
>>                  vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz)                                 !nvz = 6 for the present 1d example
>>                end do
>>                call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr)
>>
>> Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order.
>>
>>                               x_vec_gbl                                            x_vec_gbl
>> dof       node       VecView(Current)   dof      node        VecView (Expected)
>> 1          1          1.39598e-021             1         1          1.39598e-021
>> 2          1          0                                   1         2          0
>> 1          2          -0                                  1         3          0
>> 2          2          -0                                  1         4          5.64237e-037
>> 1          3          -0                                  1         5          0
>> 2          3          -0                                  1         6          -7.52316e-037
>> 1          4          5.64237e-037             1         7          7.52316e-037
>> 2          4          4.81482e-035             1         8          0
>> 1          5          -0                                  1         9          1.68459e-016
>> 2          5          -0                                  1         10         0.1296
>>                                                                     
>> 1          6          -7.52316e-037            2         1          0
>> 2          6          -7.22224e-035            2         2          0
>> 1          7          7.52316e-037             2         3          0
>> 2          7          7.22224e-035             2         4          4.81482e-035
>> 1          8          -0                                  2         5          0
>> 2          8          -0                                  2         6          -7.22224e-035
>> 1          9          1.68459e-016             2         7          7.22224e-035
>> 2          9          128623                        2         8          0
>> 1          10         0.1296                        2         9          128623
>> 2          10         0                                  2         10         0
>>
>> Thanks and regards,
>>
>> Danyang
>>
>>
>> On 22/05/2014 4:03 PM, Barry Smith wrote:
>>>    Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on
>>> x_vec_loc only on the first process. Is it what you expect?
>>>
>>>    Also what is vecpointer1d declared to be?
>>>
>>>
>>>    Barry
>>>
>>> On May 22, 2014, at 4:44 PM, Danyang Su
>>> <danyang.su at gmail.com>
>>>   wrote:
>>>
>>>
>>>> On 22/05/2014 12:01 PM, Matthew Knepley wrote:
>>>>
>>>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su <danyang.su at gmail.com>
>>>>>   wrote:
>>>>> Hi All,
>>>>>
>>>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2.
>>>>>
>>>>> The whole domain has 10 nodes in z direction.
>>>>>
>>>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different.
>>>>>
>>>>> The following is used to set the rhs value.
>>>>>
>>>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
>>>>> vecpointer = (calculate the rhs value here)
>>>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
>>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>>
>>>>>                                                                                               Vecview  Correct             Vecview  Wrong
>>>>> dof     local node           Process [0]                                  Process [0]                      Process [0]
>>>>> 1            1              1.395982780116148E-021               1.39598e-021                 1.39598e-021
>>>>> 1            2              0.000000000000000E+000               0                                       0
>>>>> 1            3              0.000000000000000E+000               0                                       0
>>>>> 1            4              5.642372883946980E-037               5.64237e-037                 5.64237e-037
>>>>> 1            5              0.000000000000000E+000                0                                       0
>>>>> 1            6             -1.395982780116148E-021               -7.52316e-037                -1.39598e-021                       Line A
>>>>> 2            1              0.000000000000000E+000               7.52316e-037                 0
>>>>> 2            2              0.000000000000000E+000                0                                       0
>>>>> 2            3              0.000000000000000E+000               1.68459e-016                  0
>>>>> 2            4              4.814824860968090E-035                0.1296                             4.81482e-035
>>>>> 2            5              0.000000000000000E+000                                                        Process [1]                             Line B
>>>>> 2            6             -1.371273884908092E-019               0                                        7.52316e-037                       Line C
>>>>>                                                                                               0                                        0
>>>>>                                         Process [1]                                   0                                       1.68459e-016
>>>>> 1            1              1.395982780116148E-021               4.81482e-035                 0.1296                                     Line D
>>>>> 1            2             -7.523163845262640E-037                0                                      1.37127e-019                         Line E
>>>>> 1            3              7.523163845262640E-037               -7.22224e-035                -7.22224e-035
>>>>> 1            4              0.000000000000000E+000               7.22224e-035                 7.22224e-035
>>>>> 1            5              1.684590875336239E-016                0                                       0
>>>>> 1            6              0.129600000000000                         128623                            128623
>>>>> 2            1              1.371273884908092E-019               0                                       0                                                Line F
>>>>> 2            2             -7.222237291452134E-035
>>>>> 2            3              7.222237291452134E-035
>>>>> 2            4              0.000000000000000E+000
>>>>> 2            5               128623.169844761
>>>>> 2            6              0.000000000000000E+000
>>>>>
>>>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values.
>>>>> How can I handle this kind of local vector to global vector assembly?
>>>>>
>>>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for.
>>>>>
>>>> Thanks, Matthew.
>>>>
>>>> I tried the following codes but still cannot get the correct global rhs vector
>>>>   
>>>> call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>>>> do i = 1,nvz                                                                         !nvz is local node amount, here is 6
>>>>    vecpointer1d(0,i-1) = x_array_loc(i)                              !assume x_array_loc is the local rhs (the third column in the above mentioned data)
>>>>    vecpointer1d(1,i-1) = x_array_loc(i+nvz)
>>>> end do
>>>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>
>>>>
>>>> Now the rhs for 1 processor is as follows. It is not what I want.
>>>>
>>>> 1.39598e-021
>>>> 0
>>>> -0
>>>> -0
>>>> -0
>>>> -0
>>>> 5.64237e-037
>>>> 4.81482e-035
>>>> -0
>>>> -0
>>>> -7.52316e-037
>>>> -7.22224e-035
>>>> 7.52316e-037
>>>> 7.22224e-035
>>>> -0
>>>> -0
>>>> 1.68459e-016
>>>> 128623
>>>> 0.1296
>>>> 0
>>>>
>>>>>     Matt
>>>>>   
>>>>>
>>>>> In fact, the codes can work if the dof and local node is as follows.
>>>>> dof     local node
>>>>> 1            1
>>>>> 2            1
>>>>> 1            2
>>>>> 2            2
>>>>> 1            3
>>>>> 2            3
>>>>>
>>>>> Thanks and regards,
>>>>>
>>>>> Danyang
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>>


From jed at jedbrown.org  Thu May 22 19:51:12 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 22 May 2014 18:51:12 -0600
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <537E998D.8040101@gmail.com>
References: <537E48C2.3060105@gmail.com>
	<CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
	<537E6FCF.4030408@gmail.com>
	<D1081C7A-0D57-4B01-940E-6CDC34B4B188@mcs.anl.gov>
	<537E892D.9030808@gmail.com>
	<3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov>
	<537E998D.8040101@gmail.com>
Message-ID: <87egzlfhbz.fsf@jedbrown.org>

Danyang Su <danyang.su at gmail.com> writes:

> On 22/05/2014 5:34 PM, Barry Smith wrote:
>>     DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?.
> Then, is there any routine convert matrix to be "interlaced"?

I don't know what you mean.  DMCreateMatrix() will give you a Mat
preallocated for interlaced and that's how you should assemble it (e.g.,
with MatSetValuesBlockedStencil()).  There are several examples that use
this interface.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/96eee065/attachment.pgp>

From bsmith at mcs.anl.gov  Thu May 22 19:51:38 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 22 May 2014 19:51:38 -0500
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <537E998D.8040101@gmail.com>
References: <537E48C2.3060105@gmail.com>
	<CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
	<537E6FCF.4030408@gmail.com>
	<D1081C7A-0D57-4B01-940E-6CDC34B4B188@mcs.anl.gov>
	<537E892D.9030808@gmail.com>
	<3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov>
	<537E998D.8040101@gmail.com>
Message-ID: <CD5EA013-583E-41D4-89BA-51245AB9B9E9@mcs.anl.gov>


  VecStrideScatter,Gather<All> can be used to go from an interlaced vector to a separate vector for each component.  You can also write code as below where you have a non-interlated ARRAY and you put/take the values into the array obtained with the DMDAVecGetArrayF90. In other words PETSc vectors remained interlaced but you work with other arrays that are not interlaced.

   Barry

On May 22, 2014, at 7:42 PM, Danyang Su <danyang.su at gmail.com> wrote:

> On 22/05/2014 5:34 PM, Barry Smith wrote:
>>    DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?.
> Then, is there any routine convert matrix to be "interlaced"?
> 
> Thanks,
> 
> Danyang
>> 
>>   Barry
>> 
>> On May 22, 2014, at 6:33 PM, Danyang Su <danyang.su at gmail.com> wrote:
>> 
>>> Hi Barry,
>>> 
>>> I use the following routine to reorder from the local rhs to global rhs.
>>> 
>>>               PetscScalar, pointer :: vecpointer1d(:,:)
>>> 
>>>               call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr)                  !x_vec_gbl is a global vector created by DMCreateGlobalVector
>>>               do i = nvzls,nvzle                                                                                          !local node number without ghost node
>>>                 vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1)                                         !x_array_loc is local rhs
>>>                 vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz)                                 !nvz = 6 for the present 1d example
>>>               end do
>>>               call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr)
>>> 
>>> Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order.
>>> 
>>>                              x_vec_gbl                                            x_vec_gbl
>>> dof       node       VecView(Current)   dof      node        VecView (Expected)
>>> 1          1          1.39598e-021             1         1          1.39598e-021
>>> 2          1          0                                   1         2          0
>>> 1          2          -0                                  1         3          0
>>> 2          2          -0                                  1         4          5.64237e-037
>>> 1          3          -0                                  1         5          0
>>> 2          3          -0                                  1         6          -7.52316e-037
>>> 1          4          5.64237e-037             1         7          7.52316e-037
>>> 2          4          4.81482e-035             1         8          0
>>> 1          5          -0                                  1         9          1.68459e-016
>>> 2          5          -0                                  1         10         0.1296
>>>                                                                    1          6          -7.52316e-037            2         1          0
>>> 2          6          -7.22224e-035            2         2          0
>>> 1          7          7.52316e-037             2         3          0
>>> 2          7          7.22224e-035             2         4          4.81482e-035
>>> 1          8          -0                                  2         5          0
>>> 2          8          -0                                  2         6          -7.22224e-035
>>> 1          9          1.68459e-016             2         7          7.22224e-035
>>> 2          9          128623                        2         8          0
>>> 1          10         0.1296                        2         9          128623
>>> 2          10         0                                  2         10         0
>>> 
>>> Thanks and regards,
>>> 
>>> Danyang
>>> 
>>> 
>>> On 22/05/2014 4:03 PM, Barry Smith wrote:
>>>>   Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on
>>>> x_vec_loc only on the first process. Is it what you expect?
>>>> 
>>>>   Also what is vecpointer1d declared to be?
>>>> 
>>>> 
>>>>   Barry
>>>> 
>>>> On May 22, 2014, at 4:44 PM, Danyang Su
>>>> <danyang.su at gmail.com>
>>>>  wrote:
>>>> 
>>>> 
>>>>> On 22/05/2014 12:01 PM, Matthew Knepley wrote:
>>>>> 
>>>>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su <danyang.su at gmail.com>
>>>>>>  wrote:
>>>>>> Hi All,
>>>>>> 
>>>>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2.
>>>>>> 
>>>>>> The whole domain has 10 nodes in z direction.
>>>>>> 
>>>>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different.
>>>>>> 
>>>>>> The following is used to set the rhs value.
>>>>>> 
>>>>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
>>>>>> vecpointer = (calculate the rhs value here)
>>>>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
>>>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>>> 
>>>>>>                                                                                              Vecview  Correct             Vecview  Wrong
>>>>>> dof     local node           Process [0]                                  Process [0]                      Process [0]
>>>>>> 1            1              1.395982780116148E-021               1.39598e-021                 1.39598e-021
>>>>>> 1            2              0.000000000000000E+000               0                                       0
>>>>>> 1            3              0.000000000000000E+000               0                                       0
>>>>>> 1            4              5.642372883946980E-037               5.64237e-037                 5.64237e-037
>>>>>> 1            5              0.000000000000000E+000                0                                       0
>>>>>> 1            6             -1.395982780116148E-021               -7.52316e-037                -1.39598e-021                       Line A
>>>>>> 2            1              0.000000000000000E+000               7.52316e-037                 0
>>>>>> 2            2              0.000000000000000E+000                0                                       0
>>>>>> 2            3              0.000000000000000E+000               1.68459e-016                  0
>>>>>> 2            4              4.814824860968090E-035                0.1296                             4.81482e-035
>>>>>> 2            5              0.000000000000000E+000                                                        Process [1]                             Line B
>>>>>> 2            6             -1.371273884908092E-019               0                                        7.52316e-037                       Line C
>>>>>>                                                                                              0                                        0
>>>>>>                                        Process [1]                                   0                                       1.68459e-016
>>>>>> 1            1              1.395982780116148E-021               4.81482e-035                 0.1296                                     Line D
>>>>>> 1            2             -7.523163845262640E-037                0                                      1.37127e-019                         Line E
>>>>>> 1            3              7.523163845262640E-037               -7.22224e-035                -7.22224e-035
>>>>>> 1            4              0.000000000000000E+000               7.22224e-035                 7.22224e-035
>>>>>> 1            5              1.684590875336239E-016                0                                       0
>>>>>> 1            6              0.129600000000000                         128623                            128623
>>>>>> 2            1              1.371273884908092E-019               0                                       0                                                Line F
>>>>>> 2            2             -7.222237291452134E-035
>>>>>> 2            3              7.222237291452134E-035
>>>>>> 2            4              0.000000000000000E+000
>>>>>> 2            5               128623.169844761
>>>>>> 2            6              0.000000000000000E+000
>>>>>> 
>>>>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values.
>>>>>> How can I handle this kind of local vector to global vector assembly?
>>>>>> 
>>>>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for.
>>>>>> 
>>>>> Thanks, Matthew.
>>>>> 
>>>>> I tried the following codes but still cannot get the correct global rhs vector
>>>>>  call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>>>>> do i = 1,nvz                                                                         !nvz is local node amount, here is 6
>>>>>   vecpointer1d(0,i-1) = x_array_loc(i)                              !assume x_array_loc is the local rhs (the third column in the above mentioned data)
>>>>>   vecpointer1d(1,i-1) = x_array_loc(i+nvz)
>>>>> end do
>>>>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>> 
>>>>> 
>>>>> Now the rhs for 1 processor is as follows. It is not what I want.
>>>>> 
>>>>> 1.39598e-021
>>>>> 0
>>>>> -0
>>>>> -0
>>>>> -0
>>>>> -0
>>>>> 5.64237e-037
>>>>> 4.81482e-035
>>>>> -0
>>>>> -0
>>>>> -7.52316e-037
>>>>> -7.22224e-035
>>>>> 7.52316e-037
>>>>> 7.22224e-035
>>>>> -0
>>>>> -0
>>>>> 1.68459e-016
>>>>> 128623
>>>>> 0.1296
>>>>> 0
>>>>> 
>>>>>>    Matt
>>>>>>  
>>>>>> In fact, the codes can work if the dof and local node is as follows.
>>>>>> dof     local node
>>>>>> 1            1
>>>>>> 2            1
>>>>>> 1            2
>>>>>> 2            2
>>>>>> 1            3
>>>>>> 2            3
>>>>>> 
>>>>>> Thanks and regards,
>>>>>> 
>>>>>> Danyang
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener
>>>>>> 
> 


From avi.mosher at wyss.harvard.edu  Thu May 22 23:56:08 2014
From: avi.mosher at wyss.harvard.edu (Robinson-Mosher, Avram Lev)
Date: Fri, 23 May 2014 00:56:08 -0400
Subject: [petsc-users] Is it possible to create a BAIJ matrix with
	non-square blocks?
Message-ID: <EC6FF3857AF8CE4AAB49A278A59832AC01561F18C417@ITCCRMAIL01.MED.HARVARD.EDU>

Hi all,
I'm interested in using PETSc's sparse matrices with block elements, but I would like the elements to be small non-square matrices (e.g., 4 by 3).  Is this possible?  I see that the general construction functions assume that the elements will be square.

Regards,
       -Avi

From jed at jedbrown.org  Fri May 23 00:14:32 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 22 May 2014 23:14:32 -0600
Subject: [petsc-users] Is it possible to create a BAIJ matrix with
	non-square blocks?
In-Reply-To: <EC6FF3857AF8CE4AAB49A278A59832AC01561F18C417@ITCCRMAIL01.MED.HARVARD.EDU>
References: <EC6FF3857AF8CE4AAB49A278A59832AC01561F18C417@ITCCRMAIL01.MED.HARVARD.EDU>
Message-ID: <8738g1dqkn.fsf@jedbrown.org>

"Robinson-Mosher, Avram Lev" <avi.mosher at wyss.harvard.edu> writes:

> Hi all, I'm interested in using PETSc's sparse matrices with block
> elements, but I would like the elements to be small non-square
> matrices (e.g., 4 by 3).  Is this possible?  I see that the general
> construction functions assume that the elements will be square.

The constant block-size matrices are only for square blocks.  But you
can often get some benefits by using AIJ matrices with Inodes (default).
So just create an AIJ matrix with fields interlaced so that 4x3 blocks
exist, then PETSc will coalesce the consecutive rows with identical
sparsity pattern, making the result similar to 4x1 blocks.  This already
provides most of the bandwidth benefit of blocked matrices.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140522/cbc32ae2/attachment.pgp>

From fiona at epcc.ed.ac.uk  Fri May 23 08:14:04 2014
From: fiona at epcc.ed.ac.uk (Fiona Reid)
Date: Fri, 23 May 2014 14:14:04 +0100
Subject: [petsc-users] Obtaining TS rosw solver output at regular time
 steps
In-Reply-To: <87ha4hh8c9.fsf@jedbrown.org>
References: <537E2D0C.5050606@epcc.ed.ac.uk> <87vbsxhf1q.fsf@jedbrown.org>
	<537E47C0.5010205@epcc.ed.ac.uk> <87ha4hh8c9.fsf@jedbrown.org>
Message-ID: <537F499C.2000803@epcc.ed.ac.uk>


On 22/05/2014 21:22, Jed Brown wrote:
> I recommend writing a monitor (TSMonitorSet) that checks whether an
> "interesting" time has been passed on the step that just completed, then
> use TSInterpolate() to obtain a solution at that "interesting" time.

Many thanks Jed. I have that all working now.

Cheers,

Fiona


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


From info at jubileedvds.com  Fri May 23 10:01:46 2014
From: info at jubileedvds.com (Jubilee DVDs)
Date: Fri, 23 May 2014 17:01:46 +0200 (SAST)
Subject: [petsc-users] Jubilee DVDs Newsletter
Message-ID: <1195896-1400857192269-133838-250313049-1-0@b.ss40.shsend.com>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140523/2ec330c9/attachment-0001.html>

From danyang.su at gmail.com  Fri May 23 13:16:04 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Fri, 23 May 2014 11:16:04 -0700
Subject: [petsc-users] Question on local vec to global vec for dof > 1
In-Reply-To: <CD5EA013-583E-41D4-89BA-51245AB9B9E9@mcs.anl.gov>
References: <537E48C2.3060105@gmail.com>
	<CAMYG4Gmyvo88x5A7AOEFptgYwdHRc885PDhjM5nStfJ96Gfe7Q@mail.gmail.com>
	<537E6FCF.4030408@gmail.com>
	<D1081C7A-0D57-4B01-940E-6CDC34B4B188@mcs.anl.gov>
	<537E892D.9030808@gmail.com>
	<3C547E6A-61DF-40A8-9B77-E38DC783FB08@mcs.anl.gov>
	<537E998D.8040101@gmail.com>
	<CD5EA013-583E-41D4-89BA-51245AB9B9E9@mcs.anl.gov>
Message-ID: <537F9064.90202@gmail.com>

Hi All,

Thanks for all your kindly reply. I convert the my codes from 
non-interlaced structure to interlaced structured and it can work now.

Thanks and regards,

Danyang

On 22/05/2014 5:51 PM, Barry Smith wrote:
>    VecStrideScatter,Gather<All> can be used to go from an interlaced vector to a separate vector for each component.  You can also write code as below where you have a non-interlated ARRAY and you put/take the values into the array obtained with the DMDAVecGetArrayF90. In other words PETSc vectors remained interlaced but you work with other arrays that are not interlaced.
>
>     Barry
>
> On May 22, 2014, at 7:42 PM, Danyang Su <danyang.su at gmail.com> wrote:
>
>> On 22/05/2014 5:34 PM, Barry Smith wrote:
>>>     DMDA does not work that way. Local and global vectors associated with DA?s are always ?interlaced?.
>> Then, is there any routine convert matrix to be "interlaced"?
>>
>> Thanks,
>>
>> Danyang
>>>    Barry
>>>
>>> On May 22, 2014, at 6:33 PM, Danyang Su <danyang.su at gmail.com> wrote:
>>>
>>>> Hi Barry,
>>>>
>>>> I use the following routine to reorder from the local rhs to global rhs.
>>>>
>>>>                PetscScalar, pointer :: vecpointer1d(:,:)
>>>>
>>>>                call DMDAVecGetArrayF90(da,x_vec_gbl,vecpointer1d,ierr)                  !x_vec_gbl is a global vector created by DMCreateGlobalVector
>>>>                do i = nvzls,nvzle                                                                                          !local node number without ghost node
>>>>                  vecpointer1d(0,i-1) = x_array_loc(i-nvzgls+1)                                         !x_array_loc is local rhs
>>>>                  vecpointer1d(1,i-1) = x_array_loc(i-nvzgls+1+nvz)                                 !nvz = 6 for the present 1d example
>>>>                end do
>>>>                call DMDAVecRestoreArrayF90(da,x_vec_gbl,vecpointer1d,ierr)
>>>>
>>>> Now Vecview gives the same rhs for 1 processor and 2 processors, but rhs order is not what I expected. I want global rhs vector hold the values of dof=1 first and then dof=2 as the local matrix and rhs hold value in this order.
>>>>
>>>>                               x_vec_gbl                                            x_vec_gbl
>>>> dof       node       VecView(Current)   dof      node        VecView (Expected)
>>>> 1          1          1.39598e-021             1         1          1.39598e-021
>>>> 2          1          0                                   1         2          0
>>>> 1          2          -0                                  1         3          0
>>>> 2          2          -0                                  1         4          5.64237e-037
>>>> 1          3          -0                                  1         5          0
>>>> 2          3          -0                                  1         6          -7.52316e-037
>>>> 1          4          5.64237e-037             1         7          7.52316e-037
>>>> 2          4          4.81482e-035             1         8          0
>>>> 1          5          -0                                  1         9          1.68459e-016
>>>> 2          5          -0                                  1         10         0.1296
>>>>                                                                     1          6          -7.52316e-037            2         1          0
>>>> 2          6          -7.22224e-035            2         2          0
>>>> 1          7          7.52316e-037             2         3          0
>>>> 2          7          7.22224e-035             2         4          4.81482e-035
>>>> 1          8          -0                                  2         5          0
>>>> 2          8          -0                                  2         6          -7.22224e-035
>>>> 1          9          1.68459e-016             2         7          7.22224e-035
>>>> 2          9          128623                        2         8          0
>>>> 1          10         0.1296                        2         9          128623
>>>> 2          10         0                                  2         10         0
>>>>
>>>> Thanks and regards,
>>>>
>>>> Danyang
>>>>
>>>>
>>>> On 22/05/2014 4:03 PM, Barry Smith wrote:
>>>>>    Always do things one small step at at time. On one process what is x_vec_loc (use VecView again on it). Is it what you expect? Then run on two processes but call VecView on
>>>>> x_vec_loc only on the first process. Is it what you expect?
>>>>>
>>>>>    Also what is vecpointer1d declared to be?
>>>>>
>>>>>
>>>>>    Barry
>>>>>
>>>>> On May 22, 2014, at 4:44 PM, Danyang Su
>>>>> <danyang.su at gmail.com>
>>>>>   wrote:
>>>>>
>>>>>
>>>>>> On 22/05/2014 12:01 PM, Matthew Knepley wrote:
>>>>>>
>>>>>>> On Thu, May 22, 2014 at 1:58 PM, Danyang Su <danyang.su at gmail.com>
>>>>>>>   wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I have a 1D transient flow problem (1 dof) coupled with energy balance (1 dof), so the total dof per node is 2.
>>>>>>>
>>>>>>> The whole domain has 10 nodes in z direction.
>>>>>>>
>>>>>>> The program runs well with 1 processor but failed in 2 processors. The matrix is the same for 1 processor and 2 processor but the rhs are different.
>>>>>>>
>>>>>>> The following is used to set the rhs value.
>>>>>>>
>>>>>>> call VecGetArrayF90(x_vec_loc, vecpointer, ierr)
>>>>>>> vecpointer = (calculate the rhs value here)
>>>>>>> call VecRestoreArrayF90(x_vec_loc,vecpointer,ierr)
>>>>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>>>>
>>>>>>>                                                                                               Vecview  Correct             Vecview  Wrong
>>>>>>> dof     local node           Process [0]                                  Process [0]                      Process [0]
>>>>>>> 1            1              1.395982780116148E-021               1.39598e-021                 1.39598e-021
>>>>>>> 1            2              0.000000000000000E+000               0                                       0
>>>>>>> 1            3              0.000000000000000E+000               0                                       0
>>>>>>> 1            4              5.642372883946980E-037               5.64237e-037                 5.64237e-037
>>>>>>> 1            5              0.000000000000000E+000                0                                       0
>>>>>>> 1            6             -1.395982780116148E-021               -7.52316e-037                -1.39598e-021                       Line A
>>>>>>> 2            1              0.000000000000000E+000               7.52316e-037                 0
>>>>>>> 2            2              0.000000000000000E+000                0                                       0
>>>>>>> 2            3              0.000000000000000E+000               1.68459e-016                  0
>>>>>>> 2            4              4.814824860968090E-035                0.1296                             4.81482e-035
>>>>>>> 2            5              0.000000000000000E+000                                                        Process [1]                             Line B
>>>>>>> 2            6             -1.371273884908092E-019               0                                        7.52316e-037                       Line C
>>>>>>>                                                                                               0                                        0
>>>>>>>                                         Process [1]                                   0                                       1.68459e-016
>>>>>>> 1            1              1.395982780116148E-021               4.81482e-035                 0.1296                                     Line D
>>>>>>> 1            2             -7.523163845262640E-037                0                                      1.37127e-019                         Line E
>>>>>>> 1            3              7.523163845262640E-037               -7.22224e-035                -7.22224e-035
>>>>>>> 1            4              0.000000000000000E+000               7.22224e-035                 7.22224e-035
>>>>>>> 1            5              1.684590875336239E-016                0                                       0
>>>>>>> 1            6              0.129600000000000                         128623                            128623
>>>>>>> 2            1              1.371273884908092E-019               0                                       0                                                Line F
>>>>>>> 2            2             -7.222237291452134E-035
>>>>>>> 2            3              7.222237291452134E-035
>>>>>>> 2            4              0.000000000000000E+000
>>>>>>> 2            5               128623.169844761
>>>>>>> 2            6              0.000000000000000E+000
>>>>>>>
>>>>>>> The red line (Line A, C, D and F) is the ghost values for 2 subdomains, but when run with 2 processor, the program treates Line B, C, D, and E as ghost values.
>>>>>>> How can I handle this kind of local vector to global vector assembly?
>>>>>>>
>>>>>>> Why are you not using DMDAVecGetArrayF90()? This is exactly what it is for.
>>>>>>>
>>>>>> Thanks, Matthew.
>>>>>>
>>>>>> I tried the following codes but still cannot get the correct global rhs vector
>>>>>>   call DMDAVecGetArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>>>>>> do i = 1,nvz                                                                         !nvz is local node amount, here is 6
>>>>>>    vecpointer1d(0,i-1) = x_array_loc(i)                              !assume x_array_loc is the local rhs (the third column in the above mentioned data)
>>>>>>    vecpointer1d(1,i-1) = x_array_loc(i+nvz)
>>>>>> end do
>>>>>> call DMDAVecRestoreArrayF90(da,x_vec_loc,vecpointer1d,ierr)
>>>>>> call DMLocalToGlobalBegin(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>>> call DMLocalToGlobalEnd(da,x_vec_loc,INSERT_VALUES, x_vec_gbl,ierr)
>>>>>>
>>>>>>
>>>>>> Now the rhs for 1 processor is as follows. It is not what I want.
>>>>>>
>>>>>> 1.39598e-021
>>>>>> 0
>>>>>> -0
>>>>>> -0
>>>>>> -0
>>>>>> -0
>>>>>> 5.64237e-037
>>>>>> 4.81482e-035
>>>>>> -0
>>>>>> -0
>>>>>> -7.52316e-037
>>>>>> -7.22224e-035
>>>>>> 7.52316e-037
>>>>>> 7.22224e-035
>>>>>> -0
>>>>>> -0
>>>>>> 1.68459e-016
>>>>>> 128623
>>>>>> 0.1296
>>>>>> 0
>>>>>>
>>>>>>>     Matt
>>>>>>>   
>>>>>>> In fact, the codes can work if the dof and local node is as follows.
>>>>>>> dof     local node
>>>>>>> 1            1
>>>>>>> 2            1
>>>>>>> 1            2
>>>>>>> 2            2
>>>>>>> 1            3
>>>>>>> 2            3
>>>>>>>
>>>>>>> Thanks and regards,
>>>>>>>
>>>>>>> Danyang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>


From spk at ldeo.columbia.edu  Sat May 24 07:04:07 2014
From: spk at ldeo.columbia.edu (Samar Khatiwala)
Date: Sat, 24 May 2014 08:04:07 -0400
Subject: [petsc-users] Installation problems on IBM machine
Message-ID: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu>

Hello,

I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with: 

==========================================
Building PETSc using CMake with 21 build threads
==========================================
make: 1254-002 Cannot find a rule to create target 21 from dependencies.
Stop.
make: 1254-004 The error code from the last command is 2.
?

Please see attached logs. I configured with:

config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]"

What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with 
--with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now 
get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no 
avail.

Any help would be appreciated. Thanks very much!

Samar


-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log.gz
Type: application/x-gzip
Size: 3162 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140524/7394b6be/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log.gz
Type: application/x-gzip
Size: 140308 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140524/7394b6be/attachment-0003.bin>

From bsmith at mcs.anl.gov  Sat May 24 08:22:40 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 24 May 2014 08:22:40 -0500
Subject: [petsc-users] Installation problems on IBM machine
In-Reply-To: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu>
References: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu>
Message-ID: <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov>


  Try using make all-legacy

   Barry

That is some cmake problem; fortunately we are abandoning cmake for the future. The reason the problem persists is likely because cmake has cached something somewhere that doesn?t get rebuilt.


On May 24, 2014, at 7:04 AM, Samar Khatiwala <spk at ldeo.columbia.edu> wrote:

> Hello,
> 
> I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with: 
> 
> ==========================================
> Building PETSc using CMake with 21 build threads
> ==========================================
> make: 1254-002 Cannot find a rule to create target 21 from dependencies.
> Stop.
> make: 1254-004 The error code from the last command is 2.
> ?
> 
> Please see attached logs. I configured with:
> 
> config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]"
> 
> What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with 
> --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now 
> get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no 
> avail.
> 
> Any help would be appreciated. Thanks very much!
> 
> Samar
> 
> 
> 
> <make.log.gz><configure.log.gz>


From s_g at berkeley.edu  Sat May 24 11:12:09 2014
From: s_g at berkeley.edu (Sanjay Govindjee)
Date: Sat, 24 May 2014 09:12:09 -0700
Subject: [petsc-users] Installation problems on IBM machine
In-Reply-To: <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov>
References: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu>
	<31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov>
Message-ID: <5380C4D9.7060602@berkeley.edu>

> fortunately we are abandoning cmake for the future.

Barry,
   Are you going to go back to standard make?
or have you selected a new make system?
-sanjay

On 5/24/14 6:22 AM, Barry Smith wrote:
>
>    Try using make all-legacy
>
>     Barry
>
> That is some cmake problem; fortunately we are abandoning cmake for the future. The reason the problem persists is likely because cmake has cached something somewhere that doesn?t get rebuilt.
>
>
> On May 24, 2014, at 7:04 AM, Samar Khatiwala <spk at ldeo.columbia.edu> wrote:
>
>> Hello,
>>
>> I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with:
>>
>> ==========================================
>> Building PETSc using CMake with 21 build threads
>> ==========================================
>> make: 1254-002 Cannot find a rule to create target 21 from dependencies.
>> Stop.
>> make: 1254-004 The error code from the last command is 2.
>> ?
>>
>> Please see attached logs. I configured with:
>>
>> config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]"
>>
>> What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with
>> --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now
>> get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no
>> avail.
>>
>> Any help would be appreciated. Thanks very much!
>>
>> Samar
>>
>>
>>
>> <make.log.gz><configure.log.gz>


From spk at ldeo.columbia.edu  Sat May 24 11:13:56 2014
From: spk at ldeo.columbia.edu (Samar Khatiwala)
Date: Sat, 24 May 2014 12:13:56 -0400
Subject: [petsc-users] Installation problems on IBM machine
In-Reply-To: <31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov>
References: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu>
	<31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov>
Message-ID: <30DC2CFB-6950-469D-B6A4-4B1DB1D40C96@ldeo.columbia.edu>

Hi Barry,

That solved the problem! Thanks so much for the fast and helpful reply!

Best,

Samar

On May 24, 2014, at 9:22 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

> 
> 
>  Try using make all-legacy
> 
>   Barry
> 
> That is some cmake problem; fortunately we are abandoning cmake for the future. The reason the problem persists is likely because cmake has cached something somewhere that doesn?t get rebuilt.
> 
> 
> On May 24, 2014, at 7:04 AM, Samar Khatiwala <spk at ldeo.columbia.edu> wrote:
> 
>> Hello,
>> 
>> I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with: 
>> 
>> ==========================================
>> Building PETSc using CMake with 21 build threads
>> ==========================================
>> make: 1254-002 Cannot find a rule to create target 21 from dependencies.
>> Stop.
>> make: 1254-004 The error code from the last command is 2.
>> ?
>> 
>> Please see attached logs. I configured with:
>> 
>> config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]"
>> 
>> What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with 
>> --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now 
>> get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no 
>> avail.
>> 
>> Any help would be appreciated. Thanks very much!
>> 
>> Samar
>> 
>> 
>> 
>> <make.log.gz><configure.log.gz>
> 


From bsmith at mcs.anl.gov  Sat May 24 11:33:35 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 24 May 2014 11:33:35 -0500
Subject: [petsc-users] Installation problems on IBM machine
In-Reply-To: <5380C4D9.7060602@berkeley.edu>
References: <9164138C-C00E-468D-841D-F8161FBF9906@ldeo.columbia.edu>
	<31E95C29-7E2C-4585-9C92-F4457B653704@mcs.anl.gov>
	<5380C4D9.7060602@berkeley.edu>
Message-ID: <1EC2EB94-D76B-4058-92C7-0C4DF65B70E7@mcs.anl.gov>


  In the development version we have the default now be very simple code that uses gnumake (much simpler than the cmake stuff). We?ve found most machines have gnumake installed and if not, gnumake is very portable and ./configure has ?download-gnumake so the user doesn?t even need to deal with installing it. 

   In our next release we will support legacy, cmake, gnumake for compiling. After that if all goes well with gnumake we may remove the legacy and cmake stuff.

   Barry

On May 24, 2014, at 11:12 AM, Sanjay Govindjee <s_g at berkeley.edu> wrote:

>> fortunately we are abandoning cmake for the future.
> 
> Barry,
>  Are you going to go back to standard make?
> or have you selected a new make system?
> -sanjay
> 
> On 5/24/14 6:22 AM, Barry Smith wrote:
>> 
>>   Try using make all-legacy
>> 
>>    Barry
>> 
>> That is some cmake problem; fortunately we are abandoning cmake for the future. The reason the problem persists is likely because cmake has cached something somewhere that doesn?t get rebuilt.
>> 
>> 
>> On May 24, 2014, at 7:04 AM, Samar Khatiwala <spk at ldeo.columbia.edu> wrote:
>> 
>>> Hello,
>>> 
>>> I'm having trouble installing PETSc on an IBM machine (power6). After some trial and error I managed to configure but the 'make all' step fails with:
>>> 
>>> ==========================================
>>> Building PETSc using CMake with 21 build threads
>>> ==========================================
>>> make: 1254-002 Cannot find a rule to create target 21 from dependencies.
>>> Stop.
>>> make: 1254-004 The error code from the last command is 2.
>>> ?
>>> 
>>> Please see attached logs. I configured with:
>>> 
>>> config/configure.py --with-cc=mpcc --with-cxx=mpCC --with-clanguage=c --with-fc=mpxlf90 --with-debugging=0 FFLAGS="-qextname" --with-batch=1 --known-mpi-shared-libraries=0 --with-blas-lapack-lib="[libessl.a]"
>>> 
>>> What is odd is that this worked once but 'make test' failed because of a missing LAPACK routine in ESSL. I reconfigured with
>>> --with-blas-lib=libessl.a --with-lapack-lib=/sw/aix53/lapack-3.2.0/lib/liblapack.a but then 'make all' failed with the same error as I now
>>> get *even after* reverting to the original (above) configure options. I've now tried this several times with a fresh copy of PETSc to no
>>> avail.
>>> 
>>> Any help would be appreciated. Thanks very much!
>>> 
>>> Samar
>>> 
>>> 
>>> 
>>> <make.log.gz><configure.log.gz>
> 


From qince168 at gmail.com  Sat May 24 22:20:44 2014
From: qince168 at gmail.com (Ce Qin)
Date: Sun, 25 May 2014 11:20:44 +0800
Subject: [petsc-users] Question about TaoLineSearchApply.
Message-ID: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>

Dear all,

Now I'm using TAO to solve an optimization problem. I want to output some
internal data at each iteration in  the monitor routine. In the cg solver I
found that TaoMonitor is called after TaoLineSearchApply. So I'm wondering
that is the objective and gradient TaoLineSearchApply returns lastest
computed? Does TAO guarantee this?

Sorry for my poor English. If you have any question please let me know. Any
help will be appreciated. Thanks in advance.

Best regards,
Ce Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140525/ac6680e5/attachment.html>

From bsmith at mcs.anl.gov  Sat May 24 22:37:15 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 24 May 2014 22:37:15 -0500
Subject: [petsc-users] Question about TaoLineSearchApply.
In-Reply-To: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
References: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
Message-ID: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>


On May 24, 2014, at 10:20 PM, Ce Qin <qince168 at gmail.com> wrote:

> Dear all,
> 
> Now I'm using TAO to solve an optimization problem. I want to output some internal data at each iteration in  the monitor routine. In the cg solver I found that TaoMonitor is called after TaoLineSearchApply. So I'm wondering that is the objective and gradient TaoLineSearchApply returns lastest computed? Does TAO guarantee this?

   According to the documentation and code yes it is suppose to always return the object function and gradient at the new solution value. Of course it is possible there is an error in one of our line search routines that results in not computing the final values. If you think think there is an error please point to the exact line search you are using and parameters etc (a test code is best and we will investigate if there is an error).

   Barry

> 
> Sorry for my poor English. If you have any question please let me know. Any help will be appreciated. Thanks in advance.
> 
> Best regards,
> Ce Qin
> 


From qince168 at gmail.com  Sun May 25 07:11:41 2014
From: qince168 at gmail.com (Ce Qin)
Date: Sun, 25 May 2014 20:11:41 +0800
Subject: [petsc-users] Question about TaoLineSearchApply.
In-Reply-To: <392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>
References: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
	<392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>
Message-ID: <CA+g8s4sNBHnthFtieVNaWwp8Cymn=tX-Ty1AKJxxvV8B1DB7sA@mail.gmail.com>

Hi Barry,

I think there is confusion about my question.

The purpose of line search is to find an \alpha_{k} that minimize f(x +
\alpha p). This procedure usually choose several \alpha and returns a
proper \alpha_{k}. For each \alpha, we need to compute the objective
function and gradient of (x + \alpha p). In my FormFunctionGradient
function, I save some internal data(model response) which will be printed
in the monitor routine. If the objective function and gradient returned by
TaoLineSearchApply isn't the latest computed,  my internal data becomes
invalid in the monitor routine. My question is the order of the objective
function and gradient corresponding to \alpha_{k} in this line search
procedure.

If it is not clear, please let me know. Thanks.

Best regards,
Ce Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140525/68bdebb2/attachment.html>

From jason.sarich at gmail.com  Sun May 25 07:52:09 2014
From: jason.sarich at gmail.com (Jason Sarich)
Date: Sun, 25 May 2014 07:52:09 -0500
Subject: [petsc-users] Question about TaoLineSearchApply.
In-Reply-To: <0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov>
References: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
	<392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>
	<0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov>
Message-ID: <CAFcybM=7d5CNewa-yiGTtx6N8uDtBaD5NZJ-Viu0fMSi5d5A-g@mail.gmail.com>

Hi Ce Qin,

What you are doing will work fine. There is no back-tracking in a
linesearch to a previously computed x, the last x computed is the current
solution.

Jason Sarich


On Sun, May 25, 2014 at 7:11 AM, Ce Qin <qince168 at gmail.com> wrote:

>  Hi Barry,
>
>  I think there is confusion about my question.
>
>  The purpose of line search is to find an \alpha_{k} that minimize f(x +
> \alpha p). This procedure usually choose several \alpha and returns a
> proper \alpha_{k}. For each \alpha, we need to compute the objective
> function and gradient of (x + \alpha p). In my FormFunctionGradient
> function, I save some internal data(model response) which will be printed
> in the monitor routine. If the objective function and gradient returned by
> TaoLineSearchApply isn't the latest computed,  my internal data becomes
> invalid in the monitor routine. My question is the order of the objective
> function and gradient corresponding to \alpha_{k} in this line search
> procedure.
>
>  If it is not clear, please let me know. Thanks.
>
>  Best regards,
> Ce Qin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140525/7ddfe53b/attachment.html>

From qince168 at gmail.com  Sun May 25 08:12:50 2014
From: qince168 at gmail.com (Ce Qin)
Date: Sun, 25 May 2014 21:12:50 +0800
Subject: [petsc-users] Question about TaoLineSearchApply.
In-Reply-To: <CAFcybM=7d5CNewa-yiGTtx6N8uDtBaD5NZJ-Viu0fMSi5d5A-g@mail.gmail.com>
References: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
	<392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>
	<0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov>
	<CAFcybM=7d5CNewa-yiGTtx6N8uDtBaD5NZJ-Viu0fMSi5d5A-g@mail.gmail.com>
Message-ID: <CA+g8s4tk6BZkzA+1EQSX9eJkxvk+7RsytW8hP_z2Bv+Sn9T15A@mail.gmail.com>

Thanks, Jason. That's what I need.

One more question, How many function and gradient evaluations one MT line
search need?
I'm doing geophysical inversion, function and gradient evaluation is very
expensive, so I want to minimize the function calls. Do you have any
suggestions?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140525/0cac3b59/attachment.html>

From jason.sarich at gmail.com  Sun May 25 09:10:36 2014
From: jason.sarich at gmail.com (Jason Sarich)
Date: Sun, 25 May 2014 09:10:36 -0500
Subject: [petsc-users] Question about TaoLineSearchApply.
In-Reply-To: <cf3d222c199f4f7da98d0026a2d6f090@NAGURSKI.anl.gov>
References: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
	<392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>
	<0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov>
	<CAFcybM=7d5CNewa-yiGTtx6N8uDtBaD5NZJ-Viu0fMSi5d5A-g@mail.gmail.com>
	<cf3d222c199f4f7da98d0026a2d6f090@NAGURSKI.anl.gov>
Message-ID: <CAFcybM=rG2M283xuEMM-iiuucrvJbL8nTNeSrJVgi=DgNtjLdQ@mail.gmail.com>

Hi Ce Qin,

Typically the MT line search accepts the first trial point, it is mostly
used to throw out a few terrible guesses and to help prevent stalling. You
should get a general idea of how much time is spent line searching just be
checking the number of TAO iterations versus the number of function
evaluations in -tao_view, and there are simple ways to access this
information directly if you need to.

There are some parameters for the line search that make it more or less
selective, but saving a few function evaluations here in the line search
will probably cost more evaluations in the broader view of the optimization
in general.

If you haven't done so yet, try the lmvm algorithm as a substitute for cg,
it works on the same information and usually performs a little better.

Jason


On Sun, May 25, 2014 at 8:12 AM, Ce Qin <qince168 at gmail.com> wrote:

>  Thanks, Jason. That's what I need.
>
> One more question, How many function and gradient evaluations one MT line
> search need?
> I'm doing geophysical inversion, function and gradient evaluation is very
> expensive, so I want to minimize the function calls. Do you have any
> suggestions?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140525/5af862ae/attachment.html>

From qince168 at gmail.com  Sun May 25 09:22:26 2014
From: qince168 at gmail.com (Ce Qin)
Date: Sun, 25 May 2014 22:22:26 +0800
Subject: [petsc-users] Question about TaoLineSearchApply.
In-Reply-To: <CAFcybM=rG2M283xuEMM-iiuucrvJbL8nTNeSrJVgi=DgNtjLdQ@mail.gmail.com>
References: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
	<392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>
	<0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov>
	<CAFcybM=7d5CNewa-yiGTtx6N8uDtBaD5NZJ-Viu0fMSi5d5A-g@mail.gmail.com>
	<cf3d222c199f4f7da98d0026a2d6f090@NAGURSKI.anl.gov>
	<CAFcybM=rG2M283xuEMM-iiuucrvJbL8nTNeSrJVgi=DgNtjLdQ@mail.gmail.com>
Message-ID: <CA+g8s4tHhcjz8i5vhojhAUCUX3Fe7PVB1Tai8G5U1FwXyb4kYw@mail.gmail.com>

Thanks, Jason. I will try it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140525/f1bb8923/attachment-0001.html>

From bsmith at mcs.anl.gov  Sun May 25 11:16:34 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 25 May 2014 11:16:34 -0500
Subject: [petsc-users] Question about TaoLineSearchApply.
In-Reply-To: <CA+g8s4tk6BZkzA+1EQSX9eJkxvk+7RsytW8hP_z2Bv+Sn9T15A@mail.gmail.com>
References: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
	<392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>
	<0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov>
	<CAFcybM=7d5CNewa-yiGTtx6N8uDtBaD5NZJ-Viu0fMSi5d5A-g@mail.gmail.com>
	<CA+g8s4tk6BZkzA+1EQSX9eJkxvk+7RsytW8hP_z2Bv+Sn9T15A@mail.gmail.com>
Message-ID: <54A85E1A-3579-43AB-94C1-FDED6D6FBBBA@mcs.anl.gov>


   Also run with -log_summary and you?ll see the percentage of time in the line search and also in the various function evaluations. The time for the line search includes the time of the function and gradient evaluations IT does, while the time for the functions and gradients includes the time for ALL function and gradient evaluations.

   Barry

On May 25, 2014, at 8:12 AM, Ce Qin <qince168 at gmail.com> wrote:

> Thanks, Jason. That's what I need.
> 
> One more question, How many function and gradient evaluations one MT line search need?
> I'm doing geophysical inversion, function and gradient evaluation is very expensive, so I want to minimize the function calls. Do you have any suggestions?


From Lukasz.Kaczmarczyk at glasgow.ac.uk  Sun May 25 12:34:56 2014
From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk)
Date: Sun, 25 May 2014 17:34:56 +0000
Subject: [petsc-users] PetscLayou
Message-ID: <F75C3E9E-7327-4CB7-8698-F667EC4E8E33@glasgow.ac.uk>

Hello,

I use PetscLayout to set the ranges for matrix partitioning (petsc-3.4.3), however, for one of the problems, I get from warring from valgrind. This warning is only for particular problem, for other code executions with different input data, valgrind do not shows any errors at that point. 

Could you tell me if this could lead to segmentation fault, for example for code compiled with Intel compiler? Where is my mistake?

Kind regards,
Lukasz

533      PetscLayout layout;
534      ierr = PetscLayoutCreate(PETSC_COMM_WORLD,&layout); CHKERRQ(ierr);
535      ierr = PetscLayoutSetBlockSize(layout,1); CHKERRQ(ierr);
536      ierr = PetscLayoutSetSize(layout,nb_dofs_row); CHKERRQ(ierr);
537      ierr = PetscLayoutSetUp(layout); CHKERRQ(ierr);
538      PetscInt rstart,rend;
539      ierr = PetscLayoutGetRange(layout,&rstart,&rend); CHKERRQ


Partition problem COUPLED_PROBLEM
        create_Mat: row lower 0 row upper 837
        create_Mat: row lower 837 row upper 1674
        create_Mat: row lower 1674 row upper 2511
        create_Mat: row lower 2511 row upper 3347
        create_Mat: row lower 3347 row upper 4183
        create_Mat: row lower 4183 row upper 5019
        create_Mat: row lower 5019 row upper 5855
        create_Mat: row lower 5855 row upper 6691
        create_Mat: row lower 6691 row upper 7527
        create_Mat: row lower 7527 row upper 8363
        create_Mat: row lower 8363 row upper 9199
        create_Mat: row lower 9199 row upper 10035
==80351== Source and destination overlap in memcpy(0x1e971f94, 0x1e971fa8, 28)
==80351==    at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==80351==    by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
==80351==    by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
==80351==    by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
==80351==    by 0x565C4C8: PetscLayoutSetUp (pmap.c:158)
==80355== Source and destination overlap in memcpy(0x1e97e784, 0x1e97e788, 44)
==80355==    at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==80355==    by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
==80355==    by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
==80355==    by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
==80355==    by 0x565C4C8: PetscLayoutSetUp (pmap.c:158)
==80355==    by 0xDEFB18: int MoFEM::FieldCore::create_Mat<MoFEM::Idx_mi_tag>(std::string const&, _p_Mat**, char const*, int**, int**, double**, bool, int) (FieldCore.hpp:537)


Full code source can be viewed from here,
https://bitbucket.org/likask/mofem-joseph/src/3185671a406bc3f02336b3886775dff037dbd4fe/mofem_v0.1/do_not_blink/FieldCore.hpp?at=release

From bsmith at mcs.anl.gov  Sun May 25 13:03:52 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 25 May 2014 13:03:52 -0500
Subject: [petsc-users] PetscLayou
In-Reply-To: <F75C3E9E-7327-4CB7-8698-F667EC4E8E33@glasgow.ac.uk>
References: <F75C3E9E-7327-4CB7-8698-F667EC4E8E33@glasgow.ac.uk>
Message-ID: <57628F62-A4CC-4DEB-A538-3B3B20D8126F@mcs.anl.gov>


   Looking at your code and the PETSc source code I see nothing wrong. 

> Source and destination overlap in memcpy(0x1e97e784, 0x1e97e788, 44)
> ==80355==    at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==80355==    by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
> ==80355==    by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
> ==80355==    by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)

   I can only guess an error in the MPI implementation at this point.  Maybe more with the MPI then the compiler? But that is guessing

  Barry


On May 25, 2014, at 12:34 PM, Lukasz Kaczmarczyk <Lukasz.Kaczmarczyk at glasgow.ac.uk> wrote:

> Hello,
> 
> I use PetscLayout to set the ranges for matrix partitioning (petsc-3.4.3), however, for one of the problems, I get from warring from valgrind. This warning is only for particular problem, for other code executions with different input data, valgrind do not shows any errors at that point. 
> 
> Could you tell me if this could lead to segmentation fault, for example for code compiled with Intel compiler? Where is my mistake?
> 
> Kind regards,
> Lukasz
> 
> 533      PetscLayout layout;
> 534      ierr = PetscLayoutCreate(PETSC_COMM_WORLD,&layout); CHKERRQ(ierr);
> 535      ierr = PetscLayoutSetBlockSize(layout,1); CHKERRQ(ierr);
> 536      ierr = PetscLayoutSetSize(layout,nb_dofs_row); CHKERRQ(ierr);
> 537      ierr = PetscLayoutSetUp(layout); CHKERRQ(ierr);
> 538      PetscInt rstart,rend;
> 539      ierr = PetscLayoutGetRange(layout,&rstart,&rend); CHKERRQ
> 
> 
> Partition problem COUPLED_PROBLEM
>        create_Mat: row lower 0 row upper 837
>        create_Mat: row lower 837 row upper 1674
>        create_Mat: row lower 1674 row upper 2511
>        create_Mat: row lower 2511 row upper 3347
>        create_Mat: row lower 3347 row upper 4183
>        create_Mat: row lower 4183 row upper 5019
>        create_Mat: row lower 5019 row upper 5855
>        create_Mat: row lower 5855 row upper 6691
>        create_Mat: row lower 6691 row upper 7527
>        create_Mat: row lower 7527 row upper 8363
>        create_Mat: row lower 8363 row upper 9199
>        create_Mat: row lower 9199 row upper 10035
> ==80351== Source and destination overlap in memcpy(0x1e971f94, 0x1e971fa8, 28)
> ==80351==    at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==80351==    by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
> ==80351==    by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
> ==80351==    by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
> ==80351==    by 0x565C4C8: PetscLayoutSetUp (pmap.c:158)
> ==80355== Source and destination overlap in memcpy(0x1e97e784, 0x1e97e788, 44)
> ==80355==    at 0x4C2CFA0: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==80355==    by 0x85AAB7E: ompi_ddt_copy_content_same_ddt (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
> ==80355==    by 0x159EE48A: ??? (in /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
> ==80355==    by 0x85B15B2: PMPI_Allgather (in /usr/lib/openmpi/lib/libmpi.so.0.0.2)
> ==80355==    by 0x565C4C8: PetscLayoutSetUp (pmap.c:158)
> ==80355==    by 0xDEFB18: int MoFEM::FieldCore::create_Mat<MoFEM::Idx_mi_tag>(std::string const&, _p_Mat**, char const*, int**, int**, double**, bool, int) (FieldCore.hpp:537)
> 
> 
> Full code source can be viewed from here,
> https://bitbucket.org/likask/mofem-joseph/src/3185671a406bc3f02336b3886775dff037dbd4fe/mofem_v0.1/do_not_blink/FieldCore.hpp?at=release


From epscodes at gmail.com  Sun May 25 19:23:27 2014
From: epscodes at gmail.com (Xiangdong)
Date: Sun, 25 May 2014 20:23:27 -0400
Subject: [petsc-users] DM vector restriction questions
Message-ID: <CAAPpcpk7uJ6mzW=TcPUer0cGz0E-oHy-3C1bCQ2LK5-E3S4NyQ@mail.gmail.com>

Hello everyone,

I have a questions about vectors in an DMDA with DOF>1. For example, in 1d
with number of grid N and DOF=2 (two fields u and v), the length of the
global vector is 2*N.

What is the best way to restrict this vector (length 2*N) to a vector
(length N) corresponding to the field u only?  This will help me obtain the
properties of field u by using VecSum and other vec functions.

With DMDAVecGetArray and looping over only u field can do the job. However,
I am just wondering whether any petsc function can provide either the
restriction matrix or the vector restricted to a single field.

Thank you.

Best,
Xiangdong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140525/43a09d36/attachment.html>

From jed at jedbrown.org  Sun May 25 19:40:48 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 25 May 2014 18:40:48 -0600
Subject: [petsc-users] DM vector restriction questions
In-Reply-To: <CAAPpcpk7uJ6mzW=TcPUer0cGz0E-oHy-3C1bCQ2LK5-E3S4NyQ@mail.gmail.com>
References: <CAAPpcpk7uJ6mzW=TcPUer0cGz0E-oHy-3C1bCQ2LK5-E3S4NyQ@mail.gmail.com>
Message-ID: <87zji574of.fsf@jedbrown.org>

Xiangdong <epscodes at gmail.com> writes:

> Hello everyone,
>
> I have a questions about vectors in an DMDA with DOF>1. For example, in 1d
> with number of grid N and DOF=2 (two fields u and v), the length of the
> global vector is 2*N.
>
> What is the best way to restrict this vector (length 2*N) to a vector
> (length N) corresponding to the field u only?  This will help me obtain the
> properties of field u by using VecSum and other vec functions.

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecStrideGather.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140525/d7b67000/attachment.pgp>

From bsmith at mcs.anl.gov  Sun May 25 19:51:08 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 25 May 2014 19:51:08 -0500
Subject: [petsc-users] DM vector restriction questions
In-Reply-To: <87zji574of.fsf@jedbrown.org>
References: <CAAPpcpk7uJ6mzW=TcPUer0cGz0E-oHy-3C1bCQ2LK5-E3S4NyQ@mail.gmail.com>
	<87zji574of.fsf@jedbrown.org>
Message-ID: <0619379E-275E-418F-A91C-78A9AED1384E@mcs.anl.gov>


  And VecStrideNorm(), VecStrideScale(), VecStrideNormAll() etc. Let us know what is missing and what you need?

  Barry

On May 25, 2014, at 7:40 PM, Jed Brown <jed at jedbrown.org> wrote:

> Xiangdong <epscodes at gmail.com> writes:
> 
>> Hello everyone,
>> 
>> I have a questions about vectors in an DMDA with DOF>1. For example, in 1d
>> with number of grid N and DOF=2 (two fields u and v), the length of the
>> global vector is 2*N.
>> 
>> What is the best way to restrict this vector (length 2*N) to a vector
>> (length N) corresponding to the field u only?  This will help me obtain the
>> properties of field u by using VecSum and other vec functions.
> 
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecStrideGather.html


From qince168 at gmail.com  Sun May 25 20:31:39 2014
From: qince168 at gmail.com (Ce Qin)
Date: Mon, 26 May 2014 09:31:39 +0800
Subject: [petsc-users] Question about TaoLineSearchApply.
In-Reply-To: <54A85E1A-3579-43AB-94C1-FDED6D6FBBBA@mcs.anl.gov>
References: <CA+g8s4ti4bg5DCJzz241BcgUW_TguC7QH2A2QhrS43r+2K7Fhg@mail.gmail.com>
	<392C773B-43B5-4D85-BC11-92D1BA8FB4DE@mcs.anl.gov>
	<0e7182b4afd4493e8580a52e0a72b234@NAGURSKI.anl.gov>
	<CAFcybM=7d5CNewa-yiGTtx6N8uDtBaD5NZJ-Viu0fMSi5d5A-g@mail.gmail.com>
	<CA+g8s4tk6BZkzA+1EQSX9eJkxvk+7RsytW8hP_z2Bv+Sn9T15A@mail.gmail.com>
	<54A85E1A-3579-43AB-94C1-FDED6D6FBBBA@mcs.anl.gov>
Message-ID: <CA+g8s4u7dgtaQ7AzvuYx9-aa2iiNG3VUMV91V14SwmOJqAQ6dw@mail.gmail.com>

Thanks for all your kind help!

Best regards,
Ce Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140526/2a664548/attachment.html>

From pjsgr100 at gmail.com  Mon May 26 04:32:29 2014
From: pjsgr100 at gmail.com (Pedro Rodrigues)
Date: Mon, 26 May 2014 10:32:29 +0100
Subject: [petsc-users] ExodusII
Message-ID: <CA+ajKM_YD0m5Wn9_=WwgS555-_VTjfGHY4AsmfKzcqy1LY5EaQ@mail.gmail.com>

Hi

   I successfully built EXODUSII under Windows using netCDF, HDF5
(ZLIB and SZIP also). I ran an example to mesh generation with success. I
would like to ask you to add support to EXODUSII under this
platform. Fortran bindings are not there (or don't work) but I made that
with Fortran interfaces (that can also be done with 'c' directives). VS2012
does not contain inttypes.h  but I modified a cygwin file to make that
available.

regards

-- 
Pedro Rodrigues
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140526/4e718ddb/attachment.html>

From jed at jedbrown.org  Mon May 26 08:17:17 2014
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 26 May 2014 07:17:17 -0600
Subject: [petsc-users] ExodusII
In-Reply-To: <CA+ajKM_YD0m5Wn9_=WwgS555-_VTjfGHY4AsmfKzcqy1LY5EaQ@mail.gmail.com>
References: <CA+ajKM_YD0m5Wn9_=WwgS555-_VTjfGHY4AsmfKzcqy1LY5EaQ@mail.gmail.com>
Message-ID: <87egzg7k82.fsf@jedbrown.org>

Pedro Rodrigues <pjsgr100 at gmail.com> writes:

> Hi
>
>    I successfully built EXODUSII under Windows using netCDF, HDF5
> (ZLIB and SZIP also). I ran an example to mesh generation with success. I
> would like to ask you to add support to EXODUSII under this
> platform. Fortran bindings are not there (or don't work) but I made that
> with Fortran interfaces (that can also be done with 'c' directives). 

The best way to add support is to have upstream accept your patch to
make ExodusII compatible.  The second best way is to submit a patch to
PETSc.

> VS2012 does not contain inttypes.h but I modified a cygwin file to
> make that available.

This sounds problematic because installing packages should not involve
modifying system files.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140526/0f55ba05/attachment-0001.pgp>

From hgbk2008 at gmail.com  Mon May 26 11:02:10 2014
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Mon, 26 May 2014 18:02:10 +0200
Subject: [petsc-users] row scale the matrix
Message-ID: <53836582.6010804@gmail.com>

Hi

My matrix contains some overshoot entries in the diagonal and I want to 
row scale by a factor that I defined. How can I do that with petsc ? (I 
don't want to use MatDiagonalScale instead, I also don't want to create 
a diagonal matrix and left multiply to the system.)

BR
Bui


From mrosso at uci.edu  Mon May 26 11:20:25 2014
From: mrosso at uci.edu (Michele Rosso)
Date: Mon, 26 May 2014 09:20:25 -0700
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <CADOhEh59U0gn1aZ3vPFs6mzb2FAnjAMcz8XuGCRAxf1hwE6xuQ@mail.gmail.com>
References: <53753C7B.8010201@uci.edu>	<87y4y0uar8.fsf@jedbrown.org>	<537A8335.4080702@uci.edu>	<8761l1qu4x.fsf@jedbrown.org>	<537A88AC.3060308@uci.edu>	<871tvpqt8u.fsf@jedbrown.org>
	<CADOhEh59U0gn1aZ3vPFs6mzb2FAnjAMcz8XuGCRAxf1hwE6xuQ@mail.gmail.com>
Message-ID: <538369C9.6010209@uci.edu>

Mark,

thank you for your input and sorry my late reply: I saw your email only now.
By setting up the solver each time step you mean re-defining the KSP 
context every time? Why should this help?
I will definitely try that as well as the hypre solution and report back.
Again, thank you.

Michele

On 05/22/2014 09:34 AM, Mark Adams wrote:
> If the solver is degrading as the coefficients change, and I would 
> assume get more nasty, you can try deleting the solver at each time 
> step.  This will be about 2x more expensive, because it does the setup 
> each solve, but it might fix your problem.
>
> You also might try:
>
> -pc_type hypre
> -pc_hypre_type boomeramg
>
>
>
>
> On Mon, May 19, 2014 at 6:49 PM, Jed Brown <jed at jedbrown.org 
> <mailto:jed at jedbrown.org>> wrote:
>
>     Michele Rosso <mrosso at uci.edu <mailto:mrosso at uci.edu>> writes:
>
>     > Jed,
>     >
>     > thank you very much!
>     > I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type
>     > sor/   and report back.
>     > Yes, I removed the nullspace from both the system matrix and the
>     rhs.
>     > Is there a way to have something similar to Dendy's multigrid or the
>     > deflated conjugate gradient method with PETSc?
>
>     Dendy's MG needs geometry.  The algorithm to produce the interpolation
>     operators is not terribly complicated so it could be done, though DMDA
>     support for cell-centered is a somewhat awkward.  "Deflated CG"
>     can mean
>     lots of things so you'll have to be more precise.  (Most everything in
>     the "deflation" world has a clear analogue in the MG world, but the
>     deflation community doesn't have a precise language to talk about
>     their
>     methods so you always have to read the paper carefully to find out if
>     it's completely standard or if there is something new.)
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140526/82644d03/attachment.html>

From bsmith at mcs.anl.gov  Mon May 26 13:44:13 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 26 May 2014 13:44:13 -0500
Subject: [petsc-users] row scale the matrix
In-Reply-To: <53836582.6010804@gmail.com>
References: <53836582.6010804@gmail.com>
Message-ID: <78644A10-9A8F-452F-8306-B445B5C3D60E@mcs.anl.gov>


   Why not MatDiagonalScale()? The left diagonal matrix l scales each row i of the matrix by l[i,i] so it seems to do exactly what you want.

   Barry

On May 26, 2014, at 11:02 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:

> Hi
> 
> My matrix contains some overshoot entries in the diagonal and I want to row scale by a factor that I defined. How can I do that with petsc ? (I don't want to use MatDiagonalScale instead, I also don't want to create a diagonal matrix and left multiply to the system.)
> 
> BR
> Bui
> 


From vbaros at hsr.ch  Mon May 26 14:56:39 2014
From: vbaros at hsr.ch (Baros Vladimir)
Date: Mon, 26 May 2014 19:56:39 +0000
Subject: [petsc-users] ExodusII
Message-ID: <b7d3e6616f8143029f96889ad8d5fcca@sid00230.hsr.ch>

Necessary header files, can be found here:
https://code.google.com/p/msinttypes/

It contains necessary inttypes.h and stdint.h headers
I successfully used them to build exodus lib with Visual Studio.

Can anyone enable the support for exodus in Windows?


From C.Klaij at marin.nl  Tue May 27 08:47:55 2014
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Tue, 27 May 2014 13:47:55 +0000
Subject: [petsc-users] MatNestGetISs in fortran
Message-ID: <64c4658aeb7441abbe20e4aa252554a2@MAR190N1.marin.local>

I'm trying to use MatNestGetISs in a fortran program but it seems to be missing from the fortran include file (PETSc 3.4).


dr. ir. Christiaan Klaij
CFD Researcher
Research & Development
E mailto:C.Klaij at marin.nl
T +31 317 49 33 44


MARIN
2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands
T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl


From tlk0812 at hotmail.com  Tue May 27 15:17:46 2014
From: tlk0812 at hotmail.com (Likun Tan)
Date: Tue, 27 May 2014 13:17:46 -0700
Subject: [petsc-users] Set the directory of output file
Message-ID: <BLU436-SMTP1499F5FAFC64B6A8735A9D7B73A0@phx.gbl>

Hello,

I want to create and write my simulation result in a binary file called result.dat, but I want to set my file in a different folder, say /home/username/output

I am using

PetscViewerBinaryOpen(PETSC_COMM_WORLD, result.dat, FILE_MODE_WRITE, & view)

But this will create the file in the current folder by default. How could I modify the command to set a new path? Thank you.

From balay at mcs.anl.gov  Tue May 27 15:40:27 2014
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 27 May 2014 15:40:27 -0500
Subject: [petsc-users] Set the directory of output file
In-Reply-To: <BLU436-SMTP1499F5FAFC64B6A8735A9D7B73A0@phx.gbl>
References: <BLU436-SMTP1499F5FAFC64B6A8735A9D7B73A0@phx.gbl>
Message-ID: <alpine.LFD.2.11.1405271536040.2582@asterix>

On Tue, 27 May 2014, Likun Tan wrote:

> Hello,
> 
> I want to create and write my simulation result in a binary file called result.dat, but I want to set my file in a different folder, say /home/username/output
> 
> I am using
> 
> PetscViewerBinaryOpen(PETSC_COMM_WORLD, result.dat, FILE_MODE_WRITE, & view)
> 
> But this will create the file in the current folder by default. How could I modify the command to set a new path? Thank you.

You should be able to do:

PetscViewerBinaryOpen(PETSC_COMM_WORLD, "/home/username/output/result.dat", FILE_MODE_WRITE, & view)

[or specify the path to the output file at runtime and use
PetscOptionsGetString() to extract this string and use with
PetscViewerBinaryOpen(). For ex: check '-f0' usage in
src/ksp/ksp/examples/tutorials/ex10.c]

Satish


From tlk0812 at hotmail.com  Tue May 27 15:57:44 2014
From: tlk0812 at hotmail.com (Likun Tan)
Date: Tue, 27 May 2014 13:57:44 -0700
Subject: [petsc-users] Set the directory of output file
In-Reply-To: <alpine.LFD.2.11.1405271536040.2582@asterix>
References: <BLU436-SMTP1499F5FAFC64B6A8735A9D7B73A0@phx.gbl>
	<alpine.LFD.2.11.1405271536040.2582@asterix>
Message-ID: <BLU436-SMTP1521F4C9363828D844BB598B73A0@phx.gbl>

It works. Thank you very much.

> On May 27, 2014, at 1:40 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> 
>> On Tue, 27 May 2014, Likun Tan wrote:
>> 
>> Hello,
>> 
>> I want to create and write my simulation result in a binary file called result.dat, but I want to set my file in a different folder, say /home/username/output
>> 
>> I am using
>> 
>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, result.dat, FILE_MODE_WRITE, & view)
>> 
>> But this will create the file in the current folder by default. How could I modify the command to set a new path? Thank you.
> 
> You should be able to do:
> 
> PetscViewerBinaryOpen(PETSC_COMM_WORLD, "/home/username/output/result.dat", FILE_MODE_WRITE, & view)
> 
> [or specify the path to the output file at runtime and use
> PetscOptionsGetString() to extract this string and use with
> PetscViewerBinaryOpen(). For ex: check '-f0' usage in
> src/ksp/ksp/examples/tutorials/ex10.c]
> 
> Satish
> 
> 

From zonexo at gmail.com  Tue May 27 20:09:09 2014
From: zonexo at gmail.com (TAY wee-beng)
Date: Wed, 28 May 2014 09:09:09 +0800
Subject: [petsc-users] Problem with DMDAVecGetArrayF90 and
	DMDAVecRestoreArrayF90
In-Reply-To: <D500529E-1D91-4D48-8174-5F00FA4C470F@mcs.anl.gov>
References: <534C9A2C.5060404@gmail.com>	<53520587.6010606@gmail.com>	<62DF81C7-0C35-410C-8D4C-206FBB22576A@mcs.anl.gov>	<535248E8.2070002@gmail.com>	<CAMYG4Gmg4=e7AaKiWAeEGxdjJ1pThZA5XFcBWhj5uOajqq4nig@mail.gmail.com>	<535284E0.8010901@gmail.com>	<CAMYG4GkU9_bfGQGVtq=ZYRFiT7c17W-akafRNzpPUjBe7HqtrQ@mail.gmail.com>	<5352934C.1010306@gmail.com>	<CAMYG4G=2t9QPdHDNGHAaKpHzAo_tBa0W-DW03sDeHY=XD3veLA@mail.gmail.com>	<53529B09.8040009@gmail.com>	<CAMYG4G=AYrDmnHHAa=cMy=k5UxgxM9OG0=UhPVmyvZcQvZR19Q@mail.gmail.com>	<5353173D.60609@gmail.com>	<53546B03.1010407@gmail.com>	<DE008283-07E7-430A-9729-3DE0ECBC9C3E@mcs.anl.gov>	<537188D8.2030307@gmail.com>	<A24660D1-F087-4012-91F2-A17336194DED@mcs.anl.gov>	<53795BCC.8020500@gmail.com>
	<CAMYG4G=D_uuesofTXoNo_Yb2GZDaOEFC5y3jG2r_+9-OA_3Pgw@mail.gmail.com>
	<53797A4D.6090602@gmail.com>
	<B52953B5-E17E-4989-9A06-C87DF21FC43D@mcs.anl.gov>
	<5379A433.5000401@gmail.com>
	<D500529E-1D91-4D48-8174-5F00FA4C470F@mcs.anl.gov>
Message-ID: <53853735.5080500@gmail.com>

On 20/5/2014 1:43 AM, Barry Smith wrote:
> On May 19, 2014, at 1:26 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>
>> On 19/5/2014 11:36 AM, Barry Smith wrote:
>>> On May 18, 2014, at 10:28 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>>> On 19/5/2014 9:53 AM, Matthew Knepley wrote:
>>>>> On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>> Hi Barry,
>>>>>
>>>>> I am trying to sort out the details so that it's easier to pinpoint the error. However, I tried on gnu gfortran and it worked well. On intel ifort, it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that it's a bug in ifort? Do you work with both intel and gnu?
>>>>>
>>>>> Yes it works with Intel. Is this using optimization?
>>>> Hi Matt,
>>>>
>>>> I forgot to add that in non-optimized cases, it works with gnu and intel. However, in optimized cases, it works with gnu, but not intel.  Does it definitely mean that it's a bug in ifort?
>>>    No. Does it run clean under valgrind?
>> Hi,
>>
>> Do you mean the debug or optimized version?
>    Both.
Hi, has anyone tested the code I sent? I am still not able to pinpoint 
the error.

Thanks.
>
>> Thanks.
>>>>>    Matt
>>>>>   
>>>>> Thank you
>>>>>
>>>>> Yours sincerely,
>>>>>
>>>>> TAY wee-beng
>>>>>
>>>>> On 14/5/2014 12:03 AM, Barry Smith wrote:
>>>>>     Please send you current code. So we may compile and run it.
>>>>>
>>>>>     Barry
>>>>>
>>>>>
>>>>>     On May 12, 2014, at 9:52 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have sent the entire code a while ago. Is there any answer? I was also trying myself but it worked for some intel compiler, and some not. I'm still not able to find the answer. gnu compilers for most cluster are old versions so they are not able to compile since I have allocatable structures.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Yours sincerely,
>>>>>
>>>>> TAY wee-beng
>>>>>
>>>>> On 21/4/2014 8:58 AM, Barry Smith wrote:
>>>>>      Please send the entire code. If we can run it and reproduce the problem we can likely track down the issue much faster than through endless rounds of email.
>>>>>
>>>>>      Barry
>>>>>
>>>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>
>>>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote:
>>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote:
>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote:
>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote:
>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote:
>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote:
>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>
>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote:
>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>
>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote:
>>>>>      Hmm,
>>>>>
>>>>>          Interface DMDAVecGetArrayF90
>>>>>            Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr)
>>>>>              USE_DM_HIDE
>>>>>              DM_HIDE da1
>>>>>              VEC_HIDE v
>>>>>              PetscScalar,pointer :: d1(:,:,:)
>>>>>              PetscErrorCode ierr
>>>>>            End Subroutine
>>>>>
>>>>>       So the d1 is a F90 POINTER. But your subroutine seems to be treating it as a ?plain old Fortran array??
>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>> Hi,
>>>>>
>>>>> So d1 is a pointer, and it's different if I declare it as "plain old Fortran array"? Because I declare it as a Fortran array and it works w/o any problem if I only call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u".
>>>>>
>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with "u", "v" and "w", error starts to happen. I wonder why...
>>>>>
>>>>> Also, supposed I call:
>>>>>
>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>       call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>       call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>> u_array ....
>>>>>
>>>>> v_array .... etc
>>>>>
>>>>> Now to restore the array, does it matter the sequence they are restored?
>>>>>       No it should not matter. If it matters that is a sign that memory has been written to incorrectly earlier in the code.
>>>>>
>>>>> Hi,
>>>>>
>>>>> Hmm, I have been getting different results on different intel compilers. I'm not sure if MPI played a part but I'm only using a single processor. In the debug mode, things run without problem. In optimized mode, in some cases, the code aborts even doing simple initialization:
>>>>>
>>>>>
>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>>       call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>       call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>       call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>
>>>>>       u_array = 0.d0
>>>>>
>>>>>       v_array = 0.d0
>>>>>
>>>>>       w_array = 0.d0
>>>>>
>>>>>       p_array = 0.d0
>>>>>
>>>>>
>>>>>       call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>
>>>>>
>>>>>       call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>       call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>       call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> The code aborts at call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), giving segmentation error. But other                                                                                     version of intel compiler passes thru this part w/o error. Since the response is different among different compilers, is this PETSc or intel 's bug? Or mvapich or openmpi?
>>>>>
>>>>> We do this is a bunch of examples. Can you reproduce this different behavior in src/dm/examples/tutorials/ex11f90.F?
>>>>> Hi Matt,
>>>>>
>>>>> Do you mean putting the above lines into ex11f90.F and test?
>>>>>
>>>>> It already has DMDAVecGetArray(). Just run it.
>>>>> Hi,
>>>>>
>>>>> It worked. The differences between mine and the code is the way the fortran modules are defined, and the ex11f90 only uses global vectors. Does it make a difference whether global or local vectors are used? Because the way it accesses x1 only touches the local region.
>>>>>
>>>>> No the global/local difference should not matter.
>>>>>    Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be used 1st, is that so? I can't find the equivalent for local vector though.
>>>>>
>>>>> DMGetLocalVector()
>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my code. Does it matter?
>>>>>
>>>>> If so, when should I call them?
>>>>>
>>>>> You just need a local vector from somewhere.
>>>>> Hi,
>>>>>
>>>>> Anyone can help with the questions below? Still trying to find why my code doesn't work.
>>>>>
>>>>> Thanks.
>>>>> Hi,
>>>>>
>>>>> I insert part of my error region code into ex11f90:
>>>>>
>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>            call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>            call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>            call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr)
>>>>>
>>>>>       u_array = 0.d0
>>>>>            v_array = 0.d0
>>>>>            w_array = 0.d0
>>>>>            p_array = 0.d0
>>>>>
>>>>>       call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr)
>>>>>
>>>>>       call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>>       call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>>       call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> It worked w/o error. I'm going to change the way the modules are defined in my code.
>>>>>
>>>>> My code contains a main program and a number of modules files, with subroutines inside e.g.
>>>>>
>>>>> module solve
>>>>>                    <- add include file?
>>>>> subroutine RRK
>>>>>                    <- add include file?
>>>>> end subroutine RRK
>>>>>
>>>>> end module solve
>>>>>
>>>>> So where should the include files (#include <finclude/petscdmda.h90>) be placed?
>>>>>
>>>>> After the module or inside the subroutine?
>>>>>
>>>>> Thanks.
>>>>>     Matt
>>>>>    Thanks.
>>>>>      Matt
>>>>>    Thanks.
>>>>>      Matt
>>>>>    Thanks
>>>>>
>>>>> Regards.
>>>>>      Matt
>>>>>    As in w, then v and u?
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> thanks
>>>>>       Note also that the beginning and end indices of the u,v,w, are different for each process see for example http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/tutorials/ex11f90.F  (and they do not start at 1). This is how to get the loop bounds.
>>>>> Hi,
>>>>>
>>>>> In my case, I fixed the u,v,w such that their indices are the same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. Now the problem lies in my subroutine treating it as a ?plain old Fortran array?.
>>>>>
>>>>> If I declare them as pointers, their indices follow the C 0 start convention, is that so?
>>>>>       Not really. It is that in each process you need to access them from the indices indicated by DMDAGetCorners() for global vectors and                                 DMDAGetGhostCorners() for local vectors.  So really C or Fortran                                                           doesn?t make any difference.
>>>>>
>>>>>
>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow the Fortran 1 start convention. Is there some way to manipulate such that I do not have to change my u(i,j,k) to u(i-1,j-1,k-1)?
>>>>>      If you code wishes to access them with indices plus one from the values returned by DMDAGetCorners() for global vectors and DMDAGetGhostCorners() for local vectors then you need to manually subtract off the 1.
>>>>>
>>>>>      Barry
>>>>>
>>>>> Thanks.
>>>>>      Barry
>>>>>
>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I tried to pinpoint the problem. I reduced my job size and hence I can run on 1 processor. Tried using valgrind but perhaps I'm using the optimized version, it didn't catch the error, besides saying "Segmentation fault (core dumped)"
>>>>>
>>>>> However, by re-writing my code, I found out a few things:
>>>>>
>>>>> 1. if I write my code this way:
>>>>>
>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>> u_array = ....
>>>>>
>>>>> v_array = ....
>>>>>
>>>>> w_array = ....
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> The code runs fine.
>>>>>
>>>>> 2. if I write my code this way:
>>>>>
>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>> call uvw_array_change(u_array,v_array,w_array) -> this subroutine does the same modification as the above.
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error
>>>>>
>>>>> where the subroutine is:
>>>>>
>>>>> subroutine uvw_array_change(u,v,w)
>>>>>
>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:)
>>>>>
>>>>> u ...
>>>>> v...
>>>>> w ...
>>>>>
>>>>> end subroutine uvw_array_change.
>>>>>
>>>>> The above will give an error at :
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> 3. Same as above, except I change the order of the last 3 lines to:
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>
>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)
>>>>>
>>>>> So they are now in reversed order. Now it works.
>>>>>
>>>>> 4. Same as 2 or 3, except the subroutine is changed to :
>>>>>
>>>>> subroutine uvw_array_change(u,v,w)
>>>>>
>>>>> real(8), intent(inout) :: u(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>
>>>>> real(8), intent(inout) :: v(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>
>>>>> real(8), intent(inout) :: w(start_indices(1):end_indices(1),start_indices(2):end_indices(2),start_indices(3):end_indices(3))
>>>>>
>>>>> u ...
>>>>> v...
>>>>> w ...
>>>>>
>>>>> end subroutine uvw_array_change.
>>>>>
>>>>> The start_indices and end_indices are simply to shift the 0 indices of C convention to that of the 1 indices of the Fortran convention. This is necessary in my case because most of my codes start array counting at 1, hence the "trick".
>>>>>
>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 (as in 2 or 3), error will occur at "call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) "
>>>>>
>>>>> So did I violate and cause memory corruption due to the trick above? But I can't think of any way other                                                           than the "trick" to continue using the 1 indices convention.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Yours sincerely,
>>>>>
>>>>> TAY wee-beng
>>>>>
>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote:
>>>>>      Try running under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>
>>>>>
>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>
>>>>> Hi Barry,
>>>>>
>>>>> As I mentioned earlier, the code works fine in PETSc debug mode but fails in non-debug mode.
>>>>>
>>>>> I have attached my code.
>>>>>
>>>>> Thank you
>>>>>
>>>>> Yours sincerely,
>>>>>
>>>>> TAY wee-beng
>>>>>
>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>>>>      Please send the code that creates da_w and the declarations of w_array
>>>>>
>>>>>      Barry
>>>>>
>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng
>>>>> <zonexo at gmail.com>
>>>>>     wrote:
>>>>>
>>>>>
>>>>> Hi Barry,
>>>>>
>>>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>>>>
>>>>>     mpirun -n 4 ./a.out -start_in_debugger
>>>>>
>>>>> I got the msg below. Before the gdb windows appear (thru x11), the program aborts.
>>>>>
>>>>> Also I tried running in another cluster and it worked. Also tried in the current cluster in debug mode and it worked too.
>>>>>
>>>>> mpirun -n 4 ./a.out -start_in_debugger
>>>>> --------------------------------------------------------------------------
>>>>> An MPI process has executed an operation involving a call to the
>>>>> "fork()" system call to create a child process.  Open MPI is currently
>>>>> operating in a condition that could result in memory corruption or
>>>>> other system errors; your MPI job may hang, crash, or produce silent
>>>>> data corruption.  The use of fork() (or system() or other calls that
>>>>> create child processes) is strongly discouraged.
>>>>>
>>>>> The process that invoked fork was:
>>>>>
>>>>>      Local host:          n12-76 (PID 20235)
>>>>>      MPI_COMM_WORLD rank: 2
>>>>>
>>>>> If you are *absolutely sure* that your application will successfully
>>>>> and correctly survive a call to fork(), you may disable this warning
>>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>>> --------------------------------------------------------------------------
>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display localhost:50.0 on machine n12-76
>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display localhost:50.0 on machine n12-76
>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display localhost:50.0 on machine n12-76
>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display localhost:50.0 on machine n12-76
>>>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>>>>
>>>>> ....
>>>>>
>>>>>     1
>>>>> [1]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>> [1]PETSC ERROR: or see
>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org
>>>>>     on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>>> [1]PETSC ERROR: to get more information on the crash.
>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>>> [3]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>> [3]PETSC ERROR: or see
>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: or try http://valgrind.org
>>>>>     on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>>>>> [3]PETSC ERROR: to get more information on the crash.
>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown file (null)
>>>>>
>>>>> ...
>>>>> Thank you.
>>>>>
>>>>> Yours sincerely,
>>>>>
>>>>> TAY wee-beng
>>>>>
>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>>>>
>>>>>      Because IO doesn?t always get flushed immediately it may not be hanging at this point.  It is better to use the option -start_in_debugger then type cont in each debugger window and then when you think it is ?hanging? do a control C in each debugger window and type where to see where each process is you can also look around in the debugger at variables to see why it is ?hanging? at that point.
>>>>>
>>>>>       Barry
>>>>>
>>>>>      This routines don?t have any parallel communication in them so are unlikely to hang.
>>>>>
>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng
>>>>>
>>>>> <zonexo at gmail.com>
>>>>>
>>>>>     wrote:
>>>>>
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I found that it hangs after printing "7". Is it because I'm doing something wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90.
>>>>>
>>>>>            call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3"
>>>>>            call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>>>>            call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>>>>            call I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>>>>            call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be in reverse order
>>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>>>>            call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)
>>>>>            call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>>>>            call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>> -- 
>>>>> Thank you.
>>>>>
>>>>> Yours sincerely,
>>>>>
>>>>> TAY wee-beng
>>>>>
>>>>>
>>>>>
>>>>> <code.txt>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>>
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>>
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>>
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener


From gmulas at oa-cagliari.inaf.it  Wed May 28 12:27:39 2014
From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas)
Date: Wed, 28 May 2014 19:27:39 +0200 (CEST)
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
Message-ID: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>

Hello.

After some time stuck doing less exciting stuff, I got back to trying to use
slepc for science.  I am trying to use the relatively new functionality for
arbitrary selection of the eigenpairs to be found, and I would want some
support to understand if I am doing things correctly.  I apologise with Jose
Roman for disappearing when I should have helped testing this, it was not my
choice. But now I am back at it.

In particular, I want to obtain the eigenvectors with the maximum projection
in a given subspace (defined by a number of normalised vectors, let's call
them targets).

I don't know in advance how many eigenpairs must be determined: I want to
obtain enough eigenvectors that the projection of all targets in the space
of these eigenvectors is very nearly identical to the targets themselves.

So my strategy, so far, is the following:

1) create and set up the matrix H to be diagonalised

   ierr = MatCreate(mixpars->slepc_comm, &H); CHKERRQ(ierr);
   ierr = MatSetSizes(H, PETSC_DECIDE, PETSC_DECIDE, statesinlist,
                      statesinlist);
   CHKERRQ(ierr);
   ierr = MatSetFromOptions(H);CHKERRQ(ierr);
   ierr = MatSetUp(H);CHKERRQ(ierr);
   ...
   ... some MatSetValue(H,...);
   ...
   ierr = MatAssemblyBegin(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
   ierr = MatAssemblyEnd(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
   ierr = MatGetVecs(H, &xr, PETSC_NULL);CHKERRQ(ierr);
   ierr = MatGetVecs(H, &xi, PETSC_NULL);CHKERRQ(ierr);

2) create the eps

   ierr = EPSCreate(mixpars->slepc_comm,&eps);CHKERRQ(ierr);
   ierr = EPSSetOperators(eps,H,PETSC_NULL);CHKERRQ(ierr);
   ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
   ierr = EPSSetTolerances(eps, tol, PETSC_DECIDE);
   ierr = EPSSetFromOptions(eps); CHKERRQ(ierr);
   ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr);
   ierr = EPSSetFromOptions(eps); CHKERRQ(ierr);
   ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr);

3) set the eps to use an arbitrary selection function

   /* every time I solve, I want to find one eigenvector, and it must be
      the one with the largest component along the target state */
   ierr = EPSSetWhichEigenpairs(eps, EPS_LARGEST_MAGNITUDE);
   ierr = EPSSetArbitrarySelection(eps, computeprojection(),
 				  (void *) &targetcompindex);
   ierr = EPSSetDimensions(eps, 1, PETSC_IGNORE, PETSC_IGNORE);
   CHKERRQ(ierr);

4) run a loop over the target vectors, and iteratively call EPSSolve until
    each target vector is completely contained in the space of the
    eigenvectors found. Before each call to EPSSolve, I set the initial guess
    equal to the target vector, and set the deflation space to be the set of
    eigenvectors found so far. After each call to EPSSolve, I add the new
    eigenvectors to the deflation space one by one, and check if the target
    state is (nearly) fully contained in the eigenvectors space. If yes, I
    move on to the next target state and so on.

5) free everything, destroy eps, matrices, vectors etc.


I have some questions about the above:

1) should it work in principle, or am I getting it all wrong?

2) should I destroy and recreate the eps after each call to EPSSolve and
before next call? Or, since the underlying matrix is always the same, can I
just call EPSSetInitialSpace(), EPSSetDeflationSpace(), update the internal
parameter to be passed to the arbitrary selection function and I can call
again EPSSolve?

3) Since what I want is going on to find one eigenpair at a time of the same
problem until some condition is fulfilled, is there a way in which I can
achieve this without setting it up again and again every time? Can I specify
an arbitrary function that is called by EPSSolve to decide whether enough
eigenpairs were computed or not, instead of doing it in this somewhat
awkward manner?

4) more technical: since I add vectors one by one to the deflation space, to
begin with I allocate a Vec *Cv with 
PetscMalloc(statesinlist*sizeof(Vec *), &Cv);
where statesinlist is the size of the problem, hence the maximum
hypothetical size of the deflation space. I would prefer to allocate this
dynamically, enlarging it as needed. Is there something like realloc() in
PETSC/SLEPC?

Thanks in advance, bye
Giacomo Mulas

-- 
_________________________________________________________________

Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
_________________________________________________________________

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel.   +39 070 71180244
mob. : +39 329  6603810
_________________________________________________________________

"When the storms are raging around you, stay right where you are"
                          (Freddy Mercury)
_________________________________________________________________

From knepley at gmail.com  Wed May 28 13:08:19 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 28 May 2014 13:08:19 -0500
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
In-Reply-To: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
Message-ID: <CAMYG4Gki+64xbtXZL2dffNMCL_Z5YBzKsmVYNpctCJTdp1C+tw@mail.gmail.com>

On Wed, May 28, 2014 at 12:27 PM, Giacomo Mulas
<gmulas at oa-cagliari.inaf.it>wrote:

> Hello.
>
> After some time stuck doing less exciting stuff, I got back to trying to
> use
> slepc for science.  I am trying to use the relatively new functionality for
> arbitrary selection of the eigenpairs to be found, and I would want some
> support to understand if I am doing things correctly.  I apologise with
> Jose
> Roman for disappearing when I should have helped testing this, it was not
> my
> choice. But now I am back at it.
>
> In particular, I want to obtain the eigenvectors with the maximum
> projection
> in a given subspace (defined by a number of normalised vectors, let's call
> them targets).
>
> I don't know in advance how many eigenpairs must be determined: I want to
> obtain enough eigenvectors that the projection of all targets in the space
> of these eigenvectors is very nearly identical to the targets themselves.
>
> So my strategy, so far, is the following:
>
> 1) create and set up the matrix H to be diagonalised
>
>   ierr = MatCreate(mixpars->slepc_comm, &H); CHKERRQ(ierr);
>   ierr = MatSetSizes(H, PETSC_DECIDE, PETSC_DECIDE, statesinlist,
>                      statesinlist);
>   CHKERRQ(ierr);
>   ierr = MatSetFromOptions(H);CHKERRQ(ierr);
>   ierr = MatSetUp(H);CHKERRQ(ierr);
>   ...
>   ... some MatSetValue(H,...);
>   ...
>   ierr = MatAssemblyBegin(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>   ierr = MatAssemblyEnd(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>   ierr = MatGetVecs(H, &xr, PETSC_NULL);CHKERRQ(ierr);
>   ierr = MatGetVecs(H, &xi, PETSC_NULL);CHKERRQ(ierr);
>
> 2) create the eps
>
>   ierr = EPSCreate(mixpars->slepc_comm,&eps);CHKERRQ(ierr);
>   ierr = EPSSetOperators(eps,H,PETSC_NULL);CHKERRQ(ierr);
>   ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
>   ierr = EPSSetTolerances(eps, tol, PETSC_DECIDE);
>   ierr = EPSSetFromOptions(eps); CHKERRQ(ierr);
>   ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr);
>   ierr = EPSSetFromOptions(eps); CHKERRQ(ierr);
>   ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr);
>
> 3) set the eps to use an arbitrary selection function
>
>   /* every time I solve, I want to find one eigenvector, and it must be
>      the one with the largest component along the target state */
>   ierr = EPSSetWhichEigenpairs(eps, EPS_LARGEST_MAGNITUDE);
>   ierr = EPSSetArbitrarySelection(eps, computeprojection(),
>                                   (void *) &targetcompindex);
>   ierr = EPSSetDimensions(eps, 1, PETSC_IGNORE, PETSC_IGNORE);
>   CHKERRQ(ierr);
>
> 4) run a loop over the target vectors, and iteratively call EPSSolve until
>    each target vector is completely contained in the space of the
>    eigenvectors found. Before each call to EPSSolve, I set the initial
> guess
>    equal to the target vector, and set the deflation space to be the set of
>    eigenvectors found so far. After each call to EPSSolve, I add the new
>    eigenvectors to the deflation space one by one, and check if the target
>    state is (nearly) fully contained in the eigenvectors space. If yes, I
>    move on to the next target state and so on.
>
> 5) free everything, destroy eps, matrices, vectors etc.
>
>
> I have some questions about the above:
>
> 1) should it work in principle, or am I getting it all wrong?
>
> 2) should I destroy and recreate the eps after each call to EPSSolve and
> before next call? Or, since the underlying matrix is always the same, can I
> just call EPSSetInitialSpace(), EPSSetDeflationSpace(), update the internal
> parameter to be passed to the arbitrary selection function and I can call
> again EPSSolve?
>
> 3) Since what I want is going on to find one eigenpair at a time of the
> same
> problem until some condition is fulfilled, is there a way in which I can
> achieve this without setting it up again and again every time? Can I
> specify
> an arbitrary function that is called by EPSSolve to decide whether enough
> eigenpairs were computed or not, instead of doing it in this somewhat
> awkward manner?
>
> 4) more technical: since I add vectors one by one to the deflation space,
> to
> begin with I allocate a Vec *Cv with PetscMalloc(statesinlist*sizeof(Vec
> *), &Cv);
> where statesinlist is the size of the problem, hence the maximum
> hypothetical size of the deflation space. I would prefer to allocate this
> dynamically, enlarging it as needed. Is there something like realloc() in
> PETSC/SLEPC?
>

There is not. However, since this is just a set of Vec pointers, allocating
and copying
should be fine. The amount of memory taken up by the pointers is very very
small.

  Thanks,

     Matt


> Thanks in advance, bye
> Giacomo Mulas
>
> --
> _________________________________________________________________
>
> Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
> _________________________________________________________________
>
> INAF - Osservatorio Astronomico di Cagliari
> via della scienza 5 - 09047 Selargius (CA)
>
> tel.   +39 070 71180244
> mob. : +39 329 6603810
> _________________________________________________________________
>
> "When the storms are raging around you, stay right where you are"
>                          (Freddy Mercury)
> _________________________________________________________________
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140528/68a50eb5/attachment.html>

From hus003 at ucsd.edu  Wed May 28 13:21:53 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Wed, 28 May 2014 18:21:53 +0000
Subject: [petsc-users] Question about dm_view
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>

Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides<http://www.mcs.anl.gov/petsc/documentation/tutorials/ParisTutorial.pdf>. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you!   ( Hui )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140528/888764e5/attachment.html>

From bsmith at mcs.anl.gov  Wed May 28 13:25:14 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 28 May 2014 13:25:14 -0500
Subject: [petsc-users] Question about dm_view
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>


  Run as./ex5 -help | grep view to see the possibilities.  It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately.

   Barry


On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

> Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you!   ( Hui )


From hus003 at ucsd.edu  Wed May 28 13:28:33 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Wed, 28 May 2014 18:28:33 +0000
Subject: [petsc-users] Question about dm_view
In-Reply-To: <94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>

Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag <FALSE>, what does this mean? 


________________________________________
From: Barry Smith [bsmith at mcs.anl.gov]
Sent: Wednesday, May 28, 2014 11:25 AM
To: Sun, Hui
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question about dm_view

  Run as./ex5 -help | grep view to see the possibilities.  It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately.

   Barry


On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

> Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you!   ( Hui )


From jroman at dsic.upv.es  Wed May 28 13:59:27 2014
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Wed, 28 May 2014 20:59:27 +0200
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
In-Reply-To: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
Message-ID: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>


El 28/05/2014, a las 19:27, Giacomo Mulas escribi?:

> Hello.
> 
> After some time stuck doing less exciting stuff, I got back to trying to use
> slepc for science.  I am trying to use the relatively new functionality for
> arbitrary selection of the eigenpairs to be found, and I would want some
> support to understand if I am doing things correctly.  I apologise with Jose
> Roman for disappearing when I should have helped testing this, it was not my
> choice. But now I am back at it.
> 
> In particular, I want to obtain the eigenvectors with the maximum projection
> in a given subspace (defined by a number of normalised vectors, let's call
> them targets).
> 
> I don't know in advance how many eigenpairs must be determined: I want to
> obtain enough eigenvectors that the projection of all targets in the space
> of these eigenvectors is very nearly identical to the targets themselves.
> 
> So my strategy, so far, is the following:
> 
> 1) create and set up the matrix H to be diagonalised
> 
>  ierr = MatCreate(mixpars->slepc_comm, &H); CHKERRQ(ierr);
>  ierr = MatSetSizes(H, PETSC_DECIDE, PETSC_DECIDE, statesinlist,
>                     statesinlist);
>  CHKERRQ(ierr);
>  ierr = MatSetFromOptions(H);CHKERRQ(ierr);
>  ierr = MatSetUp(H);CHKERRQ(ierr);
>  ...
>  ... some MatSetValue(H,...);
>  ...
>  ierr = MatAssemblyBegin(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>  ierr = MatAssemblyEnd(H,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>  ierr = MatGetVecs(H, &xr, PETSC_NULL);CHKERRQ(ierr);
>  ierr = MatGetVecs(H, &xi, PETSC_NULL);CHKERRQ(ierr);
> 
> 2) create the eps
> 
>  ierr = EPSCreate(mixpars->slepc_comm,&eps);CHKERRQ(ierr);
>  ierr = EPSSetOperators(eps,H,PETSC_NULL);CHKERRQ(ierr);
>  ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
>  ierr = EPSSetTolerances(eps, tol, PETSC_DECIDE);
>  ierr = EPSSetFromOptions(eps); CHKERRQ(ierr);
>  ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr);
>  ierr = EPSSetFromOptions(eps); CHKERRQ(ierr);
>  ierr = PetscOptionsSetFromOptions(); CHKERRQ(ierr);
> 
> 3) set the eps to use an arbitrary selection function
> 
>  /* every time I solve, I want to find one eigenvector, and it must be
>     the one with the largest component along the target state */
>  ierr = EPSSetWhichEigenpairs(eps, EPS_LARGEST_MAGNITUDE);
>  ierr = EPSSetArbitrarySelection(eps, computeprojection(),
> 				  (void *) &targetcompindex);
>  ierr = EPSSetDimensions(eps, 1, PETSC_IGNORE, PETSC_IGNORE);
>  CHKERRQ(ierr);
> 
> 4) run a loop over the target vectors, and iteratively call EPSSolve until
>   each target vector is completely contained in the space of the
>   eigenvectors found. Before each call to EPSSolve, I set the initial guess
>   equal to the target vector, and set the deflation space to be the set of
>   eigenvectors found so far. After each call to EPSSolve, I add the new
>   eigenvectors to the deflation space one by one, and check if the target
>   state is (nearly) fully contained in the eigenvectors space. If yes, I
>   move on to the next target state and so on.
> 
> 5) free everything, destroy eps, matrices, vectors etc.
> 
> 
> I have some questions about the above:
> 
> 1) should it work in principle, or am I getting it all wrong?

I don't see much problem.

> 
> 2) should I destroy and recreate the eps after each call to EPSSolve and
> before next call? Or, since the underlying matrix is always the same, can I
> just call EPSSetInitialSpace(), EPSSetDeflationSpace(), update the internal
> parameter to be passed to the arbitrary selection function and I can call
> again EPSSolve?

No need to recreate the solver. The only thing is EPSSetDeflationSpace() - I would suggest calling EPSRemoveDeflationSpace() and then EPSSetDeflationSpace() again with the extended set of vectors. Do not call EPSSetDeflationSpace() with a single vector every time.

> 
> 3) Since what I want is going on to find one eigenpair at a time of the same
> problem until some condition is fulfilled, is there a way in which I can
> achieve this without setting it up again and again every time? Can I specify
> an arbitrary function that is called by EPSSolve to decide whether enough
> eigenpairs were computed or not, instead of doing it in this somewhat
> awkward manner?

No.

> 
> 4) more technical: since I add vectors one by one to the deflation space, to
> begin with I allocate a Vec *Cv with PetscMalloc(statesinlist*sizeof(Vec *), &Cv);
> where statesinlist is the size of the problem, hence the maximum
> hypothetical size of the deflation space. I would prefer to allocate this
> dynamically, enlarging it as needed. Is there something like realloc() in
> PETSC/SLEPC?
> 
> Thanks in advance, bye
> Giacomo Mulas
> 
> -- 
> _________________________________________________________________
> 
> Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
> _________________________________________________________________
> 
> INAF - Osservatorio Astronomico di Cagliari
> via della scienza 5 - 09047 Selargius (CA)
> 
> tel.   +39 070 71180244
> mob. : +39 329  6603810
> _________________________________________________________________
> 
> "When the storms are raging around you, stay right where you are"
>                         (Freddy Mercury)
> _________________________________________________________________


From knepley at gmail.com  Wed May 28 14:10:25 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 28 May 2014 14:10:25 -0500
Subject: [petsc-users] Question about dm_view
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <CAMYG4G=dRFykSkPnRQyQny018BeXXrnhmGuv719JddS8F06kDg@mail.gmail.com>

On Wed, May 28, 2014 at 1:28 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

> Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it
> comes out a list of options related to _view, all of which have the tag
> <FALSE>, what does this mean?
>

The <FALSE> is the current value. They are all false because you have not
turned them on. IF you are using the release version,
the viewing option is -da_view. The -dm_view is the new version which we
are about to release.

  Thanks,

    Matt


> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Wednesday, May 28, 2014 11:25 AM
> To: Sun, Hui
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about dm_view
>
>   Run as./ex5 -help | grep view to see the possibilities.  It depends on
> PETSc version number. When using the graphics want you generally want a
> -draw_pause -1 to stop that program at the graphic otherwise it pops up and
> disappears immediately.
>
>    Barry
>
>
> On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
>
> > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial
> College from this site: Slides. In slide page 28, there is description of
> viewing the DA. I'm testing from my MAC the same commands listed on that
> page, for example, ex5 -dm_view, nothing interesting happen except the
> Number of Newton iterations is outputted. I'm expecting that the PETSc
> numbering would show up as a graphic window or something. Can anyone tell
> me what's missing here? Thank you!   ( Hui )
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140528/fd72eba9/attachment.html>

From hus003 at ucsd.edu  Wed May 28 14:13:32 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Wed, 28 May 2014 19:13:32 +0000
Subject: [petsc-users] Question about dm_view
In-Reply-To: <CAMYG4G=dRFykSkPnRQyQny018BeXXrnhmGuv719JddS8F06kDg@mail.gmail.com>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<CAMYG4G=dRFykSkPnRQyQny018BeXXrnhmGuv719JddS8F06kDg@mail.gmail.com>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>

Do I have to turn it on thru ./configure and then make everything again?

________________________________
From: Matthew Knepley [knepley at gmail.com]
Sent: Wednesday, May 28, 2014 12:10 PM
To: Sun, Hui
Cc: Barry Smith; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question about dm_view

On Wed, May 28, 2014 at 1:28 PM, Sun, Hui <hus003 at ucsd.edu<mailto:hus003 at ucsd.edu>> wrote:
Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag <FALSE>, what does this mean?

The <FALSE> is the current value. They are all false because you have not turned them on. IF you are using the release version,
the viewing option is -da_view. The -dm_view is the new version which we are about to release.

  Thanks,

    Matt

________________________________________
From: Barry Smith [bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
Sent: Wednesday, May 28, 2014 11:25 AM
To: Sun, Hui
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Question about dm_view

  Run as./ex5 -help | grep view to see the possibilities.  It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately.

   Barry


On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu<mailto:hus003 at ucsd.edu>> wrote:

> Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you!   ( Hui )


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140528/71ee6d67/attachment.html>

From knepley at gmail.com  Wed May 28 14:18:58 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 28 May 2014 14:18:58 -0500
Subject: [petsc-users] Question about dm_view
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<CAMYG4G=dRFykSkPnRQyQny018BeXXrnhmGuv719JddS8F06kDg@mail.gmail.com>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <CAMYG4GmTYh8rNVaD_v3SEubxvvk9q+iXmswf4wUPN5F2v4cqMw@mail.gmail.com>

On Wed, May 28, 2014 at 2:13 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

>  Do I have to turn it on thru ./configure and then make everything again?
>

No. You should see that option in the output of -help. By "turned on" I
meant that
the value of the option is FALSE.

  Thanks,

     Matt


>  ------------------------------
> *From:* Matthew Knepley [knepley at gmail.com]
> *Sent:* Wednesday, May 28, 2014 12:10 PM
> *To:* Sun, Hui
> *Cc:* Barry Smith; petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] Question about dm_view
>
>    On Wed, May 28, 2014 at 1:28 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
>
>> Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it
>> comes out a list of options related to _view, all of which have the tag
>> <FALSE>, what does this mean?
>>
>
>  The <FALSE> is the current value. They are all false because you have
> not turned them on. IF you are using the release version,
> the viewing option is -da_view. The -dm_view is the new version which we
> are about to release.
>
>    Thanks,
>
>      Matt
>
>
>> ________________________________________
>> From: Barry Smith [bsmith at mcs.anl.gov]
>> Sent: Wednesday, May 28, 2014 11:25 AM
>> To: Sun, Hui
>> Cc: petsc-users at mcs.anl.gov
>> Subject: Re: [petsc-users] Question about dm_view
>>
>>   Run as./ex5 -help | grep view to see the possibilities.  It depends on
>> PETSc version number. When using the graphics want you generally want a
>> -draw_pause -1 to stop that program at the graphic otherwise it pops up and
>> disappears immediately.
>>
>>    Barry
>>
>>
>> On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
>>
>> > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial
>> College from this site: Slides. In slide page 28, there is description of
>> viewing the DA. I'm testing from my MAC the same commands listed on that
>> page, for example, ex5 -dm_view, nothing interesting happen except the
>> Number of Newton iterations is outputted. I'm expecting that the PETSc
>> numbering would show up as a graphic window or something. Can anyone tell
>> me what's missing here? Thank you!   ( Hui )
>>
>>
>
>
>  --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140528/bbe2d2fa/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed May 28 15:12:44 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 28 May 2014 15:12:44 -0500
Subject: [petsc-users] Question about dm_view
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<CAMYG4G=dRFykSkPnRQyQny018BeXXrnhmGuv719JddS8F06kDg@mail.gmail.com>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov>


On May 28, 2014, at 2:13 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

> Do I have to turn it on thru ./configure and then make everything again? 

  No, just run the program with the option. For example if there is printed -dm_view_draw <false> then run the program with -dm_view_draw true

Barry

> 
> From: Matthew Knepley [knepley at gmail.com]
> Sent: Wednesday, May 28, 2014 12:10 PM
> To: Sun, Hui
> Cc: Barry Smith; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about dm_view
> 
> On Wed, May 28, 2014 at 1:28 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
> Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag <FALSE>, what does this mean?
> 
> The <FALSE> is the current value. They are all false because you have not turned them on. IF you are using the release version,
> the viewing option is -da_view. The -dm_view is the new version which we are about to release.
> 
>   Thanks,
> 
>     Matt
>  
> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Wednesday, May 28, 2014 11:25 AM
> To: Sun, Hui
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about dm_view
> 
>   Run as./ex5 -help | grep view to see the possibilities.  It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately.
> 
>    Barry
> 
> 
> On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
> 
> > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you!   ( Hui )
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From hus003 at ucsd.edu  Wed May 28 15:16:48 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Wed, 28 May 2014 20:16:48 +0000
Subject: [petsc-users] Question about dm_view
In-Reply-To: <364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<CAMYG4G=dRFykSkPnRQyQny018BeXXrnhmGuv719JddS8F06kDg@mail.gmail.com>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B70E6@XMAIL-MBX-BH1.AD.UCSD.EDU>

Thanks, now I get it working.   -Hui

________________________________________
From: Barry Smith [bsmith at mcs.anl.gov]
Sent: Wednesday, May 28, 2014 1:12 PM
To: Sun, Hui
Cc: Matthew Knepley; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question about dm_view

On May 28, 2014, at 2:13 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

> Do I have to turn it on thru ./configure and then make everything again?

  No, just run the program with the option. For example if there is printed -dm_view_draw <false> then run the program with -dm_view_draw true

Barry

>
> From: Matthew Knepley [knepley at gmail.com]
> Sent: Wednesday, May 28, 2014 12:10 PM
> To: Sun, Hui
> Cc: Barry Smith; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about dm_view
>
> On Wed, May 28, 2014 at 1:28 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
> Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag <FALSE>, what does this mean?
>
> The <FALSE> is the current value. They are all false because you have not turned them on. IF you are using the release version,
> the viewing option is -da_view. The -dm_view is the new version which we are about to release.
>
>   Thanks,
>
>     Matt
>
> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Wednesday, May 28, 2014 11:25 AM
> To: Sun, Hui
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about dm_view
>
>   Run as./ex5 -help | grep view to see the possibilities.  It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately.
>
>    Barry
>
>
> On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
>
> > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you!   ( Hui )
>
>
>
>
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From danyang.su at gmail.com  Wed May 28 16:57:54 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Wed, 28 May 2014 14:57:54 -0700
Subject: [petsc-users] Running problem with pc_type hypre
Message-ID: <53865BE2.2060807@gmail.com>

Hi All,

I am testing my codes under windows with PETSc V3.4.4.

When running with option -pc_type hypre using 1 processor, the program 
exactly uses 6 processors (my computer is 6 processors 12 threads) and 
the program crashed after many timesteps. The error information is as 
follows:

job aborted:
[ranks] message

[0] fatal error
Fatal error in MPI_Comm_create: Internal MPI error!, error stack:
MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, 
group=0xc80300f2, new_comm=0x000000001EA6DD30) failed
MPI_Comm_create(524).......:
MPIR_Comm_create_intra(209):
MPIR_Get_contextid(253)....: Too many communicators

When running with option -pc_type hypre using 2 processors or more, the 
program exactly uses all the threads, making the system seriously 
overburden and the program runs very slowly.

When running without -pc_type hypre, the program works fine without any 
problem.

Does anybody have the same problem in windows.

Thanks and regards,

Danyang

From bsmith at mcs.anl.gov  Wed May 28 18:01:02 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 28 May 2014 18:01:02 -0500
Subject: [petsc-users] Running problem with pc_type hypre
In-Reply-To: <53865BE2.2060807@gmail.com>
References: <53865BE2.2060807@gmail.com>
Message-ID: <BBE0A78F-6C5A-48F8-92D9-B87CD2E72383@mcs.anl.gov>


  Some possibilities:

  Are you sure that the hypre was compiled with exactly the same MPI as the that used to build PETSc?

On May 28, 2014, at 4:57 PM, Danyang Su <danyang.su at gmail.com> wrote:

> Hi All,
> 
> I am testing my codes under windows with PETSc V3.4.4.
> 
> When running with option -pc_type hypre using 1 processor, the program exactly uses 6 processors (my computer is 6 processors 12 threads)

    6 threads? or 6 processes? It should not be possible for it to use more processes then what you start the program with.

    hypre can be configured to use OpenMP thread parallelism PLUS MPI parallelism. Was it configured/compiled for that? If so you want to turn that off,
configure and compile hypre before linking to PETSc so it does not use OpenMP.

   Are you sure you don?t have a bunch of zombie MPI processes running from previous jobs that crashed. They suck up CPU but are not involved in the current MPI run. Reboot the machine to get rid of them all.

  Barry

> and the program crashed after many timesteps. The error information is as follows:
> 
> job aborted:
> [ranks] message
> 
> [0] fatal error
> Fatal error in MPI_Comm_create: Internal MPI error!, error stack:
> MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, group=0xc80300f2, new_comm=0x000000001EA6DD30) failed
> MPI_Comm_create(524).......:
> MPIR_Comm_create_intra(209):
> MPIR_Get_contextid(253)....: Too many communicators
> 
> When running with option -pc_type hypre using 2 processors or more, the program exactly uses all the threads, making the system seriously overburden and the program runs very slowly.
> 
> When running without -pc_type hypre, the program works fine without any problem.
> 
> Does anybody have the same problem in windows.
> 
> Thanks and regards,
> 
> Danyang


From danyang.su at gmail.com  Wed May 28 18:10:52 2014
From: danyang.su at gmail.com (Danyang Su)
Date: Wed, 28 May 2014 16:10:52 -0700
Subject: [petsc-users] Running problem with pc_type hypre
In-Reply-To: <BBE0A78F-6C5A-48F8-92D9-B87CD2E72383@mcs.anl.gov>
References: <53865BE2.2060807@gmail.com>
	<BBE0A78F-6C5A-48F8-92D9-B87CD2E72383@mcs.anl.gov>
Message-ID: <53866CFC.1080401@gmail.com>

Hi Barry,

I need further check on it. Running this executable file on another 
machine results into mkl_intel_thread.dll missing error. I am not sure 
at present if the mkl_intel_thread.dll version causes this problem.

Thanks,

Danyang

On 28/05/2014 4:01 PM, Barry Smith wrote:
>    Some possibilities:
>
>    Are you sure that the hypre was compiled with exactly the same MPI as the that used to build PETSc?
>
> On May 28, 2014, at 4:57 PM, Danyang Su <danyang.su at gmail.com> wrote:
>
>> Hi All,
>>
>> I am testing my codes under windows with PETSc V3.4.4.
>>
>> When running with option -pc_type hypre using 1 processor, the program exactly uses 6 processors (my computer is 6 processors 12 threads)
>      6 threads? or 6 processes? It should not be possible for it to use more processes then what you start the program with.
>
>      hypre can be configured to use OpenMP thread parallelism PLUS MPI parallelism. Was it configured/compiled for that? If so you want to turn that off,
> configure and compile hypre before linking to PETSc so it does not use OpenMP.
>
>     Are you sure you don?t have a bunch of zombie MPI processes running from previous jobs that crashed. They suck up CPU but are not involved in the current MPI run. Reboot the machine to get rid of them all.
>
>    Barry
>
>> and the program crashed after many timesteps. The error information is as follows:
>>
>> job aborted:
>> [ranks] message
>>
>> [0] fatal error
>> Fatal error in MPI_Comm_create: Internal MPI error!, error stack:
>> MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, group=0xc80300f2, new_comm=0x000000001EA6DD30) failed
>> MPI_Comm_create(524).......:
>> MPIR_Comm_create_intra(209):
>> MPIR_Get_contextid(253)....: Too many communicators
>>
>> When running with option -pc_type hypre using 2 processors or more, the program exactly uses all the threads, making the system seriously overburden and the program runs very slowly.
>>
>> When running without -pc_type hypre, the program works fine without any problem.
>>
>> Does anybody have the same problem in windows.
>>
>> Thanks and regards,
>>
>> Danyang


From bsmith at mcs.anl.gov  Wed May 28 18:27:46 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 28 May 2014 18:27:46 -0500
Subject: [petsc-users] Running problem with pc_type hypre
In-Reply-To: <53866CFC.1080401@gmail.com>
References: <53865BE2.2060807@gmail.com>
	<BBE0A78F-6C5A-48F8-92D9-B87CD2E72383@mcs.anl.gov>
	<53866CFC.1080401@gmail.com>
Message-ID: <3B72A454-9960-4AA5-BB85-E5598EB973E8@mcs.anl.gov>


   This could be an issue. In general with PETSc you want to link against MKL libraries that DO NOT use threading. Otherwise you get oversubscription to the threads.

   Barry

On May 28, 2014, at 6:10 PM, Danyang Su <danyang.su at gmail.com> wrote:

> Hi Barry,
> 
> I need further check on it. Running this executable file on another machine results into mkl_intel_thread.dll missing error. I am not sure at present if the mkl_intel_thread.dll version causes this problem.
> 
> Thanks,
> 
> Danyang
> 
> On 28/05/2014 4:01 PM, Barry Smith wrote:
>>   Some possibilities:
>> 
>>   Are you sure that the hypre was compiled with exactly the same MPI as the that used to build PETSc?
>> 
>> On May 28, 2014, at 4:57 PM, Danyang Su <danyang.su at gmail.com> wrote:
>> 
>>> Hi All,
>>> 
>>> I am testing my codes under windows with PETSc V3.4.4.
>>> 
>>> When running with option -pc_type hypre using 1 processor, the program exactly uses 6 processors (my computer is 6 processors 12 threads)
>>     6 threads? or 6 processes? It should not be possible for it to use more processes then what you start the program with.
>> 
>>     hypre can be configured to use OpenMP thread parallelism PLUS MPI parallelism. Was it configured/compiled for that? If so you want to turn that off,
>> configure and compile hypre before linking to PETSc so it does not use OpenMP.
>> 
>>    Are you sure you don?t have a bunch of zombie MPI processes running from previous jobs that crashed. They suck up CPU but are not involved in the current MPI run. Reboot the machine to get rid of them all.
>> 
>>   Barry
>> 
>>> and the program crashed after many timesteps. The error information is as follows:
>>> 
>>> job aborted:
>>> [ranks] message
>>> 
>>> [0] fatal error
>>> Fatal error in MPI_Comm_create: Internal MPI error!, error stack:
>>> MPI_Comm_create(536).......: MPI_Comm_create(comm=0x84000000, group=0xc80300f2, new_comm=0x000000001EA6DD30) failed
>>> MPI_Comm_create(524).......:
>>> MPIR_Comm_create_intra(209):
>>> MPIR_Get_contextid(253)....: Too many communicators
>>> 
>>> When running with option -pc_type hypre using 2 processors or more, the program exactly uses all the threads, making the system seriously overburden and the program runs very slowly.
>>> 
>>> When running without -pc_type hypre, the program works fine without any problem.
>>> 
>>> Does anybody have the same problem in windows.
>>> 
>>> Thanks and regards,
>>> 
>>> Danyang
> 


From mfadams at lbl.gov  Wed May 28 21:54:28 2014
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 28 May 2014 22:54:28 -0400
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <538369C9.6010209@uci.edu>
References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org>
	<537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org>
	<537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org>
	<CADOhEh59U0gn1aZ3vPFs6mzb2FAnjAMcz8XuGCRAxf1hwE6xuQ@mail.gmail.com>
	<538369C9.6010209@uci.edu>
Message-ID: <CADOhEh5V3AoSxSXn8Kc=RhQF=Czj5U8Vq+5jWNznfM8+YVxhDg@mail.gmail.com>

On Mon, May 26, 2014 at 12:20 PM, Michele Rosso <mrosso at uci.edu> wrote:

>  Mark,
>
> thank you for your input and sorry my late reply: I saw your email only
> now.
> By setting up the solver each time step you mean re-defining the KSP
> context every time?
>

THe simplest thing is to just delete the object and create it again.  THere
are "reset" methods that do the same thing semantically but it is probably
just easier to destroy the KSP object and recreate it and redo your setup
code.


> Why should this help?
>

AMG methods optimized for a particular operator but "stale" setup data
often work well on problems that evolve, at least for a while, and it saves
a lot of time to not redo the "setup" every time.  How often you should
"refresh" the setup data is problem dependant and the application needs to
control that.  There are some hooks to fine tune how much setup data is
recomputed each solve, but we are just trying to see if redoing the setup
every time helps.  If this fixes the problem then we can think about cost.
 If it does not fix the problem then it is more serious.


> I will definitely try that as well as the hypre solution and report back.
> Again, thank you.
>
> Michele
>
>
> On 05/22/2014 09:34 AM, Mark Adams wrote:
>
> If the solver is degrading as the coefficients change, and I would assume
> get more nasty, you can try deleting the solver at each time step.  This
> will be about 2x more expensive, because it does the setup each solve, but
> it might fix your problem.
>
>  You also might try:
>
>  -pc_type hypre
> -pc_hypre_type boomeramg
>
>
>
>
> On Mon, May 19, 2014 at 6:49 PM, Jed Brown <jed at jedbrown.org> wrote:
>
>> Michele Rosso <mrosso at uci.edu> writes:
>>
>> > Jed,
>> >
>>  > thank you very much!
>> > I will try with ///-mg_levels_ksp_type chebyshev -mg_levels_pc_type
>> > sor/   and report back.
>> > Yes, I removed the nullspace from both the system matrix and the rhs.
>> > Is there a way to have something similar to Dendy's multigrid or the
>> > deflated conjugate gradient method with PETSc?
>>
>>  Dendy's MG needs geometry.  The algorithm to produce the interpolation
>> operators is not terribly complicated so it could be done, though DMDA
>> support for cell-centered is a somewhat awkward.  "Deflated CG" can mean
>> lots of things so you'll have to be more precise.  (Most everything in
>> the "deflation" world has a clear analogue in the MG world, but the
>> deflation community doesn't have a precise language to talk about their
>> methods so you always have to read the paper carefully to find out if
>> it's completely standard or if there is something new.)
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140528/855c665a/attachment-0001.html>

From qince168 at gmail.com  Thu May 29 01:08:01 2014
From: qince168 at gmail.com (Ce Qin)
Date: Thu, 29 May 2014 14:08:01 +0800
Subject: [petsc-users] How to use cmake to get external libraries?
Message-ID: <CA+g8s4vwpqJHZ04fys1bEG7FB6rU6FzQggC5ZevPh02+wJZMxg@mail.gmail.com>

Dear all,

I'n now using cmake to build my project. I had tried Jed's FindPETSc
module, it works pretty fine. However, the variable PETSC_LIBRARIES it
generates only have libpetsc.so. And I want to use the external libraries
directly, so I need to get the whole libraries like PETSC_LIB in
petscvariables. Can anyone provide some hints?

Thanks in advance.

Best regards,
Ce Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/3c0f8fad/attachment.html>

From jed at jedbrown.org  Thu May 29 01:15:03 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 29 May 2014 00:15:03 -0600
Subject: [petsc-users] How to use cmake to get external libraries?
In-Reply-To: <CA+g8s4vwpqJHZ04fys1bEG7FB6rU6FzQggC5ZevPh02+wJZMxg@mail.gmail.com>
References: <CA+g8s4vwpqJHZ04fys1bEG7FB6rU6FzQggC5ZevPh02+wJZMxg@mail.gmail.com>
Message-ID: <87vbspyuu0.fsf@jedbrown.org>

Ce Qin <qince168 at gmail.com> writes:

> Dear all,
>
> I'n now using cmake to build my project. I had tried Jed's FindPETSc
> module, it works pretty fine. However, the variable PETSC_LIBRARIES it
> generates only have libpetsc.so. And I want to use the external libraries
> directly, so I need to get the whole libraries like PETSC_LIB in
> petscvariables. Can anyone provide some hints?

Best to have Find${OtherLibrary}.cmake for those libraries.  It tangles
dependencies and will be harder to maintain in the long run if you
modify FindPETSc.cmake to provide access to optional third-party
libraries.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/50584e98/attachment.pgp>

From qince168 at gmail.com  Thu May 29 01:32:25 2014
From: qince168 at gmail.com (Ce Qin)
Date: Thu, 29 May 2014 14:32:25 +0800
Subject: [petsc-users] How to use cmake to get external libraries?
In-Reply-To: <87vbspyuu0.fsf@jedbrown.org>
References: <CA+g8s4vwpqJHZ04fys1bEG7FB6rU6FzQggC5ZevPh02+wJZMxg@mail.gmail.com>
	<87vbspyuu0.fsf@jedbrown.org>
Message-ID: <CA+g8s4vUO08XZNQXvpV_YN3UYeeU0NFrOeaVYodvN=detdXA=A@mail.gmail.com>

Thanks for your quick reply, Jed.

The find modules of many scientific libraries are not available. I also
considered using Makefile to build my project, but writing a extensible
Makefile (supporting out-of-source build) is not easy to do.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/0115dd52/attachment.html>

From mrosso at uci.edu  Thu May 29 01:44:34 2014
From: mrosso at uci.edu (Michele Rosso)
Date: Wed, 28 May 2014 23:44:34 -0700
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <CADOhEh5V3AoSxSXn8Kc=RhQF=Czj5U8Vq+5jWNznfM8+YVxhDg@mail.gmail.com>
References: <53753C7B.8010201@uci.edu>	<87y4y0uar8.fsf@jedbrown.org>	<537A8335.4080702@uci.edu>	<8761l1qu4x.fsf@jedbrown.org>	<537A88AC.3060308@uci.edu>	<871tvpqt8u.fsf@jedbrown.org>	<CADOhEh59U0gn1aZ3vPFs6mzb2FAnjAMcz8XuGCRAxf1hwE6xuQ@mail.gmail.com>	<538369C9.6010209@uci.edu>
	<CADOhEh5V3AoSxSXn8Kc=RhQF=Czj5U8Vq+5jWNznfM8+YVxhDg@mail.gmail.com>
Message-ID: <5386D752.6080103@uci.edu>

Thanks Mark! I will try and let you know.

On 05/28/2014 07:54 PM, Mark Adams wrote:
>
>
>
> On Mon, May 26, 2014 at 12:20 PM, Michele Rosso <mrosso at uci.edu 
> <mailto:mrosso at uci.edu>> wrote:
>
>     Mark,
>
>     thank you for your input and sorry my late reply: I saw your email
>     only now.
>     By setting up the solver each time step you mean re-defining the
>     KSP context every time?
>
>
> THe simplest thing is to just delete the object and create it again. 
>  THere are "reset" methods that do the same thing semantically but it 
> is probably just easier to destroy the KSP object and recreate it and 
> redo your setup code.
>
>     Why should this help?
>
>
> AMG methods optimized for a particular operator but "stale" setup data 
> often work well on problems that evolve, at least for a while, and it 
> saves a lot of time to not redo the "setup" every time.  How often you 
> should "refresh" the setup data is problem dependant and the 
> application needs to control that.  There are some hooks to fine tune 
> how much setup data is recomputed each solve, but we are just trying 
> to see if redoing the setup every time helps.  If this fixes the 
> problem then we can think about cost.  If it does not fix the problem 
> then it is more serious.
>
>     I will definitely try that as well as the hypre solution and
>     report back.
>     Again, thank you.
>
>     Michele
>
>
>     On 05/22/2014 09:34 AM, Mark Adams wrote:
>>     If the solver is degrading as the coefficients change, and I
>>     would assume get more nasty, you can try deleting the solver at
>>     each time step.  This will be about 2x more expensive, because it
>>     does the setup each solve, but it might fix your problem.
>>
>>     You also might try:
>>
>>     -pc_type hypre
>>     -pc_hypre_type boomeramg
>>
>>
>>
>>
>>     On Mon, May 19, 2014 at 6:49 PM, Jed Brown <jed at jedbrown.org
>>     <mailto:jed at jedbrown.org>> wrote:
>>
>>         Michele Rosso <mrosso at uci.edu <mailto:mrosso at uci.edu>> writes:
>>
>>         > Jed,
>>         >
>>         > thank you very much!
>>         > I will try with ///-mg_levels_ksp_type chebyshev
>>         -mg_levels_pc_type
>>         > sor/   and report back.
>>         > Yes, I removed the nullspace from both the system matrix
>>         and the rhs.
>>         > Is there a way to have something similar to Dendy's
>>         multigrid or the
>>         > deflated conjugate gradient method with PETSc?
>>
>>         Dendy's MG needs geometry.  The algorithm to produce the
>>         interpolation
>>         operators is not terribly complicated so it could be done,
>>         though DMDA
>>         support for cell-centered is a somewhat awkward.  "Deflated
>>         CG" can mean
>>         lots of things so you'll have to be more precise.  (Most
>>         everything in
>>         the "deflation" world has a clear analogue in the MG world,
>>         but the
>>         deflation community doesn't have a precise language to talk
>>         about their
>>         methods so you always have to read the paper carefully to
>>         find out if
>>         it's completely standard or if there is something new.)
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140528/98e01e92/attachment.html>

From jed at jedbrown.org  Thu May 29 01:49:26 2014
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 29 May 2014 00:49:26 -0600
Subject: [petsc-users] How to use cmake to get external libraries?
In-Reply-To: <CA+g8s4vUO08XZNQXvpV_YN3UYeeU0NFrOeaVYodvN=detdXA=A@mail.gmail.com>
References: <CA+g8s4vwpqJHZ04fys1bEG7FB6rU6FzQggC5ZevPh02+wJZMxg@mail.gmail.com>
	<87vbspyuu0.fsf@jedbrown.org>
	<CA+g8s4vUO08XZNQXvpV_YN3UYeeU0NFrOeaVYodvN=detdXA=A@mail.gmail.com>
Message-ID: <87sintyt8p.fsf@jedbrown.org>

Ce Qin <qince168 at gmail.com> writes:
> The find modules of many scientific libraries are not available. 

I'm sorry, but this is their problem (or your problem, or CMake's
problem).  PETSc provides interfaces to many other packages, but we
can't support every aspect of direct use of those packages.

It's easy to make my FindPETSc.cmake use the full list of libraries, but
I won't make that change and I won't support that use.  I.e., if you are
going to tangle dependencies like that, I'm not the one responsible for
supporting it.

> I also considered using Makefile to build my project, but writing a
> extensible Makefile (supporting out-of-source build) is not easy to
> do.

Makefiles with out-of-source support are not difficult, at least if you
use gnumake.  The PETSc build (in 'master') is one example.  Here is one
for a simpler project:

  https://bitbucket.org/hpgmg/hpgmg/src

Look at base.mk and local.mk.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/48419c2c/attachment.pgp>

From qince168 at gmail.com  Thu May 29 02:01:20 2014
From: qince168 at gmail.com (Ce Qin)
Date: Thu, 29 May 2014 15:01:20 +0800
Subject: [petsc-users] How to use cmake to get external libraries?
In-Reply-To: <87sintyt8p.fsf@jedbrown.org>
References: <CA+g8s4vwpqJHZ04fys1bEG7FB6rU6FzQggC5ZevPh02+wJZMxg@mail.gmail.com>
	<87vbspyuu0.fsf@jedbrown.org>
	<CA+g8s4vUO08XZNQXvpV_YN3UYeeU0NFrOeaVYodvN=detdXA=A@mail.gmail.com>
	<87sintyt8p.fsf@jedbrown.org>
Message-ID: <CA+g8s4u9J=zoo6pC5rVCiKft8JHL2+3yNvLH8JTAcagY8pj4bQ@mail.gmail.com>

Thanks, I will look at it.

Best regards,
Ce Qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/f61a00e5/attachment-0001.html>

From gmulas at oa-cagliari.inaf.it  Thu May 29 04:45:18 2014
From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas)
Date: Thu, 29 May 2014 11:45:18 +0200 (CEST)
Subject: [petsc-users] question about arbitrary eigenvector selection in
 SLEPC
In-Reply-To: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
Message-ID: <alpine.DEB.2.10.1405291040200.17732@capitanata.oa-cagliari.inaf.it>

Hi Jose, nice hearing from you. Thanks for your help.

On Wed, 28 May 2014, Jose E. Roman wrote:

> I don't see much problem.
>
>>
>> 2) should I destroy and recreate the eps after each call to EPSSolve and
>> before next call? Or, since the underlying matrix is always the same, can I
>> just call EPSSetInitialSpace(), EPSSetDeflationSpace(), update the internal
>> parameter to be passed to the arbitrary selection function and I can call
>> again EPSSolve?
>
> No need to recreate the solver. The only thing is EPSSetDeflationSpace() -
> I would suggest calling EPSRemoveDeflationSpace() and then
> EPSSetDeflationSpace() again with the extended set of vectors.  Do not
> call EPSSetDeflationSpace() with a single vector every time.

Yes, indeed. I did not detail it in my description but what I do is keep an
array Vec[] (allocated large enough at the beginning) and I attach
eigenvectors to it as I find them. 
After every call to EPSSolve I do

       ierr = EPSGetConverged(eps, &nconverged); CHKERRQ(ierr);
       if (nconverged > 0) {
 	for (petsck=0; petsck<=nconverged-1; petsck++) {
 	  EPSGetEigenpair(eps, petsck, &lambdar, &lambdai, xr, xi);
           newtotconverged = totconverged+petsk;
 	  ierr = VecDuplicate(xr, Cv+newtotconverged);
 	  ierr = VecCopy(xr, Cv[newtotconverged]);
         }
         totconverged += nconverged;
         ierr = EPSSetDeflationSpace(eps, totconverged, Cv);
       }

So every time I call EPSSetDeflationSpace() I do give it the complete set of
eigenvectors found so far, and their number, including the previously found
ones.
In the man page of EPSSetDeflationSpace it says that if another deflation
space was previously defined, it is killed and replaced by the new one,
that's why I did not explicitly call EPSRemoveDeflationSpace() until I'm
done.  Should I instead call EPSRemoveDeflationSpace every time before
calling EPSSetDeflationSpace?  i.e.  add a call to EPSRemoveDeflationSpace()
before the one to EPSSetDeflationSpace() in the snippet above?

>> 3) Since what I want is going on to find one eigenpair at a time of the same
>> problem until some condition is fulfilled, is there a way in which I can
>> achieve this without setting it up again and again every time? Can I specify
>> an arbitrary function that is called by EPSSolve to decide whether enough
>> eigenpairs were computed or not, instead of doing it in this somewhat
>> awkward manner?
>
> No.

ok. Then another question comes after this: how much overhead is involved in
setting up the deflation space etc.  for every eigenvector that is computed? 
I guess that EPSSolve, when more than one eigenpair is needed, actually does
the internal book-keeping of adding the already converged eigenvectors to
the deflation space, so doing it by hand out of EPSSolve looks inefficient
(even if maybe it's not so bad, I don't know the internals of the code).  If
I have a reasonable guess of how many eigenvectors I will need, would it be
much more efficient to determine more eigenvectors than one with each
EPSSolve call, with the risk of wasting time finding a few more than I need? 
How expensive is finding one eigenpair compared to the overhead of setting
up a new deflation space and calling EPSSolve repeatedly?  How do the costs
scale with the size of the matrix, and the size of the deflation space?

After I get at least one version of my code working properly, I will try
doing some tests with this, if it's useful.

Bye, thanks
Giacomo

-- 
_________________________________________________________________

Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
_________________________________________________________________

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel.   +39 070 71180244
mob. : +39 329  6603810
_________________________________________________________________

"When the storms are raging around you, stay right where you are"
                          (Freddy Mercury)
_________________________________________________________________

From gmulas at oa-cagliari.inaf.it  Thu May 29 10:03:31 2014
From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas)
Date: Thu, 29 May 2014 17:03:31 +0200 (CEST)
Subject: [petsc-users] question about arbitrary eigenvector selection in
 SLEPC
In-Reply-To: <5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
Message-ID: <alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>

Hi Jose, and list.

I am in the process of writing the function to use with
EPSSetArbitrarySelection().

Inside it, I will need to take some given component (which one is included
in the info passed via ctx) of the eigenvector and square it. To do this,
since the eigenvector is not necessarily local, I will need to first do a
scatter to a local 1-component vector. So this would be like:

... some omitted machinery to cast the info from *ctx to more easily
accessible form...

   ierr = ISCreateStride(PETXC_COMM_WORLD,1,myindex,1,&is1_from);CHKERRQ(ierr);
   ierr = ISCreateStride(PETSC_COMM_WORLD,1,0,1,&is1_to);CHKERRQ(ierr);
   ierr = VecCreateSeq(PETSC_COMM_SELF, 1, &localx1);CHKERRQ(ierr);
   ierr = VecScatterCreate(xr,is1_from,localx1,is1_to,&scatter1); CHKERRQ(ierr);
   ierr = VecScatterBegin(scatter1,xr,localx1,INSERT_VALUES,
 			 SCATTER_FORWARD);
   ierr = VecScatterEnd(scatter1,xr,localx1,INSERT_VALUES,
 		       SCATTER_FORWARD);
   ierr = VecGetArray(localx1,&comp);
   *rr = comp*comp;
   ierr = VecRestoreArray(localx1, &comp);
   ierr = VecDestroy(localx1);
   ierr = VecScatterDestroy(&scatter1);
   ierr = ISDestroy(&is1_from);
   ierr = ISDestroy(&is1_to);
   *ri = 0;

... some internal housekeeping omitted

   return 0;

The questions are:

1) when the arbitrary function is called, is it called on all nodes
simultaneously, so that collective functions can be expected to work
properly, being called on all involved nodes at the same time? Should all
processes compute the *rr and *ri to be returned, and return the same value?
would it be more efficient to create a unit vector uv containing only one
nonzero component, and then use VecDot(xr, uv, &comp), instead of pulling
the component I need and squaring it as I did above?


2) since the stride, the 1-component vector, the scatter are presumably 
the same through all calls within one EPSSolve, can I take them out of 
the arbitrary function, and make them available to it through *ctx? For this
to work, the structure of xr, the eigenvector passed to the arbitrary
function, must be known outside of EPSSolve.


Thanks, bye
Giacomo

-- 
_________________________________________________________________

Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
_________________________________________________________________

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel.   +39 070 71180244
mob. : +39 329  6603810
_________________________________________________________________

"When the storms are raging around you, stay right where you are"
                          (Freddy Mercury)
_________________________________________________________________

From knepley at gmail.com  Thu May 29 10:29:54 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 29 May 2014 10:29:54 -0500
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
In-Reply-To: <alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
	<alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
Message-ID: <CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>

On Thu, May 29, 2014 at 10:03 AM, Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
wrote:

> Hi Jose, and list.
>
> I am in the process of writing the function to use with
> EPSSetArbitrarySelection().
>
> Inside it, I will need to take some given component (which one is included
> in the info passed via ctx) of the eigenvector and square it. To do this,
> since the eigenvector is not necessarily local, I will need to first do a
> scatter to a local 1-component vector. So this would be like:
>
> ... some omitted machinery to cast the info from *ctx to more easily
> accessible form...
>

There might be an easier way to do this:
  PetscScalar val = 0.0, gval;

  VecGetOwnershipRange(xr, &low, &high);
  if ((myindex >= low) && (myindex < high)) {
    VecGetArray(localx1,&a);
    val = a[myindex-low];
    VecRestoreArray(localx1, &a);
  }
  MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);

Now everyone has the value at myindex.

   Matt


>   ierr = ISCreateStride(PETXC_COMM_WORLD,1,myindex,1,&is1_from);
> CHKERRQ(ierr);
>   ierr = ISCreateStride(PETSC_COMM_WORLD,1,0,1,&is1_to);CHKERRQ(ierr);
>   ierr = VecCreateSeq(PETSC_COMM_SELF, 1, &localx1);CHKERRQ(ierr);
>   ierr = VecScatterCreate(xr,is1_from,localx1,is1_to,&scatter1);
> CHKERRQ(ierr);
>   ierr = VecScatterBegin(scatter1,xr,localx1,INSERT_VALUES,
>                          SCATTER_FORWARD);
>   ierr = VecScatterEnd(scatter1,xr,localx1,INSERT_VALUES,
>                        SCATTER_FORWARD);
>   ierr = VecGetArray(localx1,&comp);
>   *rr = comp*comp;
>   ierr = VecRestoreArray(localx1, &comp);
>   ierr = VecDestroy(localx1);
>   ierr = VecScatterDestroy(&scatter1);
>   ierr = ISDestroy(&is1_from);
>   ierr = ISDestroy(&is1_to);
>   *ri = 0;
>
> ... some internal housekeeping omitted
>
>   return 0;
>
> The questions are:
>
> 1) when the arbitrary function is called, is it called on all nodes
> simultaneously, so that collective functions can be expected to work
> properly, being called on all involved nodes at the same time? Should all
> processes compute the *rr and *ri to be returned, and return the same
> value?
> would it be more efficient to create a unit vector uv containing only one
> nonzero component, and then use VecDot(xr, uv, &comp), instead of pulling
> the component I need and squaring it as I did above?
>
>
> 2) since the stride, the 1-component vector, the scatter are presumably
> the same through all calls within one EPSSolve, can I take them out of the
> arbitrary function, and make them available to it through *ctx? For this
> to work, the structure of xr, the eigenvector passed to the arbitrary
> function, must be known outside of EPSSolve.
>
>
> Thanks, bye
> Giacomo
>
> --
> _________________________________________________________________
>
> Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
> _________________________________________________________________
>
> INAF - Osservatorio Astronomico di Cagliari
> via della scienza 5 - 09047 Selargius (CA)
>
> tel.   +39 070 71180244
> mob. : +39 329 6603810
> _________________________________________________________________
>
> "When the storms are raging around you, stay right where you are"
>                          (Freddy Mercury)
> _________________________________________________________________
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/2625a811/attachment.html>

From masghar1397 at gmail.com  Thu May 29 10:38:13 2014
From: masghar1397 at gmail.com (M Asghar)
Date: Thu, 29 May 2014 16:38:13 +0100
Subject: [petsc-users] Accessing MUMPS INFOG values
Message-ID: <CAH=JewLHkZJJpywRX=Gvxk4zTCF4A9zgTyRH=rAhYB_xXKG37A@mail.gmail.com>

Hi,

Is it possible to access the contents of MUMPS array INFOG (and INFO,
RINFOG etc) via the PETSc interface?

I am working with SLEPc and am using MUMPS for the factorisation. I would
like to access the contents of their INFOG array within our code
particularly when an error occurs in order to determine whether any
remedial action can be taken. The error code returned from PETSc is useful;
any additional information from MUMPS that can be accessed from within ones
code would be very helpful also.

Many thanks in advance.

M Asghar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/46271249/attachment.html>

From hzhang at mcs.anl.gov  Thu May 29 10:43:19 2014
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 29 May 2014 10:43:19 -0500
Subject: [petsc-users] Accessing MUMPS INFOG values
In-Reply-To: <6ebee9bb5c5942d69d1f546025e9660f@LUCKMAN.anl.gov>
References: <6ebee9bb5c5942d69d1f546025e9660f@LUCKMAN.anl.gov>
Message-ID: <CAGCphBt01ZGdWtVCt7NCg=owHF_LOOLKDsC8SpzrsV=FXinxCQ@mail.gmail.com>

Asghar:
> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG
> etc) via the PETSc interface?

Yes. Use the latest petsc (master branch).
See petsc/src/ksp/ksp/examples/tutorials/ex52.c

Hong
>
> I am working with SLEPc and am using MUMPS for the factorisation. I would
> like to access the contents of their INFOG array within our code
> particularly when an error occurs in order to determine whether any remedial
> action can be taken. The error code returned from PETSc is useful; any
> additional information from MUMPS that can be accessed from within ones code
> would be very helpful also.
>
> Many thanks in advance.
>
> M Asghar
>

From gmulas at oa-cagliari.inaf.it  Thu May 29 10:58:19 2014
From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas)
Date: Thu, 29 May 2014 17:58:19 +0200 (CEST)
Subject: [petsc-users] question about arbitrary eigenvector selection in
 SLEPC
In-Reply-To: <CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
	<alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
	<CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>
Message-ID: <alpine.DEB.2.10.1405291744350.13302@capitanata.oa-cagliari.inaf.it>

On Thu, 29 May 2014, Matthew Knepley wrote:

> There might be an easier way to do this:
> ? PetscScalar val = 0.0, gval;
> 
> ??VecGetOwnershipRange(xr, &low, &high);
> ? if ((myindex >= low) && (myindex < high)) {
> ? ? VecGetArray(localx1,&a);
> ? ? val = a[myindex-low];
> ? ? VecRestoreArray(localx1, &a);
> ? }
> ? MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
> 
> Now everyone has the value at myindex.

brilliant, why didn't I think of this? Only, I guess you were
copying/pasting and some variable names slipped, namely localx instead of
xr.  Should it be

? PetscScalar val = 0.0, gval;
   PetscScalar *a;

??VecGetOwnershipRange(xr, &low, &high);
? if ((myindex >= low) && (myindex < high)) {
? ? VecGetArray(xr,&a);
? ? val = a[myindex-low]; 
? ? VecRestoreArray(xr, &a);
? }
? MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
   *rr = gval*gval;
   *ri = 0;

?

Thanks!
Giacomo

-- 
_________________________________________________________________

Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
_________________________________________________________________

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel.   +39 070 71180244
mob. : +39 329  6603810
_________________________________________________________________

"When the storms are raging around you, stay right where you are"
                          (Freddy Mercury)
_________________________________________________________________

From hus003 at ucsd.edu  Thu May 29 11:04:39 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Thu, 29 May 2014 16:04:39 +0000
Subject: [petsc-users] Question about dm_view
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B70E6@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<CAMYG4G=dRFykSkPnRQyQny018BeXXrnhmGuv719JddS8F06kDg@mail.gmail.com>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov>,
	<7501CC2B7BBCC44A92ECEEC316170ECB6B70E6@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B738B@XMAIL-MBX-BH1.AD.UCSD.EDU>

A continuing problem: While I was running ./ex5, the output is normal. When I was running ./ex5 -help | grep whatever, the output is still normal. However, when I tried ./ex5 -help | head -20, it output the first 20 lines from help, then it output the some error message. I'm curious why there is such an error message. The error message is pasted below.  

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading or writing to a socket
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: ./ex5 on a darwin-op named blablabla by blablabla Thu May 29 08:57:47 2014
[0]PETSC ERROR: Libraries linked from /usr/local/petsc-3.1-p8/darwin-opt/lib
[0]PETSC ERROR: Configure run at Tue Mar 11 16:25:14 2014
[0]PETSC ERROR: Configure options --CC=/usr/local/openmpi-1.4.3/bin/mpicc --CXX=/usr/local/openmpi-1.4.3/bin/mpicxx --FC=/usr/local/openmpi-1.4.3/bin/mpif90 --LDFLAGS="-L/usr/local/openmpi-1.4.3/lib -Wl,-rpath,/usr/local/openmpi-1.4.3/lib" --PETSC_ARCH=darwin-opt --with-debugging=0 --with-hypre=1 --with-blas-lapack-lib --with-c++-support --download-hypre --download-f-blas-lapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.


________________________________________
From: Sun, Hui
Sent: Wednesday, May 28, 2014 1:16 PM
To: Barry Smith
Cc: Matthew Knepley; petsc-users at mcs.anl.gov
Subject: RE: [petsc-users] Question about dm_view

Thanks, now I get it working.   -Hui

________________________________________
From: Barry Smith [bsmith at mcs.anl.gov]
Sent: Wednesday, May 28, 2014 1:12 PM
To: Sun, Hui
Cc: Matthew Knepley; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question about dm_view

On May 28, 2014, at 2:13 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

> Do I have to turn it on thru ./configure and then make everything again?

  No, just run the program with the option. For example if there is printed -dm_view_draw <false> then run the program with -dm_view_draw true

Barry

>
> From: Matthew Knepley [knepley at gmail.com]
> Sent: Wednesday, May 28, 2014 12:10 PM
> To: Sun, Hui
> Cc: Barry Smith; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about dm_view
>
> On Wed, May 28, 2014 at 1:28 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
> Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it comes out a list of options related to _view, all of which have the tag <FALSE>, what does this mean?
>
> The <FALSE> is the current value. They are all false because you have not turned them on. IF you are using the release version,
> the viewing option is -da_view. The -dm_view is the new version which we are about to release.
>
>   Thanks,
>
>     Matt
>
> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Wednesday, May 28, 2014 11:25 AM
> To: Sun, Hui
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about dm_view
>
>   Run as./ex5 -help | grep view to see the possibilities.  It depends on PETSc version number. When using the graphics want you generally want a -draw_pause -1 to stop that program at the graphic otherwise it pops up and disappears immediately.
>
>    Barry
>
>
> On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
>
> > Hello, I'm new to PETSc. I'm reading a tutorial slide given in Imperial College from this site: Slides. In slide page 28, there is description of viewing the DA. I'm testing from my MAC the same commands listed on that page, for example, ex5 -dm_view, nothing interesting happen except the Number of Newton iterations is outputted. I'm expecting that the PETSc numbering would show up as a graphic window or something. Can anyone tell me what's missing here? Thank you!   ( Hui )
>
>
>
>
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From masghar1397 at gmail.com  Thu May 29 11:09:14 2014
From: masghar1397 at gmail.com (M Asghar)
Date: Thu, 29 May 2014 17:09:14 +0100
Subject: [petsc-users] Accessing MUMPS INFOG values
In-Reply-To: <CAGCphBt01ZGdWtVCt7NCg=owHF_LOOLKDsC8SpzrsV=FXinxCQ@mail.gmail.com>
References: <6ebee9bb5c5942d69d1f546025e9660f@LUCKMAN.anl.gov>
	<CAGCphBt01ZGdWtVCt7NCg=owHF_LOOLKDsC8SpzrsV=FXinxCQ@mail.gmail.com>
Message-ID: <CAH=JewLcZ0VBWMgXV_qMDRg=EX7tT2gCjkVT8WmiDqHZVWXY=A@mail.gmail.com>

Hi,

Many thanks for the quick reply! I can see calls to MatMumpsGetInfog in
ex52.c.

* This is in PETSc's dev copy if I'm not mistaken - will this make it into
the next PETSc release?
* Will/does this have a Fortran equivalent?

Many thanks,
M Asghar


On Thu, May 29, 2014 at 4:43 PM, Hong Zhang <hzhang at mcs.anl.gov> wrote:

> Asghar:
> > Is it possible to access the contents of MUMPS array INFOG (and INFO,
> RINFOG
> > etc) via the PETSc interface?
>
> Yes. Use the latest petsc (master branch).
> See petsc/src/ksp/ksp/examples/tutorials/ex52.c
>
> Hong
> >
> > I am working with SLEPc and am using MUMPS for the factorisation. I would
> > like to access the contents of their INFOG array within our code
> > particularly when an error occurs in order to determine whether any
> remedial
> > action can be taken. The error code returned from PETSc is useful; any
> > additional information from MUMPS that can be accessed from within ones
> code
> > would be very helpful also.
> >
> > Many thanks in advance.
> >
> > M Asghar
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/ed2b98b5/attachment.html>

From D.Lathouwers at tudelft.nl  Thu May 29 11:31:00 2014
From: D.Lathouwers at tudelft.nl (Danny Lathouwers - TNW)
Date: Thu, 29 May 2014 16:31:00 +0000
Subject: [petsc-users] rtol meaning
Message-ID: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net>

Dear users,

I have a problem where I step time and repeatedly solve a system with differing rhs.
At some time step petsc solver returns that initial solution is good enough (converged reason = 2 with 0 iterations done).
I do not expect this behaviour. I use rtol = 0.001, atol=0 and dtol =large number.


The manual seems to suggest the criterion is: rnorm < MAX (rtol * rnorm_0, abstol) (probably based on preconditioned residual).

How could this lead to zero iterations being done? Or is the criterion based on rnorm/bnorm instead (which I found in some reference on the internet concerning petsc and would explain the observed behaviour).


Thanks,

Danny.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/59d468d7/attachment.html>

From knepley at gmail.com  Thu May 29 11:58:51 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 29 May 2014 11:58:51 -0500
Subject: [petsc-users] rtol meaning
In-Reply-To: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net>
References: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net>
Message-ID: <CAMYG4G=8yUZ_KGH53HmeV0a0pPKoDcaE2fJ8yVwqi_MoqtkuZg@mail.gmail.com>

On Thu, May 29, 2014 at 11:31 AM, Danny Lathouwers - TNW <
D.Lathouwers at tudelft.nl> wrote:

>  Dear users,
>
>
>
> I have a problem where I step time and repeatedly solve a system with
> differing rhs.
>
> At some time step petsc solver returns that initial solution is good
> enough (converged reason = 2 with 0 iterations done).
>
> I do not expect this behaviour. I use rtol = 0.001, atol=0 and dtol =large
> number.
>
>
>
> The manual seems to suggest the criterion is: rnorm < MAX (rtol * rnorm_0, abstol) (probably based on preconditioned residual).
>
> How could this lead to zero iterations being done? Or is the criterion based on rnorm/bnorm instead (which I found in some reference on the internet concerning petsc and would explain the observed behaviour).
>
> Its ||b||, not ||r_0||. You can change it to get the other behavior, as
detailed here


http://www.mcs.anl.gov/petsc/petsc-dev/src/ksp/ksp/interface/iterativ.c.html#KSPConvergedDefault

     Matt


> Thanks,
>
> Danny.
>
>
>
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/fa9f6a24/attachment.html>

From knepley at gmail.com  Thu May 29 12:00:35 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 29 May 2014 12:00:35 -0500
Subject: [petsc-users] Question about dm_view
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B738B@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B707E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<94E05341-F27B-4DE6-A1F8-4903DAC83ECB@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B708E@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<CAMYG4G=dRFykSkPnRQyQny018BeXXrnhmGuv719JddS8F06kDg@mail.gmail.com>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B70B8@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<364AB800-EEAC-4B5E-9810-DF848C68F2AF@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B70E6@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B738B@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <CAMYG4G=Gy=j3-aRvTUwPooHrRspbcJv7nLnVt+CH0w4oNgHSpA@mail.gmail.com>

On Thu, May 29, 2014 at 11:04 AM, Sun, Hui <hus003 at ucsd.edu> wrote:

> A continuing problem: While I was running ./ex5, the output is normal.
> When I was running ./ex5 -help | grep whatever, the output is still normal.
> However, when I tried ./ex5 -help | head -20, it output the first 20 lines
> from help, then it output the some error message. I'm curious why there is
> such an error message. The error message is pasted below.
>

You will notice that the signal is "Broken Pipe". When 'head' is done, it
sends a SIGPIPE to the
process producing output. This is the standard behavior.

   Matt


> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading
> or writing to a socket
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48
> CDT 2011
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./ex5 on a darwin-op named blablabla by blablabla Thu May
> 29 08:57:47 2014
> [0]PETSC ERROR: Libraries linked from
> /usr/local/petsc-3.1-p8/darwin-opt/lib
> [0]PETSC ERROR: Configure run at Tue Mar 11 16:25:14 2014
> [0]PETSC ERROR: Configure options --CC=/usr/local/openmpi-1.4.3/bin/mpicc
> --CXX=/usr/local/openmpi-1.4.3/bin/mpicxx
> --FC=/usr/local/openmpi-1.4.3/bin/mpif90
> --LDFLAGS="-L/usr/local/openmpi-1.4.3/lib
> -Wl,-rpath,/usr/local/openmpi-1.4.3/lib" --PETSC_ARCH=darwin-opt
> --with-debugging=0 --with-hypre=1 --with-blas-lapack-lib --with-c++-support
> --download-hypre --download-f-blas-lapack --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3
> --FOPTFLAGS=-O3
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
>
>
>
> ________________________________________
> From: Sun, Hui
> Sent: Wednesday, May 28, 2014 1:16 PM
> To: Barry Smith
> Cc: Matthew Knepley; petsc-users at mcs.anl.gov
> Subject: RE: [petsc-users] Question about dm_view
>
> Thanks, now I get it working.   -Hui
>
> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Wednesday, May 28, 2014 1:12 PM
> To: Sun, Hui
> Cc: Matthew Knepley; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about dm_view
>
> On May 28, 2014, at 2:13 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
>
> > Do I have to turn it on thru ./configure and then make everything again?
>
>   No, just run the program with the option. For example if there is
> printed -dm_view_draw <false> then run the program with -dm_view_draw true
>
> Barry
>
> >
> > From: Matthew Knepley [knepley at gmail.com]
> > Sent: Wednesday, May 28, 2014 12:10 PM
> > To: Sun, Hui
> > Cc: Barry Smith; petsc-users at mcs.anl.gov
> > Subject: Re: [petsc-users] Question about dm_view
> >
> > On Wed, May 28, 2014 at 1:28 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
> > Thanks Barry for quick reply. After I type ./ex5 -help | grep view, it
> comes out a list of options related to _view, all of which have the tag
> <FALSE>, what does this mean?
> >
> > The <FALSE> is the current value. They are all false because you have
> not turned them on. IF you are using the release version,
> > the viewing option is -da_view. The -dm_view is the new version which we
> are about to release.
> >
> >   Thanks,
> >
> >     Matt
> >
> > ________________________________________
> > From: Barry Smith [bsmith at mcs.anl.gov]
> > Sent: Wednesday, May 28, 2014 11:25 AM
> > To: Sun, Hui
> > Cc: petsc-users at mcs.anl.gov
> > Subject: Re: [petsc-users] Question about dm_view
> >
> >   Run as./ex5 -help | grep view to see the possibilities.  It depends on
> PETSc version number. When using the graphics want you generally want a
> -draw_pause -1 to stop that program at the graphic otherwise it pops up and
> disappears immediately.
> >
> >    Barry
> >
> >
> > On May 28, 2014, at 1:21 PM, Sun, Hui <hus003 at ucsd.edu> wrote:
> >
> > > Hello, I'm new to PETSc. I'm reading a tutorial slide given in
> Imperial College from this site: Slides. In slide page 28, there is
> description of viewing the DA. I'm testing from my MAC the same commands
> listed on that page, for example, ex5 -dm_view, nothing interesting happen
> except the Number of Newton iterations is outputted. I'm expecting that the
> PETSc numbering would show up as a graphic window or something. Can anyone
> tell me what's missing here? Thank you!   ( Hui )
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/3d5d3e1b/attachment.html>

From knepley at gmail.com  Thu May 29 12:01:30 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 29 May 2014 12:01:30 -0500
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
In-Reply-To: <alpine.DEB.2.10.1405291744350.13302@capitanata.oa-cagliari.inaf.it>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
	<alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
	<CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>
	<alpine.DEB.2.10.1405291744350.13302@capitanata.oa-cagliari.inaf.it>
Message-ID: <CAMYG4G=9-ehtDFj_Qw1kAAtJ3ex=vtFTC8kAhD3ztxc-g=WU8g@mail.gmail.com>

On Thu, May 29, 2014 at 10:58 AM, Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
wrote:

> On Thu, 29 May 2014, Matthew Knepley wrote:
>
>  There might be an easier way to do this:
>>   PetscScalar val = 0.0, gval;
>>
>>   VecGetOwnershipRange(xr, &low, &high);
>>   if ((myindex >= low) && (myindex < high)) {
>>     VecGetArray(localx1,&a);
>>     val = a[myindex-low];
>>     VecRestoreArray(localx1, &a);
>>   }
>>   MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
>>
>> Now everyone has the value at myindex.
>>
>
> brilliant, why didn't I think of this? Only, I guess you were
> copying/pasting and some variable names slipped, namely localx instead of
> xr.  Should it be
>

Yes

   Matt


>   PetscScalar val = 0.0, gval;
>   PetscScalar *a;
>
>   VecGetOwnershipRange(xr, &low, &high);
>   if ((myindex >= low) && (myindex < high)) {
>     VecGetArray(xr,&a);
>     val = a[myindex-low];     VecRestoreArray(xr, &a);
>   }
>   MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
>   *rr = gval*gval;
>   *ri = 0;
>
> ?
>
> Thanks!
> Giacomo
>
> --
> _________________________________________________________________
>
> Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
> _________________________________________________________________
>
> INAF - Osservatorio Astronomico di Cagliari
> via della scienza 5 - 09047 Selargius (CA)
>
> tel.   +39 070 71180244
> mob. : +39 329 6603810
> _________________________________________________________________
>
> "When the storms are raging around you, stay right where you are"
>                          (Freddy Mercury)
> _________________________________________________________________


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/424ab582/attachment.html>

From lu_qin_2000 at yahoo.com  Thu May 29 13:23:53 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Thu, 29 May 2014 11:23:53 -0700 (PDT)
Subject: [petsc-users] About parallel performance
Message-ID: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>

Hello,

I?implemented?PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).

For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
?
I have used -log_summary to print out the performance summary?as attached (log_summary_p1 for serial run and log_summary_p2 for?the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
My questions are:
?
1. what?is the bottle neck of the parallel?run according to the summary?
2. Do you have any suggestions to improve the parallel performance?
?
Thanks a lot for your suggestions!
?
Regards,
Qin????
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_summary_p1.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/51163eed/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_summary_p2.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/51163eed/attachment-0003.txt>

From bsmith at mcs.anl.gov  Thu May 29 13:43:54 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 May 2014 13:43:54 -0500
Subject: [petsc-users] Accessing MUMPS INFOG values
In-Reply-To: <CAH=JewLHkZJJpywRX=Gvxk4zTCF4A9zgTyRH=rAhYB_xXKG37A@mail.gmail.com>
References: <CAH=JewLHkZJJpywRX=Gvxk4zTCF4A9zgTyRH=rAhYB_xXKG37A@mail.gmail.com>
Message-ID: <55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov>


  We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h
(versions also for single precision and complex).  So what PETSc should provide in mumps.c is a function something like

#undef __FUNCT__
#define __FUNCT__ "MatMUMPSGetStruc"
PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc)
{
  Mat_MUMPS      *mumps=(Mat_MUMPS*)A->spptr;

  PetscFunctionBegin;
  *struc = (void *) mumps->id
  PetscFunctionReturn(0);
}
so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled 
Also add a prototype for this function in petscmat.h

  
Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like.

Let us know how it goes and we?ll get this stuff into the development version of PETSc.

  Barry


On May 29, 2014, at 10:38 AM, M Asghar <masghar1397 at gmail.com> wrote:

> Hi,
> 
> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface?
> 
> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also.
> 
> Many thanks in advance.
> 
> M Asghar
> 


From bsmith at mcs.anl.gov  Thu May 29 13:45:49 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 May 2014 13:45:49 -0500
Subject: [petsc-users] Accessing MUMPS INFOG values
In-Reply-To: <55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov>
References: <CAH=JewLHkZJJpywRX=Gvxk4zTCF4A9zgTyRH=rAhYB_xXKG37A@mail.gmail.com>
	<55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov>
Message-ID: <613FC909-426E-48E1-AB49-3E0252809887@mcs.anl.gov>


  Ignore this email. I see Hong already did it a different way so you already have access to all this information.

   Barry

On May 29, 2014, at 1:43 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

> 
>  We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h
> (versions also for single precision and complex).  So what PETSc should provide in mumps.c is a function something like
> 
> #undef __FUNCT__
> #define __FUNCT__ "MatMUMPSGetStruc"
> PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc)
> {
>  Mat_MUMPS      *mumps=(Mat_MUMPS*)A->spptr;
> 
>  PetscFunctionBegin;
>  *struc = (void *) mumps->id
>  PetscFunctionReturn(0);
> }
> so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled 
> Also add a prototype for this function in petscmat.h
> 
> 
> Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like.
> 
> Let us know how it goes and we?ll get this stuff into the development version of PETSc.
> 
>  Barry
> 
> 
> 
> 
> On May 29, 2014, at 10:38 AM, M Asghar <masghar1397 at gmail.com> wrote:
> 
>> Hi,
>> 
>> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface?
>> 
>> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also.
>> 
>> Many thanks in advance.
>> 
>> M Asghar
>> 
> 


From jroman at dsic.upv.es  Thu May 29 13:51:32 2014
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Thu, 29 May 2014 20:51:32 +0200
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
In-Reply-To: <alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
	<alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
Message-ID: <6680CB15-AAA4-4404-9567-5AFF0F651D57@dsic.upv.es>


El 29/05/2014, a las 17:03, Giacomo Mulas escribi?:

> Hi Jose, and list.
> 
> I am in the process of writing the function to use with
> EPSSetArbitrarySelection().
> 
> Inside it, I will need to take some given component (which one is included
> in the info passed via ctx) of the eigenvector and square it. To do this,
> since the eigenvector is not necessarily local, I will need to first do a
> scatter to a local 1-component vector. So this would be like:
> 
> ... some omitted machinery to cast the info from *ctx to more easily
> accessible form...
> 
>  ierr = ISCreateStride(PETXC_COMM_WORLD,1,myindex,1,&is1_from);CHKERRQ(ierr);
>  ierr = ISCreateStride(PETSC_COMM_WORLD,1,0,1,&is1_to);CHKERRQ(ierr);
>  ierr = VecCreateSeq(PETSC_COMM_SELF, 1, &localx1);CHKERRQ(ierr);
>  ierr = VecScatterCreate(xr,is1_from,localx1,is1_to,&scatter1); CHKERRQ(ierr);
>  ierr = VecScatterBegin(scatter1,xr,localx1,INSERT_VALUES,
> 			 SCATTER_FORWARD);
>  ierr = VecScatterEnd(scatter1,xr,localx1,INSERT_VALUES,
> 		       SCATTER_FORWARD);
>  ierr = VecGetArray(localx1,&comp);
>  *rr = comp*comp;
>  ierr = VecRestoreArray(localx1, &comp);
>  ierr = VecDestroy(localx1);
>  ierr = VecScatterDestroy(&scatter1);
>  ierr = ISDestroy(&is1_from);
>  ierr = ISDestroy(&is1_to);
>  *ri = 0;
> 
> ... some internal housekeeping omitted
> 
>  return 0;
> 
> The questions are:
> 
> 1) when the arbitrary function is called, is it called on all nodes
> simultaneously, so that collective functions can be expected to work
> properly, being called on all involved nodes at the same time? Should all
> processes compute the *rr and *ri to be returned, and return the same value?
> would it be more efficient to create a unit vector uv containing only one
> nonzero component, and then use VecDot(xr, uv, &comp), instead of pulling
> the component I need and squaring it as I did above?
> 

Yes, all processes must have the same values.
Use the code snippet proposed by Matt.

> 
> 2) since the stride, the 1-component vector, the scatter are presumably the same through all calls within one EPSSolve, can I take them out of the arbitrary function, and make them available to it through *ctx? For this
> to work, the structure of xr, the eigenvector passed to the arbitrary
> function, must be known outside of EPSSolve.

Yes.
Internally, all vectors are basically cloned from a template vector created with MatGetVecs(A,xr,NULL) so you can do the same outside EPSSolve() to determine local sizes.

Jose


> 
> 
> Thanks, bye
> Giacomo
> 
> -- 
> _________________________________________________________________
> 
> Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
> _________________________________________________________________
> 
> INAF - Osservatorio Astronomico di Cagliari
> via della scienza 5 - 09047 Selargius (CA)
> 
> tel.   +39 070 71180244
> mob. : +39 329  6603810
> _________________________________________________________________
> 
> "When the storms are raging around you, stay right where you are"
>                         (Freddy Mercury)
> _________________________________________________________________


From bsmith at mcs.anl.gov  Thu May 29 14:12:00 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 May 2014 14:12:00 -0500
Subject: [petsc-users] About parallel performance
In-Reply-To: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
Message-ID: <B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>


   You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().

   Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.  If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.

   Barry


On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Hello,
> 
> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
> 
> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>  
> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
> My questions are:
>  
> 1. what is the bottle neck of the parallel run according to the summary?
> 2. Do you have any suggestions to improve the parallel performance?
>  
> Thanks a lot for your suggestions!
>  
> Regards,
> Qin    <log_summary_p1.txt><log_summary_p2.txt>


From D.Lathouwers at tudelft.nl  Thu May 29 14:49:41 2014
From: D.Lathouwers at tudelft.nl (Danny Lathouwers - TNW)
Date: Thu, 29 May 2014 19:49:41 +0000
Subject: [petsc-users] rtol meaning
In-Reply-To: <CAMYG4G=8yUZ_KGH53HmeV0a0pPKoDcaE2fJ8yVwqi_MoqtkuZg@mail.gmail.com>
References: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net>
	<CAMYG4G=8yUZ_KGH53HmeV0a0pPKoDcaE2fJ8yVwqi_MoqtkuZg@mail.gmail.com>
Message-ID: <4E6B33F4128CED4DB307BA83146E9A64259081DE@SRV362.tudelft.net>

Thanks Matt for your quick response.

I got to believe that it was the relative ratio of the residual from the following petsc links:

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetTolerances.html
and
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPDefaultConverged.html#KSPDefaultConverged

Perhaps these pages are outdated?

Cheers,
Danny.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/452b01d5/attachment.html>

From bsmith at mcs.anl.gov  Thu May 29 15:03:37 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 May 2014 15:03:37 -0500
Subject: [petsc-users] rtol meaning
In-Reply-To: <4E6B33F4128CED4DB307BA83146E9A64259081DE@SRV362.tudelft.net>
References: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net>
	<CAMYG4G=8yUZ_KGH53HmeV0a0pPKoDcaE2fJ8yVwqi_MoqtkuZg@mail.gmail.com>
	<4E6B33F4128CED4DB307BA83146E9A64259081DE@SRV362.tudelft.net>
Message-ID: <91B5A9C1-52F4-4E35-9841-29D396EDFCA4@mcs.anl.gov>


  Danny,

   The manual pages are a little sloppy and inconsistent. By default it uses ||b|| or || preconditioned b|| as the starting point. At the bottom of the badly formatted page you?ll see "- - rnorm_0 is the two norm of the right hand side. When initial guess is non-zero you can call KSPDefaultConvergedSetUIRNorm() to use the norm of (b - A*(initial guess)) as the starting point for relative norm convergence testing."

Likely you want to call KSPDefaultConvergedSetUIRNorm if that is how you want to detect convergence.

   We?ll cleanup the manual pages, thanks for pointing out the confusion.

   Barry

  You can see the source code at http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/interface/iterativ.c.html#KSPDefaultConverged and confirm that what Matt said is correct.


On May 29, 2014, at 2:49 PM, Danny Lathouwers - TNW <D.Lathouwers at tudelft.nl> wrote:

> Thanks Matt for your quick response.
>  
> I got to believe that it was the relative ratio of the residual from the following petsc links:
>  
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetTolerances.html
> and
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPDefaultConverged.html#KSPDefaultConverged
>  
> Perhaps these pages are outdated?
>  
> Cheers,
> Danny.


From lu_qin_2000 at yahoo.com  Thu May 29 16:06:19 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Thu, 29 May 2014 14:06:19 -0700 (PDT)
Subject: [petsc-users] About parallel performance
In-Reply-To: <B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
Message-ID: <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>

For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email.
?
The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of?p2 (143 sec) is a little?faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec).

It seems I?need a more efficient parallel preconditioner. Do you have any suggestions for that?

Many thanks,
Qin

----- Original Message -----
From: Barry Smith <bsmith at mcs.anl.gov>
To: Qin Lu <lu_qin_2000 at yahoo.com>
Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Sent: Thursday, May 29, 2014 2:12 PM
Subject: Re: [petsc-users] About parallel performance


?  You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().

?  Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.

?  Barry


On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Hello,
> 
> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
> 
> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>? 
> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
> My questions are:
>? 
> 1. what is the bottle neck of the parallel run according to the summary?
> 2. Do you have any suggestions to improve the parallel performance?
>? 
> Thanks a lot for your suggestions!
>? 
> Regards,
> Qin? ? <log_summary_p1.txt><log_summary_p2.txt>

From bsmith at mcs.anl.gov  Thu May 29 16:17:28 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 May 2014 16:17:28 -0500
Subject: [petsc-users] About parallel performance
In-Reply-To: <1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>
Message-ID: <174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>


  You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can 

 cd  src/benchmarks/streams/ 

 make MPIVersion

 mpiexec -n 1 ./MPIVersion

 mpiexec -n 2 ./MPIVersion 

   and send all the results

   Barry


On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email.
>  
> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec).
> 
> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that?
> 
> Many thanks,
> Qin
> 
> ----- Original Message -----
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com>
> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Sent: Thursday, May 29, 2014 2:12 PM
> Subject: Re: [petsc-users] About parallel performance
> 
> 
>    You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().
> 
>    Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.  If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.
> 
>    Barry
> 
> 
> 
> 
> 
> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> 
>> Hello,
>> 
>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
>> 
>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>>   
>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
>> My questions are:
>>   
>> 1. what is the bottle neck of the parallel run according to the summary?
>> 2. Do you have any suggestions to improve the parallel performance?
>>   
>> Thanks a lot for your suggestions!
>>   
>> Regards,
>> Qin    <log_summary_p1.txt><log_summary_p2.txt>


From hzhang at mcs.anl.gov  Thu May 29 16:29:31 2014
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 29 May 2014 16:29:31 -0500
Subject: [petsc-users] Accessing MUMPS INFOG values
In-Reply-To: <4e4cfc6e38dd40cf9e93cb3c5aa39258@NAGURSKI.anl.gov>
References: <CAH=JewLHkZJJpywRX=Gvxk4zTCF4A9zgTyRH=rAhYB_xXKG37A@mail.gmail.com>
	<55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov>
	<4e4cfc6e38dd40cf9e93cb3c5aa39258@NAGURSKI.anl.gov>
Message-ID: <CAGCphBvoY+eNxbDUSPh5=ANeO9+nLA2jFK3bcTa29axEzQHy=A@mail.gmail.com>

Barry :
>   Ignore this email. I see Hong already did it a different way so you already have access to all this information.

He asks
* Will/does this have a Fortran equivalent?

I'm not sure if the needed Fortran stubs are created automatically or
we must create them manually?
Hong

>
> On May 29, 2014, at 1:43 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>  We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h
>> (versions also for single precision and complex).  So what PETSc should provide in mumps.c is a function something like
>>
>> #undef __FUNCT__
>> #define __FUNCT__ "MatMUMPSGetStruc"
>> PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc)
>> {
>>  Mat_MUMPS      *mumps=(Mat_MUMPS*)A->spptr;
>>
>>  PetscFunctionBegin;
>>  *struc = (void *) mumps->id
>>  PetscFunctionReturn(0);
>> }
>> so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled
>> Also add a prototype for this function in petscmat.h
>>
>>
>> Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like.
>>
>> Let us know how it goes and we?ll get this stuff into the development version of PETSc.
>>
>>  Barry
>>
>>
>>
>>
>> On May 29, 2014, at 10:38 AM, M Asghar <masghar1397 at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface?
>>>
>>> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also.
>>>
>>> Many thanks in advance.
>>>
>>> M Asghar
>>>
>>
>

From bsmith at mcs.anl.gov  Thu May 29 16:47:40 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 May 2014 16:47:40 -0500
Subject: [petsc-users] Accessing MUMPS INFOG values
In-Reply-To: <CAGCphBvoY+eNxbDUSPh5=ANeO9+nLA2jFK3bcTa29axEzQHy=A@mail.gmail.com>
References: <CAH=JewLHkZJJpywRX=Gvxk4zTCF4A9zgTyRH=rAhYB_xXKG37A@mail.gmail.com>
	<55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov>
	<4e4cfc6e38dd40cf9e93cb3c5aa39258@NAGURSKI.anl.gov>
	<CAGCphBvoY+eNxbDUSPh5=ANeO9+nLA2jFK3bcTa29axEzQHy=A@mail.gmail.com>
Message-ID: <644168A2-5E32-44AA-A402-F09B5851E395@mcs.anl.gov>


On May 29, 2014, at 4:29 PM, Hong Zhang <hzhang at mcs.anl.gov> wrote:

> Barry :
>>  Ignore this email. I see Hong already did it a different way so you already have access to all this information.
> 
> He asks
> * Will/does this have a Fortran equivalent?
> 
> I'm not sure if the needed Fortran stubs are created automatically or
> we must create them manually?

   You need to write manual pages for each of these functions and make sure they start with /*@ and end with @*/ then run make allfortranstubs and make sure they get generated.

  Barry

> Hong
> 
>> 
>> On May 29, 2014, at 1:43 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>> 
>>> We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h
>>> (versions also for single precision and complex).  So what PETSc should provide in mumps.c is a function something like
>>> 
>>> #undef __FUNCT__
>>> #define __FUNCT__ "MatMUMPSGetStruc"
>>> PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc)
>>> {
>>> Mat_MUMPS      *mumps=(Mat_MUMPS*)A->spptr;
>>> 
>>> PetscFunctionBegin;
>>> *struc = (void *) mumps->id
>>> PetscFunctionReturn(0);
>>> }
>>> so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled
>>> Also add a prototype for this function in petscmat.h
>>> 
>>> 
>>> Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like.
>>> 
>>> Let us know how it goes and we?ll get this stuff into the development version of PETSc.
>>> 
>>> Barry
>>> 
>>> 
>>> 
>>> 
>>> On May 29, 2014, at 10:38 AM, M Asghar <masghar1397 at gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface?
>>>> 
>>>> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also.
>>>> 
>>>> Many thanks in advance.
>>>> 
>>>> M Asghar
>>>> 
>>> 
>> 


From bsmith at mcs.anl.gov  Thu May 29 16:54:45 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 May 2014 16:54:45 -0500
Subject: [petsc-users] About parallel performance
In-Reply-To: <1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>
	<1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>
Message-ID: <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>


  In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. 

  But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance.

   Barry

On May 29, 2014, at 4:37 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Barry,
> 
> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later):
> 
> =================
> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion
> Number of MPI processes 1
> Function      Rate (MB/s)
> Copy:       21682.9932
> Scale:      21637.5509
> Add:        21583.0395
> Triad:      21504.6563
> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion
> Number of MPI processes 2
> Function      Rate (MB/s)
> Copy:       21369.6976
> Scale:      21632.3203
> Add:        22203.7107
> Triad:      22305.1841
> =======================
> 
> Thanks a lot,
> Qin
> 
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com> 
> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
> Sent: Thursday, May 29, 2014 4:17 PM
> Subject: Re: [petsc-users] About parallel performance
> 
> 
> 
>   You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can 
> 
> cd  src/benchmarks/streams/ 
> 
> make MPIVersion
> 
> mpiexec -n 1 ./MPIVersion
> 
> mpiexec -n 2 ./MPIVersion 
> 
>    and send all the results
> 
>    Barry
> 
> 
> 
> On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> 
>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email.
>>   
>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec).
>> 
>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that?
>> 
>> Many thanks,
>> Qin
>> 
>> ----- Original Message -----
>> From: Barry Smith <bsmith at mcs.anl.gov>
>> To: Qin Lu <lu_qin_2000 at yahoo.com>
>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> Sent: Thursday, May 29, 2014 2:12 PM
>> Subject: Re: [petsc-users] About parallel performance
>> 
>> 
>>     You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().
>> 
>>     Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.  If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.
>> 
>>     Barry
>> 
>> 
>> 
>> 
>> 
>> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>> 
>>> Hello,
>>> 
>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
>>> 
>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>>>   
>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
>>> My questions are:
>>>   
>>> 1. what is the bottle neck of the parallel run according to the summary?
>>> 2. Do you have any suggestions to improve the parallel performance?
>>>   
>>> Thanks a lot for your suggestions!
>>>   
>>> Regards,
>>> Qin    <log_summary_p1.txt><log_summary_p2.txt>      


From lu_qin_2000 at yahoo.com  Thu May 29 17:15:47 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Thu, 29 May 2014 15:15:47 -0700 (PDT)
Subject: [petsc-users] About parallel performance
In-Reply-To: <3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>
	<1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>
	<3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>
Message-ID: <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>

Barry,
?
How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1?
?
The machine has?very new?Intel chips and is very for serial run.?What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2)?that was not built correctly?
Many thanks,
Qin
?
----- Original Message -----
From: Barry Smith <bsmith at mcs.anl.gov>
To: Qin Lu <lu_qin_2000 at yahoo.com>; petsc-users <petsc-users at mcs.anl.gov>
Cc: 
Sent: Thursday, May 29, 2014 4:54 PM
Subject: Re: [petsc-users] About parallel performance


? In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. 

? But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance.

?  Barry


On May 29, 2014, at 4:37 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Barry,
> 
> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later):
> 
> =================
> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion
> Number of MPI processes 1
> Function? ? ? Rate (MB/s)
> Copy:? ? ?  21682.9932
> Scale:? ? ? 21637.5509
> Add:? ? ? ? 21583.0395
> Triad:? ? ? 21504.6563
> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion
> Number of MPI processes 2
> Function? ? ? Rate (MB/s)
> Copy:? ? ?  21369.6976
> Scale:? ? ? 21632.3203
> Add:? ? ? ? 22203.7107
> Triad:? ? ? 22305.1841
> =======================
> 
> Thanks a lot,
> Qin
> 
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com> 
> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
> Sent: Thursday, May 29, 2014 4:17 PM
> Subject: Re: [petsc-users] About parallel performance
> 
> 
> 
>?  You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can 
> 
> cd? src/benchmarks/streams/ 
> 
> make MPIVersion
> 
> mpiexec -n 1 ./MPIVersion
> 
> mpiexec -n 2 ./MPIVersion 
> 
>? ? and send all the results
> 
>? ? Barry
> 
> 
> 
> On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> 
>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email.
>>? 
>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec).
>> 
>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that?
>> 
>> Many thanks,
>> Qin
>> 
>> ----- Original Message -----
>> From: Barry Smith <bsmith at mcs.anl.gov>
>> To: Qin Lu <lu_qin_2000 at yahoo.com>
>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> Sent: Thursday, May 29, 2014 2:12 PM
>> Subject: Re: [petsc-users] About parallel performance
>> 
>> 
>>? ?  You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().
>> 
>>? ?  Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.
>> 
>>? ?  Barry
>> 
>> 
>> 
>> 
>> 
>> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>> 
>>> Hello,
>>> 
>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
>>> 
>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>>>? 
>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
>>> My questions are:
>>>? 
>>> 1. what is the bottle neck of the parallel run according to the summary?
>>> 2. Do you have any suggestions to improve the parallel performance?
>>>? 
>>> Thanks a lot for your suggestions!
>>>? 
>>> Regards,
>>> Qin? ? <log_summary_p1.txt><log_summary_p2.txt>? ? ? 

From knepley at gmail.com  Thu May 29 17:27:30 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 29 May 2014 17:27:30 -0500
Subject: [petsc-users] About parallel performance
In-Reply-To: <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>
	<1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>
	<3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>
	<1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>
Message-ID: <CAMYG4GnZ=E7nJygzM0d7NJcoivZ6qtE-CCNkY0zA_XHs0kzXVQ@mail.gmail.com>

On Thu, May 29, 2014 at 5:15 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Barry,
>
> How did you read the test results? For a machine good for parallism,
> should the data of np=2 be about half of the those of np=1?


Ideally, the numbers should be about twice as big for np = 2.


>
> The machine has very new Intel chips and is very for serial run. What may
> cause the bad parallism? - the configurations of the machine, or I am using
> a MPI lib (MPICH2) that was not built correctly?
>

The cause is machine architecture. The memory bandwidth is only sufficient
for one core.

  Thanks,

     Matt


> Many thanks,
> Qin
>
> ----- Original Message -----
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com>; petsc-users <petsc-users at mcs.anl.gov>
> Cc:
> Sent: Thursday, May 29, 2014 4:54 PM
> Subject: Re: [petsc-users] About parallel performance
>
>
>   In that PETSc version BasicVersion is actually the MPI streams benchmark
> so you ran the right thing. Your machine is totally worthless for sparse
> linear algebra parallelism. The entire memory bandwidth is used by the
> first core so adding the second core to the computation gives you no
> improvement at all in the streams benchmark.
>
>   But the single core memory bandwidth is pretty good so for problems that
> don?t need parallelism you should get good performance.
>
>    Barry
>
>
>
>
> On May 29, 2014, at 4:37 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>
> > Barry,
> >
> > I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean
> BasicVersion? I built and ran it (if you did mean MPIVersion, I will get
> PETSc-3.4 later):
> >
> > =================
> > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion
> > Number of MPI processes 1
> > Function      Rate (MB/s)
> > Copy:       21682.9932
> > Scale:      21637.5509
> > Add:        21583.0395
> > Triad:      21504.6563
> > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion
> > Number of MPI processes 2
> > Function      Rate (MB/s)
> > Copy:       21369.6976
> > Scale:      21632.3203
> > Add:        22203.7107
> > Triad:      22305.1841
> > =======================
> >
> > Thanks a lot,
> > Qin
> >
> > From: Barry Smith <bsmith at mcs.anl.gov>
> > To: Qin Lu <lu_qin_2000 at yahoo.com>
> > Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> > Sent: Thursday, May 29, 2014 4:17 PM
> > Subject: Re: [petsc-users] About parallel performance
> >
> >
> >
> >   You need to run the streams benchmarks are one and two processes to
> see how the memory bandwidth changes. If you are using petsc-3.4 you can
> >
> > cd  src/benchmarks/streams/
> >
> > make MPIVersion
> >
> > mpiexec -n 1 ./MPIVersion
> >
> > mpiexec -n 2 ./MPIVersion
> >
> >    and send all the results
> >
> >    Barry
> >
> >
> >
> > On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> >
> >> For now I only care about the CPU of PETSc subroutines. I tried to add
> PetscLogEventBegin/End and the results are consistent with the log_summary
> attached in my first email.
> >>
> >> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs
> are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between
> p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little
> faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43
> sec). So the total CPU of PETSc subtroutines are about the same between p1
> and p2 (502 sec vs. 488 sec).
> >>
> >> It seems I need a more efficient parallel preconditioner. Do you have
> any suggestions for that?
> >>
> >> Many thanks,
> >> Qin
> >>
> >> ----- Original Message -----
> >> From: Barry Smith <bsmith at mcs.anl.gov>
> >> To: Qin Lu <lu_qin_2000 at yahoo.com>
> >> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> >> Sent: Thursday, May 29, 2014 2:12 PM
> >> Subject: Re: [petsc-users] About parallel performance
> >>
> >>
> >>     You need to determine where the other 80% of the time is. My guess
> it is in setting the values into the matrix each time. Use
> PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code
> that computes all the entries in the matrix and calls MatSetValues() and
> MatAssemblyBegin/End().
> >>
> >>     Likely the reason the linear solver does not scale better is that
> you have a machine with multiple cores that share the same memory bandwidth
> and the first core is already using well over half the memory bandwidth so
> the second core cannot be fully utilized since both cores have to wait for
> data to arrive from memory.  If you are using the development version of
> PETSc you can run make streams NPMAX=2 from the PETSc root directory and
> send this to us to confirm this.
> >>
> >>     Barry
> >>
> >>
> >>
> >>
> >>
> >> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> >>
> >>> Hello,
> >>>
> >>> I implemented PETSc parallel linear solver in a program, the
> implementation is basically the same as
> /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ,
> and let PETSc partition the matrix through MatGetOwnershipRange. However, a
> few tests shows the parallel solver is always a little slower the serial
> solver (I have excluded the matrix generation CPU).
> >>>
> >>> For serial run I used PCILU as preconditioner; for parallel run, I
> used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type
> preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around
> 200,000.
> >>>
> >>> I have used -log_summary to print out the performance summary as
> attached (log_summary_p1 for serial run and log_summary_p2 for the run with
> 2 processes). It seems the KSPSolve counts only for less than 20% of Global
> %T.
> >>> My questions are:
> >>>
> >>> 1. what is the bottle neck of the parallel run according to the
> summary?
> >>> 2. Do you have any suggestions to improve the parallel performance?
> >>>
> >>> Thanks a lot for your suggestions!
> >>>
> >>> Regards,
> >>> Qin    <log_summary_p1.txt><log_summary_p2.txt>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/859815e8/attachment-0001.html>

From lu_qin_2000 at yahoo.com  Thu May 29 17:40:25 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Thu, 29 May 2014 15:40:25 -0700 (PDT)
Subject: [petsc-users] About parallel performance
In-Reply-To: <CAMYG4GnZ=E7nJygzM0d7NJcoivZ6qtE-CCNkY0zA_XHs0kzXVQ@mail.gmail.com>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>	<174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>	<1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>	<3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>	<1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<CAMYG4GnZ=E7nJygzM0d7NJcoivZ6qtE-CCNkY0zA_XHs0kzXVQ@mail.gmail.com>
Message-ID: <1401403225.85082.YahooMailNeo@web160202.mail.bf1.yahoo.com>

Is this determined by how the machine was built (which I can not do anything), or by how the MPI/meassge-passing?is configured at the cluster (which I can ask IT?people to modify)? - this machine is actually a node of a linux cluster.
?
Thanks,
Qin?
 

________________________________
 From: Matthew Knepley <knepley at gmail.com>
To: Qin Lu <lu_qin_2000 at yahoo.com> 
Cc: Barry Smith <bsmith at mcs.anl.gov>; petsc-users <petsc-users at mcs.anl.gov> 
Sent: Thursday, May 29, 2014 5:27 PM
Subject: Re: [petsc-users] About parallel performance
  

On Thu, May 29, 2014 at 5:15 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

Barry,
>?
>How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1?

Ideally, the numbers should be about twice as big for np = 2. 

?
>The machine has?very new?Intel chips and is very for serial run.?What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2)?that was not built correctly?
>

The cause is machine architecture. The memory bandwidth is only sufficient for one core.

? Thanks,

? ? ?Matt


Many thanks,
>Qin
>?
>----- Original Message -----
>From: Barry Smith <bsmith at mcs.anl.gov>
>To: Qin Lu <lu_qin_2000 at yahoo.com>; petsc-users <petsc-users at mcs.anl.gov>
>Cc:
>Sent: Thursday, May 29, 2014 4:54 PM
>Subject: Re: [petsc-users] About parallel performance
>
>
>? In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark.
>
>? But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance.
>
>? ?Barry
>
>
>
>
>On May 29, 2014, at 4:37 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>
>> Barry,
>>
>> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later):
>>
>> =================
>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion
>> Number of MPI processes 1
>> Function? ? ? Rate (MB/s)
>> Copy:? ? ? ?21682.9932
>> Scale:? ? ? 21637.5509
>> Add:? ? ? ? 21583.0395
>> Triad:? ? ? 21504.6563
>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion
>> Number of MPI processes 2
>> Function? ? ? Rate (MB/s)
>> Copy:? ? ? ?21369.6976
>> Scale:? ? ? 21632.3203
>> Add:? ? ? ? 22203.7107
>> Triad:? ? ? 22305.1841
>> =======================
>>
>> Thanks a lot,
>> Qin
>>
>> From: Barry Smith <bsmith at mcs.anl.gov>
>> To: Qin Lu <lu_qin_2000 at yahoo.com>
>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> Sent: Thursday, May 29, 2014 4:17 PM
>> Subject: Re: [petsc-users] About parallel performance
>>
>>
>>
>>? ?You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can
>>
>> cd? src/benchmarks/streams/
>>
>> make MPIVersion
>>
>> mpiexec -n 1 ./MPIVersion
>>
>> mpiexec -n 2 ./MPIVersion
>>
>>? ? and send all the results
>>
>>? ? Barry
>>
>>
>>
>> On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>
>>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email.
>>>?
>>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec).
>>>
>>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that?
>>>
>>> Many thanks,
>>> Qin
>>>
>>> ----- Original Message -----
>>> From: Barry Smith <bsmith at mcs.anl.gov>
>>> To: Qin Lu <lu_qin_2000 at yahoo.com>
>>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>>> Sent: Thursday, May 29, 2014 2:12 PM
>>> Subject: Re: [petsc-users] About parallel performance
>>>
>>>
>>>? ? ?You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().
>>>
>>>? ? ?Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.
>>>
>>>? ? ?Barry
>>>
>>>
>>>
>>>
>>>
>>> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
>>>>
>>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>>>>?
>>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T.
>>>> My questions are:
>>>>?
>>>> 1. what is the bottle neck of the parallel run according to the summary?
>>>> 2. Do you have any suggestions to improve the parallel performance?
>>>>?
>>>> Thanks a lot for your suggestions!
>>>>?
>>>> Regards,
>>>> Qin? ? <log_summary_p1.txt><log_summary_p2.txt>? ? ?
>


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/9ac68ef2/attachment.html>

From knepley at gmail.com  Thu May 29 17:45:34 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 29 May 2014 17:45:34 -0500
Subject: [petsc-users] About parallel performance
In-Reply-To: <1401403225.85082.YahooMailNeo@web160202.mail.bf1.yahoo.com>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>
	<1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>
	<3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>
	<1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<CAMYG4GnZ=E7nJygzM0d7NJcoivZ6qtE-CCNkY0zA_XHs0kzXVQ@mail.gmail.com>
	<1401403225.85082.YahooMailNeo@web160202.mail.bf1.yahoo.com>
Message-ID: <CAMYG4G=ZCj8PPk_wJv8L5nyj7yOS0OpTAqdh9D88CJ0AKNmFbg@mail.gmail.com>

On Thu, May 29, 2014 at 5:40 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Is this determined by how the machine was built (which I can not do
> anything), or by how the MPI/meassge-passing is configured at the cluster
> (which I can ask IT people to modify)? - this machine is actually a node of
> a linux cluster.
>

It is determined by how the machine was built. Your best bet for
scalability is to use one process per node.

  Thanks,

     Matt


>
> Thanks,
> Qin
>
>    *From:* Matthew Knepley <knepley at gmail.com>
> *To:* Qin Lu <lu_qin_2000 at yahoo.com>
> *Cc:* Barry Smith <bsmith at mcs.anl.gov>; petsc-users <
> petsc-users at mcs.anl.gov>
> *Sent:* Thursday, May 29, 2014 5:27 PM
> *Subject:* Re: [petsc-users] About parallel performance
>
> On Thu, May 29, 2014 at 5:15 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>
> Barry,
>
> How did you read the test results? For a machine good for parallism,
> should the data of np=2 be about half of the those of np=1?
>
>
> Ideally, the numbers should be about twice as big for np = 2.
>
>
>
> The machine has very new Intel chips and is very for serial run. What may
> cause the bad parallism? - the configurations of the machine, or I am using
> a MPI lib (MPICH2) that was not built correctly?
>
>
> The cause is machine architecture. The memory bandwidth is only sufficient
> for one core.
>
>   Thanks,
>
>      Matt
>
>
>
>
> Many thanks,
> Qin
>
> ----- Original Message -----
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com>; petsc-users <petsc-users at mcs.anl.gov>
> Cc:
> Sent: Thursday, May 29, 2014 4:54 PM
> Subject: Re: [petsc-users] About parallel performance
>
>
>   In that PETSc version BasicVersion is actually the MPI streams benchmark
> so you ran the right thing. Your machine is totally worthless for sparse
> linear algebra parallelism. The entire memory bandwidth is used by the
> first core so adding the second core to the computation gives you no
> improvement at all in the streams benchmark.
>
>   But the single core memory bandwidth is pretty good so for problems that
> don?t need parallelism you should get good performance.
>
>    Barry
>
>
>
>
> On May 29, 2014, at 4:37 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>
> > Barry,
> >
> > I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean
> BasicVersion? I built and ran it (if you did mean MPIVersion, I will get
> PETSc-3.4 later):
> >
> > =================
> > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion
> > Number of MPI processes 1
> > Function      Rate (MB/s)
> > Copy:       21682.9932
> > Scale:      21637.5509
> > Add:        21583.0395
> > Triad:      21504.6563
> > [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion
> > Number of MPI processes 2
> > Function      Rate (MB/s)
> > Copy:       21369.6976
> > Scale:      21632.3203
> > Add:        22203.7107
> > Triad:      22305.1841
> > =======================
> >
> > Thanks a lot,
> > Qin
> >
> > From: Barry Smith <bsmith at mcs.anl.gov>
> > To: Qin Lu <lu_qin_2000 at yahoo.com>
> > Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> > Sent: Thursday, May 29, 2014 4:17 PM
> > Subject: Re: [petsc-users] About parallel performance
> >
> >
> >
> >   You need to run the streams benchmarks are one and two processes to
> see how the memory bandwidth changes. If you are using petsc-3.4 you can
> >
> > cd  src/benchmarks/streams/
> >
> > make MPIVersion
> >
> > mpiexec -n 1 ./MPIVersion
> >
> > mpiexec -n 2 ./MPIVersion
> >
> >    and send all the results
> >
> >    Barry
> >
> >
> >
> > On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> >
> >> For now I only care about the CPU of PETSc subroutines. I tried to add
> PetscLogEventBegin/End and the results are consistent with the log_summary
> attached in my first email.
> >>
> >> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs
> are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between
> p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little
> faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43
> sec). So the total CPU of PETSc subtroutines are about the same between p1
> and p2 (502 sec vs. 488 sec).
> >>
> >> It seems I need a more efficient parallel preconditioner. Do you have
> any suggestions for that?
> >>
> >> Many thanks,
> >> Qin
> >>
> >> ----- Original Message -----
> >> From: Barry Smith <bsmith at mcs.anl.gov>
> >> To: Qin Lu <lu_qin_2000 at yahoo.com>
> >> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> >> Sent: Thursday, May 29, 2014 2:12 PM
> >> Subject: Re: [petsc-users] About parallel performance
> >>
> >>
> >>     You need to determine where the other 80% of the time is. My guess
> it is in setting the values into the matrix each time. Use
> PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code
> that computes all the entries in the matrix and calls MatSetValues() and
> MatAssemblyBegin/End().
> >>
> >>     Likely the reason the linear solver does not scale better is that
> you have a machine with multiple cores that share the same memory bandwidth
> and the first core is already using well over half the memory bandwidth so
> the second core cannot be fully utilized since both cores have to wait for
> data to arrive from memory.  If you are using the development version of
> PETSc you can run make streams NPMAX=2 from the PETSc root directory and
> send this to us to confirm this.
> >>
> >>     Barry
> >>
> >>
> >>
> >>
> >>
> >> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> >>
> >>> Hello,
> >>>
> >>> I implemented PETSc parallel linear solver in a program, the
> implementation is basically the same as
> /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ,
> and let PETSc partition the matrix through MatGetOwnershipRange. However, a
> few tests shows the parallel solver is always a little slower the serial
> solver (I have excluded the matrix generation CPU).
> >>>
> >>> For serial run I used PCILU as preconditioner; for parallel run, I
> used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type
> preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around
> 200,000.
> >>>
> >>> I have used -log_summary to print out the performance summary as
> attached (log_summary_p1 for serial run and log_summary_p2 for the run with
> 2 processes). It seems the KSPSolve counts only for less than 20% of Global
> %T.
> >>> My questions are:
> >>>
> >>> 1. what is the bottle neck of the parallel run according to the
> summary?
> >>> 2. Do you have any suggestions to improve the parallel performance?
> >>>
> >>> Thanks a lot for your suggestions!
> >>>
> >>> Regards,
> >>> Qin    <log_summary_p1.txt><log_summary_p2.txt>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/4c68ac6a/attachment-0001.html>

From bsmith at mcs.anl.gov  Thu May 29 17:46:08 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 29 May 2014 17:46:08 -0500
Subject: [petsc-users] About parallel performance
In-Reply-To: <1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>
	<1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>
	<3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>
	<1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>
Message-ID: <7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov>


   For the parallel case a perfect machine would have twice the memory bandwidth when using 2 cores as opposed to 1 core. For yours it is almost exactly the same. The issue is not with the MPI or software. It depends on how many memory sockets there are and how they are shared by the various cores. As I said the initial memory bandwidth for one core 21,682. gigabytes per second is good so it is a very good sequential machine. 

  Here are the results on my laptop 

Number of MPI processes 1
Process 0 Barrys-MacBook-Pro.local
Function      Rate (MB/s) 
Copy:        7928.7346
Scale:       8271.5103
Add:        11017.0430
Triad:      10843.9018
Number of MPI processes 2
Process 0 Barrys-MacBook-Pro.local
Process 1 Barrys-MacBook-Pro.local
Function      Rate (MB/s) 
Copy:       13513.0365
Scale:      13516.7086
Add:        15455.3952
Triad:      15562.0822
------------------------------------------------
np  speedup
1 1.0
2 1.44


Note that the memory bandwidth is much lower than your machine but there is an increase in speedup from one to two cores because one core cannot utilize all the memory bandwidth. But even with two cores my laptop will be slower on PETSc then one core on your machine.

Here is the performance on a workstation we have that has multiple CPUs and multiple memory sockets

Number of MPI processes 1
Process 0 es
Function      Rate (MB/s) 
Copy:       13077.8260
Scale:      12867.1966
Add:        14637.6757
Triad:      14414.4478
Number of MPI processes 2
Process 0 es
Process 1 es
Function      Rate (MB/s) 
Copy:       22663.3116
Scale:      22102.5495
Add:        25768.1550
Triad:      26076.0410
Number of MPI processes 3
Process 0 es
Process 1 es
Process 2 es
Function      Rate (MB/s) 
Copy:       27501.7610
Scale:      26971.2183
Add:        30433.3276
Triad:      31302.9396
Number of MPI processes 4
Process 0 es
Process 1 es
Process 2 es
Process 3 es
Function      Rate (MB/s) 
Copy:       29302.3183
Scale:      30165.5295
Add:        34577.3458
Triad:      35195.8067
------------------------------------------------
np  speedup
1 1.0
2 1.81
3 2.17
4 2.44

Note that one core has a lower memory bandwidth than your machine but as I add more cores the memory bandwidth increases by a factor of 2.4

There is nothing wrong with your machine, it is just not suitable to run sparse linear algebra on multiple cores for it.

  Barry


On May 29, 2014, at 5:15 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Barry,
>  
> How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1?
>  
> The machine has very new Intel chips and is very for serial run. What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2) that was not built correctly?
> Many thanks,
> Qin
>  
> ----- Original Message -----
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com>; petsc-users <petsc-users at mcs.anl.gov>
> Cc: 
> Sent: Thursday, May 29, 2014 4:54 PM
> Subject: Re: [petsc-users] About parallel performance
> 
> 
>   In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. 
> 
>   But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance.
> 
>    Barry
> 
> 
> 
> 
> On May 29, 2014, at 4:37 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> 
>> Barry,
>> 
>> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later):
>> 
>> =================
>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion
>> Number of MPI processes 1
>> Function      Rate (MB/s)
>> Copy:       21682.9932
>> Scale:      21637.5509
>> Add:        21583.0395
>> Triad:      21504.6563
>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion
>> Number of MPI processes 2
>> Function      Rate (MB/s)
>> Copy:       21369.6976
>> Scale:      21632.3203
>> Add:        22203.7107
>> Triad:      22305.1841
>> =======================
>> 
>> Thanks a lot,
>> Qin
>> 
>> From: Barry Smith <bsmith at mcs.anl.gov>
>> To: Qin Lu <lu_qin_2000 at yahoo.com> 
>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
>> Sent: Thursday, May 29, 2014 4:17 PM
>> Subject: Re: [petsc-users] About parallel performance
>> 
>> 
>> 
>>    You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can 
>> 
>> cd  src/benchmarks/streams/ 
>> 
>> make MPIVersion
>> 
>> mpiexec -n 1 ./MPIVersion
>> 
>> mpiexec -n 2 ./MPIVersion 
>> 
>>     and send all the results
>> 
>>     Barry
>> 
>> 
>> 
>> On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>> 
>>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email.
>>>   
>>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec).
>>> 
>>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that?
>>> 
>>> Many thanks,
>>> Qin
>>> 
>>> ----- Original Message -----
>>> From: Barry Smith <bsmith at mcs.anl.gov>
>>> To: Qin Lu <lu_qin_2000 at yahoo.com>
>>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>>> Sent: Thursday, May 29, 2014 2:12 PM
>>> Subject: Re: [petsc-users] About parallel performance
>>> 
>>> 
>>>      You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().
>>> 
>>>      Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.  If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.
>>> 
>>>      Barry
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
>>>> 
>>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>>>>   
>>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
>>>> My questions are:
>>>>   
>>>> 1. what is the bottle neck of the parallel run according to the summary?
>>>> 2. Do you have any suggestions to improve the parallel performance?
>>>>   
>>>> Thanks a lot for your suggestions!
>>>>   
>>>> Regards,
>>>> Qin    <log_summary_p1.txt><log_summary_p2.txt>      


From lu_qin_2000 at yahoo.com  Thu May 29 17:46:23 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Thu, 29 May 2014 15:46:23 -0700 (PDT)
Subject: [petsc-users] About parallel performance
In-Reply-To: <CAMYG4G=ZCj8PPk_wJv8L5nyj7yOS0OpTAqdh9D88CJ0AKNmFbg@mail.gmail.com>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>	<174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>	<1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>	<3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>	<1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>	<CAMYG4GnZ=E7nJygzM0d7NJcoivZ6qtE-CCNkY0zA_XHs0kzXVQ@mail.gmail.com>	<1401403225.85082.YahooMailNeo@web160202.mail.bf1.yahoo.com>
	<CAMYG4G=ZCj8PPk_wJv8L5nyj7yOS0OpTAqdh9D88CJ0AKNmFbg@mail.gmail.com>
Message-ID: <1401403583.88868.YahooMailNeo@web160206.mail.bf1.yahoo.com>

Thanks a lot! I will try that.
?
Qin?
 

________________________________
 From: Matthew Knepley <knepley at gmail.com>
To: Qin Lu <lu_qin_2000 at yahoo.com> 
Cc: Barry Smith <bsmith at mcs.anl.gov>; petsc-users <petsc-users at mcs.anl.gov> 
Sent: Thursday, May 29, 2014 5:45 PM
Subject: Re: [petsc-users] About parallel performance
  

On Thu, May 29, 2014 at 5:40 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

Is this determined by how the machine was built (which I can not do anything), or by how the MPI/meassge-passing?is configured at the cluster (which I can ask IT?people to modify)? - this machine is actually a node of a linux cluster. 

It is determined by how the machine was built. Your best bet for scalability is to use one process per node.

? Thanks,

? ? ?Matt 

?
>Thanks,
>Qin?
>
> 
> From: Matthew Knepley <knepley at gmail.com>
>To: Qin Lu <lu_qin_2000 at yahoo.com> 
>Cc: Barry Smith <bsmith at mcs.anl.gov>; petsc-users <petsc-users at mcs.anl.gov> 
>Sent: Thursday, May 29, 2014 5:27 PM
>Subject: Re: [petsc-users] About parallel performance
>  
>
>
>On Thu, May 29, 2014 at 5:15 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>
>Barry,
>>?
>>How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1?
>
>
>Ideally, the numbers should be about twice as big for np = 2. 
>
>?
>>The machine has?very new?Intel chips and is very for serial run.?What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2)?that was not built correctly?
>> 
>
>
>The cause is machine architecture. The memory bandwidth is only sufficient for one core.
>
>
>? Thanks,
>
>
>? ? ?Matt
>
>
>
>
>
>Many thanks,
>>Qin
>>?
>>----- Original Message -----
>>From: Barry Smith <bsmith at mcs.anl.gov>
>>To: Qin Lu <lu_qin_2000 at yahoo.com>; petsc-users <petsc-users at mcs.anl.gov>
>>Cc:
>>Sent: Thursday, May 29, 2014 4:54 PM
>>Subject: Re: [petsc-users] About parallel performance
>>
>>
>>? In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark.
>>
>>? But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance.
>>
>>? ?Barry
>>
>>
>>
>>
>>On May 29, 2014, at 4:37 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>
>>> Barry,
>>>
>>> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later):
>>>
>>> =================
>>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion
>>> Number of MPI processes 1
>>> Function? ? ? Rate (MB/s)
>>> Copy:? ? ? ?21682.9932
>>> Scale:? ? ? 21637.5509
>>> Add:? ? ? ? 21583.0395
>>> Triad:? ? ? 21504.6563
>>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion
>>> Number of MPI processes 2
>>> Function? ? ? Rate (MB/s)
>>> Copy:? ? ? ?21369.6976
>>> Scale:? ? ? 21632.3203
>>> Add:? ? ? ? 22203.7107
>>> Triad:? ? ? 22305.1841
>>> =======================
>>>
>>> Thanks a lot,
>>> Qin
>>>
>>> From: Barry Smith <bsmith at mcs.anl.gov>
>>> To: Qin Lu <lu_qin_2000 at yahoo.com>
>>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>>> Sent: Thursday, May 29, 2014 4:17 PM
>>> Subject: Re: [petsc-users] About parallel performance
>>>
>>>
>>>
>>>? ?You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can
>>>
>>> cd? src/benchmarks/streams/
>>>
>>> make MPIVersion
>>>
>>> mpiexec -n 1 ./MPIVersion
>>>
>>> mpiexec -n 2 ./MPIVersion
>>>
>>>? ? and send all the results
>>>
>>>? ? Barry
>>>
>>>
>>>
>>> On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>>
>>>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email.
>>>>?
>>>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec).
>>>>
>>>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that?
>>>>
>>>> Many thanks,
>>>> Qin
>>>>
>>>> ----- Original Message -----
>>>> From: Barry Smith <bsmith at mcs.anl.gov>
>>>> To: Qin Lu <lu_qin_2000 at yahoo.com>
>>>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>>>> Sent: Thursday, May 29, 2014 2:12 PM
>>>> Subject: Re: [petsc-users] About parallel performance
>>>>
>>>>
>>>>? ? ?You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().
>>>>
>>>>? ? ?Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.
>>>>
>>>>? ? ?Barry
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
>>>>>
>>>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>>>>>?
>>>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T.
>>>>> My questions are:
>>>>>?
>>>>> 1. what is the bottle neck of the parallel run according to the summary?
>>>>> 2. Do you have any suggestions to improve the parallel performance?
>>>>>?
>>>>> Thanks a lot for your suggestions!
>>>>>?
>>>>> Regards,
>>>>> Qin? ? <log_summary_p1.txt><log_summary_p2.txt>? ? ?
>>
>
>
>
>-- 
>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>-- Norbert Wiener 
>
>
>
>
>
>


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/f09947cc/attachment-0001.html>

From lu_qin_2000 at yahoo.com  Thu May 29 17:49:24 2014
From: lu_qin_2000 at yahoo.com (Qin Lu)
Date: Thu, 29 May 2014 15:49:24 -0700 (PDT)
Subject: [petsc-users] About parallel performance
In-Reply-To: <7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov>
References: <1401387833.49733.YahooMailNeo@web160205.mail.bf1.yahoo.com>
	<B4DEAF08-30FD-4EC8-A8E1-B0D739ABA398@mcs.anl.gov>
	<1401397579.60556.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<174C9433-D481-4F60-8360-9517FC684298@mcs.anl.gov>
	<1401399434.23310.YahooMailNeo@web160201.mail.bf1.yahoo.com>
	<3A4E76AD-8CE5-4048-AD7E-3087CFCFED42@mcs.anl.gov>
	<1401401747.849.YahooMailNeo@web160203.mail.bf1.yahoo.com>
	<7C5AB63F-4210-45E0-B4F1-1C9927D376EA@mcs.anl.gov>
Message-ID: <1401403764.96816.YahooMailNeo@web160202.mail.bf1.yahoo.com>

Barry,
?
Thanks a lot for the info! I know now?what?was the problem.?
?
Qin
 

________________________________
 From: Barry Smith <bsmith at mcs.anl.gov>
To: Qin Lu <lu_qin_2000 at yahoo.com> 
Cc: petsc-users <petsc-users at mcs.anl.gov> 
Sent: Thursday, May 29, 2014 5:46 PM
Subject: Re: [petsc-users] About parallel performance
  

?  For the parallel case a perfect machine would have twice the memory bandwidth when using 2 cores as opposed to 1 core. For yours it is almost exactly the same. The issue is not with the MPI or software. It depends on how many memory sockets there are and how they are shared by the various cores. As I said the initial memory bandwidth for one core 21,682. gigabytes per second is good so it is a very good sequential machine. 

? Here are the results on my laptop 

Number of MPI processes 1
Process 0 Barrys-MacBook-Pro.local
Function? ? ? Rate (MB/s) 
Copy:? ? ? ? 7928.7346
Scale:? ? ?  8271.5103
Add:? ? ? ? 11017.0430
Triad:? ? ? 10843.9018
Number of MPI processes 2
Process 0 Barrys-MacBook-Pro.local
Process 1 Barrys-MacBook-Pro.local
Function? ? ? Rate (MB/s) 
Copy:? ? ?  13513.0365
Scale:? ? ? 13516.7086
Add:? ? ? ? 15455.3952
Triad:? ? ? 15562.0822
------------------------------------------------
np? speedup
1 1.0
2 1.44


Note that the memory bandwidth is much lower than your machine but there is an increase in speedup from one to two cores because one core cannot utilize all the memory bandwidth. But even with two cores my laptop will be slower on PETSc then one core on your machine.

Here is the performance on a workstation we have that has multiple CPUs and multiple memory sockets

Number of MPI processes 1
Process 0 es
Function? ? ? Rate (MB/s) 
Copy:? ? ?  13077.8260
Scale:? ? ? 12867.1966
Add:? ? ? ? 14637.6757
Triad:? ? ? 14414.4478
Number of MPI processes 2
Process 0 es
Process 1 es
Function? ? ? Rate (MB/s) 
Copy:? ? ?  22663.3116
Scale:? ? ? 22102.5495
Add:? ? ? ? 25768.1550
Triad:? ? ? 26076.0410
Number of MPI processes 3
Process 0 es
Process 1 es
Process 2 es
Function? ? ? Rate (MB/s) 
Copy:? ? ?  27501.7610
Scale:? ? ? 26971.2183
Add:? ? ? ? 30433.3276
Triad:? ? ? 31302.9396
Number of MPI processes 4
Process 0 es
Process 1 es
Process 2 es
Process 3 es
Function? ? ? Rate (MB/s) 
Copy:? ? ?  29302.3183
Scale:? ? ? 30165.5295
Add:? ? ? ? 34577.3458
Triad:? ? ? 35195.8067
------------------------------------------------
np? speedup
1 1.0
2 1.81
3 2.17
4 2.44

Note that one core has a lower memory bandwidth than your machine but as I add more cores the memory bandwidth increases by a factor of 2.4

There is nothing wrong with your machine, it is just not suitable to run sparse linear algebra on multiple cores for it.

? Barry


On May 29, 2014, at 5:15 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:

> Barry,
>? 
> How did you read the test results? For a machine good for parallism, should the data of np=2 be about half of the those of np=1?
>? 
> The machine has very new Intel chips and is very for serial run. What may cause the bad parallism? - the configurations of the machine, or I am using a MPI lib (MPICH2) that was not built correctly?
> Many thanks,
> Qin
>? 
> ----- Original Message -----
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Qin Lu <lu_qin_2000 at yahoo.com>; petsc-users <petsc-users at mcs.anl.gov>
> Cc: 
> Sent: Thursday, May 29, 2014 4:54 PM
> Subject: Re: [petsc-users] About parallel performance
> 
> 
>?  In that PETSc version BasicVersion is actually the MPI streams benchmark so you ran the right thing. Your machine is totally worthless for sparse linear algebra parallelism. The entire memory bandwidth is used by the first core so adding the second core to the computation gives you no improvement at all in the streams benchmark. 
> 
>?  But the single core memory bandwidth is pretty good so for problems that don?t need parallelism you should get good performance.
> 
>? ? Barry
> 
> 
> 
> 
> On May 29, 2014, at 4:37 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
> 
>> Barry,
>> 
>> I have PETSc-3.4.2 and I didn't see MPIVersion there; do you mean BasicVersion? I built and ran it (if you did mean MPIVersion, I will get PETSc-3.4 later):
>> 
>> =================
>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 1 ./BasicVersion
>> Number of MPI processes 1
>> Function? ? ? Rate (MB/s)
>> Copy:? ? ?  21682.9932
>> Scale:? ? ? 21637.5509
>> Add:? ? ? ? 21583.0395
>> Triad:? ? ? 21504.6563
>> [/petsc-3.4.2-64bit/src/benchmarks/streams]$ mpiexec -n 2 ./BasicVersion
>> Number of MPI processes 2
>> Function? ? ? Rate (MB/s)
>> Copy:? ? ?  21369.6976
>> Scale:? ? ? 21632.3203
>> Add:? ? ? ? 22203.7107
>> Triad:? ? ? 22305.1841
>> =======================
>> 
>> Thanks a lot,
>> Qin
>> 
>> From: Barry Smith <bsmith at mcs.anl.gov>
>> To: Qin Lu <lu_qin_2000 at yahoo.com> 
>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov> 
>> Sent: Thursday, May 29, 2014 4:17 PM
>> Subject: Re: [petsc-users] About parallel performance
>> 
>> 
>> 
>>? ? You need to run the streams benchmarks are one and two processes to see how the memory bandwidth changes. If you are using petsc-3.4 you can 
>> 
>> cd? src/benchmarks/streams/ 
>> 
>> make MPIVersion
>> 
>> mpiexec -n 1 ./MPIVersion
>> 
>> mpiexec -n 2 ./MPIVersion 
>> 
>>? ?  and send all the results
>> 
>>? ?  Barry
>> 
>> 
>> 
>> On May 29, 2014, at 4:06 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>> 
>>> For now I only care about the CPU of PETSc subroutines. I tried to add PetscLogEventBegin/End and the results are consistent with the log_summary attached in my first email.
>>>? 
>>> The CPU of MatSetValues and MatAssemblyBegin/End of both p1 and p2 runs are small (< 20 sec). The CPU of PCSetup/PCApply are about the same between p1 and p2 (~120 sec). The CPU of KSPSolve of p2 (143 sec) is a little faster than p1's (176 sec), but p2 spent more time in MatGetSubMatrice (43 sec). So the total CPU of PETSc subtroutines are about the same between p1 and p2 (502 sec vs. 488 sec).
>>> 
>>> It seems I need a more efficient parallel preconditioner. Do you have any suggestions for that?
>>> 
>>> Many thanks,
>>> Qin
>>> 
>>> ----- Original Message -----
>>> From: Barry Smith <bsmith at mcs.anl.gov>
>>> To: Qin Lu <lu_qin_2000 at yahoo.com>
>>> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>>> Sent: Thursday, May 29, 2014 2:12 PM
>>> Subject: Re: [petsc-users] About parallel performance
>>> 
>>> 
>>>? ? ? You need to determine where the other 80% of the time is. My guess it is in setting the values into the matrix each time. Use PetscLogEventRegister() and put a PetscLogEventBegin/End() around the code that computes all the entries in the matrix and calls MatSetValues() and MatAssemblyBegin/End().
>>> 
>>>? ? ? Likely the reason the linear solver does not scale better is that you have a machine with multiple cores that share the same memory bandwidth and the first core is already using well over half the memory bandwidth so the second core cannot be fully utilized since both cores have to wait for data to arrive from memory.? If you are using the development version of PETSc you can run make streams NPMAX=2 from the PETSc root directory and send this to us to confirm this.
>>> 
>>>? ? ? Barry
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On May 29, 2014, at 1:23 PM, Qin Lu <lu_qin_2000 at yahoo.com> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).
>>>> 
>>>> For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.
>>>>? 
>>>> I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
>>>> My questions are:
>>>>? 
>>>> 1. what is the bottle neck of the parallel run according to the summary?
>>>> 2. Do you have any suggestions to improve the parallel performance?
>>>>? 
>>>> Thanks a lot for your suggestions!
>>>>? 
>>>> Regards,
>>>> Qin? ? <log_summary_p1.txt><log_summary_p2.txt>? ? ? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/b846d15b/attachment.html>

From hzhang at mcs.anl.gov  Thu May 29 21:29:15 2014
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 29 May 2014 21:29:15 -0500
Subject: [petsc-users] Accessing MUMPS INFOG values
In-Reply-To: <a09be16adc4241229e0a0494fc6c6f26@NAGURSKI.anl.gov>
References: <CAH=JewLHkZJJpywRX=Gvxk4zTCF4A9zgTyRH=rAhYB_xXKG37A@mail.gmail.com>
	<55FE5E1B-8E00-4E92-99E8-3FC8BD1E37A6@mcs.anl.gov>
	<4e4cfc6e38dd40cf9e93cb3c5aa39258@NAGURSKI.anl.gov>
	<CAGCphBvoY+eNxbDUSPh5=ANeO9+nLA2jFK3bcTa29axEzQHy=A@mail.gmail.com>
	<a09be16adc4241229e0a0494fc6c6f26@NAGURSKI.anl.gov>
Message-ID: <CAGCphBsvXywVoszGou-qp762f9J3xzn6kiA89JXLCefi8WnvLw@mail.gmail.com>

>> He asks
>> * Will/does this have a Fortran equivalent?
>>
>> I'm not sure if the needed Fortran stubs are created automatically or
>> we must create them manually?
>
>    You need to write manual pages for each of these functions and make sure they start with /*@ and end with @*/ then run make allfortranstubs and make sure they get generated.

OK, I'll get this done.
Hong

>>>
>>> On May 29, 2014, at 1:43 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>>>
>>>> We should add direct support for this. Thus beasties are all stored by MUMPS in the data structure DMUMPS_STRUC_C which is defined in dmumps_c.h
>>>> (versions also for single precision and complex).  So what PETSc should provide in mumps.c is a function something like
>>>>
>>>> #undef __FUNCT__
>>>> #define __FUNCT__ "MatMUMPSGetStruc"
>>>> PetscErrorCode MatMUMPSGetStruc(Mat A,void **struc)
>>>> {
>>>> Mat_MUMPS      *mumps=(Mat_MUMPS*)A->spptr;
>>>>
>>>> PetscFunctionBegin;
>>>> *struc = (void *) mumps->id
>>>> PetscFunctionReturn(0);
>>>> }
>>>> so stick this function into src/mat/impls/aij/mpi/mumps/mumps.c run make at the root directory of PETSc to have PETSc libraries recompiled
>>>> Also add a prototype for this function in petscmat.h
>>>>
>>>>
>>>> Now your code would include dumps_c.h and then call MatMUMPSGetStruc() and then you can directly access any thing you like.
>>>>
>>>> Let us know how it goes and we?ll get this stuff into the development version of PETSc.
>>>>
>>>> Barry
>>>>
>>>>
>>>>
>>>>
>>>> On May 29, 2014, at 10:38 AM, M Asghar <masghar1397 at gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Is it possible to access the contents of MUMPS array INFOG (and INFO, RINFOG etc) via the PETSc interface?
>>>>>
>>>>> I am working with SLEPc and am using MUMPS for the factorisation. I would like to access the contents of their INFOG array within our code particularly when an error occurs in order to determine whether any remedial action can be taken. The error code returned from PETSc is useful; any additional information from MUMPS that can be accessed from within ones code would be very helpful also.
>>>>>
>>>>> Many thanks in advance.
>>>>>
>>>>> M Asghar
>>>>>
>>>>
>>>
>

From D.Lathouwers at tudelft.nl  Fri May 30 03:11:02 2014
From: D.Lathouwers at tudelft.nl (Danny Lathouwers - TNW)
Date: Fri, 30 May 2014 08:11:02 +0000
Subject: [petsc-users] rtol meaning
In-Reply-To: <91B5A9C1-52F4-4E35-9841-29D396EDFCA4@mcs.anl.gov>
References: <4E6B33F4128CED4DB307BA83146E9A6425908029@SRV362.tudelft.net>
	<CAMYG4G=8yUZ_KGH53HmeV0a0pPKoDcaE2fJ8yVwqi_MoqtkuZg@mail.gmail.com>
	<4E6B33F4128CED4DB307BA83146E9A64259081DE@SRV362.tudelft.net>,
	<91B5A9C1-52F4-4E35-9841-29D396EDFCA4@mcs.anl.gov>
Message-ID: <A39C116B-3074-464D-B9D9-E319CD95A7B5@tudelft.nl>

Thanks for the clarification. Petsc saves me a lot of time not having to write the icc etc so i can live with these small issues very well. Petsc behaves as expected now.
Danny

Sent from my iPad

> On 29 mei 2014, at 22:04, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
> 
> 
>  Danny,
> 
>   The manual pages are a little sloppy and inconsistent. By default it uses ||b|| or || preconditioned b|| as the starting point. At the bottom of the badly formatted page you?ll see "- - rnorm_0 is the two norm of the right hand side. When initial guess is non-zero you can call KSPDefaultConvergedSetUIRNorm() to use the norm of (b - A*(initial guess)) as the starting point for relative norm convergence testing."
> 
> Likely you want to call KSPDefaultConvergedSetUIRNorm if that is how you want to detect convergence.
> 
>   We?ll cleanup the manual pages, thanks for pointing out the confusion.
> 
>   Barry
> 
>  You can see the source code at http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/interface/iterativ.c.html#KSPDefaultConverged and confirm that what Matt said is correct.
> 
> 
>> On May 29, 2014, at 2:49 PM, Danny Lathouwers - TNW <D.Lathouwers at tudelft.nl> wrote:
>> 
>> Thanks Matt for your quick response.
>> 
>> I got to believe that it was the relative ratio of the residual from the following petsc links:
>> 
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetTolerances.html
>> and
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPDefaultConverged.html#KSPDefaultConverged
>> 
>> Perhaps these pages are outdated?
>> 
>> Cheers,
>> Danny.
> 

From jed at jedbrown.org  Thu May 29 19:27:28 2014
From: jed at jedbrown.org (Jed Brown)
Date: Fri, 30 May 2014 02:27:28 +0200
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
In-Reply-To: <CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
	<alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
	<CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>
Message-ID: <87bnugyutr.fsf@jedbrown.org>

Matthew Knepley <knepley at gmail.com> writes:
> There might be an easier way to do this:
>   PetscScalar val = 0.0, gval;
>
>   VecGetOwnershipRange(xr, &low, &high);
>   if ((myindex >= low) && (myindex < high)) {
>     VecGetArray(localx1,&a);
>     val = a[myindex-low];
>     VecRestoreArray(localx1, &a);
>   }
>   MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
>
> Now everyone has the value at myindex.

Yes, but VecGetArray is collective so please don't do it quite this way.
Instead, write

  VecGetArray(localx1,&a);
  if ((myindex >= low) && (myindex < high)) {
    val = a[myindex-low];
  }
  VecRestoreArray(localx1, &a);
  MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140530/695f81d7/attachment.pgp>

From jed at jedbrown.org  Thu May 29 20:07:09 2014
From: jed at jedbrown.org (Jed Brown)
Date: Fri, 30 May 2014 03:07:09 +0200
Subject: [petsc-users] DIVERGED_INDEFINITE_PC in algebraic multigrid
In-Reply-To: <CADOhEh5V3AoSxSXn8Kc=RhQF=Czj5U8Vq+5jWNznfM8+YVxhDg@mail.gmail.com>
References: <53753C7B.8010201@uci.edu> <87y4y0uar8.fsf@jedbrown.org>
	<537A8335.4080702@uci.edu> <8761l1qu4x.fsf@jedbrown.org>
	<537A88AC.3060308@uci.edu> <871tvpqt8u.fsf@jedbrown.org>
	<CADOhEh59U0gn1aZ3vPFs6mzb2FAnjAMcz8XuGCRAxf1hwE6xuQ@mail.gmail.com>
	<538369C9.6010209@uci.edu>
	<CADOhEh5V3AoSxSXn8Kc=RhQF=Czj5U8Vq+5jWNznfM8+YVxhDg@mail.gmail.com>
Message-ID: <87zji0xef6.fsf@jedbrown.org>

Mark Adams <mfadams at lbl.gov> writes:
>> thank you for your input and sorry my late reply: I saw your email only
>> now.
>> By setting up the solver each time step you mean re-defining the KSP
>> context every time?
>>
>
> THe simplest thing is to just delete the object and create it again.  THere
> are "reset" methods that do the same thing semantically but it is probably
> just easier to destroy the KSP object and recreate it and redo your setup
> code.

Mark, if PCReset (via KSPReset) does not produce the same behavior as
destroying the KSP and recreating it, it is a bug.  I think this is the
case, but if it's not, it needs to be fixed.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140530/5ef76f7d/attachment.pgp>

From jed at jedbrown.org  Thu May 29 20:51:16 2014
From: jed at jedbrown.org (Jed Brown)
Date: Fri, 30 May 2014 03:51:16 +0200
Subject: [petsc-users] ExodusII
In-Reply-To: <b7d3e6616f8143029f96889ad8d5fcca@sid00230.hsr.ch>
References: <b7d3e6616f8143029f96889ad8d5fcca@sid00230.hsr.ch>
Message-ID: <87ppiwxcdn.fsf@jedbrown.org>

Baros Vladimir <vbaros at hsr.ch> writes:

> Necessary header files, can be found here:
> https://code.google.com/p/msinttypes/
>
> It contains necessary inttypes.h and stdint.h headers
> I successfully used them to build exodus lib with Visual Studio.
>
> Can anyone enable the support for exodus in Windows?

As I said in my reply to Pedro, stdint.h is system functionality that is
none of PETSc's business to be installing.  PETSc could add tests for
those headers and if so, attempt to build Exodus.II.  But upstream
really has to at least claim to support it.  (We are not Exodus.II
developers.  The --download-* functionality in PETSc configure is
supposed to be a convenience only, but it generates a disproportionate
support workload.  Attempting to support a configuration that upstream
does not support and that is not used by any active PETSc developer is
inviting increased support workload and a poor user experience.  You're
always welcome to install Exodus.II or any other library yourself.)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140530/6fafb5ff/attachment.pgp>

From knepley at gmail.com  Fri May 30 06:35:30 2014
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 30 May 2014 06:35:30 -0500
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
In-Reply-To: <87bnugyutr.fsf@jedbrown.org>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
	<alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
	<CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>
	<87bnugyutr.fsf@jedbrown.org>
Message-ID: <CAMYG4GmjP7Zu+ifitreqyQJt5HXUZfJPgU+MU=8J18jDKbYbmA@mail.gmail.com>

On Thu, May 29, 2014 at 7:27 PM, Jed Brown <jed at jedbrown.org> wrote:

> Matthew Knepley <knepley at gmail.com> writes:
> > There might be an easier way to do this:
> >   PetscScalar val = 0.0, gval;
> >
> >   VecGetOwnershipRange(xr, &low, &high);
> >   if ((myindex >= low) && (myindex < high)) {
> >     VecGetArray(localx1,&a);
> >     val = a[myindex-low];
> >     VecRestoreArray(localx1, &a);
> >   }
> >   MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
> >
> > Now everyone has the value at myindex.
>
> Yes, but VecGetArray is collective so please don't do it quite this way.
> Instead, write
>
>   VecGetArray(localx1,&a);
>   if ((myindex >= low) && (myindex < high)) {
>     val = a[myindex-low];
>   }
>   VecRestoreArray(localx1, &a);
>   MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
>

I think its better to use the non-collective version:

VecGetOwnershipRange(xr, &low, &high);
if ((myindex >= low) && (myindex < high)) {
   VecGetArrayRead(xr,&a);
   val = a[myindex-low];
   VecRestoreArrayRead(xr, &a);
}
MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);

  Thanks

      Matt
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140530/f9052a94/attachment.html>

From gmulas at oa-cagliari.inaf.it  Fri May 30 07:42:07 2014
From: gmulas at oa-cagliari.inaf.it (Giacomo Mulas)
Date: Fri, 30 May 2014 14:42:07 +0200 (CEST)
Subject: [petsc-users] question about arbitrary eigenvector selection in
 SLEPC
In-Reply-To: <CAMYG4GmjP7Zu+ifitreqyQJt5HXUZfJPgU+MU=8J18jDKbYbmA@mail.gmail.com>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
	<alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
	<CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>
	<87bnugyutr.fsf@jedbrown.org>
	<CAMYG4GmjP7Zu+ifitreqyQJt5HXUZfJPgU+MU=8J18jDKbYbmA@mail.gmail.com>
Message-ID: <alpine.DEB.2.10.1405301434410.19991@capitanata.oa-cagliari.inaf.it>

On Fri, 30 May 2014, Matthew Knepley wrote:

> I think its better to use the non-collective version:
> 
> VecGetOwnershipRange(xr, &low, &high);
> if ((myindex >= low) && (myindex < high)) {
> ? ?VecGetArrayRead(xr,&a);
> ? ?val = a[myindex-low];
> ? ?VecRestoreArrayRead(xr, &a);
> }
> MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);

I agree, and I used the above now. In any case, as it came out of this
discussion, may I suggest that the man page of EPSSetArbitrarySelection()
should document that the arbitrary selection user-defined function is
collective, i.e.  it is called on all nodes in PETSC_COMM_WORLD (and is thus
an implicit MPI syncronisation point)?  As it is now, if one just looks at
the docs this is unclear, and it is consequently unclear also if one may use
collective calls inside that user-defined function (the answer is yes, from
this discussion).  One may argue that it must be so, since the user-defined
function can use the eigenvectors which by definition may be nonlocal, but
making this explicit would not hurt.

Giacomo

-- 
_________________________________________________________________

Giacomo Mulas <gmulas at oa-cagliari.inaf.it>
_________________________________________________________________

INAF - Osservatorio Astronomico di Cagliari
via della scienza 5 - 09047 Selargius (CA)

tel.   +39 070 71180244
mob. : +39 329  6603810
_________________________________________________________________

"When the storms are raging around you, stay right where you are"
                          (Freddy Mercury)
_________________________________________________________________

From hus003 at ucsd.edu  Sat May 31 00:55:14 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Sat, 31 May 2014 05:55:14 +0000
Subject: [petsc-users] Question on DMMGSetSNESLocal from
	snes/example/tutorials/ex19.c
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>

I'm looking at snes example ex19.c, on "nonlinear driven cavity multigrid 2d. You can also access it via the website ( http://acts.nersc.gov/petsc/example3/ex19.c.html<http://www.mcs.anl.gov/petsc/petsc-3.4/src/snes/examples/tutorials/ex19.c.html> )

There are three user defined local functions ( FormFunctionLocal, FormFunctionLocali, FormFunctionLocali4 ) that serves as discretized PDE operators declared before main, and is defined right after main. In the middle of the main, there are these four lines:

1.  ierr = DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,ad_FormFunctionLocal,admf_FormFunctionLocal);CHKERRQ(ierr);
2.  ierr = DMMGSetFromOptions(dmmg);CHKERRQ(ierr);
3.  ierr = DMMGSetSNESLocali(dmmg,FormFunctionLocali,0,admf_FormFunctionLocali);CHKERRQ(ierr);
4.  ierr = DMMGSetSNESLocalib(dmmg,FormFunctionLocali4,0,admfb_FormFunctionLocali4);CHKERRQ(ierr);

I have the following questions:

1. What are ad_FormFunctionLocal, admf_FormFunctionLocal from line 1? They are not defined anywhere in ex19.c. Other terms such as admf_FormFunctionLocali and admfb_FormFunctionLocali4 are also not defined anywhere in the file.

2. To me it seems like DMMGSetSNESLocal, DMMGSetSNESLocali, and DMMGSetSNESLocalib evaluates the function for all grid points, for a single grid point and for a single degree of freedom, respectively. But how does the process choose which one to use?

Thanks!           - Hui
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140531/3031c167/attachment.html>

From jed at jedbrown.org  Sat May 31 03:43:27 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sat, 31 May 2014 10:43:27 +0200
Subject: [petsc-users] Question on DMMGSetSNESLocal from
	snes/example/tutorials/ex19.c
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <87ppiutk28.fsf@jedbrown.org>

"Sun, Hui" <hus003 at ucsd.edu> writes:

> I'm looking at snes example ex19.c, on "nonlinear driven cavity multigrid 2d. You can also access it via the website ( http://acts.nersc.gov/petsc/example3/ex19.c.html<http://www.mcs.anl.gov/petsc/petsc-3.4/src/snes/examples/tutorials/ex19.c.html> )

These are not the same version.  The acts.nersc.gov link is a very old
version of that example.  The source-transformation algorithmic
differentiation tool ADIC is being used to compute derivatives (the ad_*
and admf_* functions).  ADIC is not maintained, has an unfortunate
license, and was not widely used so the "automatic" support has been
removed from PETSc.  Please look at the current version, which has
neither DMMG (removed/merged into SNES some years ago) or ADIC.

> There are three user defined local functions ( FormFunctionLocal, FormFunctionLocali, FormFunctionLocali4 ) that serves as discretized PDE operators declared before main, and is defined right after main. In the middle of the main, there are these four lines:
>
> 1.  ierr = DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,ad_FormFunctionLocal,admf_FormFunctionLocal);CHKERRQ(ierr);
> 2.  ierr = DMMGSetFromOptions(dmmg);CHKERRQ(ierr);
> 3.  ierr = DMMGSetSNESLocali(dmmg,FormFunctionLocali,0,admf_FormFunctionLocali);CHKERRQ(ierr);
> 4.  ierr = DMMGSetSNESLocalib(dmmg,FormFunctionLocali4,0,admfb_FormFunctionLocali4);CHKERRQ(ierr);
>
> I have the following questions:
>
> 1. What are ad_FormFunctionLocal, admf_FormFunctionLocal from line 1? They are not defined anywhere in ex19.c. Other terms such as admf_FormFunctionLocali and admfb_FormFunctionLocali4 are also not defined anywhere in the file.
>
> 2. To me it seems like DMMGSetSNESLocal, DMMGSetSNESLocali, and DMMGSetSNESLocalib evaluates the function for all grid points, for a single grid point and for a single degree of freedom, respectively. But how does the process choose which one to use?
>
> Thanks!           - Hui
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140531/e249b550/attachment.pgp>

From jroman at dsic.upv.es  Sat May 31 04:46:50 2014
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sat, 31 May 2014 11:46:50 +0200
Subject: [petsc-users] question about arbitrary eigenvector selection in
	SLEPC
In-Reply-To: <alpine.DEB.2.10.1405301434410.19991@capitanata.oa-cagliari.inaf.it>
References: <alpine.DEB.2.10.1405281853350.29504@capitanata.oa-cagliari.inaf.it>
	<5714C5D2-BEFB-49F8-8762-A81DC8849014@dsic.upv.es>
	<alpine.DEB.2.10.1405291609260.12946@capitanata.oa-cagliari.inaf.it>
	<CAMYG4G=e-VByA4CsogWcgB7ekorNstYj5iYb4PDvyZq_yaRoYw@mail.gmail.com>
	<87bnugyutr.fsf@jedbrown.org>
	<CAMYG4GmjP7Zu+ifitreqyQJt5HXUZfJPgU+MU=8J18jDKbYbmA@mail.gmail.com>
	<alpine.DEB.2.10.1405301434410.19991@capitanata.oa-cagliari.inaf.it>
Message-ID: <D5070274-9825-4EB7-9650-B53BBEFC28BD@dsic.upv.es>


El 30/05/2014, a las 14:42, Giacomo Mulas escribi?:

> On Fri, 30 May 2014, Matthew Knepley wrote:
> 
>> I think its better to use the non-collective version:
>> VecGetOwnershipRange(xr, &low, &high);
>> if ((myindex >= low) && (myindex < high)) {
>>    VecGetArrayRead(xr,&a);
>>    val = a[myindex-low];
>>    VecRestoreArrayRead(xr, &a);
>> }
>> MPI_Allreduce(&val, &gval, 1, MPIU_SCALAR, MPI_SUM, PETSC_COMM_WORLD);
> 
> I agree, and I used the above now. In any case, as it came out of this
> discussion, may I suggest that the man page of EPSSetArbitrarySelection()
> should document that the arbitrary selection user-defined function is
> collective, i.e.  it is called on all nodes in PETSC_COMM_WORLD (and is thus
> an implicit MPI syncronisation point)?  As it is now, if one just looks at
> the docs this is unclear, and it is consequently unclear also if one may use
> collective calls inside that user-defined function (the answer is yes, from
> this discussion).  One may argue that it must be so, since the user-defined
> function can use the eigenvectors which by definition may be nonlocal, but
> making this explicit would not hurt.
> 
> Giacomo
> 

Done.
https://bitbucket.org/slepc/slepc/commits/26293bc

Thanks.
Jose


From hus003 at ucsd.edu  Sat May 31 08:27:38 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Sat, 31 May 2014 13:27:38 +0000
Subject: [petsc-users] Question on DMMGSetSNESLocal from
 snes/example/tutorials/ex19.c
In-Reply-To: <87ppiutk28.fsf@jedbrown.org>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<87ppiutk28.fsf@jedbrown.org>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>

Thank you Jed. The version I was using is 3.1, it is too old. 

________________________________________
From: Jed Brown [jed at jedbrown.org]
Sent: Saturday, May 31, 2014 1:43 AM
To: Sun, Hui; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c

"Sun, Hui" <hus003 at ucsd.edu> writes:

> I'm looking at snes example ex19.c, on "nonlinear driven cavity multigrid 2d. You can also access it via the website ( http://acts.nersc.gov/petsc/example3/ex19.c.html<http://www.mcs.anl.gov/petsc/petsc-3.4/src/snes/examples/tutorials/ex19.c.html> )

These are not the same version.  The acts.nersc.gov link is a very old
version of that example.  The source-transformation algorithmic
differentiation tool ADIC is being used to compute derivatives (the ad_*
and admf_* functions).  ADIC is not maintained, has an unfortunate
license, and was not widely used so the "automatic" support has been
removed from PETSc.  Please look at the current version, which has
neither DMMG (removed/merged into SNES some years ago) or ADIC.

> There are three user defined local functions ( FormFunctionLocal, FormFunctionLocali, FormFunctionLocali4 ) that serves as discretized PDE operators declared before main, and is defined right after main. In the middle of the main, there are these four lines:
>
> 1.  ierr = DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,ad_FormFunctionLocal,admf_FormFunctionLocal);CHKERRQ(ierr);
> 2.  ierr = DMMGSetFromOptions(dmmg);CHKERRQ(ierr);
> 3.  ierr = DMMGSetSNESLocali(dmmg,FormFunctionLocali,0,admf_FormFunctionLocali);CHKERRQ(ierr);
> 4.  ierr = DMMGSetSNESLocalib(dmmg,FormFunctionLocali4,0,admfb_FormFunctionLocali4);CHKERRQ(ierr);
>
> I have the following questions:
>
> 1. What are ad_FormFunctionLocal, admf_FormFunctionLocal from line 1? They are not defined anywhere in ex19.c. Other terms such as admf_FormFunctionLocali and admfb_FormFunctionLocali4 are also not defined anywhere in the file.
>
> 2. To me it seems like DMMGSetSNESLocal, DMMGSetSNESLocali, and DMMGSetSNESLocalib evaluates the function for all grid points, for a single grid point and for a single degree of freedom, respectively. But how does the process choose which one to use?
>
> Thanks!           - Hui

From mairhofer at itt.uni-stuttgart.de  Sat May 31 10:02:59 2014
From: mairhofer at itt.uni-stuttgart.de (Jonas Mairhofer)
Date: Sat, 31 May 2014 17:02:59 +0200
Subject: [petsc-users] Customized Jacobi-Vector action approximation
Message-ID: <5389EF23.30603@itt.uni-stuttgart.de>

Hi all,

I am using PETSc to solve a system of nonlinear equations arising from 
Density Functional Theory. Depending on the actual problem setup the 
residulas of the matrix-free linear solver (GMRES)
stagnate and the nonlinear system converges only slowly.
Besides preconditioning my second idea to improve the performance of the 
linear solver was to use a higher order approximation of the 
Jacobi-vector product. Therefore, I am trying to write a user defined 
subroutine that calculates the approximation of the matrix-free 
Jacobi-Vector product, i.e. I would like to have a routine which can 
replace the default 1st order approximation

J(x)*v  = (F(x+eps*v) - F(x) ) / eps

for instance by a 2nd order approximation such as

J(x)*v = (F(x+eps*v) - F(x-eps*v) ) / 2eps

So assuming that I have a subroutine which claculates the approximation 
of J(x)*v, how do I get PETSc to use this result in the SNES solver?

Thank you very much,
Jonas

From jed at jedbrown.org  Sat May 31 10:11:31 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sat, 31 May 2014 17:11:31 +0200
Subject: [petsc-users] Customized Jacobi-Vector action approximation
In-Reply-To: <5389EF23.30603@itt.uni-stuttgart.de>
References: <5389EF23.30603@itt.uni-stuttgart.de>
Message-ID: <87mwdyt23g.fsf@jedbrown.org>

Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de> writes:

> Hi all,
>
> I am using PETSc to solve a system of nonlinear equations arising from 
> Density Functional Theory. Depending on the actual problem setup the 
> residulas of the matrix-free linear solver (GMRES)
> stagnate and the nonlinear system converges only slowly.
> Besides preconditioning my second idea to improve the performance of the 
> linear solver was to use a higher order approximation of the 
> Jacobi-vector product. Therefore, I am trying to write a user defined 
> subroutine that calculates the approximation of the matrix-free 
> Jacobi-Vector product, i.e. I would like to have a routine which can 
> replace the default 1st order approximation
>
> J(x)*v  = (F(x+eps*v) - F(x) ) / eps
>
> for instance by a 2nd order approximation such as
>
> J(x)*v = (F(x+eps*v) - F(x-eps*v) ) / 2eps
>
> So assuming that I have a subroutine which claculates the approximation 
> of J(x)*v, how do I get PETSc to use this result in the SNES solver?

Unless you are trying to add the centered difference code to the PETSc
library, you should create a MatShell that computes the action by your
formula.  Note that the centered difference does not help with rounding
error, so you'll likely want to use a larger step size (eps) and rely on
the function having sufficient smoothness if you hope to achieve better
accuracy than the one-sided difference.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140531/e3c97e7f/attachment-0001.pgp>

From bsmith at mcs.anl.gov  Sat May 31 11:07:15 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 31 May 2014 11:07:15 -0500
Subject: [petsc-users] Customized Jacobi-Vector action approximation
In-Reply-To: <5389EF23.30603@itt.uni-stuttgart.de>
References: <5389EF23.30603@itt.uni-stuttgart.de>
Message-ID: <B513DC27-AE04-4F9C-B956-E2813BA2865D@mcs.anl.gov>


  You might consider trying some of the non-Newton based nonlinear solvers now available in the development version of PETSc http://www.mcs.anl.gov/petsc/developers/index.html  Here is a list of them see their manual pages for more details 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140531/e06feddd/attachment.html>
-------------- next part --------------


On May 31, 2014, at 10:02 AM, Jonas Mairhofer <mairhofer at itt.uni-stuttgart.de> wrote:

> Hi all,
> 
> I am using PETSc to solve a system of nonlinear equations arising from Density Functional Theory. Depending on the actual problem setup the residulas of the matrix-free linear solver (GMRES)
> stagnate and the nonlinear system converges only slowly.
> Besides preconditioning my second idea to improve the performance of the linear solver was to use a higher order approximation of the Jacobi-vector product. Therefore, I am trying to write a user defined subroutine that calculates the approximation of the matrix-free Jacobi-Vector product, i.e. I would like to have a routine which can replace the default 1st order approximation
> 
> J(x)*v  = (F(x+eps*v) - F(x) ) / eps
> 
> for instance by a 2nd order approximation such as
> 
> J(x)*v = (F(x+eps*v) - F(x-eps*v) ) / 2eps
> 
> So assuming that I have a subroutine which claculates the approximation of J(x)*v, how do I get PETSc to use this result in the SNES solver?
> 
> Thank you very much,
> Jonas


From hus003 at ucsd.edu  Sat May 31 12:46:48 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Sat, 31 May 2014 17:46:48 +0000
Subject: [petsc-users] Question on DMMGSetSNESLocal from
 snes/example/tutorials/ex19.c
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<87ppiutk28.fsf@jedbrown.org>,
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>

Continue this topic. Right now I'm looking at ex19.c from PETSc v3.3 and v3.4, both containing a user defined function NonlinearGS(SNES, Vec, Vec, void*), I'm wondering why the arguments are not passed by reference or pointers? Will a copy been made for the first three arguments once NonlinearGS is called?        -Hui


________________________________________
From: Sun, Hui
Sent: Saturday, May 31, 2014 6:27 AM
To: Jed Brown; petsc-users at mcs.anl.gov
Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c

Thank you Jed. The version I was using is 3.1, it is too old.

________________________________________
From: Jed Brown [jed at jedbrown.org]
Sent: Saturday, May 31, 2014 1:43 AM
To: Sun, Hui; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c

"Sun, Hui" <hus003 at ucsd.edu> writes:

> I'm looking at snes example ex19.c, on "nonlinear driven cavity multigrid 2d. You can also access it via the website ( http://acts.nersc.gov/petsc/example3/ex19.c.html<http://www.mcs.anl.gov/petsc/petsc-3.4/src/snes/examples/tutorials/ex19.c.html> )

These are not the same version.  The acts.nersc.gov link is a very old
version of that example.  The source-transformation algorithmic
differentiation tool ADIC is being used to compute derivatives (the ad_*
and admf_* functions).  ADIC is not maintained, has an unfortunate
license, and was not widely used so the "automatic" support has been
removed from PETSc.  Please look at the current version, which has
neither DMMG (removed/merged into SNES some years ago) or ADIC.

> There are three user defined local functions ( FormFunctionLocal, FormFunctionLocali, FormFunctionLocali4 ) that serves as discretized PDE operators declared before main, and is defined right after main. In the middle of the main, there are these four lines:
>
> 1.  ierr = DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,ad_FormFunctionLocal,admf_FormFunctionLocal);CHKERRQ(ierr);
> 2.  ierr = DMMGSetFromOptions(dmmg);CHKERRQ(ierr);
> 3.  ierr = DMMGSetSNESLocali(dmmg,FormFunctionLocali,0,admf_FormFunctionLocali);CHKERRQ(ierr);
> 4.  ierr = DMMGSetSNESLocalib(dmmg,FormFunctionLocali4,0,admfb_FormFunctionLocali4);CHKERRQ(ierr);
>
> I have the following questions:
>
> 1. What are ad_FormFunctionLocal, admf_FormFunctionLocal from line 1? They are not defined anywhere in ex19.c. Other terms such as admf_FormFunctionLocali and admfb_FormFunctionLocali4 are also not defined anywhere in the file.
>
> 2. To me it seems like DMMGSetSNESLocal, DMMGSetSNESLocali, and DMMGSetSNESLocalib evaluates the function for all grid points, for a single grid point and for a single degree of freedom, respectively. But how does the process choose which one to use?
>
> Thanks!           - Hui

From jed at jedbrown.org  Sat May 31 12:52:06 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sat, 31 May 2014 19:52:06 +0200
Subject: [petsc-users] Question on DMMGSetSNESLocal from
	snes/example/tutorials/ex19.c
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <87ha45u989.fsf@jedbrown.org>

"Sun, Hui" <hus003 at ucsd.edu> writes:

> Continue this topic. Right now I'm looking at ex19.c from PETSc v3.3
> and v3.4, both containing a user defined function NonlinearGS(SNES,
> Vec, Vec, void*), I'm wondering why the arguments are not passed by
> reference or pointers? 

All PETSc objects (like SNES, Vec, etc.) are pointers to private
structures.

  typedef struct _p_Vec *Vec;

You cannot dereference the pointer because the implementation is
private, but it is passed around as a pointer.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140531/aeb91d25/attachment.pgp>

From jonasmairhofer86 at gmail.com  Sat May 31 14:02:47 2014
From: jonasmairhofer86 at gmail.com (Jonas Mairhofer)
Date: Sat, 31 May 2014 21:02:47 +0200
Subject: [petsc-users] Customized Jacobi-Vector action approximation
In-Reply-To: <B513DC27-AE04-4F9C-B956-E2813BA2865D@mcs.anl.gov>
References: <5389EF23.30603@itt.uni-stuttgart.de>
	<B513DC27-AE04-4F9C-B956-E2813BA2865D@mcs.anl.gov>
Message-ID: <CANBc3AuovvR81ZtfDeOjUBH8PP-W2BF-201ukr_7PbOxfQoBgQ@mail.gmail.com>

Thank you both for your fast answers! I agree, that it might not make a big
difference using the centered difference formula, but just to get it from
my list of things that could help I will try and implement it. I don't
understand how I could miss this forum discussion when I was looking for a
way to implement this all day yesterday, but  the second link I got now
from google typing "petsc MatShell"  is a long discussion you had with
another user on exactly what I want to do :) Just in case anyone else is
looking for the same thing :
http://lists.mcs.anl.gov/pipermail/petsc-users/2010-August/006821.html


On Sat, May 31, 2014 at 6:07 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   You might consider trying some of the non-Newton based nonlinear solvers
> now available in the development version of PETSc
> http://www.mcs.anl.gov/petsc/developers/index.html  Here is a list of
> them see their manual pages for more details
>
>
>
>
>
> On May 31, 2014, at 10:02 AM, Jonas Mairhofer <
> mairhofer at itt.uni-stuttgart.de> wrote:
>
> > Hi all,
> >
> > I am using PETSc to solve a system of nonlinear equations arising from
> Density Functional Theory. Depending on the actual problem setup the
> residulas of the matrix-free linear solver (GMRES)
> > stagnate and the nonlinear system converges only slowly.
> > Besides preconditioning my second idea to improve the performance of the
> linear solver was to use a higher order approximation of the Jacobi-vector
> product. Therefore, I am trying to write a user defined subroutine that
> calculates the approximation of the matrix-free Jacobi-Vector product, i.e.
> I would like to have a routine which can replace the default 1st order
> approximation
> >
> > J(x)*v  = (F(x+eps*v) - F(x) ) / eps
> >
> > for instance by a 2nd order approximation such as
> >
> > J(x)*v = (F(x+eps*v) - F(x-eps*v) ) / 2eps
> >
> > So assuming that I have a subroutine which claculates the approximation
> of J(x)*v, how do I get PETSc to use this result in the SNES solver?
> >
> > Thank you very much,
> > Jonas
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140531/98bbac11/attachment.html>

From hus003 at ucsd.edu  Sat May 31 16:38:50 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Sat, 31 May 2014 21:38:50 +0000
Subject: [petsc-users] Question on DMMGSetSNESLocal from
 snes/example/tutorials/ex19.c
In-Reply-To: <87ha45u989.fsf@jedbrown.org>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<87ha45u989.fsf@jedbrown.org>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>

Thank you Jed for explaining this to me. I tried to compile and run with the following options:
./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason

1). I use 2 cores and get the following output:
lid velocity = 100, prandtl # = 1, grashof # = 10000
  0 SNES Function norm 1111.93 
  1 SNES Function norm 829.129 
  2 SNES Function norm 532.66 
  3 SNES Function norm 302.926 
  4 SNES Function norm 3.64014 
  5 SNES Function norm 0.0410053 
  6 SNES Function norm 4.57951e-06 
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6
Number of SNES iterations = 6
Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s.

2). I use 8 cores and get the following output:
lid velocity = 100, prandtl # = 1, grashof # = 10000
  0 SNES Function norm 1111.93 
  1 SNES Function norm 829.049 
  2 SNES Function norm 532.616 
  3 SNES Function norm 303.165 
  4 SNES Function norm 3.93436 
Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4
Number of SNES iterations = 4
Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s.

First of all, the two runs yields different results. 
Secondly, the time cost comparison doesn't seem to be scaling correctly. 
( I have used petsctime.h to calculate the time cost. )

Do you have any insight of what might be missing? 

-Hui


________________________________________
From: Jed Brown [jed at jedbrown.org]
Sent: Saturday, May 31, 2014 10:52 AM
To: Sun, Hui; petsc-users at mcs.anl.gov
Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c

"Sun, Hui" <hus003 at ucsd.edu> writes:

> Continue this topic. Right now I'm looking at ex19.c from PETSc v3.3
> and v3.4, both containing a user defined function NonlinearGS(SNES,
> Vec, Vec, void*), I'm wondering why the arguments are not passed by
> reference or pointers?

All PETSc objects (like SNES, Vec, etc.) are pointers to private
structures.

  typedef struct _p_Vec *Vec;

You cannot dereference the pointer because the implementation is
private, but it is passed around as a pointer.

From jed at jedbrown.org  Sat May 31 16:48:28 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sat, 31 May 2014 23:48:28 +0200
Subject: [petsc-users] Question on DMMGSetSNESLocal from
	snes/example/tutorials/ex19.c
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ha45u989.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <87zjhxsjpv.fsf@jedbrown.org>

"Sun, Hui" <hus003 at ucsd.edu> writes:

> Thank you Jed for explaining this to me. I tried to compile and run with the following options:
> ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason
>
> 1). I use 2 cores and get the following output:
> lid velocity = 100, prandtl # = 1, grashof # = 10000
>   0 SNES Function norm 1111.93 
>   1 SNES Function norm 829.129 
>   2 SNES Function norm 532.66 
>   3 SNES Function norm 302.926 
>   4 SNES Function norm 3.64014 
>   5 SNES Function norm 0.0410053 
>   6 SNES Function norm 4.57951e-06 
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6
> Number of SNES iterations = 6
> Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s.
>
> 2). I use 8 cores and get the following output:
> lid velocity = 100, prandtl # = 1, grashof # = 10000
>   0 SNES Function norm 1111.93 
>   1 SNES Function norm 829.049 
>   2 SNES Function norm 532.616 
>   3 SNES Function norm 303.165 
>   4 SNES Function norm 3.93436 
> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4
> Number of SNES iterations = 4
> Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s.
>
> First of all, the two runs yields different results. 

The linear solve did not converge in the second case.

Run a more robust linear solver.  These problems can get difficult, but
I think -pc_type asm -sub_pc_type lu should be sufficient.

> Secondly, the time cost comparison doesn't seem to be scaling correctly. 
> ( I have used petsctime.h to calculate the time cost. )

1. Run in optimized mode.

2. Don't use more processes than you have cores (I don't know if this
affects you).

3. This problem is too small to take advantage of much (if any)
parallelism.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140531/67653fd6/attachment.pgp>

From hus003 at ucsd.edu  Sat May 31 18:06:37 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Sat, 31 May 2014 23:06:37 +0000
Subject: [petsc-users] Question on DMMGSetSNESLocal from
 snes/example/tutorials/ex19.c
In-Reply-To: <87zjhxsjpv.fsf@jedbrown.org>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ha45u989.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<87zjhxsjpv.fsf@jedbrown.org>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>

Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 18.1767s and 55.0017s respectively. That seems quite reasonable. 

By the way, how do I know which matrix solver and which preconditioner is being called? 

Besides, I have another question: I try to program finite difference for 2D Stokes flow with Dirichlet or Neumann bdry conditions, using staggered MAC grid. I looked up all the examples in snes, there are three stokes flow examples, all of which are finite element. I was thinking about naming (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j), then define u, v, p as three petscscalers on (i,j), but in that case u will have one more column than p and v will have one more row than p. If there is already something there in PETSc about MAC grid, then I don't have to worry about those details. Do you know any examples or references doing that? 

Hui


________________________________________
From: Jed Brown [jed at jedbrown.org]
Sent: Saturday, May 31, 2014 2:48 PM
To: Sun, Hui; petsc-users at mcs.anl.gov
Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c

"Sun, Hui" <hus003 at ucsd.edu> writes:

> Thank you Jed for explaining this to me. I tried to compile and run with the following options:
> ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason
>
> 1). I use 2 cores and get the following output:
> lid velocity = 100, prandtl # = 1, grashof # = 10000
>   0 SNES Function norm 1111.93
>   1 SNES Function norm 829.129
>   2 SNES Function norm 532.66
>   3 SNES Function norm 302.926
>   4 SNES Function norm 3.64014
>   5 SNES Function norm 0.0410053
>   6 SNES Function norm 4.57951e-06
> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6
> Number of SNES iterations = 6
> Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s.
>
> 2). I use 8 cores and get the following output:
> lid velocity = 100, prandtl # = 1, grashof # = 10000
>   0 SNES Function norm 1111.93
>   1 SNES Function norm 829.049
>   2 SNES Function norm 532.616
>   3 SNES Function norm 303.165
>   4 SNES Function norm 3.93436
> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4
> Number of SNES iterations = 4
> Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s.
>
> First of all, the two runs yields different results.

The linear solve did not converge in the second case.

Run a more robust linear solver.  These problems can get difficult, but
I think -pc_type asm -sub_pc_type lu should be sufficient.

> Secondly, the time cost comparison doesn't seem to be scaling correctly.
> ( I have used petsctime.h to calculate the time cost. )

1. Run in optimized mode.

2. Don't use more processes than you have cores (I don't know if this
affects you).

3. This problem is too small to take advantage of much (if any)
parallelism.

From jed at jedbrown.org  Sat May 31 18:13:34 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 01 Jun 2014 01:13:34 +0200
Subject: [petsc-users] Question on DMMGSetSNESLocal from
	snes/example/tutorials/ex19.c
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ha45u989.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87zjhxsjpv.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <87sinpsfs1.fsf@jedbrown.org>

"Sun, Hui" <hus003 at ucsd.edu> writes:

> Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 18.1767s and 55.0017s respectively. That seems quite reasonable. 
>
> By the way, how do I know which matrix solver and which preconditioner is being called? 

-ksp_view (or -snes_view, which includes the same information once per nonlinear solve).

> Besides, I have another question: I try to program finite difference
> for 2D Stokes flow with Dirichlet or Neumann bdry conditions, using
> staggered MAC grid. I looked up all the examples in snes, there are
> three stokes flow examples, all of which are finite element. I was
> thinking about naming (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j),
> then define u, v, p as three petscscalers on (i,j), but in that case u
> will have one more column than p and v will have one more row than
> p. If there is already something there in PETSc about MAC grid, then I
> don't have to worry about those details. Do you know any examples or
> references doing that?

What you describe is a common approach.  You set trivial "boundary
conditions" for those silent dofs and otherwise ignore them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140601/d1fdf413/attachment.pgp>

From bsmith at mcs.anl.gov  Sat May 31 18:16:04 2014
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 31 May 2014 18:16:04 -0500
Subject: [petsc-users] Question on DMMGSetSNESLocal from
	snes/example/tutorials/ex19.c
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ha45u989.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<87zjhxsjpv.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <C576F5FF-3F44-4292-80D1-ED4C0225FEAA@mcs.anl.gov>


On May 31, 2014, at 6:06 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

> Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 18.1767s and 55.0017s respectively. That seems quite reasonable. 
> 
> By the way, how do I know which matrix solver and which preconditioner is being called? 

  Run with -snes_view  (or -ts_view if using the ODE integrators).

> 
> Besides, I have another question: I try to program finite difference for 2D Stokes flow with Dirichlet or Neumann bdry conditions, using staggered MAC grid. I looked up all the examples in snes, there are three stokes flow examples, all of which are finite element. I was thinking about naming (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j), then define u, v, p as three petscscalers on (i,j), but in that case u will have one more column than p and v will have one more row than p. If there is already something there in PETSc about MAC grid, then I don't have to worry about those details. Do you know any examples or references doing that? 

   Unfortunately the DMDA is not ideal for this since it only supports the same number of dof at each grid point. You need to decouple the extra ?variables? and not use their values to do a MAC grid. For example in two dimensions with u (velocity in x direction), v (velocity in y direction) and p (pressure at cell centers), and pure Dirichlet boundary conditions then create a DMDA with a dof of three and for each cell treat the first component of the cell as u (on the lower side of cell) , the second as v (on left side of cell)  and the third as p (on center of cell). For the final row of cells across the top there is no v or p, just the u along the bottoms of the cells and for the final row of cells along the right there is only a v. So make all the ?extra? equations  be simply f.v[i][j]  = x.v[i][j]  (or x.p or x.u depending on where) and put a 1 on the diagonal of that row/column of the Jacobian). Yes it is a little annoyingly cumbersome.

   Barry
 
> 
> Hui
> 
> 
> 
> ________________________________________
> From: Jed Brown [jed at jedbrown.org]
> Sent: Saturday, May 31, 2014 2:48 PM
> To: Sun, Hui; petsc-users at mcs.anl.gov
> Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c
> 
> "Sun, Hui" <hus003 at ucsd.edu> writes:
> 
>> Thank you Jed for explaining this to me. I tried to compile and run with the following options:
>> ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason
>> 
>> 1). I use 2 cores and get the following output:
>> lid velocity = 100, prandtl # = 1, grashof # = 10000
>>  0 SNES Function norm 1111.93
>>  1 SNES Function norm 829.129
>>  2 SNES Function norm 532.66
>>  3 SNES Function norm 302.926
>>  4 SNES Function norm 3.64014
>>  5 SNES Function norm 0.0410053
>>  6 SNES Function norm 4.57951e-06
>> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6
>> Number of SNES iterations = 6
>> Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s.
>> 
>> 2). I use 8 cores and get the following output:
>> lid velocity = 100, prandtl # = 1, grashof # = 10000
>>  0 SNES Function norm 1111.93
>>  1 SNES Function norm 829.049
>>  2 SNES Function norm 532.616
>>  3 SNES Function norm 303.165
>>  4 SNES Function norm 3.93436
>> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4
>> Number of SNES iterations = 4
>> Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s.
>> 
>> First of all, the two runs yields different results.
> 
> The linear solve did not converge in the second case.
> 
> Run a more robust linear solver.  These problems can get difficult, but
> I think -pc_type asm -sub_pc_type lu should be sufficient.
> 
>> Secondly, the time cost comparison doesn't seem to be scaling correctly.
>> ( I have used petsctime.h to calculate the time cost. )
> 
> 1. Run in optimized mode.
> 
> 2. Don't use more processes than you have cores (I don't know if this
> affects you).
> 
> 3. This problem is too small to take advantage of much (if any)
> parallelism.


From hus003 at ucsd.edu  Sat May 31 18:28:20 2014
From: hus003 at ucsd.edu (Sun, Hui)
Date: Sat, 31 May 2014 23:28:20 +0000
Subject: [petsc-users] Question on DMMGSetSNESLocal from
 snes/example/tutorials/ex19.c
In-Reply-To: <C576F5FF-3F44-4292-80D1-ED4C0225FEAA@mcs.anl.gov>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ha45u989.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<87zjhxsjpv.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>,
	<C576F5FF-3F44-4292-80D1-ED4C0225FEAA@mcs.anl.gov>
Message-ID: <7501CC2B7BBCC44A92ECEEC316170ECB6B760C@XMAIL-MBX-BH1.AD.UCSD.EDU>

Thank you Jed and Barry for being very helpful answering all my questions! Right now, I have GMRES as the solver, how do I change it to BiCGStab? 

Best,
Hui


________________________________________
From: Barry Smith [bsmith at mcs.anl.gov]
Sent: Saturday, May 31, 2014 4:16 PM
To: Sun, Hui
Cc: Jed Brown; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c

On May 31, 2014, at 6:06 PM, Sun, Hui <hus003 at ucsd.edu> wrote:

> Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 18.1767s and 55.0017s respectively. That seems quite reasonable.
>
> By the way, how do I know which matrix solver and which preconditioner is being called?

  Run with -snes_view  (or -ts_view if using the ODE integrators).

>
> Besides, I have another question: I try to program finite difference for 2D Stokes flow with Dirichlet or Neumann bdry conditions, using staggered MAC grid. I looked up all the examples in snes, there are three stokes flow examples, all of which are finite element. I was thinking about naming (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j), then define u, v, p as three petscscalers on (i,j), but in that case u will have one more column than p and v will have one more row than p. If there is already something there in PETSc about MAC grid, then I don't have to worry about those details. Do you know any examples or references doing that?

   Unfortunately the DMDA is not ideal for this since it only supports the same number of dof at each grid point. You need to decouple the extra ?variables? and not use their values to do a MAC grid. For example in two dimensions with u (velocity in x direction), v (velocity in y direction) and p (pressure at cell centers), and pure Dirichlet boundary conditions then create a DMDA with a dof of three and for each cell treat the first component of the cell as u (on the lower side of cell) , the second as v (on left side of cell)  and the third as p (on center of cell). For the final row of cells across the top there is no v or p, just the u along the bottoms of the cells and for the final row of cells along the right there is only a v. So make all the ?extra? equations  be simply f.v[i][j]  = x.v[i][j]  (or x.p or x.u depending on where) and put a 1 on the diagonal of that row/column of the Jacobian). Yes it is a little annoyingly cumbersome.

   Barry

>
> Hui
>
>
>
> ________________________________________
> From: Jed Brown [jed at jedbrown.org]
> Sent: Saturday, May 31, 2014 2:48 PM
> To: Sun, Hui; petsc-users at mcs.anl.gov
> Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from snes/example/tutorials/ex19.c
>
> "Sun, Hui" <hus003 at ucsd.edu> writes:
>
>> Thank you Jed for explaining this to me. I tried to compile and run with the following options:
>> ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 2 -snes_monitor_short -snes_converged_reason
>>
>> 1). I use 2 cores and get the following output:
>> lid velocity = 100, prandtl # = 1, grashof # = 10000
>>  0 SNES Function norm 1111.93
>>  1 SNES Function norm 829.129
>>  2 SNES Function norm 532.66
>>  3 SNES Function norm 302.926
>>  4 SNES Function norm 3.64014
>>  5 SNES Function norm 0.0410053
>>  6 SNES Function norm 4.57951e-06
>> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6
>> Number of SNES iterations = 6
>> Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s.
>>
>> 2). I use 8 cores and get the following output:
>> lid velocity = 100, prandtl # = 1, grashof # = 10000
>>  0 SNES Function norm 1111.93
>>  1 SNES Function norm 829.049
>>  2 SNES Function norm 532.616
>>  3 SNES Function norm 303.165
>>  4 SNES Function norm 3.93436
>> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4
>> Number of SNES iterations = 4
>> Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s.
>>
>> First of all, the two runs yields different results.
>
> The linear solve did not converge in the second case.
>
> Run a more robust linear solver.  These problems can get difficult, but
> I think -pc_type asm -sub_pc_type lu should be sufficient.
>
>> Secondly, the time cost comparison doesn't seem to be scaling correctly.
>> ( I have used petsctime.h to calculate the time cost. )
>
> 1. Run in optimized mode.
>
> 2. Don't use more processes than you have cores (I don't know if this
> affects you).
>
> 3. This problem is too small to take advantage of much (if any)
> parallelism.


From jed at jedbrown.org  Sat May 31 18:29:18 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 01 Jun 2014 01:29:18 +0200
Subject: [petsc-users] Question on DMMGSetSNESLocal from
	snes/example/tutorials/ex19.c
In-Reply-To: <C576F5FF-3F44-4292-80D1-ED4C0225FEAA@mcs.anl.gov>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ha45u989.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87zjhxsjpv.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<C576F5FF-3F44-4292-80D1-ED4C0225FEAA@mcs.anl.gov>
Message-ID: <87ppitsf1t.fsf@jedbrown.org>

Barry Smith <bsmith at mcs.anl.gov> writes:
>So make all the ?extra? equations be simply f.v[i][j] = x.v[i][j] (or
>x.p or x.u depending on where) and put a 1 on the diagonal of that
>row/column of the Jacobian).

This would typically be f[j][i].v = x[j][i].v, etc.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140601/298db57b/attachment.pgp>

From jed at jedbrown.org  Sat May 31 18:46:50 2014
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 01 Jun 2014 01:46:50 +0200
Subject: [petsc-users] Question on DMMGSetSNESLocal from
	snes/example/tutorials/ex19.c
In-Reply-To: <7501CC2B7BBCC44A92ECEEC316170ECB6B760C@XMAIL-MBX-BH1.AD.UCSD.EDU>
References: <7501CC2B7BBCC44A92ECEEC316170ECB6B74C9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ppiutk28.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B74E5@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7501@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87ha45u989.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B7590@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<87zjhxsjpv.fsf@jedbrown.org>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B75D9@XMAIL-MBX-BH1.AD.UCSD.EDU>
	<C576F5FF-3F44-4292-80D1-ED4C0225FEAA@mcs.anl.gov>
	<7501CC2B7BBCC44A92ECEEC316170ECB6B760C@XMAIL-MBX-BH1.AD.UCSD.EDU>
Message-ID: <87ha45se8l.fsf@jedbrown.org>

"Sun, Hui" <hus003 at ucsd.edu> writes:

> Thank you Jed and Barry for being very helpful answering all my
> questions! Right now, I have GMRES as the solver, how do I change it
> to BiCGStab?

-ksp_type bcgs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140601/5c3fd4a3/attachment.pgp>